summaryrefslogtreecommitdiffstats
path: root/doc/gettext_2.html
blob: 699dfc5fe061569e1a0a7c38a99f917e6f501593 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
<HTML>
<HEAD>
<!-- This HTML file has been created by texi2html 1.52a
     from gettext.texi on 31 January 2002 -->

<TITLE>GNU gettext utilities - 2  PO Files and PO Mode Basics</TITLE>
</HEAD>
<BODY>
Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_1.html">previous</A>, <A HREF="gettext_3.html">next</A>, <A HREF="gettext_16.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
<P><HR><P>


<H1><A NAME="SEC7" HREF="gettext_toc.html#TOC7">2  PO Files and PO Mode Basics</A></H1>

<P>
The GNU <CODE>gettext</CODE> toolset helps programmers and translators
at producing, updating and using translation files, mainly those
PO files which are textual, editable files.  This chapter stresses
the format of PO files, and contains a PO mode starter.  PO mode
description is spread throughout this manual instead of being concentrated
in one place.  Here we present only the basics of PO mode.

</P>



<H2><A NAME="SEC8" HREF="gettext_toc.html#TOC8">2.1  Completing GNU <CODE>gettext</CODE> Installation</A></H2>

<P>
Once you have received, unpacked, configured and compiled the GNU
<CODE>gettext</CODE> distribution, the <SAMP>`make install&acute;</SAMP> command puts in
place the programs <CODE>xgettext</CODE>, <CODE>msgfmt</CODE>, <CODE>gettext</CODE>, and
<CODE>msgmerge</CODE>, as well as their available message catalogs.  To
top off a comfortable installation, you might also want to make the
PO mode available to your Emacs users.

</P>
<P>
During the installation of the PO mode, you might want to modify your
file <TT>`.emacs&acute;</TT>, once and for all, so it contains a few lines looking
like:

</P>

<PRE>
(setq auto-mode-alist
      (cons '("\\.po\\'\\|\\.po\\." . po-mode) auto-mode-alist))
(autoload 'po-mode "po-mode" "Major mode for translators to edit PO files" t)
</PRE>

<P>
Later, whenever you edit some <TT>`.po&acute;</TT>
file, or any file having the string <SAMP>`.po.&acute;</SAMP> within its name,
Emacs loads <TT>`po-mode.elc&acute;</TT> (or <TT>`po-mode.el&acute;</TT>) as needed, and
automatically activates PO mode commands for the associated buffer.
The string <EM>PO</EM> appears in the mode line for any buffer for
which PO mode is active.  Many PO files may be active at once in a
single Emacs session.

</P>
<P>
If you are using Emacs version 20 or newer, and have already installed
the appropriate international fonts on your system, you may also tell
Emacs how to determine automatically the coding system of every PO file.
This will often (but not always) cause the necessary fonts to be loaded
and used for displaying the translations on your Emacs screen.  For this
to happen, add the lines:

</P>

<PRE>
(modify-coding-system-alist 'file "\\.po\\'\\|\\.po\\."
                            'po-find-file-coding-system)
(autoload 'po-find-file-coding-system "po-mode")
</PRE>

<P>
to your <TT>`.emacs&acute;</TT> file.  If, with this, you still see boxes instead
of international characters, try a different font set (via Shift Mouse
button 1).

</P>


<H2><A NAME="SEC9" HREF="gettext_toc.html#TOC9">2.2  The Format of PO Files</A></H2>

<P>
A PO file is made up of many entries, each entry holding the relation
between an original untranslated string and its corresponding
translation.  All entries in a given PO file usually pertain
to a single project, and all translations are expressed in a single
target language.  One PO file <EM>entry</EM> has the following schematic
structure:

</P>

<PRE>
<VAR>white-space</VAR>
#  <VAR>translator-comments</VAR>
#. <VAR>automatic-comments</VAR>
#: <VAR>reference</VAR>...
#, <VAR>flag</VAR>...
msgid <VAR>untranslated-string</VAR>
msgstr <VAR>translated-string</VAR>
</PRE>

<P>
The general structure of a PO file should be well understood by
the translator.  When using PO mode, very little has to be known
about the format details, as PO mode takes care of them for her.

</P>
<P>
Entries begin with some optional white space.  Usually, when generated
through GNU <CODE>gettext</CODE> tools, there is exactly one blank line
between entries.  Then comments follow, on lines all starting with the
character <KBD>#</KBD>.  There are two kinds of comments: those which have
some white space immediately following the <KBD>#</KBD>, which comments are
created and maintained exclusively by the translator, and those which
have some non-white character just after the <KBD>#</KBD>, which comments
are created and maintained automatically by GNU <CODE>gettext</CODE> tools.
All comments, of either kind, are optional.

</P>
<P>
After white space and comments, entries show two strings, namely
first the untranslated string as it appears in the original program
sources, and then, the translation of this string.  The original
string is introduced by the keyword <CODE>msgid</CODE>, and the translation,
by <CODE>msgstr</CODE>.  The two strings, untranslated and translated,
are quoted in various ways in the PO file, using <KBD>"</KBD>
delimiters and <KBD>\</KBD> escapes, but the translator does not really
have to pay attention to the precise quoting format, as PO mode fully
takes care of quoting for her.

</P>
<P>
The <CODE>msgid</CODE> strings, as well as automatic comments, are produced
and managed by other GNU <CODE>gettext</CODE> tools, and PO mode does not
provide means for the translator to alter these.  The most she can
do is merely deleting them, and only by deleting the whole entry.
On the other hand, the <CODE>msgstr</CODE> string, as well as translator
comments, are really meant for the translator, and PO mode gives her
the full control she needs.

</P>
<P>
The comment lines beginning with <KBD>#,</KBD> are special because they are
not completely ignored by the programs as comments generally are.  The
comma separated list of <VAR>flag</VAR>s is used by the <CODE>msgfmt</CODE>
program to give the user some better diagnostic messages.  Currently
there are two forms of flags defined:

</P>
<DL COMPACT>

<DT><KBD>fuzzy</KBD>
<DD>
This flag can be generated by the <CODE>msgmerge</CODE> program or it can be
inserted by the translator herself.  It shows that the <CODE>msgstr</CODE>
string might not be a correct translation (anymore).  Only the translator
can judge if the translation requires further modification, or is
acceptable as is.  Once satisfied with the translation, she then removes
this <KBD>fuzzy</KBD> attribute.  The <CODE>msgmerge</CODE> program inserts this
when it combined the <CODE>msgid</CODE> and <CODE>msgstr</CODE> entries after fuzzy
search only.  See section <A HREF="gettext_6.html#SEC45">6.3  Fuzzy Entries</A>.

<DT><KBD>c-format</KBD>
<DD>
<DT><KBD>no-c-format</KBD>
<DD>
These flags should not be added by a human.  Instead only the
<CODE>xgettext</CODE> program adds them.  In an automated PO file processing
system as proposed here the user changes would be thrown away again as
soon as the <CODE>xgettext</CODE> program generates a new template file.

In case the <KBD>c-format</KBD> flag is given for a string the <CODE>msgfmt</CODE>
does some more tests to check to validity of the translation.
See section <A HREF="gettext_8.html#SEC118">8.1  Invoking the <CODE>msgfmt</CODE> Program</A>.

</DL>

<P>
A different kind of entries is used for translations which involve
plural forms.

</P>

<PRE>
<VAR>white-space</VAR>
#  <VAR>translator-comments</VAR>
#. <VAR>automatic-comments</VAR>
#: <VAR>reference</VAR>...
#, <VAR>flag</VAR>...
msgid <VAR>untranslated-string-singular</VAR>
msgid_plural <VAR>untranslated-string-plural</VAR>
msgstr[0] <VAR>translated-string-case-0</VAR>
...
msgstr[N] <VAR>translated-string-case-n</VAR>
</PRE>

<P>
It happens that some lines, usually whitespace or comments, follow the
very last entry of a PO file.  Such lines are not part of any entry,
and PO mode is unable to take action on those lines.  By using the
PO mode function <KBD>M-x po-normalize</KBD>, the translator may get
rid of those spurious lines.  See section <A HREF="gettext_2.html#SEC12">2.5  Normalizing Strings in Entries</A>.

</P>
<P>
The remainder of this section may be safely skipped by those using
PO mode, yet it may be interesting for everybody to have a better
idea of the precise format of a PO file.  On the other hand, those
not having Emacs handy should carefully continue reading on.

</P>
<P>
Each of <VAR>untranslated-string</VAR> and <VAR>translated-string</VAR> respects
the C syntax for a character string, including the surrounding quotes
and embedded backslashed escape sequences.  When the time comes
to write multi-line strings, one should not use escaped newlines.
Instead, a closing quote should follow the last character on the
line to be continued, and an opening quote should resume the string
at the beginning of the following PO file line.  For example:

</P>

<PRE>
msgid ""
"Here is an example of how one might continue a very long string\n"
"for the common case the string represents multi-line output.\n"
</PRE>

<P>
In this example, the empty string is used on the first line, to
allow better alignment of the <KBD>H</KBD> from the word <SAMP>`Here&acute;</SAMP>
over the <KBD>f</KBD> from the word <SAMP>`for&acute;</SAMP>.  In this example, the
<CODE>msgid</CODE> keyword is followed by three strings, which are meant
to be concatenated.  Concatenating the empty string does not change
the resulting overall string, but it is a way for us to comply with
the necessity of <CODE>msgid</CODE> to be followed by a string on the same
line, while keeping the multi-line presentation left-justified, as
we find this to be a cleaner disposition.  The empty string could have
been omitted, but only if the string starting with <SAMP>`Here&acute;</SAMP> was
promoted on the first line, right after <CODE>msgid</CODE>.<A NAME="DOCF2" HREF="gettext_foot.html#FOOT2">(2)</A> It was not really necessary
either to switch between the two last quoted strings immediately after
the newline <SAMP>`\n&acute;</SAMP>, the switch could have occurred after <EM>any</EM>
other character, we just did it this way because it is neater.

</P>
<P>
One should carefully distinguish between end of lines marked as
<SAMP>`\n&acute;</SAMP> <EM>inside</EM> quotes, which are part of the represented
string, and end of lines in the PO file itself, outside string quotes,
which have no incidence on the represented string.

</P>
<P>
Outside strings, white lines and comments may be used freely.
Comments start at the beginning of a line with <SAMP>`#&acute;</SAMP> and extend
until the end of the PO file line.  Comments written by translators
should have the initial <SAMP>`#&acute;</SAMP> immediately followed by some white
space.  If the <SAMP>`#&acute;</SAMP> is not immediately followed by white space,
this comment is most likely generated and managed by specialized GNU
tools, and might disappear or be replaced unexpectedly when the PO
file is given to <CODE>msgmerge</CODE>.

</P>


<H2><A NAME="SEC10" HREF="gettext_toc.html#TOC10">2.3  Main PO mode Commands</A></H2>

<P>
After setting up Emacs with something similar to the lines in
section <A HREF="gettext_2.html#SEC8">2.1  Completing GNU <CODE>gettext</CODE> Installation</A>, PO mode is activated for a window when Emacs finds a
PO file in that window.  This puts the window read-only and establishes a
po-mode-map, which is a genuine Emacs mode, in a way that is not derived
from text mode in any way.  Functions found on <CODE>po-mode-hook</CODE>,
if any, will be executed.

</P>
<P>
When PO mode is active in a window, the letters <SAMP>`PO&acute;</SAMP> appear
in the mode line for that window.  The mode line also displays how
many entries of each kind are held in the PO file.  For example,
the string <SAMP>`132t+3f+10u+2o&acute;</SAMP> would tell the translator that the
PO mode contains 132 translated entries (see section <A HREF="gettext_6.html#SEC44">6.2  Translated Entries</A>,
3 fuzzy entries (see section <A HREF="gettext_6.html#SEC45">6.3  Fuzzy Entries</A>), 10 untranslated entries
(see section <A HREF="gettext_6.html#SEC46">6.4  Untranslated Entries</A>) and 2 obsolete entries (see section <A HREF="gettext_6.html#SEC47">6.5  Obsolete Entries</A>).  Zero-coefficients items are not shown.  So, in this example, if
the fuzzy entries were unfuzzied, the untranslated entries were translated
and the obsolete entries were deleted, the mode line would merely display
<SAMP>`145t&acute;</SAMP> for the counters.

</P>
<P>
The main PO commands are those which do not fit into the other categories of
subsequent sections.  These allow for quitting PO mode or for managing windows
in special ways.

</P>
<DL COMPACT>

<DT><KBD>_</KBD>
<DD>
Undo last modification to the PO file (<CODE>po-undo</CODE>).

<DT><KBD>Q</KBD>
<DD>
Quit processing and save the PO file (<CODE>po-quit</CODE>).

<DT><KBD>q</KBD>
<DD>
Quit processing, possibly after confirmation (<CODE>po-confirm-and-quit</CODE>).

<DT><KBD>0</KBD>
<DD>
Temporary leave the PO file window (<CODE>po-other-window</CODE>).

<DT><KBD>?</KBD>
<DD>
<DT><KBD>h</KBD>
<DD>
Show help about PO mode (<CODE>po-help</CODE>).

<DT><KBD>=</KBD>
<DD>
Give some PO file statistics (<CODE>po-statistics</CODE>).

<DT><KBD>V</KBD>
<DD>
Batch validate the format of the whole PO file (<CODE>po-validate</CODE>).

</DL>

<P>
The command <KBD>_</KBD> (<CODE>po-undo</CODE>) interfaces to the Emacs
<EM>undo</EM> facility.  See section `Undoing Changes' in <CITE>The Emacs Editor</CITE>.  Each time <KBD>U</KBD> is typed, modifications which the translator
did to the PO file are undone a little more.  For the purpose of
undoing, each PO mode command is atomic.  This is especially true for
the <KBD><KBD>RET</KBD></KBD> command: the whole edition made by using a single
use of this command is undone at once, even if the edition itself
implied several actions.  However, while in the editing window, one
can undo the edition work quite parsimoniously.

</P>
<P>
The commands <KBD>Q</KBD> (<CODE>po-quit</CODE>) and <KBD>q</KBD>
(<CODE>po-confirm-and-quit</CODE>) are used when the translator is done with the
PO file.  The former is a bit less verbose than the latter.  If the file
has been modified, it is saved to disk first.  In both cases, and prior to
all this, the commands check if any untranslated messages remain in the
PO file and, if so, the translator is asked if she really wants to leave
off working with this PO file.  This is the preferred way of getting rid
of an Emacs PO file buffer.  Merely killing it through the usual command
<KBD>C-x k</KBD> (<CODE>kill-buffer</CODE>) is not the tidiest way to proceed.

</P>
<P>
The command <KBD>0</KBD> (<CODE>po-other-window</CODE>) is another, softer way,
to leave PO mode, temporarily.  It just moves the cursor to some other
Emacs window, and pops one if necessary.  For example, if the translator
just got PO mode to show some source context in some other, she might
discover some apparent bug in the program source that needs correction.
This command allows the translator to change sex, become a programmer,
and have the cursor right into the window containing the program she
(or rather <EM>he</EM>) wants to modify.  By later getting the cursor back
in the PO file window, or by asking Emacs to edit this file once again,
PO mode is then recovered.

</P>
<P>
The command <KBD>h</KBD> (<CODE>po-help</CODE>) displays a summary of all available PO
mode commands.  The translator should then type any character to resume
normal PO mode operations.  The command <KBD>?</KBD> has the same effect
as <KBD>h</KBD>.

</P>
<P>
The command <KBD>=</KBD> (<CODE>po-statistics</CODE>) computes the total number of
entries in the PO file, the ordinal of the current entry (counted from
1), the number of untranslated entries, the number of obsolete entries,
and displays all these numbers.

</P>
<P>
The command <KBD>V</KBD> (<CODE>po-validate</CODE>) launches <CODE>msgfmt</CODE> in
checking and verbose
mode over the current PO file.  This command first offers to save the
current PO file on disk.  The <CODE>msgfmt</CODE> tool, from GNU <CODE>gettext</CODE>,
has the purpose of creating a MO file out of a PO file, and PO mode uses
the features of this program for checking the overall format of a PO file,
as well as all individual entries.

</P>
<P>
The program <CODE>msgfmt</CODE> runs asynchronously with Emacs, so the
translator regains control immediately while her PO file is being studied.
Error output is collected in the Emacs <SAMP>`*compilation*&acute;</SAMP> buffer,
displayed in another window.  The regular Emacs command <KBD>C-x`</KBD>
(<CODE>next-error</CODE>), as well as other usual compile commands, allow the
translator to reposition quickly to the offending parts of the PO file.
Once the cursor is on the line in error, the translator may decide on
any PO mode action which would help correcting the error.

</P>


<H2><A NAME="SEC11" HREF="gettext_toc.html#TOC11">2.4  Entry Positioning</A></H2>

<P>
The cursor in a PO file window is almost always part of
an entry.  The only exceptions are the special case when the cursor
is after the last entry in the file, or when the PO file is
empty.  The entry where the cursor is found to be is said to be the
current entry.  Many PO mode commands operate on the current entry,
so moving the cursor does more than allowing the translator to browse
the PO file, this also selects on which entry commands operate.

</P>
<P>
Some PO mode commands alter the position of the cursor in a specialized
way.  A few of those special purpose positioning are described here,
the others are described in following sections (for a complete list try
<KBD>C-h m</KBD>):

</P>
<DL COMPACT>

<DT><KBD>.</KBD>
<DD>
Redisplay the current entry (<CODE>po-current-entry</CODE>).

<DT><KBD>n</KBD>
<DD>
Select the entry after the current one (<CODE>po-next-entry</CODE>).

<DT><KBD>p</KBD>
<DD>
Select the entry before the current one (<CODE>po-previous-entry</CODE>).

<DT><KBD>&#60;</KBD>
<DD>
Select the first entry in the PO file (<CODE>po-first-entry</CODE>).

<DT><KBD>&#62;</KBD>
<DD>
Select the last entry in the PO file (<CODE>po-last-entry</CODE>).

<DT><KBD>m</KBD>
<DD>
Record the location of the current entry for later use
(<CODE>po-push-location</CODE>).

<DT><KBD>r</KBD>
<DD>
Return to a previously saved entry location (<CODE>po-pop-location</CODE>).

<DT><KBD>x</KBD>
<DD>
Exchange the current entry location with the previously saved one
(<CODE>po-exchange-location</CODE>).

</DL>

<P>
Any Emacs command able to reposition the cursor may be used
to select the current entry in PO mode, including commands which
move by characters, lines, paragraphs, screens or pages, and search
commands.  However, there is a kind of standard way to display the
current entry in PO mode, which usual Emacs commands moving
the cursor do not especially try to enforce.  The command <KBD>.</KBD>
(<CODE>po-current-entry</CODE>) has the sole purpose of redisplaying the
current entry properly, after the current entry has been changed by
means external to PO mode, or the Emacs screen otherwise altered.

</P>
<P>
It is yet to be decided if PO mode helps the translator, or otherwise
irritates her, by forcing a rigid window disposition while she
is doing her work.  We originally had quite precise ideas about
how windows should behave, but on the other hand, anyone used to
Emacs is often happy to keep full control.  Maybe a fixed window
disposition might be offered as a PO mode option that the translator
might activate or deactivate at will, so it could be offered on an
experimental basis.  If nobody feels a real need for using it, or
a compulsion for writing it, we should drop this whole idea.
The incentive for doing it should come from translators rather than
programmers, as opinions from an experienced translator are surely
more worth to me than opinions from programmers <EM>thinking</EM> about
how <EM>others</EM> should do translation.

</P>
<P>
The commands <KBD>n</KBD> (<CODE>po-next-entry</CODE>) and <KBD>p</KBD>
(<CODE>po-previous-entry</CODE>) move the cursor the entry following,
or preceding, the current one.  If <KBD>n</KBD> is given while the
cursor is on the last entry of the PO file, or if <KBD>p</KBD>
is given while the cursor is on the first entry, no move is done.

</P>
<P>
The commands <KBD>&#60;</KBD> (<CODE>po-first-entry</CODE>) and <KBD>&#62;</KBD>
(<CODE>po-last-entry</CODE>) move the cursor to the first entry, or last
entry, of the PO file.  When the cursor is located past the last
entry in a PO file, most PO mode commands will return an error saying
<SAMP>`After last entry&acute;</SAMP>.  Moreover, the commands <KBD>&#60;</KBD> and <KBD>&#62;</KBD>
have the special property of being able to work even when the cursor
is not into some PO file entry, and one may use them for nicely
correcting this situation.  But even these commands will fail on a
truly empty PO file.  There are development plans for the PO mode for it
to interactively fill an empty PO file from sources.  See section <A HREF="gettext_3.html#SEC16">3.3  Marking Translatable Strings</A>.

</P>
<P>
The translator may decide, before working at the translation of
a particular entry, that she needs to browse the remainder of the
PO file, maybe for finding the terminology or phraseology used
in related entries.  She can of course use the standard Emacs idioms
for saving the current cursor location in some register, and use that
register for getting back, or else, use the location ring.

</P>
<P>
PO mode offers another approach, by which cursor locations may be saved
onto a special stack.  The command <KBD>m</KBD> (<CODE>po-push-location</CODE>)
merely adds the location of current entry to the stack, pushing
the already saved locations under the new one.  The command
<KBD>r</KBD> (<CODE>po-pop-location</CODE>) consumes the top stack element and
repositions the cursor to the entry associated with that top element.
This position is then lost, for the next <KBD>r</KBD> will move the cursor
to the previously saved location, and so on until no locations remain
on the stack.

</P>
<P>
If the translator wants the position to be kept on the location stack,
maybe for taking a look at the entry associated with the top
element, then go elsewhere with the intent of getting back later, she
ought to use <KBD>m</KBD> immediately after <KBD>r</KBD>.

</P>
<P>
The command <KBD>x</KBD> (<CODE>po-exchange-location</CODE>) simultaneously
repositions the cursor to the entry associated with the top element of
the stack of saved locations, and replaces that top element with the
location of the current entry before the move.  Consequently, repeating
the <KBD>x</KBD> command toggles alternatively between two entries.
For achieving this, the translator will position the cursor on the
first entry, use <KBD>m</KBD>, then position to the second entry, and
merely use <KBD>x</KBD> for making the switch.

</P>


<H2><A NAME="SEC12" HREF="gettext_toc.html#TOC12">2.5  Normalizing Strings in Entries</A></H2>

<P>
There are many different ways for encoding a particular string into a
PO file entry, because there are so many different ways to split and
quote multi-line strings, and even, to represent special characters
by backslashed escaped sequences.  Some features of PO mode rely on
the ability for PO mode to scan an already existing PO file for a
particular string encoded into the <CODE>msgid</CODE> field of some entry.
Even if PO mode has internally all the built-in machinery for
implementing this recognition easily, doing it fast is technically
difficult.  To facilitate a solution to this efficiency problem,
we decided on a canonical representation for strings.

</P>
<P>
A conventional representation of strings in a PO file is currently
under discussion, and PO mode experiments with a canonical representation.
Having both <CODE>xgettext</CODE> and PO mode converging towards a uniform
way of representing equivalent strings would be useful, as the internal
normalization needed by PO mode could be automatically satisfied
when using <CODE>xgettext</CODE> from GNU <CODE>gettext</CODE>.  An explicit
PO mode normalization should then be only necessary for PO files
imported from elsewhere, or for when the convention itself evolves.

</P>
<P>
So, for achieving normalization of at least the strings of a given
PO file needing a canonical representation, the following PO mode
command is available:

</P>
<DL COMPACT>

<DT><KBD>M-x po-normalize</KBD>
<DD>
Tidy the whole PO file by making entries more uniform.

</DL>

<P>
The special command <KBD>M-x po-normalize</KBD>, which has no associated
keys, revises all entries, ensuring that strings of both original
and translated entries use uniform internal quoting in the PO file.
It also removes any crumb after the last entry.  This command may be
useful for PO files freshly imported from elsewhere, or if we ever
improve on the canonical quoting format we use.  This canonical format
is not only meant for getting cleaner PO files, but also for greatly
speeding up <CODE>msgid</CODE> string lookup for some other PO mode commands.

</P>
<P>
<KBD>M-x po-normalize</KBD> presently makes three passes over the entries.
The first implements heuristics for converting PO files for GNU
<CODE>gettext</CODE> 0.6 and earlier, in which <CODE>msgid</CODE> and <CODE>msgstr</CODE>
fields were using K&#38;R style C string syntax for multi-line strings.
These heuristics may fail for comments not related to obsolete
entries and ending with a backslash; they also depend on subsequent
passes for finalizing the proper commenting of continued lines for
obsolete entries.  This first pass might disappear once all oldish PO
files would have been adjusted.  The second and third pass normalize
all <CODE>msgid</CODE> and <CODE>msgstr</CODE> strings respectively.  They also
clean out those trailing backslashes used by XView's <CODE>msgfmt</CODE>
for continued lines.

</P>
<P>
Having such an explicit normalizing command allows for importing PO
files from other sources, but also eases the evolution of the current
convention, evolution driven mostly by aesthetic concerns, as of now.
It is easy to make suggested adjustments at a later time, as the
normalizing command and eventually, other GNU <CODE>gettext</CODE> tools
should greatly automate conformance.  A description of the canonical
string format is given below, for the particular benefit of those not
having Emacs handy, and who would nevertheless want to handcraft
their PO files in nice ways.

</P>
<P>
Right now, in PO mode, strings are single line or multi-line.  A string
goes multi-line if and only if it has <EM>embedded</EM> newlines, that
is, if it matches <SAMP>`[^\n]\n+[^\n]&acute;</SAMP>.  So, we would have:

</P>

<PRE>
msgstr "\n\nHello, world!\n\n\n"
</PRE>

<P>
but, replacing the space by a newline, this becomes:

</P>

<PRE>
msgstr ""
"\n"
"\n"
"Hello,\n"
"world!\n"
"\n"
"\n"
</PRE>

<P>
We are deliberately using a caricatural example, here, to make the
point clearer.  Usually, multi-lines are not that bad looking.
It is probable that we will implement the following suggestion.
We might lump together all initial newlines into the empty string,
and also all newlines introducing empty lines (that is, for <VAR>n</VAR>
&#62; 1, the <VAR>n</VAR>-1'th last newlines would go together on a separate
string), so making the previous example appear:

</P>

<PRE>
msgstr "\n\n"
"Hello,\n"
"world!\n"
"\n\n"
</PRE>

<P>
There are a few yet undecided little points about string normalization,
to be documented in this manual, once these questions settle.

</P>
<P><HR><P>
Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_1.html">previous</A>, <A HREF="gettext_3.html">next</A>, <A HREF="gettext_16.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
</BODY>
</HTML>