From 41ae7394123b8b0b13eb37ff1e9ed325374bbeba Mon Sep 17 00:00:00 2001 From: Bruno Haible Date: Thu, 19 Apr 2001 18:37:49 +0000 Subject: Automatically generated from gettext.texi. --- doc/gettext_2.html | 685 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 685 insertions(+) create mode 100644 doc/gettext_2.html (limited to 'doc/gettext_2.html') diff --git a/doc/gettext_2.html b/doc/gettext_2.html new file mode 100644 index 0000000..fcf1a05 --- /dev/null +++ b/doc/gettext_2.html @@ -0,0 +1,685 @@ + + + + +GNU gettext utilities - 2 PO Files and PO Mode Basics + + +Go to the first, previous, next, last section, table of contents. +


+ + +

2 PO Files and PO Mode Basics

+ +

+The GNU gettext toolset helps programmers and translators +at producing, updating and using translation files, mainly those +PO files which are textual, editable files. This chapter stresses +the format of PO files, and contains a PO mode starter. PO mode +description is spread throughout this manual instead of being concentrated +in one place. Here we present only the basics of PO mode. + +

+ + + +

2.1 Completing GNU gettext Installation

+ +

+Once you have received, unpacked, configured and compiled the GNU +gettext distribution, the `make install' command puts in +place the programs xgettext, msgfmt, gettext, and +msgmerge, as well as their available message catalogs. To +top off a comfortable installation, you might also want to make the +PO mode available to your Emacs users. + +

+

+During the installation of the PO mode, you might want to modify your +file `.emacs', once and for all, so it contains a few lines looking +like: + +

+ +
+(setq auto-mode-alist
+      (cons '("\\.po[tx]?\\'\\|\\.po\\." . po-mode) auto-mode-alist))
+(autoload 'po-mode "po-mode" "Major mode for translators to edit PO files" t)
+
+ +

+Later, whenever you edit some `.po', `.pot' or `.pox' +file, or any file having the string `.po.' within its name, +Emacs loads `po-mode.elc' (or `po-mode.el') as needed, and +automatically activates PO mode commands for the associated buffer. +The string PO appears in the mode line for any buffer for +which PO mode is active. Many PO files may be active at once in a +single Emacs session. + +

+

+If you are using Emacs version 20 or newer, and have already installed +the appropriate international fonts on your system, you may also tell +Emacs how to determine automatically the coding system of every PO file. +This will often (but not always) cause the necessary fonts to be loaded +and used for displaying the translations on your Emacs screen. For this +to happen, add the lines: + +

+ +
+(modify-coding-system-alist 'file "\\.po[tx]?\\'\\|\\.po\\."
+                            'po-find-file-coding-system)
+(autoload 'po-find-file-coding-system "po-mode")
+
+ +

+to your `.emacs' file. If, with this, you still see boxes instead +of international characters, try a different font set (via Shift Mouse +button 1). + +

+ + +

2.2 The Format of PO Files

+ +

+A PO file is made up of many entries, each entry holding the relation +between an original untranslated string and its corresponding +translation. All entries in a given PO file usually pertain +to a single project, and all translations are expressed in a single +target language. One PO file entry has the following schematic +structure: + +

+ +
+white-space
+#  translator-comments
+#. automatic-comments
+#: reference...
+#, flag...
+msgid untranslated-string
+msgstr translated-string
+
+ +

+The general structure of a PO file should be well understood by +the translator. When using PO mode, very little has to be known +about the format details, as PO mode takes care of them for her. + +

+

+Entries begin with some optional white space. Usually, when generated +through GNU gettext tools, there is exactly one blank line +between entries. Then comments follow, on lines all starting with the +character #. There are two kinds of comments: those which have +some white space immediately following the #, which comments are +created and maintained exclusively by the translator, and those which +have some non-white character just after the #, which comments +are created and maintained automatically by GNU gettext tools. +All comments, of either kind, are optional. + +

+

+After white space and comments, entries show two strings, namely +first the untranslated string as it appears in the original program +sources, and then, the translation of this string. The original +string is introduced by the keyword msgid, and the translation, +by msgstr. The two strings, untranslated and translated, +are quoted in various ways in the PO file, using " +delimiters and \ escapes, but the translator does not really +have to pay attention to the precise quoting format, as PO mode fully +takes care of quoting for her. + +

+

+The msgid strings, as well as automatic comments, are produced +and managed by other GNU gettext tools, and PO mode does not +provide means for the translator to alter these. The most she can +do is merely deleting them, and only by deleting the whole entry. +On the other hand, the msgstr string, as well as translator +comments, are really meant for the translator, and PO mode gives her +the full control she needs. + +

+

+The comment lines beginning with #, are special because they are +not completely ignored by the programs as comments generally are. The +comma separated list of flags is used by the msgfmt +program to give the user some better diagnostic messages. Currently +there are two forms of flags defined: + +

+
+ +
fuzzy +
+This flag can be generated by the msgmerge program or it can be +inserted by the translator herself. It shows that the msgstr +string might not be a correct translation (anymore). Only the translator +can judge if the translation requires further modification, or is +acceptable as is. Once satisfied with the translation, she then removes +this fuzzy attribute. The msgmerge program inserts this +when it combined the msgid and msgstr entries after fuzzy +search only. See section 6.3 Fuzzy Entries. + +
c-format +
+
no-c-format +
+These flags should not be added by a human. Instead only the +xgettext program adds them. In an automatized PO file processing +system as proposed here the user changes would be thrown away again as +soon as the xgettext program generates a new template file. + +In case the c-format flag is given for a string the msgfmt +does some more tests to check to validity of the translation. +See section 7.1 Invoking the msgfmt Program. + +
+ +

+A different kind of entries is used for translations which involve +plural forms. + +

+ +
+white-space
+#  translator-comments
+#. automatic-comments
+#: reference...
+#, flag...
+msgid untranslated-string-singular
+msgid_plural untranslated-string-plural
+msgstr[0] translated-string-case-0
+...
+msgstr[N] translated-string-case-n
+
+ +

+It happens that some lines, usually whitespace or comments, follow the +very last entry of a PO file. Such lines are not part of any entry, +and PO mode is unable to take action on those lines. By using the +PO mode function M-x po-normalize, the translator may get +rid of those spurious lines. See section 2.5 Normalizing Strings in Entries. + +

+

+The remainder of this section may be safely skipped by those using +PO mode, yet it may be interesting for everybody to have a better +idea of the precise format of a PO file. On the other hand, those +not having Emacs handy should carefully continue reading on. + +

+

+Each of untranslated-string and translated-string respects +the C syntax for a character string, including the surrounding quotes +and imbedded backslashed escape sequences. When the time comes +to write multi-line strings, one should not use escaped newlines. +Instead, a closing quote should follow the last character on the +line to be continued, and an opening quote should resume the string +at the beginning of the following PO file line. For example: + +

+ +
+msgid ""
+"Here is an example of how one might continue a very long string\n"
+"for the common case the string represents multi-line output.\n"
+
+ +

+In this example, the empty string is used on the first line, to +allow better alignment of the H from the word `Here' +over the f from the word `for'. In this example, the +msgid keyword is followed by three strings, which are meant +to be concatenated. Concatenating the empty string does not change +the resulting overall string, but it is a way for us to comply with +the necessity of msgid to be followed by a string on the same +line, while keeping the multi-line presentation left-justified, as +we find this to be a cleaner disposition. The empty string could have +been omitted, but only if the string starting with `Here' was +promoted on the first line, right after msgid.(2) It was not really necessary +either to switch between the two last quoted strings immediately after +the newline `\n', the switch could have occurred after any +other character, we just did it this way because it is neater. + +

+

+One should carefully distinguish between end of lines marked as +`\n' inside quotes, which are part of the represented +string, and end of lines in the PO file itself, outside string quotes, +which have no incidence on the represented string. + +

+

+Outside strings, white lines and comments may be used freely. +Comments start at the beginning of a line with `#' and extend +until the end of the PO file line. Comments written by translators +should have the initial `#' immediately followed by some white +space. If the `#' is not immediately followed by white space, +this comment is most likely generated and managed by specialized GNU +tools, and might disappear or be replaced unexpectedly when the PO +file is given to msgmerge. + +

+ + +

2.3 Main PO mode Commands

+ +

+After setting up Emacs with something similar to the lines in +section 2.1 Completing GNU gettext Installation, PO mode is activated for a window when Emacs finds a +PO file in that window. This puts the window read-only and establishes a +po-mode-map, which is a genuine Emacs mode, in a way that is not derived +from text mode in any way. Functions found on po-mode-hook, +if any, will be executed. + +

+

+When PO mode is active in a window, the letters `PO' appear +in the mode line for that window. The mode line also displays how +many entries of each kind are held in the PO file. For example, +the string `132t+3f+10u+2o' would tell the translator that the +PO mode contains 132 translated entries (see section 6.2 Translated Entries, +3 fuzzy entries (see section 6.3 Fuzzy Entries), 10 untranslated entries +(see section 6.4 Untranslated Entries) and 2 obsolete entries (see section 6.5 Obsolete Entries). Zero-coefficients items are not shown. So, in this example, if +the fuzzy entries were unfuzzied, the untranslated entries were translated +and the obsolete entries were deleted, the mode line would merely display +`145t' for the counters. + +

+

+The main PO commands are those which do not fit into the other categories of +subsequent sections. These allow for quitting PO mode or for managing windows +in special ways. + +

+
+ +
U +
+Undo last modification to the PO file. + +
Q +
+Quit processing and save the PO file. + +
q +
+Quit processing, possibly after confirmation. + +
O +
+Temporary leave the PO file window. + +
? +
+
h +
+Show help about PO mode. + +
= +
+Give some PO file statistics. + +
V +
+Batch validate the format of the whole PO file. + +
+ +

+The command U (po-undo) interfaces to the Emacs +undo facility. See section `Undoing Changes' in The Emacs Editor. Each time U is typed, modifications which the translator +did to the PO file are undone a little more. For the purpose of +undoing, each PO mode command is atomic. This is especially true for +the RET command: the whole edition made by using a single +use of this command is undone at once, even if the edition itself +implied several actions. However, while in the editing window, one +can undo the edition work quite parsimoniously. + +

+

+The commands Q (po-quit) and q +(po-confirm-and-quit) are used when the translator is done with the +PO file. The former is a bit less verbose than the latter. If the file +has been modified, it is saved to disk first. In both cases, and prior to +all this, the commands check if some untranslated message remains in the +PO file and, if yes, the translator is asked if she really wants to leave +off working with this PO file. This is the preferred way of getting rid +of an Emacs PO file buffer. Merely killing it through the usual command +C-x k (kill-buffer) is not the tidiest way to proceed. + +

+

+The command O (po-other-window) is another, softer way, +to leave PO mode, temporarily. It just moves the cursor to some other +Emacs window, and pops one if necessary. For example, if the translator +just got PO mode to show some source context in some other, she might +discover some apparent bug in the program source that needs correction. +This command allows the translator to change sex, become a programmer, +and have the cursor right into the window containing the program she +(or rather he) wants to modify. By later getting the cursor back +in the PO file window, or by asking Emacs to edit this file once again, +PO mode is then recovered. + +

+

+The command h (po-help) displays a summary of all available PO +mode commands. The translator should then type any character to resume +normal PO mode operations. The command ? has the same effect +as h. + +

+

+The command = (po-statistics) computes the total number of +entries in the PO file, the ordinal of the current entry (counted from +1), the number of untranslated entries, the number of obsolete entries, +and displays all these numbers. + +

+

+The command V (po-validate) launches msgfmt in verbose +mode over the current PO file. This command first offers to save the +current PO file on disk. The msgfmt tool, from GNU gettext, +has the purpose of creating a MO file out of a PO file, and PO mode uses +the features of this program for checking the overall format of a PO file, +as well as all individual entries. + +

+

+The program msgfmt runs asynchronously with Emacs, so the +translator regains control immediately while her PO file is being studied. +Error output is collected in the Emacs `*compilation*' buffer, +displayed in another window. The regular Emacs command C-x` +(next-error), as well as other usual compile commands, allow the +translator to reposition quickly to the offending parts of the PO file. +Once the cursor is on the line in error, the translator may decide on +any PO mode action which would help correcting the error. + +

+ + +

2.4 Entry Positioning

+ +

+The cursor in a PO file window is almost always part of +an entry. The only exceptions are the special case when the cursor +is after the last entry in the file, or when the PO file is +empty. The entry where the cursor is found to be is said to be the +current entry. Many PO mode commands operate on the current entry, +so moving the cursor does more than allowing the translator to browse +the PO file, this also selects on which entry commands operate. + +

+

+Some PO mode commands alter the position of the cursor in a specialized +way. A few of those special purpose positioning are described here, +the others are described in following sections. + +

+
+ +
. +
+Redisplay the current entry. + +
n +
+
n +
+Select the entry after the current one. + +
p +
+
p +
+Select the entry before the current one. + +
< +
+Select the first entry in the PO file. + +
> +
+Select the last entry in the PO file. + +
m +
+Record the location of the current entry for later use. + +
l +
+Return to a previously saved entry location. + +
x +
+Exchange the current entry location with the previously saved one. + +
+ +

+Any Emacs command able to reposition the cursor may be used +to select the current entry in PO mode, including commands which +move by characters, lines, paragraphs, screens or pages, and search +commands. However, there is a kind of standard way to display the +current entry in PO mode, which usual Emacs commands moving +the cursor do not especially try to enforce. The command . +(po-current-entry) has the sole purpose of redisplaying the +current entry properly, after the current entry has been changed by +means external to PO mode, or the Emacs screen otherwise altered. + +

+

+It is yet to be decided if PO mode helps the translator, or otherwise +irritates her, by forcing a rigid window disposition while she +is doing her work. We originally had quite precise ideas about +how windows should behave, but on the other hand, anyone used to +Emacs is often happy to keep full control. Maybe a fixed window +disposition might be offered as a PO mode option that the translator +might activate or deactivate at will, so it could be offered on an +experimental basis. If nobody feels a real need for using it, or +a compulsion for writing it, we should drop this whole idea. +The incentive for doing it should come from translators rather than +programmers, as opinions from an experienced translator are surely +more worth to me than opinions from programmers thinking about +how others should do translation. + +

+

+The commands n (po-next-entry) and p +(po-previous-entry) move the cursor the entry following, +or preceding, the current one. If n is given while the +cursor is on the last entry of the PO file, or if p +is given while the cursor is on the first entry, no move is done. + +

+

+The commands < (po-first-entry) and > +(po-last-entry) move the cursor to the first entry, or last +entry, of the PO file. When the cursor is located past the last +entry in a PO file, most PO mode commands will return an error saying +`After last entry'. Moreover, the commands < and > +have the special property of being able to work even when the cursor +is not into some PO file entry, and one may use them for nicely +correcting this situation. But even these commands will fail on a +truly empty PO file. There are development plans for the PO mode for it +to interactively fill an empty PO file from sources. See section 3.3 Marking Translatable Strings. + +

+

+The translator may decide, before working at the translation of +a particular entry, that she needs to browse the remainder of the +PO file, maybe for finding the terminology or phraseology used +in related entries. She can of course use the standard Emacs idioms +for saving the current cursor location in some register, and use that +register for getting back, or else, use the location ring. + +

+

+PO mode offers another approach, by which cursor locations may be saved +onto a special stack. The command m (po-push-location) +merely adds the location of current entry to the stack, pushing +the already saved locations under the new one. The command +r (po-pop-location) consumes the top stack element and +repositions the cursor to the entry associated with that top element. +This position is then lost, for the next r will move the cursor +to the previously saved location, and so on until no locations remain +on the stack. + +

+

+If the translator wants the position to be kept on the location stack, +maybe for taking a look at the entry associated with the top +element, then go elsewhere with the intent of getting back later, she +ought to use m immediately after r. + +

+

+The command x (po-exchange-location) simultaneously +repositions the cursor to the entry associated with the top element of +the stack of saved locations, and replaces that top element with the +location of the current entry before the move. Consequently, repeating +the x command toggles alternatively between two entries. +For achieving this, the translator will position the cursor on the +first entry, use m, then position to the second entry, and +merely use x for making the switch. + +

+ + +

2.5 Normalizing Strings in Entries

+ +

+There are many different ways for encoding a particular string into a +PO file entry, because there are so many different ways to split and +quote multi-line strings, and even, to represent special characters +by backslahsed escaped sequences. Some features of PO mode rely on +the ability for PO mode to scan an already existing PO file for a +particular string encoded into the msgid field of some entry. +Even if PO mode has internally all the built-in machinery for +implementing this recognition easily, doing it fast is technically +difficult. To facilitate a solution to this efficiency problem, +we decided on a canonical representation for strings. + +

+

+A conventional representation of strings in a PO file is currently +under discussion, and PO mode experiments with a canonical representation. +Having both xgettext and PO mode converging towards a uniform +way of representing equivalent strings would be useful, as the internal +normalization needed by PO mode could be automatically satisfied +when using xgettext from GNU gettext. An explicit +PO mode normalization should then be only necessary for PO files +imported from elsewhere, or for when the convention itself evolves. + +

+

+So, for achieving normalization of at least the strings of a given +PO file needing a canonical representation, the following PO mode +command is available: + +

+
+ +
M-x po-normalize +
+Tidy the whole PO file by making entries more uniform. + +
+ +

+The special command M-x po-normalize, which has no associated +keys, revises all entries, ensuring that strings of both original +and translated entries use uniform internal quoting in the PO file. +It also removes any crumb after the last entry. This command may be +useful for PO files freshly imported from elsewhere, or if we ever +improve on the canonical quoting format we use. This canonical format +is not only meant for getting cleaner PO files, but also for greatly +speeding up msgid string lookup for some other PO mode commands. + +

+

+M-x po-normalize presently makes three passes over the entries. +The first implements heuristics for converting PO files for GNU +gettext 0.6 and earlier, in which msgid and msgstr +fields were using K&R style C string syntax for multi-line strings. +These heuristics may fail for comments not related to obsolete +entries and ending with a backslash; they also depend on subsequent +passes for finalizing the proper commenting of continued lines for +obsolete entries. This first pass might disappear once all oldish PO +files would have been adjusted. The second and third pass normalize +all msgid and msgstr strings respectively. They also +clean out those trailing backslashes used by XView's msgfmt +for continued lines. + +

+

+Having such an explicit normalizing command allows for importing PO +files from other sources, but also eases the evolution of the current +convention, evolution driven mostly by aesthetic concerns, as of now. +It is easy to make suggested adjustments at a later time, as the +normalizing command and eventually, other GNU gettext tools +should greatly automate conformance. A description of the canonical +string format is given below, for the particular benefit of those not +having Emacs handy, and who would nevertheless want to handcraft +their PO files in nice ways. + +

+

+Right now, in PO mode, strings are single line or multi-line. A string +goes multi-line if and only if it has embedded newlines, that +is, if it matches `[^\n]\n+[^\n]'. So, we would have: + +

+ +
+msgstr "\n\nHello, world!\n\n\n"
+
+ +

+but, replacing the space by a newline, this becomes: + +

+ +
+msgstr ""
+"\n"
+"\n"
+"Hello,\n"
+"world!\n"
+"\n"
+"\n"
+
+ +

+We are deliberately using a caricatural example, here, to make the +point clearer. Usually, multi-lines are not that bad looking. +It is probable that we will implement the following suggestion. +We might lump together all initial newlines into the empty string, +and also all newlines introducing empty lines (that is, for n +> 1, the n-1'th last newlines would go together on a separate +string), so making the previous example appear: + +

+ +
+msgstr "\n\n"
+"Hello,\n"
+"world!\n"
+"\n\n"
+
+ +

+There are a few yet undecided little points about string normalization, +to be documented in this manual, once these questions settle. + +

+


+Go to the first, previous, next, last section, table of contents. + + -- cgit v1.1