This is gettext.info, produced by makeinfo version 4.0 from gettext.texi. INFO-DIR-SECTION GNU Gettext Utilities START-INFO-DIR-ENTRY * Gettext: (gettext). GNU gettext utilities. * gettextize: (gettext)gettextize Invocation. Prepare a package for gettext. * msgfmt: (gettext)msgfmt Invocation. Make MO files out of PO files. * msgmerge: (gettext)msgmerge Invocation. Update two PO files into one. * xgettext: (gettext)xgettext Invocation. Extract strings into a PO file. END-INFO-DIR-ENTRY This file provides documentation for GNU `gettext' utilities. It also serves as a reference for the free Translation Project. Copyright (C) 1995, 1996, 1997, 1998, 2001 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Foundation.  File: gettext.info, Node: Manipulating, Next: Binaries, Prev: Updating, Up: Top Manipulating PO Files ********************* Sometimes it is necessary to manipulate PO files in a way that is better performed automatically than by hand. GNU `gettext' includes a complete set of tools for this purpose. When merging two packages into a single package, the resulting POT file will be the concatenation of the two packages' POT files. Thus the maintainer must concatenate the two existing package translations into a single translation catalog, for each language. This is best performed using `msgcat'. It is then the translators' duty to deal with any possible conflicts that arose during the merge. When a translator takes over the translation job from another translator, but she uses a different character encoding in her locale, she will convert the catalog to her character encoding. This is best done through the `msgconv' program. When a maintainer takes a source file with tagged messages from another package, he should also take the existing translations for this source file (and not let the translators do the same job twice). One way to do this is through `msggrep', another is to create a POT file for that source file and use `msgmerge'. When a translator wants to adjust some translation catalog for a special dialect or orthography - for example, German as written in Switzerland versus German as written in Germany -, she needs to apply some text processing to every message in the catalog. The tool for doing this is `msgexec'. Another use of `msgexec' is to produce approximately the POT file for which a given PO file was made. This can be done through a filter command like `msgexec sed -e d | sed -e '/^# /d''. Note that the original POT file may have had different comments and different plural message counts, that's why it's better to use the original POT file if available. When third party tools create PO or POT files, sometimes duplicates cannot be avoided. But the GNU `gettext' tools give an error when they encounter duplicate msgids in the same file and in the same domain. To merge duplicates, the `msguniq' program can be used. `msgcomm' is a more general tool for keeping or throwing away duplicates, occurring in different files. `msgcmp' can be used to check whether a translation catalog is completely translated. `msgattrib' can be used to select and extract only the fuzzy or untranslated messages of a translation catalog. `msgen' is useful as a first step for preparing English translation catalogs. It copies each message's msgid to its msgstr. * Menu: * msgcat Invocation:: Invoking the `msgcat' Program * msgconv Invocation:: Invoking the `msgconv' Program * msggrep Invocation:: Invoking the `msggrep' Program * msgexec Invocation:: Invoking the `msgexec' Program * msguniq Invocation:: Invoking the `msguniq' Program * msgcomm Invocation:: Invoking the `msgcomm' Program * msgcmp Invocation:: Invoking the `msgcmp' Program * msgattrib Invocation:: Invoking the `msgattrib' Program * msgen Invocation:: Invoking the `msgen' Program  File: gettext.info, Node: msgcat Invocation, Next: msgconv Invocation, Prev: Manipulating, Up: Manipulating Invoking the `msgcat' Program ============================= msgcat [OPTION] [INPUTFILE]... The `msgcat' program concatenates and merges the specified PO files. It finds messages which are common to two or more of the specified PO files. By using the `--more-than' option, greater commonality may be requested before messages are printed. Conversely, the `--less-than' option may be used to specify less commonality before messages are printed (i.e. `--less-than=2' will only print the unique messages). Translations, comments and extract comments will be cumulated, except that if `--use-first' is specified, they will be taken from the first PO file to define them. File positions from all PO files will be cumulated. Input file location ------------------- `INPUTFILE ...' Input files. `-f FILE' `--files-from=FILE' Read the names of the input files from FILE instead of getting them from the command line. `-D DIRECTORY' `--directory=DIRECTORY' Add DIRECTORY to the list of directories. Source files are searched relative to this list of directories. The resulting `.po' file will be written relative to the current directory, though. If INPUTFILE is `-', standard input is read. Output file location -------------------- `-o FILE' `--output-file=FILE' Write output to specified file. The results are written to standard output if no output file is specified or if it is `-'. Message selection ----------------- `-< NUMBER' `--less-than=NUMBER' Print messages with less than NUMBER definitions, defaults to infinite if not set. `-> NUMBER' `--more-than=NUMBER' Print messages with more than NUMBER definitions, defaults to 0 if not set. `-u' `--unique' Shorthand for `--less-than=2'. Requests that only unique messages be printed. Output details -------------- `-t' `--to-code=NAME' Specify encoding for output. `--use-first' Use first available translation for each message. Don't merge several translations into one. `--force-po' Always write an output file even if it contains no message. `-i' `--indent' Write the .po file using indented style. `--no-location' Do not write `#: FILENAME:LINE' lines. `-n' `--add-location' Generate `#: FILENAME:LINE' lines (default). `--strict' Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions. `-w NUMBER' `--width=NUMBER' Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given NUMBER. `-s' `--sort-output' Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context. `-F' `--sort-by-file' Sort output by file location. Informative output ------------------ `-h' `--help' Display this help and exit. `-V' `--version' Output version information and exit.  File: gettext.info, Node: msgconv Invocation, Next: msggrep Invocation, Prev: msgcat Invocation, Up: Manipulating Invoking the `msgconv' Program ============================== msgconv [OPTION] [INPUTFILE] The `msgconv' program converts a translation catalog to a different character encoding. Input file location ------------------- `INPUTFILE' Input PO file. `-D DIRECTORY' `--directory=DIRECTORY' Add DIRECTORY to the list of directories. Source files are searched relative to this list of directories. The resulting `.po' file will be written relative to the current directory, though. If no INPUTFILE is given or if it is `-', standard input is read. Output file location -------------------- `-o FILE' `--output-file=FILE' Write output to specified file. The results are written to standard output if no output file is specified or if it is `-'. Conversion target ----------------- `-t' `--to-code=NAME' Specify encoding for output. The default encoding is the current locale's encoding. Output details -------------- `--force-po' Always write an output file even if it contains no message. `-i' `--indent' Write the .po file using indented style. `--no-location' Do not write `#: FILENAME:LINE' lines. `--add-location' Generate `#: FILENAME:LINE' lines (default). `--strict' Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions. `-w NUMBER' `--width=NUMBER' Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given NUMBER. `-s' `--sort-output' Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context. `-F' `--sort-by-file' Sort output by file location. Informative output ------------------ `-h' `--help' Display this help and exit. `-V' `--version' Output version information and exit.  File: gettext.info, Node: msggrep Invocation, Next: msgexec Invocation, Prev: msgconv Invocation, Up: Manipulating Invoking the `msggrep' Program ============================== msggrep [OPTION] [INPUTFILE] The `msggrep' program extracts all messages of a translation catalog that match a given pattern or belong to some given source files. Input file location ------------------- `INPUTFILE' Input PO file. `-D DIRECTORY' `--directory=DIRECTORY' Add DIRECTORY to the list of directories. Source files are searched relative to this list of directories. The resulting `.po' file will be written relative to the current directory, though. If no INPUTFILE is given or if it is `-', standard input is read. Output file location -------------------- `-o FILE' `--output-file=FILE' Write output to specified file. The results are written to standard output if no output file is specified or if it is `-'. Message selection ----------------- [-N SOURCEFILE]... [-M DOMAINNAME]... [-K MSGID-PATTERN] [-T MSGSTR-PATTERN] A message is selected if * it comes from one of the specified source files, * or if it comes from one of the specified domains, * or if `-K' is given and its key (msgid or msgid_plural) matches MSGID-PATTERN, * or if `-T' is given and its translation (msgstr) matches MSGSTR-PATTERN. When more than one selection criterion is specified, the set of selected messages is the union of the selected messages of each criterion. MSGID-PATTERN or MSGSTR-PATTERN syntax: [-E | -F] [-e PATTERN | -f FILE]... PATTERNs are basic regular expressions by default, or extended regular expressions if -E is given, or fixed strings if -F is given. `-N SOURCEFILE' `--location=SOURCEFILE' Select messages extracted from SOURCEFILE. `-M DOMAINNAME' `--domain=DOMAINNAME' Select messages belonging to domain DOMAINNAME. `-K' `--msgid' Start of patterns for the msgid. `-T' `--msgstr' Start of patterns for the msgstr. `-E' `--extended-regexp' Specify that PATTERN is an extended regular expression. `-F' `--fixed-strings' Specify that PATTERN is a set of newline-separated strings. `-e PATTERN' `--regexp=PATTERN' Use PATTERN as a regular expression. `-f FILE' `--file=FILE' Obtain PATTERN from FILE. `-i' `--ignore-case' Ignore case distinctions. Output details -------------- `--force-po' Always write an output file even if it contains no message. `--indent' Write the .po file using indented style. `--no-location' Do not write `#: FILENAME:LINE' lines. `--add-location' Generate `#: FILENAME:LINE' lines (default). `--strict' Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions. `-w NUMBER' `--width=NUMBER' Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given NUMBER. `--sort-output' Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context. `--sort-by-file' Sort output by file location. Informative output ------------------ `-h' `--help' Display this help and exit. `-V' `--version' Output version information and exit.  File: gettext.info, Node: msgexec Invocation, Next: msguniq Invocation, Prev: msggrep Invocation, Up: Manipulating Invoking the `msgexec' Program ============================== msgexec [OPTION] FILTER [FILTER-OPTION] The `msgexec' program applies a filter to all translations of a translation catalog. Input file location ------------------- `-i INPUTFILE' `--input=INPUTFILE' Input PO file. `-D DIRECTORY' `--directory=DIRECTORY' Add DIRECTORY to the list of directories. Source files are searched relative to this list of directories. The resulting `.po' file will be written relative to the current directory, though. If no INPUTFILE is given or if it is `-', standard input is read. Output file location -------------------- `-o FILE' `--output-file=FILE' Write output to specified file. The results are written to standard output if no output file is specified or if it is `-'. The filter ---------- The FILTER can be any program that reads a translation from standard input and writes a modified translation to standard output. A frequently used filter is `sed'. Useful FILTER-OPTIONs when the FILTER is `sed' ---------------------------------------------- `-e SCRIPT' `--expression=SCRIPT' Add SCRIPT to the commands to be executed. `-f SCRIPTFILE' `--file=SCRIPTFILE' Add the contents of SCRIPTFILE to the commands to be executed. `-n' `--quiet' `--silent' Suppress automatic printing of pattern space. Output details -------------- `--force-po' Always write an output file even if it contains no message. `--indent' Write the .po file using indented style. `--no-location' Do not write `#: FILENAME:LINE' lines. `--add-location' Generate `#: FILENAME:LINE' lines (default). `--strict' Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions. `-w NUMBER' `--width=NUMBER' Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given NUMBER. `-s' `--sort-output' Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context. `-F' `--sort-by-file' Sort output by file location. Informative output ------------------ `-h' `--help' Display this help and exit. `-V' `--version' Output version information and exit.  File: gettext.info, Node: msguniq Invocation, Next: msgcomm Invocation, Prev: msgexec Invocation, Up: Manipulating Invoking the `msguniq' Program ============================== msguniq [OPTION] [INPUTFILE] The `msguniq' program unifies duplicate translations in a translation catalog. It finds duplicate translations of the same message ID. Such duplicates are invalid input for other programs like `msgfmt', `msgmerge' or `msgcat'. By default, duplicates are merged together. When using the `--repeated' option, only duplicates are output, and all other messages are discarded. Comments and extracted comments will be cumulated, except that if `--use-first' is specified, they will be taken from the first translation. File positions will be cumulated. When using the `--unique' option, duplicates are discarded. Input file location ------------------- `INPUTFILE' Input PO file. `-D DIRECTORY' `--directory=DIRECTORY' Add DIRECTORY to the list of directories. Source files are searched relative to this list of directories. The resulting `.po' file will be written relative to the current directory, though. If no INPUTFILE is given or if it is `-', standard input is read. Output file location -------------------- `-o FILE' `--output-file=FILE' Write output to specified file. The results are written to standard output if no output file is specified or if it is `-'. Message selection ----------------- `-d' `--repeated' Print only duplicates. `-u' `--unique' Print only unique messages, discard duplicates. Output details -------------- `-t' `--to-code=NAME' Specify encoding for output. `--use-first' Use first available translation for each message. Don't merge several translations into one. `--force-po' Always write an output file even if it contains no message. `-i' `--indent' Write the .po file using indented style. `--no-location' Do not write `#: FILENAME:LINE' lines. `-n' `--add-location' Generate `#: FILENAME:LINE' lines (default). `--strict' Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions. `-w NUMBER' `--width=NUMBER' Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given NUMBER. `-s' `--sort-output' Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context. `-F' `--sort-by-file' Sort output by file location. Informative output ------------------ `-h' `--help' Display this help and exit. `-V' `--version' Output version information and exit.  File: gettext.info, Node: msgcomm Invocation, Next: msgcmp Invocation, Prev: msguniq Invocation, Up: Manipulating Invoking the `msgcomm' Program ============================== msgcomm [OPTION] [INPUTFILE]... The `msgcomm' program finds messages which are common to two or more of the specified PO files. By using the `--more-than' option, greater commonality may be requested before messages are printed. Conversely, the `--less-than' option may be used to specify less commonality before messages are printed (i.e. `--less-than=2' will only print the unique messages). Translations, comments and extract comments will be preserved, but only from the first PO file to define them. File positions from all PO files will be cumulated. Input file location ------------------- `INPUTFILE ...' Input files. `-f FILE' `--files-from=FILE' Read the names of the input files from FILE instead of getting them from the command line. `-D DIRECTORY' `--directory=DIRECTORY' Add DIRECTORY to the list of directories. Source files are searched relative to this list of directories. The resulting `.po' file will be written relative to the current directory, though. If INPUTFILE is `-', standard input is read. Output file location -------------------- `-o FILE' `--output-file=FILE' Write output to specified file. The results are written to standard output if no output file is specified or if it is `-'. Message selection ----------------- `-< NUMBER' `--less-than=NUMBER' Print messages with less than NUMBER definitions, defaults to infinite if not set. `-> NUMBER' `--more-than=NUMBER' Print messages with more than NUMBER definitions, defaults to 1 if not set. `-u' `--unique' Shorthand for `--less-than=2'. Requests that only unique messages be printed. Output details -------------- `--force-po' Always write an output file even if it contains no message. `-i' `--indent' Write the .po file using indented style. `--no-location' Do not write `#: FILENAME:LINE' lines. `-n' `--add-location' Generate `#: FILENAME:LINE' lines (default). `--strict' Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions. `-w NUMBER' `--width=NUMBER' Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given NUMBER. `-s' `--sort-output' Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context. `-F' `--sort-by-file' Sort output by file location. `--omit-header' Don't write header with `msgid ""' entry. Informative output ------------------ `-h' `--help' Display this help and exit. `-V' `--version' Output version information and exit.  File: gettext.info, Node: msgcmp Invocation, Next: msgattrib Invocation, Prev: msgcomm Invocation, Up: Manipulating Invoking the `msgcmp' Program ============================= msgcmp [OPTION] DEF.po REF.pot The `msgcmp' program compares two Uniforum style .po files to check that both contain the same set of msgid strings. The DEF.po file is an existing PO file with the translations. The REF.pot file is the last created PO file, or a PO Template file (generally created by `xgettext'). This is useful for checking that you have translated each and every message in your program. Where an exact match cannot be found, fuzzy matching is used to produce better diagnostics. Input file location ------------------- `DEF.po' Translations. `REF.pot' References to the sources. `-D DIRECTORY' `--directory=DIRECTORY' Add DIRECTORY to the list of directories. Source files are searched relative to this list of directories. Operation modifiers ------------------- `-m' `--multi-domain' Apply REF.pot to each of the domains in DEF.po. Informative output ------------------ `-h' `--help' Display this help and exit. `-V' `--version' Output version information and exit.  File: gettext.info, Node: msgattrib Invocation, Next: msgen Invocation, Prev: msgcmp Invocation, Up: Manipulating Invoking the `msgattrib' Program ================================ msgattrib [OPTION] [INPUTFILE] The `msgattrib' program filters the messages of a translation catalog according to their attributes, and manipulates the attributes. Input file location ------------------- `INPUTFILE' Input PO file. `-D DIRECTORY' `--directory=DIRECTORY' Add DIRECTORY to the list of directories. Source files are searched relative to this list of directories. The resulting `.po' file will be written relative to the current directory, though. If no INPUTFILE is given or if it is `-', standard input is read. Output file location -------------------- `-o FILE' `--output-file=FILE' Write output to specified file. The results are written to standard output if no output file is specified or if it is `-'. Message selection ----------------- `--translated' Keep translated messages, remove untranslated messages. `--untranslated' Keep untranslated messages, remove translated messages. `--no-fuzzy' Remove `fuzzy' marked messages. `--only-fuzzy' Keep `fuzzy' marked messages, remove all other messsages. `--no-obsolete' Remove obsolete #~ messages. `--only-obsolete' Keep obsolete #~ messages, remove all other messages. Attribute manipulation ---------------------- Attributes are modified after the message selection/removal has been performed. `--set-fuzzy' Set all messages `fuzzy'. `--clear-fuzzy' Set all messages non-`fuzzy'. `--set-obsolete' Set all messages obsolete. `--clear-obsolete' Set all messages non-obsolete. `--fuzzy' Synonym for `--only-fuzzy --clear-fuzzy': It keeps only the fuzzy messages and removes their `fuzzy' mark. `--obsolete' Synonym for `--only-obsolete --clear-obsolete': It keeps only the obsolete messages and makes them non-obsolete. Output details -------------- `--force-po' Always write an output file even if it contains no message. `-i' `--indent' Write the .po file using indented style. `--no-location' Do not write `#: FILENAME:LINE' lines. `-n' `--add-location' Generate `#: FILENAME:LINE' lines (default). `--strict' Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions. `-w NUMBER' `--width=NUMBER' Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given NUMBER. `-s' `--sort-output' Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context. `-F' `--sort-by-file' Sort output by file location. Informative output ------------------ `-h' `--help' Display this help and exit. `-V' `--version' Output version information and exit.  File: gettext.info, Node: msgen Invocation, Prev: msgattrib Invocation, Up: Manipulating Invoking the `msgen' Program ============================ msgen [OPTION] INPUTFILE The `msgen' program creates an English translation catalog. The input file is the last created English PO file, or a PO Template file (generally created by xgettext). Untranslated entries are assigned a translation that is identical to the msgid, and are marked fuzzy. Input file location ------------------- `INPUTFILE' Input PO or POT file. `-D DIRECTORY' `--directory=DIRECTORY' Add DIRECTORY to the list of directories. Source files are searched relative to this list of directories. The resulting `.po' file will be written relative to the current directory, though. If INPUTFILE is `-', standard input is read. Output file location -------------------- `-o FILE' `--output-file=FILE' Write output to specified file. The results are written to standard output if no output file is specified or if it is `-'. Output details -------------- `--force-po' Always write an output file even if it contains no message. `-i' `--indent' Write the .po file using indented style. `--no-location' Do not write `#: FILENAME:LINE' lines. `--add-location' Generate `#: FILENAME:LINE' lines (default). `--strict' Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions. `-w NUMBER' `--width=NUMBER' Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given NUMBER. `-s' `--sort-output' Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context. `-F' `--sort-by-file' Sort output by file location. Informative output ------------------ `-h' `--help' Display this help and exit. `-V' `--version' Output version information and exit.  File: gettext.info, Node: Binaries, Next: Users, Prev: Manipulating, Up: Top Producing Binary MO Files ************************* * Menu: * msgfmt Invocation:: Invoking the `msgfmt' Program * msgunfmt Invocation:: Invoking the `msgunfmt' Program * MO Files:: The Format of GNU MO Files  File: gettext.info, Node: msgfmt Invocation, Next: msgunfmt Invocation, Prev: Binaries, Up: Binaries Invoking the `msgfmt' Program ============================= msgfmt [OPTION] FILENAME.po ... The `msgfmt' programs generates a binary message catalog from a textual translation description. Input file location ------------------- `FILENAME.po ...' `-D DIRECTORY' `--directory=DIRECTORY' Add DIRECTORY to the list of directories. Source files are searched relative to this list of directories. The resulting `.po' file will be written relative to the current directory, though. If an input file is `-', standard input is read. Operation mode -------------- `-j' `--java' Java mode: generate a Java `ResourceBundle' class. `--java2' Like -java, and assume Java2 (JDK 1.2 or higher). Output file location -------------------- `-o FILE' `--output-file=FILE' Write output to specified file. `--strict' Direct the program to work strictly following the Uniforum/Sun implementation. Currently this only affects the naming of the output file. If this option is not given the name of the output file is the same as the domain name. If the strict Uniforum mode is enabled the suffix `.mo' is added to the file name if it is not already present. We find this behaviour of Sun's implementation rather silly and so by default this mode is _not_ selected. If the output FILE is `-', output is written to standard output. Output file location in Java mode --------------------------------- `-r RESOURCE' `--resource=RESOURCE' Specify the resource name. `-l LOCALE' `--locale=LOCALE' Specify the locale name, either a language specification of the form LL or a combined language and country specification of the form LL_CC. `-d DIRECTORY' Specify the base directory of classes directory hierarchy. The class name is determined by appending the locale name to the resource name, separated with an underscore. The `-d' option is mandatory. The class is written under the specified directory. Input file interpretation ------------------------- `-c' `--check' Perform all the checks implied by `--check-format', `--check-header', `--check-domain'. `--check-format' Check language dependent format strings. If the string represents a format string used in a `printf'-like function both strings should have the same number of `%' format specifiers, with matching types. If the flag `c-format' or `possible-c-format' appears in the special comment <#,> for this entry a check is performed. For example, the check will diagnose using `%.*s' against `%s', or `%d' against `%s', or `%d' against `%x'. It can even handle positional parameters. Normally the `xgettext' program automatically decides whether a string is a format string or not. This algorithm is not perfect, though. It might regard a string as a format string though it is not used in a `printf'-like function and so `msgfmt' might report errors where there are none. To solve this problem the programmer can dictate the decision to the `xgettext' program (*note c-format::). The translator should not consider removing the flag from the <#,> line. This "fix" would be reversed again as soon as `msgmerge' is called the next time. `--check-header' Verify presence and contents of the header entry. *Note Header Entry::, for a description of the various fields in the header entry. `--check-domain' Check for conflicts between domain directives and the `--output-file' option `-C' `--check-compatibility' Check that GNU msgfmt behaves like X/Open msgfmt. This will give an error when attempting to use the GNU extensions. `--check-accelerators[=CHAR]' Check presence of keyboard accelerators for menu items. This is based on the convention used in some GUIs that a keyboard accelerator in a menu item string is designated by an immediately preceding `&' character. Sometimes a keyboard accelerator is also called "keyboard mnemonic". This check verifies that if the untranslated string has exactly one `&' character, the translated string has exactly one `&' as well. If this option is given with a CHAR argument, this CHAR should be a non-alphanumeric character and is used as keyboard acceleator mark instead of `&'. `-f' `--use-fuzzy' Use fuzzy entries in output. Note that using this option is usually wrong, because fuzzy messages are exactly those which have not been validated by a human translator. Output details -------------- `-a NUMBER' `--alignment=NUMBER' Align strings to NUMBER bytes (default: 1). `--no-hash' Don't include a hash table in the binary file. Lookup will be more expensive at run time (binary search instead of hash table lookup). Informative output ------------------ `-h' `--help' Display this help and exit. `-V' `--version' Output version information and exit. `--statistics' Print statistics about translations. `-v' `--verbose' Increase verbosity level.  File: gettext.info, Node: msgunfmt Invocation, Next: MO Files, Prev: msgfmt Invocation, Up: Binaries Invoking the `msgunfmt' Program =============================== msgunfmt [OPTION] [FILE]... The `msgunfmt' program converts a binary message catalog to a Uniforum style .po file. Operation mode -------------- `-j' `--java' Java mode: generate a Java `ResourceBundle' class. Input file location ------------------- `FILE ...' Input .mo files. If no input FILE is given or if it is `-', standard input is read. Input file location in Java mode -------------------------------- `-r RESOURCE' `--resource=RESOURCE' Specify the resource name. `-l LOCALE' `--locale=LOCALE' Specify the locale name, either a language specification of the form LL or a combined language and country specification of the form LL_CC. The class name is determined by appending the locale name to the resource name, separated with an underscore. The class is located using the `CLASSPATH'. Output file location -------------------- `-o FILE' `--output-file=FILE' Write output to specified file. The results are written to standard output if no output file is specified or if it is `-'. Output details -------------- `--force-po' Always write an output file even if it contains no message. `-i' `--indent' Write the .po file using indented style. `--strict' Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions. `-w NUMBER' `--width=NUMBER' Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given NUMBER. `-s' `--sort-output' Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context. Informative output ------------------ `-h' `--help' Display this help and exit. `-V' `--version' Output version information and exit. `-v' `--verbose' Increase verbosity level.  File: gettext.info, Node: MO Files, Prev: msgunfmt Invocation, Up: Binaries The Format of GNU MO Files ========================== The format of the generated MO files is best described by a picture, which appears below. The first two words serve the identification of the file. The magic number will always signal GNU MO files. The number is stored in the byte order of the generating machine, so the magic number really is two numbers: `0x950412de' and `0xde120495'. The second word describes the current revision of the file format. For now the revision is 0. This might change in future versions, and ensures that the readers of MO files can distinguish new formats from old ones, so that both can be handled correctly. The version is kept separate from the magic number, instead of using different magic numbers for different formats, mainly because `/etc/magic' is not updated often. It might be better to have magic separated from internal format version identification. Follow a number of pointers to later tables in the file, allowing for the extension of the prefix part of MO files without having to recompile programs reading them. This might become useful for later inserting a few flag bits, indication about the charset used, new tables, or other things. Then, at offset O and offset T in the picture, two tables of string descriptors can be found. In both tables, each string descriptor uses two 32 bits integers, one for the string length, another for the offset of the string in the MO file, counting in bytes from the start of the file. The first table contains descriptors for the original strings, and is sorted so the original strings are in increasing lexicographical order. The second table contains descriptors for the translated strings, and is parallel to the first table: to find the corresponding translation one has to access the array slot in the second array with the same index. Having the original strings sorted enables the use of simple binary search, for when the MO file does not contain an hashing table, or for when it is not practical to use the hashing table provided in the MO file. This also has another advantage, as the empty string in a PO file GNU `gettext' is usually _translated_ into some system information attached to that particular MO file, and the empty string necessarily becomes the first in both the original and translated tables, making the system information very easy to find. The size S of the hash table can be zero. In this case, the hash table itself is not contained in the MO file. Some people might prefer this because a precomputed hashing table takes disk space, and does not win _that_ much speed. The hash table contains indices to the sorted array of strings in the MO file. Conflict resolution is done by double hashing. The precise hashing algorithm used is fairly dependent of GNU `gettext' code, and is not documented here. As for the strings themselves, they follow the hash file, and each is terminated with a , and this is not counted in the length which appears in the string descriptor. The `msgfmt' program has an option selecting the alignment for MO file strings. With this option, each string is separately aligned so it starts at an offset which is a multiple of the alignment value. On some RISC machines, a correct alignment will speed things up. Plural forms are stored by letting the plural of the original string follow the singular of the original string, separated through a byte. The length which appears in the string descriptor includes both. However, only the singular of the original string takes part in the hash table lookup. The plural variants of the translation are all stored consecutively, separated through a byte. Here also, the length in the string descriptor includes all of them. Nothing prevents a MO file from having embedded s in strings. However, the program interface currently used already presumes that strings are terminated, so embedded s are somewhat useless. But the MO file format is general enough so other interfaces would be later possible, if for example, we ever want to implement wide characters right in MO files, where bytes may accidently appear. (No, we don't want to have wide characters in MO files. They would make the file unnecessarily large, and the `wchar_t' type being platform dependent, MO files would be platform dependent as well.) This particular issue has been strongly debated in the GNU `gettext' development forum, and it is expectable that MO file format will evolve or change over time. It is even possible that many formats may later be supported concurrently. But surely, we have to start somewhere, and the MO file format described here is a good start. Nothing is cast in concrete, and the format may later evolve fairly easily, so we should feel comfortable with the current approach. byte +------------------------------------------+ 0 | magic number = 0x950412de | | | 4 | file format revision = 0 | | | 8 | number of strings | == N | | 12 | offset of table with original strings | == O | | 16 | offset of table with translation strings | == T | | 20 | size of hashing table | == S | | 24 | offset of hashing table | == H | | . . . (possibly more entries later) . . . | | O | length & offset 0th string ----------------. O + 8 | length & offset 1st string ------------------. ... ... | | O + ((N-1)*8)| length & offset (N-1)th string | | | | | | | T | length & offset 0th translation ---------------. T + 8 | length & offset 1st translation -----------------. ... ... | | | | T + ((N-1)*8)| length & offset (N-1)th translation | | | | | | | | | | | H | start hash table | | | | | ... ... | | | | H + S * 4 | end hash table | | | | | | | | | | | | NUL terminated 0th string <----------------' | | | | | | | | | NUL terminated 1st string <------------------' | | | | | | ... ... | | | | | | | NUL terminated 0th translation <---------------' | | | | | NUL terminated 1st translation <-----------------' | | ... ... | | +------------------------------------------+  File: gettext.info, Node: Users, Next: Programmers, Prev: Binaries, Up: Top The User's View *************** When GNU `gettext' will truly have reached its goal, average users should feel some kind of astonished pleasure, seeing the effect of that strange kind of magic that just makes their own native language appear everywhere on their screens. As for naive users, they would ideally have no special pleasure about it, merely taking their own language for _granted_, and becoming rather unhappy otherwise. So, let's try to describe here how we would like the magic to operate, as we want the users' view to be the simplest, among all ways one could look at GNU `gettext'. All other software engineers: programmers, translators, maintainers, should work together in such a way that the magic becomes possible. This is a long and progressive undertaking, and information is available about the progress of the Translation Project. When a package is distributed, there are two kinds of users: "installers" who fetch the distribution, unpack it, configure it, compile it and install it for themselves or others to use; and "end users" that call programs of the package, once these have been installed at their site. GNU `gettext' is offering magic for both installers and end users. * Menu: * Matrix:: The Current `ABOUT-NLS' Matrix * Installers:: Magic for Installers * End Users:: Magic for End Users  File: gettext.info, Node: Matrix, Next: Installers, Prev: Users, Up: Users The Current `ABOUT-NLS' Matrix ============================== Languages are not equally supported in all packages using GNU `gettext'. To know if some package uses GNU `gettext', one may check the distribution for the `ABOUT-NLS' information file, for some `LL.po' files, often kept together into some `po/' directory, or for an `intl/' directory. Internationalized packages have usually many `LL.po' files, where LL represents the language. *Note End Users:: for a complete description of the format for LL. More generally, a matrix is available for showing the current state of the Translation Project, listing which packages are prepared for multi-lingual messages, and which languages are supported by each. Because this information changes often, this matrix is not kept within this GNU `gettext' manual. This information is often found in file `ABOUT-NLS' from various distributions, but is also as old as the distribution itself. A recent copy of this `ABOUT-NLS' file, containing up-to-date information, should generally be found on the Translation Project sites, and also on most GNU archive sites.  File: gettext.info, Node: Installers, Next: End Users, Prev: Matrix, Up: Users Magic for Installers ==================== By default, packages fully using GNU `gettext', internally, are installed in such a way that they to allow translation of messages. At _configuration_ time, those packages should automatically detect whether the underlying host system already provides the GNU `gettext' functions. If not, the GNU `gettext' library should be automatically prepared and used. Installers may use special options at configuration time for changing this behavior. The command `./configure --with-included-gettext' bypasses system `gettext' to use the included GNU `gettext' instead, while `./configure --disable-nls' produces programs totally unable to translate messages. Internationalized packages have usually many `LL.po' files. Unless translations are disabled, all those available are installed together with the package. However, the environment variable `LINGUAS' may be set, prior to configuration, to limit the installed set. `LINGUAS' should then contain a space separated list of two-letter codes, stating which languages are allowed.  File: gettext.info, Node: End Users, Prev: Installers, Up: Users Magic for End Users =================== We consider here those packages using GNU `gettext' internally, and for which the installers did not disable translation at _configure_ time. Then, users only have to set the `LANG' environment variable to the appropriate `LL_CC' combination prior to using the programs in the package. *Note Matrix::. For example, let's presume a German site. At the shell prompt, users merely have to execute `setenv LANG de_DE' (in `csh') or `export LANG; LANG=de_DE' (in `sh'). They could even do this from their `.login' or `.profile' file.  File: gettext.info, Node: Programmers, Next: Translators, Prev: Users, Up: Top The Programmer's View ********************* One aim of the current message catalog implementation provided by GNU `gettext' was to use the systems message catalog handling, if the installer wishes to do so. So we perhaps should first take a look at the solutions we know about. The people in the POSIX committee did not manage to agree on one of the semi-official standards which we'll describe below. In fact they couldn't agree on anything, so they decided only to include an example of an interface. The major Unix vendors are split in the usage of the two most important specifications: X/Open's catgets vs. Uniforum's gettext interface. We'll describe them both and later explain our solution of this dilemma. * Menu: * catgets:: About `catgets' * gettext:: About `gettext' * Comparison:: Comparing the two interfaces * Using libintl.a:: Using libintl.a in own programs * gettext grok:: Being a `gettext' grok * Temp Programmers:: Temporary Notes for the Programmers Chapter