This is gettext.info, produced by makeinfo version 4.0 from
gettext.texi.

INFO-DIR-SECTION GNU Gettext Utilities
START-INFO-DIR-ENTRY
* Gettext: (gettext).                           GNU gettext utilities.
* gettextize: (gettext)gettextize Invocation.   Prepare a package for gettext.
* msgfmt: (gettext)msgfmt Invocation.           Make MO files out of PO files.
* msgmerge: (gettext)msgmerge Invocation.       Update two PO files into one.
* xgettext: (gettext)xgettext Invocation.       Extract strings into a PO file.
END-INFO-DIR-ENTRY

   This file provides documentation for GNU `gettext' utilities.  It
also serves as a reference for the free Translation Project.

   Copyright (C) 1995, 1996, 1997, 1998, 2001 Free Software Foundation,
Inc.

   Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.

   Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided that
the entire resulting derived work is distributed under the terms of a
permission notice identical to this one.

   Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that this permission notice may be stated in a
translation approved by the Foundation.


File: gettext.info,  Node: Manipulating,  Next: Binaries,  Prev: Updating,  Up: Top

Manipulating PO Files
*********************

   Sometimes it is necessary to manipulate PO files in a way that is
better performed automatically than by hand.  GNU `gettext' includes a
complete set of tools for this purpose.

   When merging two packages into a single package, the resulting POT
file will be the concatenation of the two packages' POT files.  Thus the
maintainer must concatenate the two existing package translations into
a single translation catalog, for each language.  This is best performed
using `msgcat'.  It is then the translators' duty to deal with any
possible conflicts that arose during the merge.

   When a translator takes over the translation job from another
translator, but she uses a different character encoding in her locale,
she will convert the catalog to her character encoding.  This is best
done through the `msgconv' program.

   When a maintainer takes a source file with tagged messages from
another package, he should also take the existing translations for this
source file (and not let the translators do the same job twice).  One
way to do this is through `msggrep', another is to create a POT file for
that source file and use `msgmerge'.

   When a translator wants to adjust some translation catalog for a
special dialect or orthography - for example, German as written in
Switzerland versus German as written in Germany -, she needs to apply
some text processing to every message in the catalog.  The tool for
doing this is `msgexec'.

   Another use of `msgexec' is to produce approximately the POT file for
which a given PO file was made.  This can be done through a filter
command like `msgexec sed -e d | sed -e '/^# /d''.  Note that the
original POT file may have had different comments and different plural
message counts, that's why it's better to use the original POT file if
available.

   When third party tools create PO or POT files, sometimes duplicates
cannot be avoided.  But the GNU `gettext' tools give an error when they
encounter duplicate msgids in the same file and in the same domain.  To
merge duplicates, the `msguniq' program can be used.

   `msgcomm' is a more general tool for keeping or throwing away
duplicates, occurring in different files.

   `msgcmp' can be used to check whether a translation catalog is
completely translated.

   `msgattrib' can be used to select and extract only the fuzzy or
untranslated messages of a translation catalog.

   `msgen' is useful as a first step for preparing English translation
catalogs.  It copies each message's msgid to its msgstr.

* Menu:

* msgcat Invocation::           Invoking the `msgcat' Program
* msgconv Invocation::          Invoking the `msgconv' Program
* msggrep Invocation::          Invoking the `msggrep' Program
* msgexec Invocation::          Invoking the `msgexec' Program
* msguniq Invocation::          Invoking the `msguniq' Program
* msgcomm Invocation::          Invoking the `msgcomm' Program
* msgcmp Invocation::           Invoking the `msgcmp' Program
* msgattrib Invocation::        Invoking the `msgattrib' Program
* msgen Invocation::            Invoking the `msgen' Program


File: gettext.info,  Node: msgcat Invocation,  Next: msgconv Invocation,  Prev: Manipulating,  Up: Manipulating

Invoking the `msgcat' Program
=============================

     msgcat [OPTION] [INPUTFILE]...

   The `msgcat' program concatenates and merges the specified PO files.
It finds messages which are common to two or more of the specified PO
files.  By using the `--more-than' option, greater commonality may be
requested before messages are printed.  Conversely, the `--less-than'
option may be used to specify less commonality before messages are
printed (i.e.  `--less-than=2' will only print the unique messages).
Translations, comments and extract comments will be cumulated, except
that if `--use-first' is specified, they will be taken from the first
PO file to define them.  File positions from all PO files will be
cumulated.

Input file location
-------------------

`INPUTFILE ...'
     Input files.

`-f FILE'
`--files-from=FILE'
     Read the names of the input files from FILE instead of getting
     them from the command line.

`-D DIRECTORY'
`--directory=DIRECTORY'
     Add DIRECTORY to the list of directories.  Source files are
     searched relative to this list of directories.  The resulting `.po'
     file will be written relative to the current directory, though.

   If INPUTFILE is `-', standard input is read.

Output file location
--------------------

`-o FILE'
`--output-file=FILE'
     Write output to specified file.

   The results are written to standard output if no output file is
specified or if it is `-'.

Message selection
-----------------

`-< NUMBER'
`--less-than=NUMBER'
     Print messages with less than NUMBER definitions, defaults to
     infinite if not set.

`-> NUMBER'
`--more-than=NUMBER'
     Print messages with more than NUMBER definitions, defaults to 0 if
     not set.

`-u'
`--unique'
     Shorthand for `--less-than=2'.  Requests that only unique messages
     be printed.

Output details
--------------

`-t'
`--to-code=NAME'
     Specify encoding for output.

`--use-first'
     Use first available translation for each message.  Don't merge
     several translations into one.

`--force-po'
     Always write an output file even if it contains no message.

`-i'
`--indent'
     Write the .po file using indented style.

`--no-location'
     Do not write `#: FILENAME:LINE' lines.

`-n'
`--add-location'
     Generate `#: FILENAME:LINE' lines (default).

`--strict'
     Write out a strict Uniforum conforming PO file.  Note that this
     Uniforum format should be avoided because it doesn't support the
     GNU extensions.

`-w NUMBER'
`--width=NUMBER'
     Set the output page width.  Long strings in the output files will
     be split across multiple lines in order to ensure that each line's
     width (= number of screen columns) is less or equal to the given
     NUMBER.

`-s'
`--sort-output'
     Generate sorted output.  Note that using this option makes it much
     harder for the translator to understand each message's context.

`-F'
`--sort-by-file'
     Sort output by file location.

Informative output
------------------

`-h'
`--help'
     Display this help and exit.

`-V'
`--version'
     Output version information and exit.


File: gettext.info,  Node: msgconv Invocation,  Next: msggrep Invocation,  Prev: msgcat Invocation,  Up: Manipulating

Invoking the `msgconv' Program
==============================

     msgconv [OPTION] [INPUTFILE]

   The `msgconv' program converts a translation catalog to a different
character encoding.

Input file location
-------------------

`INPUTFILE'
     Input PO file.

`-D DIRECTORY'
`--directory=DIRECTORY'
     Add DIRECTORY to the list of directories.  Source files are
     searched relative to this list of directories.  The resulting `.po'
     file will be written relative to the current directory, though.

   If no INPUTFILE is given or if it is `-', standard input is read.

Output file location
--------------------

`-o FILE'
`--output-file=FILE'
     Write output to specified file.

   The results are written to standard output if no output file is
specified or if it is `-'.

Conversion target
-----------------

`-t'
`--to-code=NAME'
     Specify encoding for output.

   The default encoding is the current locale's encoding.

Output details
--------------

`--force-po'
     Always write an output file even if it contains no message.

`-i'
`--indent'
     Write the .po file using indented style.

`--no-location'
     Do not write `#: FILENAME:LINE' lines.

`--add-location'
     Generate `#: FILENAME:LINE' lines (default).

`--strict'
     Write out a strict Uniforum conforming PO file.  Note that this
     Uniforum format should be avoided because it doesn't support the
     GNU extensions.

`-w NUMBER'
`--width=NUMBER'
     Set the output page width.  Long strings in the output files will
     be split across multiple lines in order to ensure that each line's
     width (= number of screen columns) is less or equal to the given
     NUMBER.

`-s'
`--sort-output'
     Generate sorted output.  Note that using this option makes it much
     harder for the translator to understand each message's context.

`-F'
`--sort-by-file'
     Sort output by file location.

Informative output
------------------

`-h'
`--help'
     Display this help and exit.

`-V'
`--version'
     Output version information and exit.


File: gettext.info,  Node: msggrep Invocation,  Next: msgexec Invocation,  Prev: msgconv Invocation,  Up: Manipulating

Invoking the `msggrep' Program
==============================

     msggrep [OPTION] [INPUTFILE]

   The `msggrep' program extracts all messages of a translation catalog
that match a given pattern or belong to some given source files.

Input file location
-------------------

`INPUTFILE'
     Input PO file.

`-D DIRECTORY'
`--directory=DIRECTORY'
     Add DIRECTORY to the list of directories.  Source files are
     searched relative to this list of directories.  The resulting `.po'
     file will be written relative to the current directory, though.

   If no INPUTFILE is given or if it is `-', standard input is read.

Output file location
--------------------

`-o FILE'
`--output-file=FILE'
     Write output to specified file.

   The results are written to standard output if no output file is
specified or if it is `-'.

Message selection
-----------------

       [-N SOURCEFILE]... [-M DOMAINNAME]... [-K MSGID-PATTERN] [-T MSGSTR-PATTERN]

   A message is selected if
   * it comes from one of the specified source files,

   * or if it comes from one of the specified domains,

   * or if `-K' is given and its key (msgid or msgid_plural) matches
       MSGID-PATTERN,

   * or if `-T' is given and its translation (msgstr) matches
     MSGSTR-PATTERN.

   When more than one selection criterion is specified, the set of
selected messages is the union of the selected messages of each
criterion.

   MSGID-PATTERN or MSGSTR-PATTERN syntax:
       [-E | -F] [-e PATTERN | -f FILE]...
   PATTERNs are basic regular expressions by default, or extended
regular expressions if -E is given, or fixed strings if -F is given.

`-N SOURCEFILE'
`--location=SOURCEFILE'
     Select messages extracted from SOURCEFILE.

`-M DOMAINNAME'
`--domain=DOMAINNAME'
     Select messages belonging to domain DOMAINNAME.

`-K'
`--msgid'
     Start of patterns for the msgid.

`-T'
`--msgstr'
     Start of patterns for the msgstr.

`-E'
`--extended-regexp'
     Specify that PATTERN is an extended regular expression.

`-F'
`--fixed-strings'
     Specify that PATTERN is a set of newline-separated strings.

`-e PATTERN'
`--regexp=PATTERN'
     Use PATTERN as a regular expression.

`-f FILE'
`--file=FILE'
     Obtain PATTERN from FILE.

`-i'
`--ignore-case'
     Ignore case distinctions.

Output details
--------------

`--force-po'
     Always write an output file even if it contains no message.

`--indent'
     Write the .po file using indented style.

`--no-location'
     Do not write `#: FILENAME:LINE' lines.

`--add-location'
     Generate `#: FILENAME:LINE' lines (default).

`--strict'
     Write out a strict Uniforum conforming PO file.  Note that this
     Uniforum format should be avoided because it doesn't support the
     GNU extensions.

`-w NUMBER'
`--width=NUMBER'
     Set the output page width.  Long strings in the output files will
     be split across multiple lines in order to ensure that each line's
     width (= number of screen columns) is less or equal to the given
     NUMBER.

`--sort-output'
     Generate sorted output.  Note that using this option makes it much
     harder for the translator to understand each message's context.

`--sort-by-file'
     Sort output by file location.

Informative output
------------------

`-h'
`--help'
     Display this help and exit.

`-V'
`--version'
     Output version information and exit.


File: gettext.info,  Node: msgexec Invocation,  Next: msguniq Invocation,  Prev: msggrep Invocation,  Up: Manipulating

Invoking the `msgexec' Program
==============================

     msgexec [OPTION] FILTER [FILTER-OPTION]

   The `msgexec' program applies a filter to all translations of a
translation catalog.

Input file location
-------------------

`-i INPUTFILE'
`--input=INPUTFILE'
     Input PO file.

`-D DIRECTORY'
`--directory=DIRECTORY'
     Add DIRECTORY to the list of directories.  Source files are
     searched relative to this list of directories.  The resulting `.po'
     file will be written relative to the current directory, though.

   If no INPUTFILE is given or if it is `-', standard input is read.

Output file location
--------------------

`-o FILE'
`--output-file=FILE'
     Write output to specified file.

   The results are written to standard output if no output file is
specified or if it is `-'.

The filter
----------

   The FILTER can be any program that reads a translation from standard
input and writes a modified translation to standard output.  A
frequently used filter is `sed'.

Useful FILTER-OPTIONs when the FILTER is `sed'
----------------------------------------------

`-e SCRIPT'
`--expression=SCRIPT'
     Add SCRIPT to the commands to be executed.

`-f SCRIPTFILE'
`--file=SCRIPTFILE'
     Add the contents of SCRIPTFILE to the commands to be executed.

`-n'
`--quiet'
`--silent'
     Suppress automatic printing of pattern space.

Output details
--------------

`--force-po'
     Always write an output file even if it contains no message.

`--indent'
     Write the .po file using indented style.

`--no-location'
     Do not write `#: FILENAME:LINE' lines.

`--add-location'
     Generate `#: FILENAME:LINE' lines (default).

`--strict'
     Write out a strict Uniforum conforming PO file.  Note that this
     Uniforum format should be avoided because it doesn't support the
     GNU extensions.

`-w NUMBER'
`--width=NUMBER'
     Set the output page width.  Long strings in the output files will
     be split across multiple lines in order to ensure that each line's
     width (= number of screen columns) is less or equal to the given
     NUMBER.

`-s'
`--sort-output'
     Generate sorted output.  Note that using this option makes it much
     harder for the translator to understand each message's context.

`-F'
`--sort-by-file'
     Sort output by file location.

Informative output
------------------

`-h'
`--help'
     Display this help and exit.

`-V'
`--version'
     Output version information and exit.


File: gettext.info,  Node: msguniq Invocation,  Next: msgcomm Invocation,  Prev: msgexec Invocation,  Up: Manipulating

Invoking the `msguniq' Program
==============================

     msguniq [OPTION] [INPUTFILE]

   The `msguniq' program unifies duplicate translations in a translation
catalog.  It finds duplicate translations of the same message ID.  Such
duplicates are invalid input for other programs like `msgfmt',
`msgmerge' or `msgcat'.  By default, duplicates are merged together.
When using the `--repeated' option, only duplicates are output, and all
other messages are discarded.  Comments and extracted comments will be
cumulated, except that if `--use-first' is specified, they will be
taken from the first translation.  File positions will be cumulated.
When using the `--unique' option, duplicates are discarded.

Input file location
-------------------

`INPUTFILE'
     Input PO file.

`-D DIRECTORY'
`--directory=DIRECTORY'
     Add DIRECTORY to the list of directories.  Source files are
     searched relative to this list of directories.  The resulting `.po'
     file will be written relative to the current directory, though.

   If no INPUTFILE is given or if it is `-', standard input is read.

Output file location
--------------------

`-o FILE'
`--output-file=FILE'
     Write output to specified file.

   The results are written to standard output if no output file is
specified or if it is `-'.

Message selection
-----------------

`-d'
`--repeated'
     Print only duplicates.

`-u'
`--unique'
     Print only unique messages, discard duplicates.

Output details
--------------

`-t'
`--to-code=NAME'
     Specify encoding for output.

`--use-first'
     Use first available translation for each message.  Don't merge
     several translations into one.

`--force-po'
     Always write an output file even if it contains no message.

`-i'
`--indent'
     Write the .po file using indented style.

`--no-location'
     Do not write `#: FILENAME:LINE' lines.

`-n'
`--add-location'
     Generate `#: FILENAME:LINE' lines (default).

`--strict'
     Write out a strict Uniforum conforming PO file.  Note that this
     Uniforum format should be avoided because it doesn't support the
     GNU extensions.

`-w NUMBER'
`--width=NUMBER'
     Set the output page width.  Long strings in the output files will
     be split across multiple lines in order to ensure that each line's
     width (= number of screen columns) is less or equal to the given
     NUMBER.

`-s'
`--sort-output'
     Generate sorted output.  Note that using this option makes it much
     harder for the translator to understand each message's context.

`-F'
`--sort-by-file'
     Sort output by file location.

Informative output
------------------

`-h'
`--help'
     Display this help and exit.

`-V'
`--version'
     Output version information and exit.


File: gettext.info,  Node: msgcomm Invocation,  Next: msgcmp Invocation,  Prev: msguniq Invocation,  Up: Manipulating

Invoking the `msgcomm' Program
==============================

     msgcomm [OPTION] [INPUTFILE]...

   The `msgcomm' program finds messages which are common to two or more
of the specified PO files.  By using the `--more-than' option, greater
commonality may be requested before messages are printed.  Conversely,
the `--less-than' option may be used to specify less commonality before
messages are printed (i.e.  `--less-than=2' will only print the unique
messages).  Translations, comments and extract comments will be
preserved, but only from the first PO file to define them.  File
positions from all PO files will be cumulated.

Input file location
-------------------

`INPUTFILE ...'
     Input files.

`-f FILE'
`--files-from=FILE'
     Read the names of the input files from FILE instead of getting
     them from the command line.

`-D DIRECTORY'
`--directory=DIRECTORY'
     Add DIRECTORY to the list of directories.  Source files are
     searched relative to this list of directories.  The resulting `.po'
     file will be written relative to the current directory, though.

   If INPUTFILE is `-', standard input is read.

Output file location
--------------------

`-o FILE'
`--output-file=FILE'
     Write output to specified file.

   The results are written to standard output if no output file is
specified or if it is `-'.

Message selection
-----------------

`-< NUMBER'
`--less-than=NUMBER'
     Print messages with less than NUMBER definitions, defaults to
     infinite if not set.

`-> NUMBER'
`--more-than=NUMBER'
     Print messages with more than NUMBER definitions, defaults to 1 if
     not set.

`-u'
`--unique'
     Shorthand for `--less-than=2'.  Requests that only unique messages
     be printed.

Output details
--------------

`--force-po'
     Always write an output file even if it contains no message.

`-i'
`--indent'
     Write the .po file using indented style.

`--no-location'
     Do not write `#: FILENAME:LINE' lines.

`-n'
`--add-location'
     Generate `#: FILENAME:LINE' lines (default).

`--strict'
     Write out a strict Uniforum conforming PO file.  Note that this
     Uniforum format should be avoided because it doesn't support the
     GNU extensions.

`-w NUMBER'
`--width=NUMBER'
     Set the output page width.  Long strings in the output files will
     be split across multiple lines in order to ensure that each line's
     width (= number of screen columns) is less or equal to the given
     NUMBER.

`-s'
`--sort-output'
     Generate sorted output.  Note that using this option makes it much
     harder for the translator to understand each message's context.

`-F'
`--sort-by-file'
     Sort output by file location.

`--omit-header'
     Don't write header with `msgid ""' entry.

Informative output
------------------

`-h'
`--help'
     Display this help and exit.

`-V'
`--version'
     Output version information and exit.


File: gettext.info,  Node: msgcmp Invocation,  Next: msgattrib Invocation,  Prev: msgcomm Invocation,  Up: Manipulating

Invoking the `msgcmp' Program
=============================

     msgcmp [OPTION] DEF.po REF.pot

   The `msgcmp' program compares two Uniforum style .po files to check
that both contain the same set of msgid strings.  The DEF.po file is an
existing PO file with the translations.  The REF.pot file is the last
created PO file, or a PO Template file (generally created by
`xgettext').  This is useful for checking that you have translated each
and every message in your program.  Where an exact match cannot be
found, fuzzy matching is used to produce better diagnostics.

Input file location
-------------------

`DEF.po'
     Translations.

`REF.pot'
     References to the sources.

`-D DIRECTORY'
`--directory=DIRECTORY'
     Add DIRECTORY to the list of directories.  Source files are
     searched relative to this list of directories.

Operation modifiers
-------------------

`-m'
`--multi-domain'
     Apply REF.pot to each of the domains in DEF.po.

Informative output
------------------

`-h'
`--help'
     Display this help and exit.

`-V'
`--version'
     Output version information and exit.


File: gettext.info,  Node: msgattrib Invocation,  Next: msgen Invocation,  Prev: msgcmp Invocation,  Up: Manipulating

Invoking the `msgattrib' Program
================================

     msgattrib [OPTION] [INPUTFILE]

   The `msgattrib' program filters the messages of a translation catalog
according to their attributes, and manipulates the attributes.

Input file location
-------------------

`INPUTFILE'
     Input PO file.

`-D DIRECTORY'
`--directory=DIRECTORY'
     Add DIRECTORY to the list of directories.  Source files are
     searched relative to this list of directories.  The resulting `.po'
     file will be written relative to the current directory, though.

   If no INPUTFILE is given or if it is `-', standard input is read.

Output file location
--------------------

`-o FILE'
`--output-file=FILE'
     Write output to specified file.

   The results are written to standard output if no output file is
specified or if it is `-'.

Message selection
-----------------

`--translated'
     Keep translated messages, remove untranslated messages.

`--untranslated'
     Keep untranslated messages, remove translated messages.

`--no-fuzzy'
     Remove `fuzzy' marked messages.

`--only-fuzzy'
     Keep `fuzzy' marked messages, remove all other messsages.

`--no-obsolete'
     Remove obsolete #~ messages.

`--only-obsolete'
     Keep obsolete #~ messages, remove all other messages.

Attribute manipulation
----------------------

   Attributes are modified after the message selection/removal has been
performed.

`--set-fuzzy'
     Set all messages `fuzzy'.

`--clear-fuzzy'
     Set all messages non-`fuzzy'.

`--set-obsolete'
     Set all messages obsolete.

`--clear-obsolete'
     Set all messages non-obsolete.

`--fuzzy'
     Synonym for `--only-fuzzy --clear-fuzzy': It keeps only the fuzzy
     messages and removes their `fuzzy' mark.

`--obsolete'
     Synonym for `--only-obsolete --clear-obsolete': It keeps only the
     obsolete messages and makes them non-obsolete.

Output details
--------------

`--force-po'
     Always write an output file even if it contains no message.

`-i'
`--indent'
     Write the .po file using indented style.

`--no-location'
     Do not write `#: FILENAME:LINE' lines.

`-n'
`--add-location'
     Generate `#: FILENAME:LINE' lines (default).

`--strict'
     Write out a strict Uniforum conforming PO file.  Note that this
     Uniforum format should be avoided because it doesn't support the
     GNU extensions.

`-w NUMBER'
`--width=NUMBER'
     Set the output page width.  Long strings in the output files will
     be split across multiple lines in order to ensure that each line's
     width (= number of screen columns) is less or equal to the given
     NUMBER.

`-s'
`--sort-output'
     Generate sorted output.  Note that using this option makes it much
     harder for the translator to understand each message's context.

`-F'
`--sort-by-file'
     Sort output by file location.

Informative output
------------------

`-h'
`--help'
     Display this help and exit.

`-V'
`--version'
     Output version information and exit.


File: gettext.info,  Node: msgen Invocation,  Prev: msgattrib Invocation,  Up: Manipulating

Invoking the `msgen' Program
============================

     msgen [OPTION] INPUTFILE

   The `msgen' program creates an English translation catalog.  The
input file is the last created English PO file, or a PO Template file
(generally created by xgettext).  Untranslated entries are assigned a
translation that is identical to the msgid, and are marked fuzzy.

Input file location
-------------------

`INPUTFILE'
     Input PO or POT file.

`-D DIRECTORY'
`--directory=DIRECTORY'
     Add DIRECTORY to the list of directories.  Source files are
     searched relative to this list of directories.  The resulting `.po'
     file will be written relative to the current directory, though.

   If INPUTFILE is `-', standard input is read.

Output file location
--------------------

`-o FILE'
`--output-file=FILE'
     Write output to specified file.

   The results are written to standard output if no output file is
specified or if it is `-'.

Output details
--------------

`--force-po'
     Always write an output file even if it contains no message.

`-i'
`--indent'
     Write the .po file using indented style.

`--no-location'
     Do not write `#: FILENAME:LINE' lines.

`--add-location'
     Generate `#: FILENAME:LINE' lines (default).

`--strict'
     Write out a strict Uniforum conforming PO file.  Note that this
     Uniforum format should be avoided because it doesn't support the
     GNU extensions.

`-w NUMBER'
`--width=NUMBER'
     Set the output page width.  Long strings in the output files will
     be split across multiple lines in order to ensure that each line's
     width (= number of screen columns) is less or equal to the given
     NUMBER.

`-s'
`--sort-output'
     Generate sorted output.  Note that using this option makes it much
     harder for the translator to understand each message's context.

`-F'
`--sort-by-file'
     Sort output by file location.

Informative output
------------------

`-h'
`--help'
     Display this help and exit.

`-V'
`--version'
     Output version information and exit.


File: gettext.info,  Node: Binaries,  Next: Users,  Prev: Manipulating,  Up: Top

Producing Binary MO Files
*************************

* Menu:

* msgfmt Invocation::           Invoking the `msgfmt' Program
* msgunfmt Invocation::         Invoking the `msgunfmt' Program
* MO Files::                    The Format of GNU MO Files


File: gettext.info,  Node: msgfmt Invocation,  Next: msgunfmt Invocation,  Prev: Binaries,  Up: Binaries

Invoking the `msgfmt' Program
=============================

     msgfmt [OPTION] FILENAME.po ...

   The `msgfmt' programs generates a binary message catalog from a
textual translation description.

Input file location
-------------------

`FILENAME.po ...'

`-D DIRECTORY'
`--directory=DIRECTORY'
     Add DIRECTORY to the list of directories.  Source files are
     searched relative to this list of directories.  The resulting `.po'
     file will be written relative to the current directory, though.

   If an input file is `-', standard input is read.

Operation mode
--------------

`-j'
`--java'
     Java mode: generate a Java `ResourceBundle' class.

`--java2'
     Like -java, and assume Java2 (JDK 1.2 or higher).

Output file location
--------------------

`-o FILE'
`--output-file=FILE'
     Write output to specified file.

`--strict'
     Direct the program to work strictly following the Uniforum/Sun
     implementation.  Currently this only affects the naming of the
     output file.  If this option is not given the name of the output
     file is the same as the domain name.  If the strict Uniforum mode
     is enabled the suffix `.mo' is added to the file name if it is not
     already present.

     We find this behaviour of Sun's implementation rather silly and so
     by default this mode is _not_ selected.

   If the output FILE is `-', output is written to standard output.

Output file location in Java mode
---------------------------------

`-r RESOURCE'
`--resource=RESOURCE'
     Specify the resource name.

`-l LOCALE'
`--locale=LOCALE'
     Specify the locale name, either a language specification of the
     form LL or a combined language and country specification of the
     form LL_CC.

`-d DIRECTORY'
     Specify the base directory of classes directory hierarchy.

   The class name is determined by appending the locale name to the
resource name, separated with an underscore.  The `-d' option is
mandatory.  The class is written under the specified directory.

Input file interpretation
-------------------------

`-c'
`--check'
     Perform all the checks implied by `--check-format',
     `--check-header', `--check-domain'.

`--check-format'
     Check language dependent format strings.

     If the string represents a format string used in a `printf'-like
     function both strings should have the same number of `%' format
     specifiers, with matching types.  If the flag `c-format' or
     `possible-c-format' appears in the special comment <#,> for this
     entry a check is performed.  For example, the check will diagnose
     using `%.*s' against `%s', or `%d' against `%s', or `%d' against
     `%x'.  It can even handle positional parameters.

     Normally the `xgettext' program automatically decides whether a
     string is a format string or not.  This algorithm is not perfect,
     though.  It might regard a string as a format string though it is
     not used in a `printf'-like function and so `msgfmt' might report
     errors where there are none.

     To solve this problem the programmer can dictate the decision to
     the `xgettext' program (*note c-format::).  The translator should
     not consider removing the flag from the <#,> line.  This "fix"
     would be reversed again as soon as `msgmerge' is called the next
     time.

`--check-header'
     Verify presence and contents of the header entry.  *Note Header
     Entry::, for a description of the various fields in the header
     entry.

`--check-domain'
     Check for conflicts between domain directives and the
     `--output-file' option

`-C'
`--check-compatibility'
     Check that GNU msgfmt behaves like X/Open msgfmt.  This will give
     an error when attempting to use the GNU extensions.

`--check-accelerators[=CHAR]'
     Check presence of keyboard accelerators for menu items.  This is
     based on the convention used in some GUIs that a keyboard
     accelerator in a menu item string is designated by an immediately
     preceding `&' character.  Sometimes a keyboard accelerator is also
     called "keyboard mnemonic".  This check verifies that if the
     untranslated string has exactly one `&' character, the translated
     string has exactly one `&' as well.  If this option is given with
     a CHAR argument, this CHAR should be a non-alphanumeric character
     and is used as keyboard acceleator mark instead of `&'.

`-f'
`--use-fuzzy'
     Use fuzzy entries in output.  Note that using this option is
     usually wrong, because fuzzy messages are exactly those which have
     not been validated by a human translator.

Output details
--------------

`-a NUMBER'
`--alignment=NUMBER'
     Align strings to NUMBER bytes (default: 1).

`--no-hash'
     Don't include a hash table in the binary file.  Lookup will be
     more expensive at run time (binary search instead of hash table
     lookup).

Informative output
------------------

`-h'
`--help'
     Display this help and exit.

`-V'
`--version'
     Output version information and exit.

`--statistics'
     Print statistics about translations.

`-v'
`--verbose'
     Increase verbosity level.


File: gettext.info,  Node: msgunfmt Invocation,  Next: MO Files,  Prev: msgfmt Invocation,  Up: Binaries

Invoking the `msgunfmt' Program
===============================

     msgunfmt [OPTION] [FILE]...

   The `msgunfmt' program converts a binary message catalog to a
Uniforum style .po file.

Operation mode
--------------

`-j'
`--java'
     Java mode: generate a Java `ResourceBundle' class.

Input file location
-------------------

`FILE ...'
     Input .mo files.

   If no input FILE is given or if it is `-', standard input is read.

Input file location in Java mode
--------------------------------

`-r RESOURCE'
`--resource=RESOURCE'
     Specify the resource name.

`-l LOCALE'
`--locale=LOCALE'
     Specify the locale name, either a language specification of the
     form LL or a combined language and country specification of the
     form LL_CC.

   The class name is determined by appending the locale name to the
resource name, separated with an underscore.  The class is located
using the `CLASSPATH'.

Output file location
--------------------

`-o FILE'
`--output-file=FILE'
     Write output to specified file.

   The results are written to standard output if no output file is
specified or if it is `-'.

Output details
--------------

`--force-po'
     Always write an output file even if it contains no message.

`-i'
`--indent'
     Write the .po file using indented style.

`--strict'
     Write out a strict Uniforum conforming PO file.  Note that this
     Uniforum format should be avoided because it doesn't support the
     GNU extensions.

`-w NUMBER'
`--width=NUMBER'
     Set the output page width.  Long strings in the output files will
     be split across multiple lines in order to ensure that each line's
     width (= number of screen columns) is less or equal to the given
     NUMBER.

`-s'
`--sort-output'
     Generate sorted output.  Note that using this option makes it much
     harder for the translator to understand each message's context.

Informative output
------------------

`-h'
`--help'
     Display this help and exit.

`-V'
`--version'
     Output version information and exit.

`-v'
`--verbose'
     Increase verbosity level.


File: gettext.info,  Node: MO Files,  Prev: msgunfmt Invocation,  Up: Binaries

The Format of GNU MO Files
==========================

   The format of the generated MO files is best described by a picture,
which appears below.

   The first two words serve the identification of the file.  The magic
number will always signal GNU MO files.  The number is stored in the
byte order of the generating machine, so the magic number really is two
numbers: `0x950412de' and `0xde120495'.  The second word describes the
current revision of the file format.  For now the revision is 0.  This
might change in future versions, and ensures that the readers of MO
files can distinguish new formats from old ones, so that both can be
handled correctly.  The version is kept separate from the magic number,
instead of using different magic numbers for different formats, mainly
because `/etc/magic' is not updated often.  It might be better to have
magic separated from internal format version identification.

   Follow a number of pointers to later tables in the file, allowing
for the extension of the prefix part of MO files without having to
recompile programs reading them.  This might become useful for later
inserting a few flag bits, indication about the charset used, new
tables, or other things.

   Then, at offset O and offset T in the picture, two tables of string
descriptors can be found.  In both tables, each string descriptor uses
two 32 bits integers, one for the string length, another for the offset
of the string in the MO file, counting in bytes from the start of the
file.  The first table contains descriptors for the original strings,
and is sorted so the original strings are in increasing lexicographical
order.  The second table contains descriptors for the translated
strings, and is parallel to the first table: to find the corresponding
translation one has to access the array slot in the second array with
the same index.

   Having the original strings sorted enables the use of simple binary
search, for when the MO file does not contain an hashing table, or for
when it is not practical to use the hashing table provided in the MO
file.  This also has another advantage, as the empty string in a PO
file GNU `gettext' is usually _translated_ into some system information
attached to that particular MO file, and the empty string necessarily
becomes the first in both the original and translated tables, making
the system information very easy to find.

   The size S of the hash table can be zero.  In this case, the hash
table itself is not contained in the MO file.  Some people might prefer
this because a precomputed hashing table takes disk space, and does not
win _that_ much speed.  The hash table contains indices to the sorted
array of strings in the MO file.  Conflict resolution is done by double
hashing.  The precise hashing algorithm used is fairly dependent of GNU
`gettext' code, and is not documented here.

   As for the strings themselves, they follow the hash file, and each
is terminated with a <NUL>, and this <NUL> is not counted in the length
which appears in the string descriptor.  The `msgfmt' program has an
option selecting the alignment for MO file strings.  With this option,
each string is separately aligned so it starts at an offset which is a
multiple of the alignment value.  On some RISC machines, a correct
alignment will speed things up.

   Plural forms are stored by letting the plural of the original string
follow the singular of the original string, separated through a <NUL>
byte.  The length which appears in the string descriptor includes both.
However, only the singular of the original string takes part in the
hash table lookup.  The plural variants of the translation are all
stored consecutively, separated through a <NUL> byte.  Here also, the
length in the string descriptor includes all of them.

   Nothing prevents a MO file from having embedded <NUL>s in strings.
However, the program interface currently used already presumes that
strings are <NUL> terminated, so embedded <NUL>s are somewhat useless.
But the MO file format is general enough so other interfaces would be
later possible, if for example, we ever want to implement wide
characters right in MO files, where <NUL> bytes may accidently appear.
(No, we don't want to have wide characters in MO files.  They would
make the file unnecessarily large, and the `wchar_t' type being
platform dependent, MO files would be platform dependent as well.)

   This particular issue has been strongly debated in the GNU `gettext'
development forum, and it is expectable that MO file format will evolve
or change over time.  It is even possible that many formats may later
be supported concurrently.  But surely, we have to start somewhere, and
the MO file format described here is a good start.  Nothing is cast in
concrete, and the format may later evolve fairly easily, so we should
feel comfortable with the current approach.

             byte
                  +------------------------------------------+
               0  | magic number = 0x950412de                |
                  |                                          |
               4  | file format revision = 0                 |
                  |                                          |
               8  | number of strings                        |  == N
                  |                                          |
              12  | offset of table with original strings    |  == O
                  |                                          |
              16  | offset of table with translation strings |  == T
                  |                                          |
              20  | size of hashing table                    |  == S
                  |                                          |
              24  | offset of hashing table                  |  == H
                  |                                          |
                  .                                          .
                  .    (possibly more entries later)         .
                  .                                          .
                  |                                          |
               O  | length & offset 0th string  ----------------.
           O + 8  | length & offset 1st string  ------------------.
                   ...                                    ...   | |
     O + ((N-1)*8)| length & offset (N-1)th string           |  | |
                  |                                          |  | |
               T  | length & offset 0th translation  ---------------.
           T + 8  | length & offset 1st translation  -----------------.
                   ...                                    ...   | | | |
     T + ((N-1)*8)| length & offset (N-1)th translation      |  | | | |
                  |                                          |  | | | |
               H  | start hash table                         |  | | | |
                   ...                                    ...   | | | |
       H + S * 4  | end hash table                           |  | | | |
                  |                                          |  | | | |
                  | NUL terminated 0th string  <----------------' | | |
                  |                                          |    | | |
                  | NUL terminated 1st string  <------------------' | |
                  |                                          |      | |
                   ...                                    ...       | |
                  |                                          |      | |
                  | NUL terminated 0th translation  <---------------' |
                  |                                          |        |
                  | NUL terminated 1st translation  <-----------------'
                  |                                          |
                   ...                                    ...
                  |                                          |
                  +------------------------------------------+


File: gettext.info,  Node: Users,  Next: Programmers,  Prev: Binaries,  Up: Top

The User's View
***************

   When GNU `gettext' will truly have reached its goal, average users
should feel some kind of astonished pleasure, seeing the effect of that
strange kind of magic that just makes their own native language appear
everywhere on their screens.  As for naive users, they would ideally
have no special pleasure about it, merely taking their own language for
_granted_, and becoming rather unhappy otherwise.

   So, let's try to describe here how we would like the magic to
operate, as we want the users' view to be the simplest, among all ways
one could look at GNU `gettext'.  All other software engineers:
programmers, translators, maintainers, should work together in such a
way that the magic becomes possible.  This is a long and progressive
undertaking, and information is available about the progress of the
Translation Project.

   When a package is distributed, there are two kinds of users:
"installers" who fetch the distribution, unpack it, configure it,
compile it and install it for themselves or others to use; and "end
users" that call programs of the package, once these have been
installed at their site.  GNU `gettext' is offering magic for both
installers and end users.

* Menu:

* Matrix::                      The Current `ABOUT-NLS' Matrix
* Installers::                  Magic for Installers
* End Users::                   Magic for End Users


File: gettext.info,  Node: Matrix,  Next: Installers,  Prev: Users,  Up: Users

The Current `ABOUT-NLS' Matrix
==============================

   Languages are not equally supported in all packages using GNU
`gettext'.  To know if some package uses GNU `gettext', one may check
the distribution for the `ABOUT-NLS' information file, for some `LL.po'
files, often kept together into some `po/' directory, or for an `intl/'
directory.  Internationalized packages have usually many `LL.po' files,
where LL represents the language.  *Note End Users:: for a complete
description of the format for LL.

   More generally, a matrix is available for showing the current state
of the Translation Project, listing which packages are prepared for
multi-lingual messages, and which languages are supported by each.
Because this information changes often, this matrix is not kept within
this GNU `gettext' manual.  This information is often found in file
`ABOUT-NLS' from various distributions, but is also as old as the
distribution itself.  A recent copy of this `ABOUT-NLS' file,
containing up-to-date information, should generally be found on the
Translation Project sites, and also on most GNU archive sites.


File: gettext.info,  Node: Installers,  Next: End Users,  Prev: Matrix,  Up: Users

Magic for Installers
====================

   By default, packages fully using GNU `gettext', internally, are
installed in such a way that they to allow translation of messages.  At
_configuration_ time, those packages should automatically detect
whether the underlying host system already provides the GNU `gettext'
functions.  If not, the GNU `gettext' library should be automatically
prepared and used.  Installers may use special options at configuration
time for changing this behavior.  The command `./configure
--with-included-gettext' bypasses system `gettext' to use the included
GNU `gettext' instead, while `./configure --disable-nls' produces
programs totally unable to translate messages.

   Internationalized packages have usually many `LL.po' files.  Unless
translations are disabled, all those available are installed together
with the package.  However, the environment variable `LINGUAS' may be
set, prior to configuration, to limit the installed set.  `LINGUAS'
should then contain a space separated list of two-letter codes, stating
which languages are allowed.


File: gettext.info,  Node: End Users,  Prev: Installers,  Up: Users

Magic for End Users
===================

   We consider here those packages using GNU `gettext' internally, and
for which the installers did not disable translation at _configure_
time.  Then, users only have to set the `LANG' environment variable to
the appropriate `LL_CC' combination prior to using the programs in the
package.  *Note Matrix::.  For example, let's presume a German site.
At the shell prompt, users merely have to execute `setenv LANG de_DE'
(in `csh') or `export LANG; LANG=de_DE' (in `sh').  They could even do
this from their `.login' or `.profile' file.


File: gettext.info,  Node: Programmers,  Next: Translators,  Prev: Users,  Up: Top

The Programmer's View
*********************

   One aim of the current message catalog implementation provided by
GNU `gettext' was to use the systems message catalog handling, if the
installer wishes to do so.  So we perhaps should first take a look at
the solutions we know about.  The people in the POSIX committee did not
manage to agree on one of the semi-official standards which we'll
describe below.  In fact they couldn't agree on anything, so they
decided only to include an example of an interface.  The major Unix
vendors are split in the usage of the two most important
specifications: X/Open's catgets vs. Uniforum's gettext interface.
We'll describe them both and later explain our solution of this dilemma.

* Menu:

* catgets::                     About `catgets'
* gettext::                     About `gettext'
* Comparison::                  Comparing the two interfaces
* Using libintl.a::             Using libintl.a in own programs
* gettext grok::                Being a `gettext' grok
* Temp Programmers::            Temporary Notes for the Programmers Chapter