summaryrefslogtreecommitdiffstats
path: root/doc/gettext.info-1
blob: f91d9a2bfcf1ae7b4d0d12a669ae9c3701c904da (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
This is gettext.info, produced by makeinfo version 4.0 from
gettext.texi.

INFO-DIR-SECTION GNU Gettext Utilities
START-INFO-DIR-ENTRY
* Gettext: (gettext).                           GNU gettext utilities.
* gettextize: (gettext)gettextize Invocation.   Prepare a package for gettext.
* msgfmt: (gettext)msgfmt Invocation.           Make MO files out of PO files.
* msgmerge: (gettext)msgmerge Invocation.       Update two PO files into one.
* xgettext: (gettext)xgettext Invocation.       Extract strings into a PO file.
END-INFO-DIR-ENTRY

   This file provides documentation for GNU `gettext' utilities.  It
also serves as a reference for the free Translation Project.

   Copyright (C) 1995, 1996, 1997, 1998, 2001, 2002 Free Software
Foundation, Inc.

   Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.

   Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided that
the entire resulting derived work is distributed under the terms of a
permission notice identical to this one.

   Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that this permission notice may be stated in a
translation approved by the Foundation.


File: gettext.info,  Node: Top,  Next: Introduction,  Prev: (dir),  Up: (dir)

GNU `gettext' utilities
***********************

   This manual document the GNU gettext tools and the GNU libintl
library, version 0.11.3.

* Menu:

* Introduction::                Introduction
* Basics::                      PO Files and PO Mode Basics
* Sources::                     Preparing Program Sources
* Template::                    Making the PO Template File
* Creating::                    Creating a New PO File
* Updating::                    Updating Existing PO Files
* Manipulating::                Manipulating PO Files
* Binaries::                    Producing Binary MO Files
* Users::                       The User's View
* Programmers::                 The Programmer's View
* Translators::                 The Translator's View
* Maintainers::                 The Maintainer's View
* Programming Languages::       Other Programming Languages
* Conclusion::                  Concluding Remarks

* Language Codes::              ISO 639 language codes
* Country Codes::               ISO 3166 country codes

* Program Index::               Index of Programs
* Option Index::                Index of Command-Line Options
* Variable Index::              Index of Environment Variables
* PO Mode Index::               Index of Emacs PO Mode Commands
* Autoconf Macro Index::        Index of Autoconf Macros
* Index::                       General Index

 --- The Detailed Node Listing ---

Introduction

* Why::                         The Purpose of GNU `gettext'
* Concepts::                    I18n, L10n, and Such
* Aspects::                     Aspects in Native Language Support
* Files::                       Files Conveying Translations
* Overview::                    Overview of GNU `gettext'

PO Files and PO Mode Basics

* Installation::                Completing GNU `gettext' Installation
* PO Files::                    The Format of PO Files
* Main PO Commands::            Main Commands
* Entry Positioning::           Entry Positioning
* Normalizing::                 Normalizing Strings in Entries

Preparing Program Sources

* Triggering::                  Triggering `gettext' Operations
* Preparing Strings::           Preparing Translatable Strings
* Mark Keywords::               How Marks Appear in Sources
* Marking::                     Marking Translatable Strings
* c-format Flag::               Telling something about the following string
* Special cases::               Special Cases of Translatable Strings

Making the PO Template File

* xgettext Invocation::         Invoking the `xgettext' Program

Creating a New PO File

* msginit Invocation::          Invoking the `msginit' Program
* Header Entry::                Filling in the Header Entry

Updating Existing PO Files

* msgmerge Invocation::         Invoking the `msgmerge' Program
* Translated Entries::          Translated Entries
* Fuzzy Entries::               Fuzzy Entries
* Untranslated Entries::        Untranslated Entries
* Obsolete Entries::            Obsolete Entries
* Modifying Translations::      Modifying Translations
* Modifying Comments::          Modifying Comments
* Subedit::                     Mode for Editing Translations
* C Sources Context::           C Sources Context
* Auxiliary::                   Consulting Auxiliary PO Files
* Compendium::                  Using Translation Compendia

Using Translation Compendia

* Creating Compendia::          Merging translations for later use
* Using Compendia::             Using older translations if they fit

Manipulating PO Files

* msgcat Invocation::           Invoking the `msgcat' Program
* msgconv Invocation::          Invoking the `msgconv' Program
* msggrep Invocation::          Invoking the `msggrep' Program
* msgfilter Invocation::        Invoking the `msgfilter' Program
* msguniq Invocation::          Invoking the `msguniq' Program
* msgcomm Invocation::          Invoking the `msgcomm' Program
* msgcmp Invocation::           Invoking the `msgcmp' Program
* msgattrib Invocation::        Invoking the `msgattrib' Program
* msgen Invocation::            Invoking the `msgen' Program
* msgexec Invocation::          Invoking the `msgexec' Program

Producing Binary MO Files

* msgfmt Invocation::           Invoking the `msgfmt' Program
* msgunfmt Invocation::         Invoking the `msgunfmt' Program
* MO Files::                    The Format of GNU MO Files

The User's View

* Matrix::                      The Current `ABOUT-NLS' Matrix
* Installers::                  Magic for Installers
* End Users::                   Magic for End Users

The Programmer's View

* catgets::                     About `catgets'
* gettext::                     About `gettext'
* Comparison::                  Comparing the two interfaces
* Using libintl.a::             Using libintl.a in own programs
* gettext grok::                Being a `gettext' grok
* Temp Programmers::            Temporary Notes for the Programmers Chapter

About `catgets'

* Interface to catgets::        The interface
* Problems with catgets::       Problems with the `catgets' interface?!

About `gettext'

* Interface to gettext::        The interface
* Ambiguities::                 Solving ambiguities
* Locating Catalogs::           Locating message catalog files
* Charset conversion::          How to request conversion to Unicode
* Plural forms::                Additional functions for handling plurals
* GUI program problems::        Another technique for solving ambiguities
* Optimized gettext::           Optimization of the *gettext functions

Temporary Notes for the Programmers Chapter

* Temp Implementations::        Temporary - Two Possible Implementations
* Temp catgets::                Temporary - About `catgets'
* Temp WSI::                    Temporary - Why a single implementation
* Temp Notes::                  Temporary - Notes

The Translator's View

* Trans Intro 0::               Introduction 0
* Trans Intro 1::               Introduction 1
* Discussions::                 Discussions
* Organization::                Organization
* Information Flow::            Information Flow

Organization

* Central Coordination::        Central Coordination
* National Teams::              National Teams
* Mailing Lists::               Mailing Lists

National Teams

* Sub-Cultures::                Sub-Cultures
* Organizational Ideas::        Organizational Ideas

The Maintainer's View

* Flat and Non-Flat::           Flat or Non-Flat Directory Structures
* Prerequisites::               Prerequisite Works
* gettextize Invocation::       Invoking the `gettextize' Program
* Adjusting Files::             Files You Must Create or Alter
* autoconf macros::             Autoconf macros for use in `configure.in'
* CVS Issues::                  Integrating with CVS

Files You Must Create or Alter

* po/POTFILES.in::              `POTFILES.in' in `po/'
* po/LINGUAS::                  `LINGUAS' in `po/'
* po/Makevars::                 `Makefile' pieces in `po/'
* configure.in::                `configure.in' at top level
* config.guess::                `config.guess', `config.sub' at top level
* mkinstalldirs::               `mkinstalldirs' at top level
* aclocal::                     `aclocal.m4' at top level
* acconfig::                    `acconfig.h' at top level
* config.h.in::                 `config.h.in' at top level
* Makefile::                    `Makefile.in' at top level
* src/Makefile::                `Makefile.in' in `src/'
* lib/gettext.h::               `gettext.h' in `lib/'

Autoconf macros for use in `configure.in'

* AM_GNU_GETTEXT::              AM_GNU_GETTEXT in `gettext.m4'
* AM_GNU_GETTEXT_VERSION::      AM_GNU_GETTEXT_VERSION in `gettext.m4'
* AM_ICONV::                    AM_ICONV in `iconv.m4'

Integrating with CVS

* Distributed CVS::             Avoiding version mismatch in distributed development
* Files under CVS::             Files to put under CVS version control
* autopoint Invocation::        Invoking the `autopoint' Program

Other Programming Languages

* Language Implementors::       The Language Implementor's View
* Programmers for other Languages::  The Programmer's View
* Translators for other Languages::  The Translator's View
* Maintainers for other Languages::  The Maintainer's View
* List of Programming Languages::  Individual Programming Languages
* List of Data Formats::        Internationalizable Data

The Translator's View

* c-format::                    C Format Strings
* python-format::               Python Format Strings
* lisp-format::                 Lisp Format Strings
* elisp-format::                Emacs Lisp Format Strings
* librep-format::               librep Format Strings
* smalltalk-format::            Smalltalk Format Strings
* java-format::                 Java Format Strings
* awk-format::                  awk Format Strings
* object-pascal-format::        Object Pascal Format Strings
* ycp-format::                  YCP Format Strings
* tcl-format::                  Tcl Format Strings

Individual Programming Languages

* C::                           C, C++, Objective C
* sh::                          sh - Shell Script
* bash::                        bash - Bourne-Again Shell Script
* Python::                      Python
* Common Lisp::                 GNU clisp - Common Lisp
* clisp C::                     GNU clisp C sources
* Emacs Lisp::                  Emacs Lisp
* librep::                      librep
* Smalltalk::                   GNU Smalltalk
* Java::                        Java
* gawk::                        GNU awk
* Pascal::                      Pascal - Free Pascal Compiler
* wxWindows::                   wxWindows library
* YCP::                         YCP - YaST2 scripting language
* Tcl::                         Tcl - Tk's scripting language
* Perl::                        Perl
* PHP::                         PHP Hypertext Preprocessor
* Pike::                        Pike

Internationalizable Data

* POT::                         POT - Portable Object Template
* RST::                         Resource String Table
* Glade::                       Glade - GNOME user interface description

Concluding Remarks

* History::                     History of GNU `gettext'
* References::                  Related Readings


File: gettext.info,  Node: Introduction,  Next: Basics,  Prev: Top,  Up: Top

Introduction
************

     This manual is still in _DRAFT_ state.  Some sections are still
     empty, or almost.  We keep merging material from other sources
     (essentially e-mail folders) while the proper integration of this
     material is delayed.

   In this manual, we use _he_ when speaking of the programmer or
maintainer, _she_ when speaking of the translator, and _they_ when
speaking of the installers or end users of the translated program.
This is only a convenience for clarifying the documentation.  It is
_absolutely_ not meant to imply that some roles are more appropriate to
males or females.  Besides, as you might guess, GNU `gettext' is meant
to be useful for people using computers, whatever their sex, race,
religion or nationality!

   This chapter explains the goals sought in the creation of GNU
`gettext' and the free Translation Project.  Then, it explains a few
broad concepts around Native Language Support, and positions message
translation with regard to other aspects of national and cultural
variance, as they apply to to programs.  It also surveys those files
used to convey the translations.  It explains how the various tools
interact in the initial generation of these files, and later, how the
maintenance cycle should usually operate.

   Please send suggestions and corrections to:

     Internet address:
         bug-gnu-gettext@gnu.org

Please include the manual's edition number and update date in your
messages.

* Menu:

* Why::                         The Purpose of GNU `gettext'
* Concepts::                    I18n, L10n, and Such
* Aspects::                     Aspects in Native Language Support
* Files::                       Files Conveying Translations
* Overview::                    Overview of GNU `gettext'


File: gettext.info,  Node: Why,  Next: Concepts,  Prev: Introduction,  Up: Introduction

The Purpose of GNU `gettext'
============================

   Usually, programs are written and documented in English, and use
English at execution time to interact with users.  This is true not
only of GNU software, but also of a great deal of commercial and free
software.  Using a common language is quite handy for communication
between developers, maintainers and users from all countries.  On the
other hand, most people are less comfortable with English than with
their own native language, and would prefer to use their mother tongue
for day to day's work, as far as possible.  Many would simply _love_ to
see their computer screen showing a lot less of English, and far more
of their own language.

   However, to many people, this dream might appear so far fetched that
they may believe it is not even worth spending time thinking about it.
They have no confidence at all that the dream might ever become true.
Yet some have not lost hope, and have organized themselves.  The
Translation Project is a formalization of this hope into a workable
structure, which has a good chance to get all of us nearer the
achievement of a truly multi-lingual set of programs.

   GNU `gettext' is an important step for the Translation Project, as
it is an asset on which we may build many other steps.  This package
offers to programmers, translators and even users, a well integrated
set of tools and documentation.  Specifically, the GNU `gettext'
utilities are a set of tools that provides a framework within which
other free packages may produce multi-lingual messages.  These tools
include

   * A set of conventions about how programs should be written to
     support message catalogs.

   * A directory and file naming organization for the message catalogs
     themselves.

   * A runtime library supporting the retrieval of translated messages.

   * A few stand-alone programs to massage in various ways the sets of
     translatable strings, or already translated strings.

   * A special mode for Emacs(1) which helps preparing these sets and
     bringing them up to date.

   GNU `gettext' is designed to minimize the impact of
internationalization on program sources, keeping this impact as small
and hardly noticeable as possible.  Internationalization has better
chances of succeeding if it is very light weighted, or at least, appear
to be so, when looking at program sources.

   The Translation Project also uses the GNU `gettext' distribution as
a vehicle for documenting its structure and methods.  This goes beyond
the strict technicalities of documenting the GNU `gettext' proper.  By
so doing, translators will find in a single place, as far as possible,
all they need to know for properly doing their translating work.  Also,
this supplemental documentation might also help programmers, and even
curious users, in understanding how GNU `gettext' is related to the
remainder of the Translation Project, and consequently, have a glimpse
at the _big picture_.

   ---------- Footnotes ----------

   (1) In this manual, all mentions of Emacs refers to either GNU Emacs
or to XEmacs, which people sometimes call FSF Emacs and Lucid Emacs,
respectively.


File: gettext.info,  Node: Concepts,  Next: Aspects,  Prev: Why,  Up: Introduction

I18n, L10n, and Such
====================

   Two long words appear all the time when we discuss support of native
language in programs, and these words have a precise meaning, worth
being explained here, once and for all in this document.  The words are
_internationalization_ and _localization_.  Many people, tired of
writing these long words over and over again, took the habit of writing
"i18n" and "l10n" instead, quoting the first and last letter of each
word, and replacing the run of intermediate letters by a number merely
telling how many such letters there are.  But in this manual, in the
sake of clarity, we will patiently write the names in full, each time...

   By "internationalization", one refers to the operation by which a
program, or a set of programs turned into a package, is made aware of
and able to support multiple languages.  This is a generalization
process, by which the programs are untied from calling only English
strings or other English specific habits, and connected to generic ways
of doing the same, instead.  Program developers may use various
techniques to internationalize their programs.  Some of these have been
standardized.  GNU `gettext' offers one of these standards.  *Note
Programmers::.

   By "localization", one means the operation by which, in a set of
programs already internationalized, one gives the program all needed
information so that it can adapt itself to handle its input and output
in a fashion which is correct for some native language and cultural
habits.  This is a particularisation process, by which generic methods
already implemented in an internationalized program are used in
specific ways.  The programming environment puts several functions to
the programmers disposal which allow this runtime configuration.  The
formal description of specific set of cultural habits for some country,
together with all associated translations targeted to the same native
language, is called the "locale" for this language or country.  Users
achieve localization of programs by setting proper values to special
environment variables, prior to executing those programs, identifying
which locale should be used.

   In fact, locale message support is only one component of the cultural
data that makes up a particular locale.  There are a whole host of
routines and functions provided to aid programmers in developing
internationalized software and which allow them to access the data
stored in a particular locale.  When someone presently refers to a
particular locale, they are obviously referring to the data stored
within that particular locale.  Similarly, if a programmer is referring
to "accessing the locale routines", they are referring to the complete
suite of routines that access all of the locale's information.

   One uses the expression "Native Language Support", or merely NLS,
for speaking of the overall activity or feature encompassing both
internationalization and localization, allowing for multi-lingual
interactions in a program.  In a nutshell, one could say that
internationalization is the operation by which further localizations
are made possible.

   Also, very roughly said, when it comes to multi-lingual messages,
internationalization is usually taken care of by programmers, and
localization is usually taken care of by translators.


File: gettext.info,  Node: Aspects,  Next: Files,  Prev: Concepts,  Up: Introduction

Aspects in Native Language Support
==================================

   For a totally multi-lingual distribution, there are many things to
translate beyond output messages.

   * As of today, GNU `gettext' offers a complete toolset for
     translating messages output by C programs.  Perl scripts and shell
     scripts will also need to be translated.  Even if there are today
     some hooks by which this can be done, these hooks are not
     integrated as well as they should be.

   * Some programs, like `autoconf' or `bison', are able to produce
     other programs (or scripts).  Even if the generating programs
     themselves are internationalized, the generated programs they
     produce may need internationalization on their own, and this
     indirect internationalization could be automated right from the
     generating program.  In fact, quite usually, generating and
     generated programs could be internationalized independently, as
     the effort needed is fairly orthogonal.

   * A few programs include textual tables which might need translation
     themselves, independently of the strings contained in the program
     itself.  For example, RFC 1345 gives an English description for
     each character which the `recode' program is able to reconstruct
     at execution.  Since these descriptions are extracted from the RFC
     by mechanical means, translating them properly would require a
     prior translation of the RFC itself.

   * Almost all programs accept options, which are often worded out so
     to be descriptive for the English readers; one might want to
     consider offering translated versions for program options as well.

   * Many programs read, interpret, compile, or are somewhat driven by
     input files which are texts containing keywords, identifiers, or
     replies which are inherently translatable.  For example, one may
     want `gcc' to allow diacriticized characters in identifiers or use
     translated keywords; `rm -i' might accept something else than `y'
     or `n' for replies, etc.  Even if the program will eventually make
     most of its output in the foreign languages, one has to decide
     whether the input syntax, option values, etc., are to be localized
     or not.

   * The manual accompanying a package, as well as all documentation
     files in the distribution, could surely be translated, too.
     Translating a manual, with the intent of later keeping up with
     updates, is a major undertaking in itself, generally.


   As we already stressed, translation is only one aspect of locales.
Other internationalization aspects are system services and are handled
in GNU `libc'.  There are many attributes that are needed to define a
country's cultural conventions.  These attributes include beside the
country's native language, the formatting of the date and time, the
representation of numbers, the symbols for currency, etc.  These local
"rules" are termed the country's locale.  The locale represents the
knowledge needed to support the country's native attributes.

   There are a few major areas which may vary between countries and
hence, define what a locale must describe.  The following list helps
putting multi-lingual messages into the proper context of other tasks
related to locales.  See the GNU `libc' manual for details.

_Characters and Codesets_
     The codeset most commonly used through out the USA and most English
     speaking parts of the world is the ASCII codeset.  However, there
     are many characters needed by various locales that are not found
     within this codeset.  The 8-bit ISO 8859-1 code set has most of
     the special characters needed to handle the major European
     languages.  However, in many cases, the ISO 8859-1 font is not
     adequate: it doesn't even handle the major European currency.
     Hence each locale will need to specify which codeset they need to
     use and will need to have the appropriate character handling
     routines to cope with the codeset.

_Currency_
     The symbols used vary from country to country as does the position
     used by the symbol.  Software needs to be able to transparently
     display currency figures in the native mode for each locale.

_Dates_
     The format of date varies between locales.  For example, Christmas
     day in 1994 is written as 12/25/94 in the USA and as 25/12/94 in
     Australia.  Other countries might use ISO 8061 dates, etc.

     Time of the day may be noted as HH:MM, HH.MM, or otherwise.  Some
     locales require time to be specified in 24-hour mode rather than
     as AM or PM.  Further, the nature and yearly extent of the
     Daylight Saving correction vary widely between countries.

_Numbers_
     Numbers can be represented differently in different locales.  For
     example, the following numbers are all written correctly for their
     respective locales:

          12,345.67       English
          12.345,67       German
           12345,67       French
          1,2345.67       Asia

     Some programs could go further and use different unit systems, like
     English units or Metric units, or even take into account variants
     about how numbers are spelled in full.

_Messages_
     The most obvious area is the language support within a locale.
     This is where GNU `gettext' provides the means for developers and
     users to easily change the language that the software uses to
     communicate to the user.

   Components of locale outside of message handling are standardized in
the ISO C standard and the SUSV2 specification.  GNU `libc' fully
implements this, and most other modern systems provide a more or less
reasonable support for at least some of the missing components.


File: gettext.info,  Node: Files,  Next: Overview,  Prev: Aspects,  Up: Introduction

Files Conveying Translations
============================

   The letters PO in `.po' files means Portable Object, to distinguish
it from `.mo' files, where MO stands for Machine Object.  This
paradigm, as well as the PO file format, is inspired by the NLS
standard developed by Uniforum, and first implemented by Sun in their
Solaris system.

   PO files are meant to be read and edited by humans, and associate
each original, translatable string of a given package with its
translation in a particular target language.  A single PO file is
dedicated to a single target language.  If a package supports many
languages, there is one such PO file per language supported, and each
package has its own set of PO files.  These PO files are best created by
the `xgettext' program, and later updated or refreshed through the
`msgmerge' program.  Program `xgettext' extracts all marked messages
from a set of C files and initializes a PO file with empty
translations.  Program `msgmerge' takes care of adjusting PO files
between releases of the corresponding sources, commenting obsolete
entries, initializing new ones, and updating all source line
references.  Files ending with `.pot' are kind of base translation
files found in distributions, in PO file format.

   MO files are meant to be read by programs, and are binary in nature.
A few systems already offer tools for creating and handling MO files as
part of the Native Language Support coming with the system, but the
format of these MO files is often different from system to system, and
non-portable.  The tools already provided with these systems don't
support all the features of GNU `gettext'.  Therefore GNU `gettext'
uses its own format for MO files.  Files ending with `.gmo' are really
MO files, when it is known that these files use the GNU format.


File: gettext.info,  Node: Overview,  Prev: Files,  Up: Introduction

Overview of GNU `gettext'
=========================

   The following diagram summarizes the relation between the files
handled by GNU `gettext' and the tools acting on these files.  It is
followed by somewhat detailed explanations, which you should read while
keeping an eye on the diagram.  Having a clear understanding of these
interrelations will surely help programmers, translators and
maintainers.

     Original C Sources ---> PO mode ---> Marked C Sources ---.
                                                              |
                   .---------<--- GNU gettext Library         |
     .--- make <---+                                          |
     |             `---------<--------------------+-----------'
     |                                            |
     |   .-----<--- PACKAGE.pot <--- xgettext <---'   .---<--- PO Compendium
     |   |                                            |             ^
     |   |                                            `---.         |
     |   `---.                                            +---> PO mode ---.
     |       +----> msgmerge ------> LANG.po ---->--------'                |
     |   .---'                                                             |
     |   |                                                                 |
     |   `-------------<---------------.                                   |
     |                                 +--- New LANG.po <------------------'
     |   .--- LANG.gmo <--- msgfmt <---'
     |   |
     |   `---> install ---> /.../LANG/PACKAGE.mo ---.
     |                                              +---> "Hello world!"
     `-------> install ---> /.../bin/PROGRAM -------'

   The indication `PO mode' appears in two places in this picture, and
you may safely read it as merely meaning "hand editing", using any
editor of your choice, really.  However, for those of you being the
lucky users of Emacs, PO mode has been specifically created for
providing a cozy environment for editing or modifying PO files.  While
editing a PO file, PO mode allows for the easy browsing of auxiliary
and compendium PO files, as well as for following references into the
set of C program sources from which PO files have been derived.  It has
a few special features, among which are the interactive marking of
program strings as translatable, and the validation of PO files with
easy repositioning to PO file lines showing errors.

   As a programmer, the first step to bringing GNU `gettext' into your
package is identifying, right in the C sources, those strings which are
meant to be translatable, and those which are untranslatable.  This
tedious job can be done a little more comfortably using emacs PO mode,
but you can use any means familiar to you for modifying your C sources.
Beside this some other simple, standard changes are needed to properly
initialize the translation library.  *Note Sources::, for more
information about all this.

   For newly written software the strings of course can and should be
marked while writing it.  The `gettext' approach makes this very easy.
Simply put the following lines at the beginning of each file or in a
central header file:

     #define _(String) (String)
     #define N_(String) String
     #define textdomain(Domain)
     #define bindtextdomain(Package, Directory)

Doing this allows you to prepare the sources for internationalization.
Later when you feel ready for the step to use the `gettext' library
simply replace these definitions by the following:

     #include <libintl.h>
     #define _(String) gettext (String)
     #define gettext_noop(String) String
     #define N_(String) gettext_noop (String)

and link against `libintl.a' or `libintl.so'.  Note that on GNU
systems, you don't need to link with `libintl' because the `gettext'
library functions are already contained in GNU libc.  That is all you
have to change.

   Once the C sources have been modified, the `xgettext' program is
used to find and extract all translatable strings, and create a PO
template file out of all these.  This `PACKAGE.pot' file contains all
original program strings.  It has sets of pointers to exactly where in
C sources each string is used.  All translations are set to empty.  The
letter `t' in `.pot' marks this as a Template PO file, not yet oriented
towards any particular language.  *Note xgettext Invocation::, for more
details about how one calls the `xgettext' program.  If you are
_really_ lazy, you might be interested at working a lot more right
away, and preparing the whole distribution setup (*note Maintainers::).
By doing so, you spare yourself typing the `xgettext' command, as
`make' should now generate the proper things automatically for you!

   The first time through, there is no `LANG.po' yet, so the `msgmerge'
step may be skipped and replaced by a mere copy of `PACKAGE.pot' to
`LANG.po', where LANG represents the target language.  See *Note
Creating:: for details.

   Then comes the initial translation of messages.  Translation in
itself is a whole matter, still exclusively meant for humans, and whose
complexity far overwhelms the level of this manual.  Nevertheless, a
few hints are given in some other chapter of this manual (*note
Translators::).  You will also find there indications about how to
contact translating teams, or becoming part of them, for sharing your
translating concerns with others who target the same native language.

   While adding the translated messages into the `LANG.po' PO file, if
you do not have Emacs handy, you are on your own for ensuring that your
efforts fully respect the PO file format, and quoting conventions
(*note PO Files::).  This is surely not an impossible task, as this is
the way many people have handled PO files already for Uniforum or
Solaris.  On the other hand, by using PO mode in Emacs, most details of
PO file format are taken care of for you, but you have to acquire some
familiarity with PO mode itself.  Besides main PO mode commands (*note
Main PO Commands::), you should know how to move between entries (*note
Entry Positioning::), and how to handle untranslated entries (*note
Untranslated Entries::).

   If some common translations have already been saved into a compendium
PO file, translators may use PO mode for initializing untranslated
entries from the compendium, and also save selected translations into
the compendium, updating it (*note Compendium::).  Compendium files are
meant to be exchanged between members of a given translation team.

   Programs, or packages of programs, are dynamic in nature: users write
bug reports and suggestion for improvements, maintainers react by
modifying programs in various ways.  The fact that a package has
already been internationalized should not make maintainers shy of
adding new strings, or modifying strings already translated.  They just
do their job the best they can.  For the Translation Project to work
smoothly, it is important that maintainers do not carry translation
concerns on their already loaded shoulders, and that translators be
kept as free as possible of programming concerns.

   The only concern maintainers should have is carefully marking new
strings as translatable, when they should be, and do not otherwise
worry about them being translated, as this will come in proper time.
Consequently, when programs and their strings are adjusted in various
ways by maintainers, and for matters usually unrelated to translation,
`xgettext' would construct `PACKAGE.pot' files which are evolving over
time, so the translations carried by `LANG.po' are slowly fading out of
date.

   It is important for translators (and even maintainers) to understand
that package translation is a continuous process in the lifetime of a
package, and not something which is done once and for all at the start.
After an initial burst of translation activity for a given package,
interventions are needed once in a while, because here and there,
translated entries become obsolete, and new untranslated entries
appear, needing translation.

   The `msgmerge' program has the purpose of refreshing an already
existing `LANG.po' file, by comparing it with a newer `PACKAGE.pot'
template file, extracted by `xgettext' out of recent C sources.  The
refreshing operation adjusts all references to C source locations for
strings, since these strings move as programs are modified.  Also,
`msgmerge' comments out as obsolete, in `LANG.po', those already
translated entries which are no longer used in the program sources
(*note Obsolete Entries::).  It finally discovers new strings and
inserts them in the resulting PO file as untranslated entries (*note
Untranslated Entries::).  *Note msgmerge Invocation::, for more
information about what `msgmerge' really does.

   Whatever route or means taken, the goal is to obtain an updated
`LANG.po' file offering translations for all strings.

   The temporal mobility, or fluidity of PO files, is an integral part
of the translation game, and should be well understood, and accepted.
People resisting it will have a hard time participating in the
Translation Project, or will give a hard time to other participants!  In
particular, maintainers should relax and include all available official
PO files in their distributions, even if these have not recently been
updated, without exerting pressure on the translator teams to get the
job done.  The pressure should rather come from the community of users
speaking a particular language, and maintainers should consider
themselves fairly relieved of any concern about the adequacy of
translation files.  On the other hand, translators should reasonably
try updating the PO files they are responsible for, while the package
is undergoing pretest, prior to an official distribution.

   Once the PO file is complete and dependable, the `msgfmt' program is
used for turning the PO file into a machine-oriented format, which may
yield efficient retrieval of translations by the programs of the
package, whenever needed at runtime (*note MO Files::).  *Note msgfmt
Invocation::, for more information about all modes of execution for the
`msgfmt' program.

   Finally, the modified and marked C sources are compiled and linked
with the GNU `gettext' library, usually through the operation of
`make', given a suitable `Makefile' exists for the project, and the
resulting executable is installed somewhere users will find it.  The MO
files themselves should also be properly installed.  Given the
appropriate environment variables are set (*note End Users::), the
program should localize itself automatically, whenever it executes.

   The remainder of this manual has the purpose of explaining in depth
the various steps outlined above.


File: gettext.info,  Node: Basics,  Next: Sources,  Prev: Introduction,  Up: Top

PO Files and PO Mode Basics
***************************

   The GNU `gettext' toolset helps programmers and translators at
producing, updating and using translation files, mainly those PO files
which are textual, editable files.  This chapter stresses the format of
PO files, and contains a PO mode starter.  PO mode description is
spread throughout this manual instead of being concentrated in one
place.  Here we present only the basics of PO mode.

* Menu:

* Installation::                Completing GNU `gettext' Installation
* PO Files::                    The Format of PO Files
* Main PO Commands::            Main Commands
* Entry Positioning::           Entry Positioning
* Normalizing::                 Normalizing Strings in Entries


File: gettext.info,  Node: Installation,  Next: PO Files,  Prev: Basics,  Up: Basics

Completing GNU `gettext' Installation
=====================================

   Once you have received, unpacked, configured and compiled the GNU
`gettext' distribution, the `make install' command puts in place the
programs `xgettext', `msgfmt', `gettext', and `msgmerge', as well as
their available message catalogs.  To top off a comfortable
installation, you might also want to make the PO mode available to your
Emacs users.

   During the installation of the PO mode, you might want to modify your
file `.emacs', once and for all, so it contains a few lines looking
like:

     (setq auto-mode-alist
           (cons '("\\.po\\'\\|\\.po\\." . po-mode) auto-mode-alist))
     (autoload 'po-mode "po-mode" "Major mode for translators to edit PO files" t)

   Later, whenever you edit some `.po' file, or any file having the
string `.po.' within its name, Emacs loads `po-mode.elc' (or
`po-mode.el') as needed, and automatically activates PO mode commands
for the associated buffer.  The string _PO_ appears in the mode line
for any buffer for which PO mode is active.  Many PO files may be
active at once in a single Emacs session.

   If you are using Emacs version 20 or newer, and have already
installed the appropriate international fonts on your system, you may
also tell Emacs how to determine automatically the coding system of
every PO file.  This will often (but not always) cause the necessary
fonts to be loaded and used for displaying the translations on your
Emacs screen.  For this to happen, add the lines:

     (modify-coding-system-alist 'file "\\.po\\'\\|\\.po\\."
                                 'po-find-file-coding-system)
     (autoload 'po-find-file-coding-system "po-mode")

to your `.emacs' file.  If, with this, you still see boxes instead of
international characters, try a different font set (via Shift Mouse
button 1).


File: gettext.info,  Node: PO Files,  Next: Main PO Commands,  Prev: Installation,  Up: Basics

The Format of PO Files
======================

   A PO file is made up of many entries, each entry holding the relation
between an original untranslated string and its corresponding
translation.  All entries in a given PO file usually pertain to a
single project, and all translations are expressed in a single target
language.  One PO file "entry" has the following schematic structure:

     WHITE-SPACE
     #  TRANSLATOR-COMMENTS
     #. AUTOMATIC-COMMENTS
     #: REFERENCE...
     #, FLAG...
     msgid UNTRANSLATED-STRING
     msgstr TRANSLATED-STRING

   The general structure of a PO file should be well understood by the
translator.  When using PO mode, very little has to be known about the
format details, as PO mode takes care of them for her.

   A simple entry can look like this:

     #: lib/error.c:116
     msgid "Unknown system error"
     msgstr "Error desconegut del sistema"

   Entries begin with some optional white space.  Usually, when
generated through GNU `gettext' tools, there is exactly one blank line
between entries.  Then comments follow, on lines all starting with the
character `#'.  There are two kinds of comments: those which have some
white space immediately following the `#', which comments are created
and maintained exclusively by the translator, and those which have some
non-white character just after the `#', which comments are created and
maintained automatically by GNU `gettext' tools.  All comments, of
either kind, are optional.

   After white space and comments, entries show two strings, namely
first the untranslated string as it appears in the original program
sources, and then, the translation of this string.  The original string
is introduced by the keyword `msgid', and the translation, by `msgstr'.
The two strings, untranslated and translated, are quoted in various
ways in the PO file, using `"' delimiters and `\' escapes, but the
translator does not really have to pay attention to the precise quoting
format, as PO mode fully takes care of quoting for her.

   The `msgid' strings, as well as automatic comments, are produced and
managed by other GNU `gettext' tools, and PO mode does not provide
means for the translator to alter these.  The most she can do is merely
deleting them, and only by deleting the whole entry.  On the other
hand, the `msgstr' string, as well as translator comments, are really
meant for the translator, and PO mode gives her the full control she
needs.

   The comment lines beginning with `#,' are special because they are
not completely ignored by the programs as comments generally are.  The
comma separated list of FLAGs is used by the `msgfmt' program to give
the user some better diagnostic messages.  Currently there are two
forms of flags defined:

`fuzzy'
     This flag can be generated by the `msgmerge' program or it can be
     inserted by the translator herself.  It shows that the `msgstr'
     string might not be a correct translation (anymore).  Only the
     translator can judge if the translation requires further
     modification, or is acceptable as is.  Once satisfied with the
     translation, she then removes this `fuzzy' attribute.  The
     `msgmerge' program inserts this when it combined the `msgid' and
     `msgstr' entries after fuzzy search only.  *Note Fuzzy Entries::.

`c-format'
`no-c-format'
     These flags should not be added by a human.  Instead only the
     `xgettext' program adds them.  In an automated PO file processing
     system as proposed here the user changes would be thrown away
     again as soon as the `xgettext' program generates a new template
     file.

     In case the `c-format' flag is given for a string the `msgfmt'
     does some more tests to check to validity of the translation.
     *Note msgfmt Invocation::.

   A different kind of entries is used for translations which involve
plural forms.

     WHITE-SPACE
     #  TRANSLATOR-COMMENTS
     #. AUTOMATIC-COMMENTS
     #: REFERENCE...
     #, FLAG...
     msgid UNTRANSLATED-STRING-SINGULAR
     msgid_plural UNTRANSLATED-STRING-PLURAL
     msgstr[0] TRANSLATED-STRING-CASE-0
     ...
     msgstr[N] TRANSLATED-STRING-CASE-N

   Such an entry can look like this:

     #: src/msgcmp.c:338 src/po-lex.c:699
     #, c-format
     msgid "found %d fatal error"
     msgid_plural "found %d fatal errors"
     msgstr[0] "s'ha trobat %d error fatal"
     msgstr[1] "s'han trobat %d errors fatals"

   It happens that some lines, usually whitespace or comments, follow
the very last entry of a PO file.  Such lines are not part of any entry,
and PO mode is unable to take action on those lines.  By using the PO
mode function `M-x po-normalize', the translator may get rid of those
spurious lines.  *Note Normalizing::.

   The remainder of this section may be safely skipped by those using
PO mode, yet it may be interesting for everybody to have a better idea
of the precise format of a PO file.  On the other hand, those not
having Emacs handy should carefully continue reading on.

   Each of UNTRANSLATED-STRING and TRANSLATED-STRING respects the C
syntax for a character string, including the surrounding quotes and
embedded backslashed escape sequences.  When the time comes to write
multi-line strings, one should not use escaped newlines.  Instead, a
closing quote should follow the last character on the line to be
continued, and an opening quote should resume the string at the
beginning of the following PO file line.  For example:

     msgid ""
     "Here is an example of how one might continue a very long string\n"
     "for the common case the string represents multi-line output.\n"

In this example, the empty string is used on the first line, to allow
better alignment of the `H' from the word `Here' over the `f' from the
word `for'.  In this example, the `msgid' keyword is followed by three
strings, which are meant to be concatenated.  Concatenating the empty
string does not change the resulting overall string, but it is a way
for us to comply with the necessity of `msgid' to be followed by a
string on the same line, while keeping the multi-line presentation
left-justified, as we find this to be a cleaner disposition.  The empty
string could have been omitted, but only if the string starting with
`Here' was promoted on the first line, right after `msgid'.(1) It was
not really necessary either to switch between the two last quoted
strings immediately after the newline `\n', the switch could have
occurred after _any_ other character, we just did it this way because
it is neater.

   One should carefully distinguish between end of lines marked as `\n'
_inside_ quotes, which are part of the represented string, and end of
lines in the PO file itself, outside string quotes, which have no
incidence on the represented string.

   Outside strings, white lines and comments may be used freely.
Comments start at the beginning of a line with `#' and extend until the
end of the PO file line.  Comments written by translators should have
the initial `#' immediately followed by some white space.  If the `#'
is not immediately followed by white space, this comment is most likely
generated and managed by specialized GNU tools, and might disappear or
be replaced unexpectedly when the PO file is given to `msgmerge'.

   ---------- Footnotes ----------

   (1) This limitation is not imposed by GNU `gettext', but is for
compatibility with the `msgfmt' implementation on Solaris.