$Revision: 1.2 $ $Date: 2001/02/20 15:21:13 $ 20 !!! README for gemtext. !!! If you think the above mentioned date is old check out http://stud.uni-sb.de/~gufl0000/atari/gemtext. You will find the latest version there. You will also find links to ftp sites from where you can download gemtext via anonymous ftp. See the file INSTALL for installing gemtext. See the file COPYING for licensing. See the file NEWS for information that is not covered by either this file or the provided manual pages. All information concerning features of gemtext included in any text file that comes along with this package is subject to change without notice! ========================================================================== This is the gemtext package. The gemtext package offers standardized i18n (internationalization) features for GEM programs. The package is intended to help programmers to internationalize their programs. If you aren't a programmer you might want to read the file ABOUT-NLS to get an idea of national language support. But you won't need gemtext. GEM is a registered trademark of Digital Research Company. GEM is mainly used on Atari computers or the various emulations available today on other platforms. You can also find the so-called PC-GEM for IBM compatible machines. If you neither use an Atari nor emulate the Atari's OS you won't need gemtext too. Still it might be a good idea to have a look at the file ABOUT-NLS. All programs and libraries contained in this package will work on their own, only depending on some executables and libraries that can be found in most Unix-like environments. Yet, you will only benefit from gemtext if you have the GNU gettext package or another internationalization package that can be compared with gettext already installed. You will find the latest version of GNU gettext at ftp://ftp.gnu.org/pub/gnu It compiles without any problems with the GNU C-compiler and MiNTLib PL46. (Well, there's a little problem: If your msgfmt keeps on crashing with a bus error try to replace the module obstack.o in your MiNTLib with the obstack.c that comes along with GNU gettext. But this is a bug in the MiNTLib and not in GNU gettext.) In the following I will refer to any computer that can make use of xgemtext as 'Atari'. I also expect you to be familiar with the C programming language and with the well known data structures and function calls that are necessary for programming GEM applications. He, she, they... Quote from the gettext info file: "In this manual, we use *he* when speaking of the programmer or maintainer, *she* when speaking of the translator, and *they* when speaking of the installers or end users of the translated program. This is only a convenience for clarifying the documentation. It is *absolutely* not meant to imply that some roles are more appropriate to males or females. Besides, as you might guess, GNU `gettext' is meant to be useful for people using computers, whatever their sex, race, religion or nationality!" Last but not least: xgemtext is written in C++. Hence you will need a C++ compiler to compile the package. GNU C++ is a good choice if you don't already have one. If you don't have the system resources to run the compiler you will have to try to get an already compiled version. Good news: The library (which is the more important part of the gemtext package) is entirely written in plain C to allow easy porting to different compiler systems. CONTENTS ======== 1. Why gemtext? 2. Alternatives for i18n 3. How to use the GNU gettext package 4. How to use the gemtext package 5. The library rintl 6. Charsets: Atari versus ISO-Latin 1 7. Miscellanea 1. Why gemtext? =============== The Atari's GUI (Graphical User Interface) offers possibilities for native language support for a long time already. All necessary graphical data and text for the user interface is usually kept in a so-called resource file. For example the program foobar.app will normally look for the resource file foobar.rsc in the same directory. A programmer aware of i18n will usually provide several of these resource file each of them containing the entire data in different languages, called e. g. en.rsc, fr.rsc, it.rsc and so on. The users have to rename the particular resource file in their language to foobar.rsc and the program will still behave the same but will display its messages in the chosen language. This procedere is suboptimal for various reasons. - The most obvious one is that you keep loads of redundant data in the extra provided resouce files. Usually the strings in the resource files occupy very little space in comparison to the graphical data. - GEM applications usually expect their resource files to reside at a fixed location and to have a fixed name. Hence, in a multi-user environment you have to choose one particular language for each application or you have to keep different copies of both your resource and other data files. - Once you have decided to include your resources into your source code you have to say goodbye to the idea of i18n. - Probably the most important disadvantage of the current system is the effort you have to take in updating your applications. Whenever you have changed the tree structure in your resource file you will have to edit every other resource file immediately if you do the translation yourself. If you let somebody else do your translations she will have enormous difficulties. She will have to click through the entire resource tree, taking care not to change the data structures unwillingly, and picking out the texts that have changed. If you haven't found some trickier system it is probably easier to restart the translation from the beginning. - Whenever your application has to display a message you have to edit your resource file even for little ones like: form_alert (1, "[1][ Memory exhausted! ][ Damned ]"); Every string that has to be translated has to appear in the resource file. Considering that most resource file formats are restricted to 64 kB of length and that most GEM libraries can handle only one single resource file this is often a severe restriction in the design of the GUI. 2. Alternatives for i18n ======================== As far as programs are concerned that have to be run via a command line interpreter, there is already a portable, standardized system for message internationalization, the GNU gettext package. It offers tools for preparing all kinds of source files for i18n, tools that help both the programmer and translator to keep track of updated versions (and the maintainers, too) and it comes along with a free C library that gives access to the translated messages at runtime of the program depending on the setting of documented environment variables. You should get a copy of the GNU gettext package now. Yet, if you are too curious to download it at once, just go ahead but you might want to reread this file later as it refers to information that is available only with the GNU gettext package. The gemtext package should also be useful with NLS-packages (Native Language Support) other than GNU gettext to the degree that GNU gettext itself is compatible with other standardized packages. Yet, it has been only tested with GNU gettext as it is the only available one for Atari. 3. How to use the GNU gettext package ===================================== Forget about the whole GUI thing for a short while to get roughly an idea of programming with the GNU gettext package. We'll come back to the GUI shortly. NOTE: This section isn't meant to be an exhaustive description of the GNU gettext package. It is intended to give you an overview. If you already know the GNU gettext package you might safely skip this section. Localizing messages in a C program is very easily done. At first you need some (usually three) lines of extra code to tell the NLS library functions where to find the translated messages. Then you have to go through your code and decide for every string whether you want to get it translated or not. Strings that should be translated have to be embedded in a function call to gettext (). Thus instead of writing: printf ("Hello world!\n"); you would now write printf (gettext ("Hello world!\n")); When you're fed up with typing g-e-t-t-e-x-t over and over again you will follow the habit of most of your colleagues and #define _(s) gettext (s) well knowing that now only three extra keystrokes will make the rest of the world happy. Very handy, very easy, isn't it? Unfortunately it doesn't work if you use strings for example in initialized structures or arrays. Really? Just to make sure that you don't abandon the whole thing already, I will tell you a secret: It still works! Read the gettext info file and follow the nodes Sources -> Special cases. After having marked all translatable strings in your sources you will want to extract them for translation by xgettext *.c producing a file messages.po that contains both the original strings and space for the translations. After the file being translated you will compile it into a binary by a call to the program msgfmt. This binary file will then usually be moved to /usr/(local/)share/locale/de/LC_MESSAGES/foobar.mo provided that your program has been translated into German ('de' is the ISO-639 2-letter country code for Germany). Now you set one of the envariables LANGUAGE, LANG or LC_ALL to 'de' and recompile your sources. Your linker will then complain about the reference to 'gettext' not being resolved and you will fix the problem by linking the program with the 'intl'-library that comes along with gettext (and gemtext, too). Surprise, surprise! Your program has learned to say hello to the world in yet another language. After long and satisfying experiences with your chef-d'oeuvre you might feel that it is not enough to say hello to the world. You want to say hello to the whole universe, too. So what now? Redo the whole thing? No! You simply update your sources and extract a new messages file. Before translating the file entirely again you should have msgmerge have a glimpse at both the original and the updated file. The program msgmerge is very clever and will probably guess what you have done. It will produce a new .po file that will already contain the strings you have translated the last time. It will also not cast away those strings that you do not care to translate any longer. It will comment them out instead, thus allowing you to change your mind again later. This feature is also very handy for corrected misspellings or minor modifications to strings. If msgmerge encounters such a case it will mark the maybe incorrect translation it guesses to be best as a "fuzzy" translation. If you agree with msgmerge, simply remove the "fuzzy" comment. Well, that's it in brief. Again, if you want to understand the GNU gettext package in its whole powerfulness you can't help reading the documentation. 4. How to use the gemtext package ================================= Probably you have already seen that it's no longer necessary to put strings for alert boxes or other free strings in your resource files. What has worked with printf will work with form_alert, too. And what about resources integrated in your source files? Why not? Most resource construction sets offer the possibility to produce a C source file as output and running a simple awk script on this file will already do the whole job for you. (An important advantage of integrated resources is that you're not restricted to 64 kB anymore). But if you don't bother to write all the routines to fix adresses of object trees and so on you will estimate the help of xgemtext from this package. Running xgemtext *.rsc in your source directory will extract all strings from your resource files placing them in a file messages.c looking more or less like this: /* This file was automatically produced by xgemtext */ #ifndef gettext_noop #define gettext_noop(s) (s) #endif const char* volatile nls_rsc_strings[] = { gettext_noop (" Foobar"), gettext_noop (" File"), gettext_noop (" Options"), gettext_noop (" About Foobar..."), gettext_noop (" Open"), ... /* All other strings in your resource. */ 0L } It looks like a real C source file and it actually is one. Yet, usually you will never compile it nor link the corresponding object file. But you never know... Various options control xgemtext's behavior while extracting the strings from your resources, see the manual page for detailed information. Of course it does not extract only strings from objects of type G_TEXT, G_STRING etc. but from all kinds of objects containing text, including icons and bitmaps. So what's the purpose of messages.c if not compiling? You can run it (together with your other sources) through xgettext and now you have your resource strings in a .po file that can be processed with all the tools from the GNU gettext package. 5. The library rintl ==================== Your resource file hasn't been patched by xgemtext. So, how to get the translated strings into the objects where they should go? You load your resource as usual. Now you simply have to take the habit to call gettree() for each object tree after fixing the object addresses via rsrc_gaddr(). The library function gettree() will walk thru the specified object tree replacing each pointer to a string with a pointer to the translated versions. It will also modify some members of the internal data structures (string length etc.) to the appropriate values. For simple objects this is all that you have to do. For more complicated ones translating the strings might leave your carefully designed objects a mess, beginning with menues. Remember that all string lengths may have been arbitrarily changed, resulting in objects that might hide each other or ugly looking gaps between objects that were meant to be close to each other. You will find numerous functions in the library rintl that will try to bring your objects back to a tolerable outer appearance again, each of them being called rnl(). You will also find a function called rnltree() that will try to do the job for entire object trees. You can control the behavior of these adapt functions in numerous ways that should meet as many requirements as possible. You can specify spaces to be left between objects, margins to the tree borders, minimum widths of button objects and so on. You can also tell the functions not to walk thru the whole object tree but stopping at a certain level, thus allowing you to process different parts of your objects each with individual values for the control flags and variables. Menus usually require special treatment. rnlmenu() will usually to the job in a satisfying manner. As you might guess the result still can't be perfect. Your objects might need a little (or a whole lot of) fine tuning after being translated. Special care is also expected from the translator. She has to follow some simple rules while doing her job. You yourself, the programmer, should follow some rules when designing your object trees. The rnltree() function will work best if it finds all object children in rows and columns of constant height. This can be easily achieved by putting object children belonging logically together into visible or invisible boxes. For each library function you will find a manual page describing its functionality in detail. In the subdirectory ``example'' after building the binaries you will find a simple example program showing all this in a simple manner. Set the envariable LANGUAGE to one of the values de (German), it (Italian), fr (French), nl (Dutch), ga (Irish) or pt (portuguese) to select your preferred language. Setting the envariable to an unknown value will make the example program choose the default values ``C'' (in this case English). Select the menu entry ``International...'' to see a simple text displayed in your selected language. The entry ``Language...'' might allow you to change the language even at runtime but this depends on the existence of the putenv() function in your standard C library. If putenv() is not found the entry will be disabled. You have to change the language on the command line. In the ``Options'' menu you can follow the metamorphosis of a GEM dialog, first without any treatment, then after a call to rnltree() and finally after some additional beauty cures. See the source and the manual pages rnltree(3) and rnlpush(3) for details. 6. Charsets: Atari versus ISO-Latin 1 ===================================== There's a dilemma with the i18n thing. On other platforms (especially of the workstation sector) you can not only localize the language but also the charset to utilize. AFAIK there's no C library available that allows you to do so on Atari. A couple of years ago there was but one choice for a charset in the Atari sector, the charset of the Atari's built-in system font. Now more and more Atari users have switched to the universally used ISO-Latin 1 charset also known as i-8859-1, especially when using a command line interpreter. If you use already internationalized GNU packages you will come across this charset, too. (Note that both ISO-Latin 1 and the Atari charset are intended for use with western european languages, the Atari's charset to a certain extent also for Hebrew and Greek. Hence the following refers only to those languages). I have chosen the following way: As it is very unlikely that you use an ISO-Latin 1 charset with GEM, the files containing the messages for the example program - which has a GUI - have to be read with the Atari charset. But these files shouldn't go into your locale directory. The whole thing is provided as an example and should stay where it is (i. e. it isn't installed). For xgemtext you have the choice. If you use an ISO-Latin-1 font (as you should do) simply follow the instructions in ABOUT-NLS. If you insist on using a font with an Atari codeset set the environment variable LANGUAGE (or LC_ALL or LC_xxx) to .atarist where should specify the language you want to use. When installing xgemtext the appropriate directories in your locale directory get created. They will be called /.atarist/LC_MESSAGES. 7. Miscellanea ============== Now something that might leave you a little uneasy: The translating project relies on the original programs being written in English. Say you have written your program in Rhaeto-Romance and you want to have it translated into Irish. Well, the diligent translator from the Aron Islands is lucky enough to understand English and she probably won't bother to learn Rhaeto-Romance. The only possibility to solve the problem is to agree upon one base language and this language is English. If you have made it until here through this document I'm sure you will manage to write the messages in your sources and resources in English, too. Of course you will have to explain your users how to benefit from the new i18n features of your program. Include the file ABOUT-NLS that is part of the GNU gettext distribution. This will tell your users the necessary measures to take. You should also include a copy of the file ATARI.NLS in your sources. This file is intended to be read by translators. They will find some useful hints for translating messages for GEM programs. Note that gemtext is *not* part of the GNU project! Bug reports, comments, compliments should be sent to the author, flames and criticism to /dev/null. Please also note that everything you find in the directory intl *is* part of the GNU gettext package. Thus, any comments concerning files in this subdirectory should be directed to the maintainer of the GNU gettext package. There are some other files that consist partly or entirely of copyrighted material by the Free Software Foundation. If in doubt check the header texts. Hope you have fun with gemtext! Guido Flohr .