This is the file pyhyphen.txt of the CJK macro package ver. 4.8.5 (16-Oct-2021). Hyphenation patterns for pinyin syllables with and without tone markers ----------------------------------------------------------------------- Sometimes it makes sense to use unaccented pinyin syllables for common names and phrases which are repeated frequently; sometimes you are in an environment which doesn't allow accented pinyin syllables at all. For such cases it is desirable to have correct hyphenation, avoiding manually added hints using e.g., `\-' between the syllables. Fortunately, due to the limited numbers of Chinese pinyin syllables (407 for Mandarin), it is easy to create hyphenation patterns. The logical consequence is to add a new `language' to the Babel package, and exactly this can be found in the directory utils/pyhyphen. Installation ------------ [Note: The hyphenation patterns have been submitted to the `tex-hyphen' project. In recent TeX distributions it is no longer necessary to set up pinyin support manually since everything should work out of the box.] This is fairly straightforward. Move the Babel language definition file pinyin.ldf file to a place found by TeX. If you e.g., maintain a local TEXMF tree, a good place would be $TEXMFLOCAL/tex/generic/babel/pinyin.ldf. Similarly, move the pinyin hyphenation pattern files hyph-zh-latn-pinyin.tex and hyph-zh-latin-tonepinyin.tex into your (local) TEXMF tree: The analogous place would be directory $TEXMFLOCAL/tex/generic/hyphen. Now run texconfig (or a similar tool) to add hyph-zh-latn-pinyin.tex and hyph-zh-latin-tonepinyin.tex to the used hyphenation patterns. In the usual case you have to add a line saying pinyin hyph-zh-latn-tonepinyin.tex for Unicode-aware TeX engines (XeTeX, luatex, upTeX) or pinyin hyph-zh-latn-pinyin.tex for all other engines to the hyphenation configuration file language.dat. Finally, build a new format file (usually the command `initex latex.ltx'); in most cases this happens automatically. Using Babel ensures that it works both with LaTeX and Plain TeX. Usage ----- Unicode-aware engines - - - - - - - - - - - Do something like this: \documentclass[...]{...} \usepackage[pinyin,german,english]{babel} ... \begin{document} ... \foreignlanguage{pinyin}{pinyin with tone markers like Běijīng} ... \end{document} Other engines - - - - - - - Do something like this: \documentclass[...]{...} \usepackage[T1]{fontenc} \usepackage[pinyin,german,english]{babel} ... \begin{document} ... \foreignlanguage{pinyin}{pinyin without tone markers like Beijing} ... \end{document} Note 1: pinyin.ldf is intentionally very minimal. Don't expect that e.g., \chapter yields a pinyin version of the Chinese word for `chapter'. It might be useful to define a shorthand macro like the following: \newcommand{\py}[1]{\foreignlanguage{pinyin}{#1}} Now you can simply say \py{Beijing} Note 2: The apostrophe character `'' is used as to precede syllables starting with `a', `e', or `u'. This is needed to resolve ambiguities like this: Xi'an -> Xi-an (two syllables) Xian -> Xian (one syllable) If a line break occurs, the apostrophe should vanish, e.g., Tian'anmen -> Tian- anmen Use the Babel shorthand "' to enter '; this internally resolves to \discretionary. The shorthand `"u' (as used in German) is available to input `umlaut u'. Note 3: Most Babel language support files define a `.sty' file also. This is not true for pinyin! pinyin.sty is used for accented pinyin syllables which don't need a special hyphenation support. (pinyin.sty works with Plain TeX also.) Technical details ----------------- The dictionary used to construct the hyphenation patterns has been created with a small C program `pinyin.c' which simply combines all existing Chinese syllable pairs that don't need an apostrophe inbetween. Then, `patgen' has been run on the dictionary; `pinyin.tr' defines the used character set. Due to the regularity of the word combinations, only two-letter patterns of the first and second level are needed to find all possible breaks without a single error or omission. ---End of pyhyphen.txt---