This is the file pyhyphen.txt of the CJK macro package ver. 4.8.5
(16-Oct-2021).

Hyphenation patterns for pinyin syllables with and without tone markers
-----------------------------------------------------------------------

Sometimes it makes sense to use unaccented pinyin syllables for common names
and phrases which are repeated frequently; sometimes you are in an
environment which doesn't allow accented pinyin syllables at all. For such
cases it is desirable to have correct hyphenation, avoiding manually added
hints using e.g., `\-' between the syllables.

Fortunately, due to the limited numbers of Chinese pinyin syllables (407 for
Mandarin), it is easy to create hyphenation patterns. The logical
consequence is to add a new `language' to the Babel package, and exactly
this can be found in the directory utils/pyhyphen.


Installation
------------

[Note: The hyphenation patterns have been submitted to the `tex-hyphen'
       project. In recent TeX distributions it is no longer necessary to set
       up pinyin support manually since everything should work out of the
       box.]

This is fairly straightforward. Move the Babel language definition file
pinyin.ldf file to a place found by TeX. If you e.g., maintain a local
TEXMF tree, a good place would be $TEXMFLOCAL/tex/generic/babel/pinyin.ldf.
Similarly, move the pinyin hyphenation pattern files hyph-zh-latn-pinyin.tex
and hyph-zh-latin-tonepinyin.tex into your (local) TEXMF tree: The analogous
place would be directory $TEXMFLOCAL/tex/generic/hyphen.

Now run texconfig (or a similar tool) to add hyph-zh-latn-pinyin.tex and
hyph-zh-latin-tonepinyin.tex to the used hyphenation patterns. In the usual
case you have to add a line saying

  pinyin    hyph-zh-latn-tonepinyin.tex

for Unicode-aware TeX engines (XeTeX, luatex, upTeX) or

  pinyin    hyph-zh-latn-pinyin.tex

for all other engines to the hyphenation configuration file language.dat.
Finally, build a new format file (usually the command `initex latex.ltx');
in most cases this happens automatically.

Using Babel ensures that it works both with LaTeX and Plain TeX.


Usage
-----

Unicode-aware engines
- - - - - - - - - - -

Do something like this:

  \documentclass[...]{...}

  \usepackage[pinyin,german,english]{babel}
  ...

  \begin{document}
  ...
  \foreignlanguage{pinyin}{pinyin with tone markers like Běijīng}
  ...
  \end{document}


Other engines
- - - - - - -

Do something like this:

  \documentclass[...]{...}

  \usepackage[T1]{fontenc}
  \usepackage[pinyin,german,english]{babel}
  ...

  \begin{document}
  ...
  \foreignlanguage{pinyin}{pinyin without tone markers like Beijing}
  ...
  \end{document}


Note 1: pinyin.ldf is intentionally very minimal. Don't expect that e.g.,
        \chapter yields a pinyin version of the Chinese word for `chapter'.
        It might be useful to define a shorthand macro like the following:

          \newcommand{\py}[1]{\foreignlanguage{pinyin}{#1}}

        Now you can simply say

          \py{Beijing}

Note 2: The apostrophe character `'' is used as to precede syllables
        starting with `a', `e', or `u'. This is needed to resolve
        ambiguities like this:

          Xi'an -> Xi-an (two syllables)
          Xian  -> Xian (one syllable)

        If a line break occurs, the apostrophe should vanish, e.g.,

          Tian'anmen -> Tian-
                        anmen

        Use the Babel shorthand "' to enter '; this internally resolves to
        \discretionary.

        The shorthand `"u' (as used in German) is available to input
        `umlaut u'.

Note 3: Most Babel language support files define a `<language>.sty' file
        also. This is not true for pinyin! pinyin.sty is used for accented
        pinyin syllables which don't need a special hyphenation support.
        (pinyin.sty works with Plain TeX also.)


Technical details
-----------------

The dictionary used to construct the hyphenation patterns has been created
with a small C program `pinyin.c' which simply combines all existing Chinese
syllable pairs that don't need an apostrophe inbetween. Then, `patgen' has
been run on the dictionary; `pinyin.tr' defines the used character set.

Due to the regularity of the word combinations, only two-letter patterns of
the first and second level are needed to find all possible breaks without a
single error or omission.

---End of pyhyphen.txt---