shithub: riscv

ref: 85728b93ec89ec80e4dfc2aa36a490ec5dda2452
dir: /sys/man/1/ktrans/

View raw version
.TH KTRANS 1
.SH NAME
ktrans \- language transliterator
.SH SYNOPSIS
.B ktrans
[
.B -G
]
[
.B -l
.I lang
]
[
.I kbdtap
]
.SH DESCRIPTION
.I Ktrans
transliterates a stream of keyboard
events. Without any arguments,
.I ktrans
reads events from standard input
and writes out converted events to stdout.
If a
.I kbdtap
file is given, it is used for both
input and output instead.
.I Ktrans
starts in a passthrough mode, echoing out
the input with no conversions. Control characters
are used to give instructions, the following
control sequences are used to switch between languages:
.TP
.B ctl-t
English (Passthrough).
.TP
.B ctl-n
Japanese Hiragana.
.TP
.B ctl-k
Japanese Katakana.
.TP
.B ctl-c
Chinese.
.TP
.B ctl-r
Russian.
.TP
.B ctl-o
Greek.
.TP
.B ctl-s
Korean.
.TP
.B ctl-v
Vietnamese.
.SH CONVERSION
Conversion is done in two layers, an implicit
layer for unambiguous mappings, and an explicit
layer for selecting one match out of a list of
ambiguous matches. The following control characters
are used for conversion instructions.
.TP
.B ctl-\e
Explicitly match the current input, consecutive inputs of ctl-\e
will cycle through all the possible options.
.TP
.B ctl-l
Reset the current input buffer.
.PP
The implicit layer happens automatically as characters
are input, transforming a consecutive set of key strokes
in to their rune counterpart. A series of runes may then
be explicitly matched by cycling through a list of options.
.I Ktrans
automatically maintains a buffer of the current series of
key strokes being considered for an explicit match, and resets
that buffer on logical "word" breaks depending on the language.
However manual hints of when to reset this buffer will likely
still be required.
.PP
Input is always passed along, when a match is found
.I Ktrans
will emit backspaces to clear the input sequence and replace
it with the matched sequence.
.SH DISPLAY
.I Ktrans
will provide a graphical display of current explicit conversion
candidates as implicit conversion is done. Candidates are highlighted
as a user cycles through them. At the bottom of the list is an exit
button for quitting the program. Keyboard input typed in to the window is
transliterated but discarded, providing a scratch input space. The 
.B -G
option disables this display.
.SH "KEY MAPPING"
For convenience, the control characters used by
.I ktrans
can be mapped directly to physical keys through modifications
of the kbmap (see
.IR kbdfs (8)).
The
.B /sys/lib/kbmap/jp
mapping will turn language input keys
present on Japanese A01/106/109(A) in to control
sequences matching their label:
.TP
.B Henkan
Convert to Kanji (ctl-\e)
.TP
.B Muhenkan
Clear Kanji buffer (ctl-l)
.TP
.B Hiragana / Katakana
Switch to Hiragana (ctl-n)
.TP
.B Shift + Hiragana / Katakana
Switch to Katakana (ctl-v)
.TP
.B Hankaku / Zenkaku
Switch to Hiragana (ctl-n)
.TP
.B Shift + Hankaku / Zenkaku
Switch to passthrough (ctl-t)
.TP
.B Shift + Space
Convert to Kanji (ctl-\e).
This is a fallback for keyboards without a physical Henkan key.
.SH JAPANESE
The Hiragana and Katakana modes implicitly turn Hepburn representations
in to their Kana counterparts. Explicit conversions combine sequences
of Hiragana in to Kanji.
.PP
Capital Latin input is used for hinting. For adjectives and verbs, a single
capital is used as an Okurigana hint. For example,
.ft Jp
動かす
.ft
is typed as 'ugoKasu[^\e]'. The hint serves two purposes, it is
provided as part of the explicit sequence for Kanji lookup and denotes
that the following runes are Okurigana. 
.PP
For particles, the entire Kana may be input in upper case. This similarly
denotes the end of the Kanji portion of the sequence, but is not used
as part of the lookup sequence itself. So to write
.ft Jp
私の猫
.ft
the user types "watashiNO[^\e]neko[^\e]". Note that in both cases
we have successfully communicated to krans when to reset the explicit
match buffer without needing to explicitily give a ctl-l character.
.SH CHINESE
The Wubizixing input method is used. No implicit conversion is done,
explicit conversion interprets Latin characters as their Wubi counterparts
to do lookup of Hanzi.
.SH RUSSIAN
Implicit layer converts latin to Cyrillic; the transliteration is mostly
phonetic, with
.B '
for
.IR myagkij-znak
(ь),
.B ''
for
.I tverdyj-znak
(ъ)
.I yo
for ё,
.B j
for
.IR i-kratkaya
(й).
.SH VIETNAMESE
Implicit conversion is modeled after Telex, supporting
standard diacritic suffixes.
.SH KOREAN
Mapping is done by emulating a Dubeolsik layout, with each latin
character mapping to a single Jamo. Sequences of up to three Jamo
are automatically converted to Hangul syllables.
.SH EXAMPLES
To type the following Japanese text:

.ft Jp
私は毎日35分以上歩いて、 更に10分電車に乗って学校に通います。
 健康の維持にも役だっていますが、 なかなかたのしいものです。
.ft

your keyboard typing stream should be:

watashiHA[^\e]mainichi[^\e]35[^l]fun[^\e]ijou[^\e]aruIte,[^\e]
saraNI[^\e]10[^l]fun[^\e]denshaNI[^\e]noTte[^\e]gakkouNI[^\e]
kayoImasu.[\en]kenkouNO[^\e]ijiNImo[^\e]yakuDAtteimasuga,[^\e]
nakanakatanoshiImonodesu.[\en]

where [^\e] and [^l] indicate 'ctl-\e' and 'ctl-l',
respectively.
.SH SOURCE
.B /sys/src/cmd/ktrans
.SH SEE ALSO
.IR rio (4)
.IR kbdfs (8)
.SH BUGS
.PP
There is no hint from rio when the user moves the cursor, as such
moving it is unlikely to result in what the user expects.
.PP
Plan9 lacks support for rendering combinational Unicode sequences,
limiting the use of some code ranges.
.SH HISTORY
Ktrans was originally written by Kenji Okamoto in August of 2000 for
the 2nd edition of Plan 9.  It was imported in to 9front in July of
2022, with patches by several contributors. It was towed inside
the environment during the 2022 9front hackathon.