JyutPing and SPPAS-dict

Cantonese is a major Chinese dialect spoken by tens of millions of people in the provinces of Guangdong and Guangxi, the neighboring regions of Hong Kong and Macau, and many overseas Chinese communities. In Cantonese, each Chinese character is pronounced as a monosyllable carrying a specific lexical tone. The syllable can be divided into two parts: the Initial (onset), and the Final (rime). The Initial is typically a consonant, while the Final contains a vowel nucleus followed by an optional consonant coda. There are 20 Initials and 53 Finals in Cantonese, which lead to over 600 legitimate base syllables. Each base syllable can be associated with different tones. If the tone is changed, the syllable generally refers to another character that has a different meaning [1].

JyutPing is a romanisation system for Cantonese developed by the Linguistic Society of Hong Kong (LSHK) in 1993. Table 1 shows the mapping between Jyut Ping symbols and IPA symbols, where the left columns refer to Jyut Ping symbols and the right columns refer to IPA sysmbols [2].

Table 1. Jyut Ping symbols and IPA symbols.

Different from the Initial-Final model for Cantonese, there are 32 phones to form syllables, including 13 vowels and 19 consonants, which can be represented by SPPAS-dict symbols [3]. Table 2 shows the mapping for SPPAS-dict and IPA.

Table 2. SPPAS-dict and IPA.

[1] Tan Lee, Yuanyuan Liu, Yu Ting Yeung, Thomas K.T. Law, Kathy Y.S. Lee, "Predicting
severity of voice disorder from DNN-HMM acoustic posteriors", Proceedings of Interspeech,
San Francisco, USA, September 8-12, 2016.

[2] Ching P C, Lee T, Lo W K, et al. Cantonese speech recognition and synthesis[J]. Advances
in Chinese Spoken Language Processing, 2006: 365-386.

[3] Fung R, Bigi B. Automatic word segmentation for spoken Cantonese[C]//Oriental COCOSDA
held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation
(O-COCOSDA/CASLRE), 2015 International Conference. IEEE, 2015: 196-201.