NAME cxsearch SYNOPSIS cxsearch -spkr spkrdir [-so] [-sp] xscript DESCRIPTION An utility for searching files from CDROM of CU Corpora using regular expression. The items to be search could either be the phonetic transcription or the orthographic transcription (BIG5 text) of the data. [Note: This utility is not applicable to CUSYL] OPTIONS The following options are supported: -spkr spkrdir "spkrdir" is the speaker directory of the CDROM in CU Corpora. By the time of writing this program, the known corpora to be applicable are : CUWORD, CUDIGIT, CUCMD, CUSENT -so Search by the orthographic transcriptoin. The transcription in CU Corpora is in BIG5 code for Chinese text. This option and the -sp option is mutually exclusive. If both are specified in the command line. The last one will be active. -sp Search by the phonemic transcription. The transcription in CU Corpora is in LSHK scheme. This option is the default. This option and the -so option is mutually exclusive. If both are specified in the command line. The last one will be active. xscript It is the pattern to be search. It could be simply the orthographic (BIG5 text) or phonemic transcription you are interested. In fact, it is an regular expression (RE). Therefore, you may specify the transcriptions you are interested as RE which will give you a powerful searching utility for the corpora. EXAMPLE Normal string matching cxsearch -spkr CC04M -sp hoi1 This will retrive all the name of all files with syllables /hoi1/. Regular expression matching cxsearch -spkr CC04M -sp oi[12] This will retrive all the name of all files with syllables with final /oi/, either in tone 1 or tone 2. REGULAR EXPRESSION SHORT NOTE * zero or multiple match to preceeding character + one or multiple match to preceeding character [] match either one of the character(s) in the bracket AUTHOR HSIUNG Chun-Fat and LO Wai-Kit (wklo@ieee.org) CONTACT For more information, please check at http://dsp.ee.cuhk.edu.hk/speech ACKNOWLEDGEMENT It is part of the Cantonese Speech Database project carried out by the Digital Signal Processing Laboratory (DSPLAB), Department of Electronic Engineering, the Chinese University of Hong Kong. This project (AF/20/97) is supported by the Industrial Support Fund (ISF), Industry Department, Hong Kong Special Administrative Region government, China. DISCLAIMS cxsearch is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Any opinion, findings, and conclusions or recommendations expressed in this material/event (or by members of the project team) do not reflect the views of the Government of the Hong Kong Special Administrative Region, the Industry Department or the Industry and Technology Development Council. COPYRIGHT (C) 1999 The Chineses University of Hong Kong. All Rights Reserved.