Research Interests
Computational linguistics and NLP
My computational interests center around applying computational morphology and phonology to problems in natural language processing. While linguistic representations have long been employed in a variety of NLP tasks, these representations have been primarily morphological, syntactic, semantic, and discourse-analytic. My mission has been to expand the role of morphological, and especially phonological, representations in NLP. Some specific research directions include the following:
- Phonological representations for cross-lingual transfer learning
- Phonological representations (segments, features) in named entity recognition (NER)
- Predicting the forms of loanwords from phonological models
- Approximate phonological matching and entity linking
- The implementation and application of automatic (rule-based, unsupervised, and semi-supervised) syllabification
- Tools for compiling metarules (with phonology feature specifications) into finite state transducers
My interests in computational morphology are perhaps more straighforward:
- Developing more user-friendly and expressive frameworks for writing morphological analyzers, particularly for languages with non-concatenative morphology
- Rich morphological representations for machine translation from morphologically rich source langauges
- Machine translation into morphologically rich target languages
Finally, I have long-standing interests in computational historical phonology
- Test hypothetical proto-language reconstructions and sequences of sound changes through computational means
- Cognate prediction problem: given two corpora of related languages and a word from one corpus, predict the form of its cognate in the other language, even if it is not present in the corpus
Theoretical and descriptive linguistics
I have long worked on languages of East and Southeast Asia. The specific languages and groups in which I am interested and which I have worked on are as follows:
- Hmong-Mien
- Western Hmongic
- Tibeto-Burman
- Tangkhulic
- Kuki-Chin
- Jingpho
Aside from specific languages, I am interested in a variety of linguistic subfields and issues. Here is a representative outline of my theoretical interests:
- Phonology
- Tone
- Phonation type/register
- Chain shifts and other counterfeeding opacity
- Abstractness of phonological relationships
- Phonetics-Phonology interface
- Phonology-Morphology interface
- Morphology
- Compounding
- Process morphology
- Reduplication
- Phonological constraints on morphotactics
- Affix ordering
- Historical Linguistics
- Comparative reconstruction
- Reconstructing phonological grammars
- Speaker misunderstanding and misinterpretation as a source of linguistic innovation
- Language contact
- Language Description