• dmortens@cs.cmu.edu

Current Research

KAIROS

My team has (PI: Teruko Mitamura) has been awarded a contract (pending negotiations) for the DARPA KAIROS program. I will contribute to techniques and protocols for the curation of schema libraries.

LORELEI

Much of my current research for the past five years has centered around around DARPA LORELEI, a program that seeks to bring language technologies to low resource languages, with a particular focus on humanitarian assistance/disaster relief (HA/DR) scenarios. The central task is discovering “situation frames” (SFs), tuples consisting of a humanitarian need, a place (location or geopolitical entity) and other information of value to mission planners. These may be found in speech or text. Discovering these involves a set of ancillary tasks: machine translation (MT), named entity recognition (NER), and entity discovering and linking (EDL). My research output has contributed to all of these tasks.

Computational Phonology

I have made it my mission to bring the power of 1960s phonology to the NLP researcher of today. I have already completed two of the three legs of this stool. The first is Epitran, a orthography-to-IPA converter for over 60 languages. The second is PanPhon, a Python library for extracting articulatory features from IPA representations, then manipulating these feature representations in various useful ways. The third leg of the stool (PhonFST), currently in progress, is a Python library that compiles phonological rules, in the rich notation used by phonologists, into finite-state transducers (using OpenFST as a backend). PhonFST will incorporate the feature representations of PanPhon, allowing the construction of expressive and general “meta-rules” that have not been possible in earlier FST regular expression notations like those from XFST and Pynini.