Although there has been recent work in building natural language interfaces to databases either through light schema annotation (Popescu et al.) or machine learning from natural language/logical query pairs (Mooney et al. and Collins et al.), the fact remains that users are not going to widely use such interfaces until they exhibit very high levels of breadth and robustness. And while annotation and learning techniques can and should be used to bootstrap configuration, only through the attention and patching actions of a watchful human administrator can systems be engineered into reliable and dominating solutions.
This seminar reviews work I have done with my students toward building full-featured NLIs to databases. We have built up a fair bit of infrastructure around the core problem of parsing single shot questions to logical queries expressed in a higher-order version of Codd's tuple calculus. Additionally we treat the inverse problem of generating natural language paraphrases of such tuple-calculus expressions. A particular focus has been on enabling the typical database administrator to configure and maintain such systems over standard relational databases. After demonstrating our current capabilities, we give a glimpse into our plans to evaluate alternative grammar formalisms as well as to develop techniques to reduce user feelings of intimidation. Our ultimate goal is to identify and develop application niches where natural language interfaces to databases exhibit marked advantages over approaches based upon standard forms, hyper-text navigation, information retrieval and formal query languages.
Michael Minock obtained a Ph.D. in Computer Science from UCLA in 1997 and worked at Microelectronics and Computer Technology Corporation (MCC) in Austin TX until 2000. He has been a senior lecturer at the Department of Computing Science at Umeå University, Sweden since 2001.