30 January 2007
University of Pisa
Giusseppe Attardi

The development of statistical parsers is one of the biggest breakthroughs in natural language processing in recent years. Statistical dependency parsers use knowledge of language gained from an annotated corpus to produce a dependency parse tree for sentences, without requiring a Phrase Structure grammar for the language. Dependency trees represent predicate-argument relations between words that are both easy to understand by human annotators and convenient for use in later processing stages. I developed a multilanguage dependency parser that achieved good accuracy scores at the CoNLL-X shared task, dealing with 13 different languages. The parser is designed as a one-pass deterministic Shift/Reduce parser, capable of handling non-projective sentences, which occur more frequently in languages other than English. The parser achieves a processing performance of over 200 sentences per second, making it suitable for large scale deployment. I will describe uses of the parser in Question Answering, Intent Analysis and other applications of document analysis.