English | Japanese
| Position | Research Assosiate |
| Degree | Ph.D., University of Tokyo |
| Research | Parsing algorithms, grammar acquisition, probabilistic models for parse disambiguation |
| Publication | publication page |
| Contact | Department of Computer Science, Faculty of Information Science and Technology, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, 113-0033 Tokyo, JAPAN E-mail: yusuke@is.s.u-tokyo.ac.jp Office: Room 401, 7th Building of Faculty of Science Tel: +81/0 3 5841 4088 Fax: +81/0 3 5802 8872 |
I am a Research Associate at the University of Tokyo and working on Computational Linguistics and Natural Language Processing since 1998. I studied Computer Science and Computational Linguistics in Department of Information Science, the University of Tokyo. I received B.Sc. in 1998, M.Sc. in 2000, and Ph.D. in 2006 from the University of Tokyo. My research focus was on the efficient processing of disjunctive feature structures and probabilistic models for unification-based parsing. Currently I am interested in grammar engineering based on grammar acquisition from large corpora.
I am working on linguistic/mathematical models of natural language and its application to real-world texts. While recent studies have succeeded in developing various NLP techniques in practice, my interest is rather in the mathematical modeling of language. I first worked on efficient algorithms for HPSG parsing, and I observed there remain lots of problems in the theories and in their implementations. This observation lead me to the current research on corpus-oriented grammar development. The grammar developed by this method can compute liguistically sound analyses of real-world texts and also we can evaluate the validity of grammar theories according to real-world texts. My research interest also includes probabilistic models for parse disambiguation, which is very important because wide-coverage grammars produce highly ambiguous parse candidates. These methods have been applied to the development of a wide-coverage HPSG parser, Enju. This parser is currently used in various applications including MEDIE and Info-PubMed.