トップ 差分 一覧 ソース 検索 ヘルプ PDF RSS ログイン

Daisuke Okanohara

English | Japanese

Position D3
Research Area Statistical Natural Language Processing, Machine Learning, Data ,Structure, and Algorithm
Publication English | Japanese
Contact Department of Computer Science, Faculty of Information Science and Technology, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, JAPAN
office Room 615, 7th Building of Faculty of Science
e-mail hillbig at is.s.u-tokyo.ac.jp (replace at with @)

Pen picture

I am a third-year PhD student, and now working on a stastical natural language processing. I am interested in a statistical natural language processing using very large corpora. I also study data indexing, algorithm for large data, statistical learning theory, and information theory.

Event(Past 12month, international events only.)

  • I will have a talk at ALENEX 2010 (Jan. 2010)
  • I had talk at SPIRE 2009 (Aug. 2009)
  • I had a poster presentation at NAACL-HLT 2009 (May. 2009)
  • I had a poster presentation at SDM 2009 (May. 2009) poster(ppt)
  • I had a talk at ESA 2008 (Sep. 2008)

Selected Publications

NLP / Machine Learning

  • "Learning Combination Features with L1 Regularization", D. Okanohara and J. Tsujii., In the NAACL-HLT. June 2009. pdf ppt
  • "Text Categorization with All Substring Features", D. Okanohara, J. Tsujii., In the SIAM International Conference on Data Mining (SDM). April 2009. PDF PPT
  • "A discriminative language model with pseudo-negative samples", D. Okanohara and J. Tsujii., In Proc. of ACL 2007 pdf
  • "Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition", D. Okanohara, Y. Miyao, Y. Tsuruoka and J. Tsujii., In Proc. of ACL 2006. Sydney, Australia, July 2006. pdf
  • "Assigning Polarity Scores to Reviews Using Machine Learning Techniques", D. Okanohara and J. Tsujii., IJCNLP 2005. LNCS3651. Jeju Island, Korea, Springer-Verlag, October 2005. pdf

Algorithm/ Data Structure

  • "Conjunctive Filter: Breaking the Entropy Barrier", D. Okanohara, Y. Yoshida., In the Proc. of ALENEX 2010 (pdf pptx(slides) pdf(slides))
  • "A Linear-Time Burrows-Wheeler Transform using Induced Sorting", D. Okanohara and K. Sadakane., In the Proc. of SPIRE 16th String Processing and Information Retrieval Symposium. Aug 2009. (pdf(draft))
  • "An Online Algorithm for Finding the Longest Previous Factors", D. Okanohara and K. Sadakane., In the 16th European Symposium on Algorithms. Sep 2008 (ppt, pdf)
  • "Practical Entropy-Compressed Rank/Select Dictionary", D. Okanohara and K. Sadakane., In the Proceedings of ALENEX 2007. New Orleans, Lousiana, January 2007. (ppt, pdf)
  • "Partially Decodable Compression with Static PPM", D. Okanohara., In the Data Compression Conference 2005 poster session. Snowbird, UT, USA, March 2005.

Software

  • Minise MIni Search Engine. A compact full-text search engine supporting sequential search, and indexes; inverted file index, N-gram index, and suffix arrays.
  • Ohmm Online EM algorithm for Hidden Markov Models
  • OLL Online Machine Learning Library
  • Bep Associative Arrays for very large collections (And minimal perfect hash function library)
  • Tx Succinct Trie Data Structure

Principal Developer in Exploratory Software Project (2002-2005)

(Mitou Software Souzou Jigyou)

  • A New Data Compression Algorithm using Word Extraction Method. (2002)
  • Universal Probabilistic Language Models (2003)
  • Document Classification using Context Information. (2004-2005)

These software are used at Preferred Infrastructure

Books, Articles

  • Data Compression Handbook", Shuwa System, 2003 (Japanese)
  • "Compression Algorithms", C Magazine, Softbank Creative, 2006 January (Japnese)

Awards

  • Genome4, Bio Informatics Programming Contest Problem 2 Best awards, 2004
  • Exploratory Software Project, Super Creater Awards 2005
  • YANS 2006, Best presentation awards
  • President's Prize of the University of Tokyo, 2007 link(Japanese)
  • YANS 2007, Best presentation awards,
  • IBIS 2008, Prize for encouragement

Link