GENIA LOGO

GENIAcorpus3.0p

This is the POS-annotated version of the GENIA corpus Ver 3.0 (2000 abstracts).

As the version 2.1, the tag set is basically that of Penn Treebank (PTB) POS tag set, with the following major differences.

The corpus is available in three formats.

The xml files have been checked with Internet Explorer 6.0 and Mozilla 1.1.

The corpus is available from the download page.


The pages were last updated on the 17th March 2003 by Tateisi Yuka.

Department of Information Science, Faculty of Science, University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113, Japan.