A computer science text corpus/search engine x-TeC and its applications

Takehiro Tokuda, Yusuke Soyama, Tetsuya Suzuki

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We built a computer science text corpus/search engine called X-Tec. We automatically collected 2.98 million sentences (68.9 million words) from carefully chosen English computer science documents on the Web using 678 hours. We also built an interactive sample sentence query system and an automatic expression diagnostic system for graduate students. Our computer science text corpus/search engine can be also used for knowledge search and word co-occurrence frequency retrieval.

Original languageEnglish
Title of host publicationInformation Modelling and Knowledge Bases XVII
EditorsYasushi Kiyoki, Jaak Henno, Hannu Jaakkola, Hannu Kangassalo
PublisherIOS Press BV
Pages253-259
Number of pages7
ISBN (Electronic)1586035916, 9781586035914
Publication statusPublished - 2006
Externally publishedYes
Event15th European-Japanese Conference on Information Modelling and Knowledge Bases, EJC 2005 - Tallinn, Estonia
Duration: 2005 May 162005 May 20

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume136
ISSN (Print)0922-6389

Conference

Conference15th European-Japanese Conference on Information Modelling and Knowledge Bases, EJC 2005
Country/TerritoryEstonia
CityTallinn
Period05/5/1605/5/20

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'A computer science text corpus/search engine x-TeC and its applications'. Together they form a unique fingerprint.

Cite this