Information of NLPRS2001

Cover
page I

Brief Contents
page III

Greeting from the General Chair
Jun'ichi Tsujii (University of Tokyo, Japan)
page V

Message from the Program Chair
Keh-Yih Su (Behavior Design Corperation, Taiwan)
page VII

Message from the Local Organization Chair
Hiroshi Nakagawa (University of Tokyo, Japan)
page IX

Committees
page XI

Table of Contents
page XV


Invited Talks

Audio Browsing and Search in the Voicemail Domain
Julia Hirschberg (AT&T Labs-Research, USA), Michiel Bacchiani and Phil Isenhour
pages 3-8

Corpus, Information Mining and the New Global Village
Benjamin K. Tsou (City University of Hong Kong, Hong Kong SAR)
pages 9-18

From Read Speech Recognition to Spontaneous Speech Understanding
Sadaoki Furui (Tokyo Institute of Technology, Japan)
pages 19-25

Just-In-Time Question Answering
Sanda M. Harabagiu (University of Texas, USA)
pages 27-34


Panel discussion

Are NLP technologies really ready for application?
Jun'ichi Tsujii (Organizer; University of Tokyo, Japan)
pages 37-38


Word Sense Disambiguation (I)

Ensembling based on Feature Space Restructuring with Application to WSD
Hiroya Takamura, Hiroyasu Yamada, Taku Kudoh, Kaoru Yamamoto and Yuji Matsumoto
pages 41-48

Word Sense Disambiguation Using Vectors of Co-occurrence Information
Saim Shin, Yong-soek Choi and Key-Sun Choi
pages 49-55

A Bayesian Approach to Semi-Supervised Learning
Rebecca Bruce
pages 57-64


Word Sense Disambiguation (II)

Disambiguation of Compound Noun Translations Extracted from Bilingual Comparable Corpora
Hiroshi Nakagawa
pages 67-74

Word Sense Disambiguation with a Corpus-Based Semantic Network
Qujiang Peng, Takeshi Ito and Teiji Furugori
pages 75-82

Automatic Sense Tagging Using Parallel Corpora
Nancy Ide, Tomaž Erjavec and Dan Tufis
pages 83-90


Summarization

Evaluating Automatic Text Summarization with Summary Results by the Experienced and the Inexperienced
Kai Ishikawa, Shinichi Ando and Akitoshi Okumura
pages 93-100

Analysis of Linguistic Features for Identifying Information Constituents of a Concept
Haeseung Paik, Young-Soo Kang and Key-Sun Choi
pages 101-107

WordNet and Automated Text Summarization
Rui Pedro Chaves
pages 109-116


Lexical Database

Papillon Lexical Database Project: Monolingual Dictionaries & Interlingual Links
Gilles Sérasset and Mathieu Mangeot
pages 119-125

Relative Synonymy and Conceptual Vectors
Matthieu Lafourcade and Violaine Prince
pages 127-134

Integration of heterogeneous language resources: A monolingual dictionary and a thesaurus
Takenobu Tokunaga, Yasuhiro Syotu, Hozumi Tanaka and Kiyoaki Shirai
pages 135-142


Language Modeling (I)

A Simple Closed-Class/Open-Class Factorization for Improved Language Modeling
Fuchun Peng and Dale Schuurmans
pages 145-152

Learning Strategies In A Grammar Induction Framework
Chin-Chung Wong, Helen Meng and Kai-Chung Siu
pages 153-157

Corpus-Based Acquisition of Sentence Readability Ranking Models for Deaf People
Kentaro Inui and Satomi Yamamoto
pages 159-166


Language Modeling (II)

A New Prosodic Phrasing Model for Chinese TTS Systems
Weijun Chen, Fuzong Lin, Jianmin Li and Bo Zhang
pages 169-175

Personalization of Text Entry Systems for Mobile Phones
Kumiko Tanaka-Ishii, Yusuke Inutsuka and Masato Takeichi
pages 177-184

Enhancing the Robustness of DOP for Speech Understanding
Khalil Sima'an
pages 185-192


Paraphrasing

Paraphrasing Utterances by Reordering Words Using Semi-Automatically Acquired Patterns
Yujie Zhang, Kazuhide Yamamoto, Chengqing Zong and Masashi Sakamoto
pages 195-202

Paraphrasing Spoken Japanese for Untangling Bilingual Transfer
Kazuhide Yamamoto
pages 203-210

An Unsupervised Method for Canonicalization of Japanese Postpositions
Kentaro Torisawa
pages 211-218


Nominal Extraction

Re-interpretation of Rules for Named Entity Task by Probability Assignment
Tatsunori Mori
pages 221-228

Named Entity Recognition using Machine Learning Methods and Pattern-Selection Rules
Choong-Nyoung Seon, Youngjoong Ko, Jeong-Seok Kim and Jungyun Seo
pages 229-236

An Efficient Method for Korean Noun Extraction Using Noun Occurrence Characteristics
Do-Gil Lee, Sang-Zoo Lee and Hae-Chang Rim
pages 237-244


Syntax

A Case Study of Free Word Order Grammar Development in DG, TAG and LFG
Mark Pedersen and Helen Purchase
pages 247-254

A Separate-and-Learn Approach to EM Learning of PCFGs
Taisuke Sato, Shigeru Abe, Yoshitaka Kameya and Kiyoaki Shirai
pages 255-262

Resolving Ambiguity in Inter-chunk Dependency Parsing
Mi-Young Kim, Sin-Jae Kang and Jong-Hyeok Lee
pages 263-270


Topic Detection

Topic Segmentation : A First Stage to Dialog-Based Information Extraction
Narjes Boufaden, Guy Lapalme and Yoshua Bengio
pages 273-279

kNN vs. Linear Support Vector Machine in Event Tracking
Weiquan Liu and Joe F Zhou
pages 281-288

Topic-Word Selection Based on Combinatorial Probability
Toru Hisamitsu and Yoshiki Niwa
pages 289-296


IR/IE

Hierarchical Concept Description and Learning for Information Extraction
Luo Xiao, Dieter Wissmann, Michael Brown and Stefan Jablonski
pages 299-306

Linguistic Techniques to Improve the Performance of Automatic Text Categorization
Akiko Aizawa
pages 307-314

Classification of Open-Ended Questionnaires based on Surface Information in Sentence Structure
Hiroko Inui, Masaki Murata, Kiyotaka Uchimoto and Hitoshi Isahara
pages 315-322


Morphology & POS Tagging

Unknown Word Guessing and Part-of-Speech Tagging Using Support Vector Machines
Tetsuji Nakagawa, Taku Kudoh and Yuji Matsumoto
pages 325-331

A Maximum Entropy Tagger with Unsupervised Hidden Markov Models
Jun'ichi Kazama, Yusuke Miyao and Jun'ichi Tsujii
pages 333-340

Partially Supervised Learning of Morphology with Stochastic Transducers
Alexander Clark
pages 341-348


Parsing

Incremental CFG Parsing with Statistical Lexical Dependencies
Takahisa Murase, Shigeki Matsubara, Yoshihide Kato and Yasuyoshi Inagaki
pages 351-358

GLR Parser with Conditional Action Model(CAM)
Yong-Jae Kwak, Young-Sook Hwang, Hoo-Jung Chung, So-Young Park, Sang-Zoo Lee and Hae-Chang Rim
pages 359-366

Abstract Left-corner Parsing for Unification Grammars
Noriko Tomuro and Steven L. Lytinen
pages 367-374


Multilingual Applications

Hierarchical Phrase Alignment Harmonized with Parsing
Kenji Imamura
pages 377-384

Segmented LSI for Fully Automated Large-scale Cross-Language Information Retrieval
Tatsunori Mori
pages 385-392

Automatically Harvesting Katakana-English Term Pairs from Search Engine Query Logs
Eric Brill, Gary Kacmarcik and Chris Brockett
pages 393-399


Misc. Topics

A Probabilistic Model for Japanese Zero Pronoun Resolution Integrating Syntactic and Semantic Features
Kazuhiro Seki, Atsushi Fujii and Tetsuya Ishikawa
pages 403-410

Discovery of Definition Patterns by Compressing Dictionary Sentences
Masatoshi Tsuchiya, Sadao Kurohashi and Satoshi Sato
pages 411-418

Korean Text Generation from Database for Homeshopping Sites
Ji-Eun Roh, Sin-Jae Kang and Jong-Hyeok Lee
pages 419-426


Project Notes: IR/Text Searching

Japanese Information Retrieval Method Using Syntactic and Statistical Information
Tsunenori Mine, Hiroki Fujitani and Makoto Amamiya
pages 429-434

Migemo: Incremental Search Method for Languages with Many Character Faces
Satoru Takabayashi, Hiroyuki Komatsu and Toshiyuki Masui
pages 435-438


Project Notes: Text Analysis

Language Independent Text Categorization
Huang Xuanjing, Wu Lide, Xu Guowei and Hiroyuli Ishizaki
pages 441-446

Acquisition of Sentence Reduction Rules for Improving Quality of Text Summaries
Kazuhiro Takeuchi and Yuji Matsumoto
pages 447-452


Project Notes: Applications

Finding Target Language Correspondence for Lexicalized EBMT System
Wei Wang, Jin-Xia Huang, Ming Zhou and Chang-Ning Huang
pages 455-460

Generation of 3D CG Animations from Recipe Sentences
Hideki Uematsu, Akira Shimazu and Manabu Okumura
pages 461-466


Project Notes: Morphological & Syntactic Analyses

A Usability Case Study of Grammar Formalisms for Free Word Order Natural Language Processing
Mark Pedersen and Helen Purchase
pages 469-474

A Hierarchical EM Approach to Word Segmentation
Fuchun Peng and Dale Schuurmans
pages 475-480

Statistical Parsing of Dutch using Maximum Entropy Models with Feature Merging
Tony Mullen, Rob Malouf and Gertjan van Noord
pages 481-486

Harmonised Morphosyntactic Tagging for Seven Languages and Orwell's 1984
Tomaž Erjavec
pages 487-492

Grapheme-to-Phoneme for Thai
Pongthai Tarsaku, Virach Sornlertlamvanich and Rachod Thongprasirt
pages 493-498


Project Notes: Entity Extraction

Application and Difficulty of Natural Language Processing in Chinese Temporal Information Extraction
Wenjie Li, Kam-Fai Wong and Chunfa Yuan
pages 501-506

Modality Expressions in Japanese and Their Automatic Paraphrasing
Toshifumi Tanabe, Kenji Yoshimura and Kosho Shudo
pages 507-512

Analysis and Reconstrution of Dictionary Definition Units
Chung-Won Seo and Key-Sun Choi
pages 513-518

Improving accuracy in Base Noun Phrase Identification with the variation of context length and position
In-Ho Kang, Yeojin Lee and GilChang Kim
pages 519-524

A Modular Approach to Turkish Noun Compounding: The Integration of a Finite-State Model
Aysenur Akyuz Birturk and Sandiway Fong
pages 525-530


Project Notes: Generation/Text Planning

A summary planner based on a three-level discourse model
Thiago Alexandre Salgueiro Pardo and Lucia Helena Machado Rino
pages 533-538

Pruning UNL texts for Summarizing Purposes
Camilla Brandel Martins and Lucia Helena Machado Rino
pages 539-544

Design of a Generation Component for a Spoken Dialogue System
Graham Wilcock and Kristiina Jokinen
pages 545-550

Approach to Spoken Chinese Paraphrasing Based on Feature Extraction
Chengqing Zong, Yujie Zhang, Kazuhide Yamamoto, Masashi Sakamoto and Satoshi Shirai
pages 551-556

Centering as an Anaphora Generation Algorithm: A Language Learning Aid Perspective
Mitsuko Yamura-Takei, Miho Fujiwara and Teruaki Aizawa
pages 557-562


Posters I

Speaker Determination in Video News by Using Acoustic Features and Transcripts
Yasuhiko Watanabe, Shigeru Toji and Yoshihiro Okada
pages 565-570

Cross-Language Information Retrieval of Proper Nouns using Context Information
Isao Goto, Noriyoshi Uratani and Terumasa Ehara
pages 571-578

Thai Text Entry with Digits
Yusuke Inutsuka, Kumiko Tanaka-Ishii and Masato Takeichi
pages 579-584

Sentence Extraction by Questioning
Zenshiro Kawasaki, Keiji Shibata and Masato Tajima
pages 585-591

Summarizing Newspaper Articles Using Extracted Informative and Functional Words
Mamiko Hatayama, Yoshihiro Matsuo and Satoshi Shirai
pages 593-600

Monologue Summarization: Extraction of Important Sentences for TV News Commentary Programs
Takahiro Ito, Kenji Matsumoto, Yasuo Tanida, Hideki Kashioka and Hideki Tanaka
pages 601-608

A E-mail Filtering System Based on Personal Profiles
Masami Shishibori, Kazuaki Ando and Jun-ichi Aoe
pages 609-616

Multilingual Document Alignment - A Study with Chinese and Japanese
Md Maruf Hasan and Yuji Matsumoto
pages 617-623

Acquiring word meanings with questions,answers and a dynamic environment - generalizing with neural networks.
Mats U. Nystrand, Naoto Takahashi and Kazuhiro Ueda
pages 625-630

Usability Word-Pair Features for Probabilistic Text Classifiers
Sang-Bum Kim, Gum-Won Hong, Sang-Zoo Lee, Hae-Chang Rim and Myoung-Sook Ko
pages 631-638

Korean to English TV Caption Translator: ""CaptionEye/KE""
Seong-il Yang, Young-Kil Kim, Young-Ae Seo, Sung-Kwon Choi and Sang-Kyu Park
pages 639-645

Long Sentence Partitioning using Structure Analysis for Machine Translation
Yoon-Hyung Roh, Young-Ae Seo, Ki-Young Lee and Sung-Kwon Choi
pages 646-652

Headline Generation for Summaries from Multiple Online Sources
Hong-Jia Wong, June-Jei Kuo and Hsin-Hsi Chen
pages 653-660


Posters II

Use of a Lexical Feature Database for Partial Parsing of Chinese
Elliott Franco Drábex and Qiang Zhou
pages 663-668

Automatic Corpus-Based Extraction of Chinese Legal Terms
Oi Yee Kwong and Benjamin K. Tsou
pages 669-676

Span-based Statistical Dependency Parsing of Chinese
Tom B.Y. Lai, Chang-Ning Huang, Ming Zhou, Jiangbo Miao and Tony K.C. Siu
pages 677-684

Unsupervised Improvement of Morphological Analyzer for Inflectionally Rich Languages
Akshar Bharati, Rajeev Sangal, Sushma Bendre, Pavan Kumar and Aishwarya
pages 685-692

Chunking with Decision Trees
Dirk Lüdtke
pages 693-699

Estimating Reliability of Contextual Evidences in Decision-List Classifiers under Bayesian Learning
Yoshimasa Tsuruoka and Takashi Chikayama
pages 701-707

The Grammatical Function Analysis between Adnoun Clause and Noun Phrase in Korean
Songwook Lee, Tae-Yeoub Jang and Jungyun Seo
pages 709-713

Word Sense Disambiguation Using Neural Networks with Concept Co-occurrence Information
You-Jin Chung, Sin-Jae Kang, Kyoung-Hi Moon and Jong-Hyeok Lee
pages 715-722

Korean Adverb Ordering in English-Korean Machine Translation Using Clustering
Shin Won Lee, Dong Un An and Seong Jong Chung
pages 723-728

Automatic Segmentation of Words using Syllable Bigram Statistics
Seung-Shik Kang and Chong-Woo Woo
pages 729-732

An Empirical Study of Feature Set Selection for Text Chunking
Young-Sook Hwang, Yong-Jae Kwak, Hoo-Jung Chung, So-Young Park and Hae-Chang Rim
pages 733-740

Efficient Parsing for Word Structure
Anna Maria Di Sciullo and Sandiway Fong
pages 741-748

Vietnamese Word Segmentation
Dinh Dien, Hoang Kiem and Nguyen Van Toan
pages 749-756


Exhibitions and Demonstrations of NLP technologies

A Korean Language Processing Workbench
Key-Sun Choi
pages 759-760

SummaryBIFF: An E-mail Summarizer for Mobile Phones
Takaaki Hasegawa, Takefumi Yamazaki and Yoshihiko Hayashi
pages 761-762

Kura: A Lexico-Structural Paraphrasing Engine
Ryu Iida, Kentaro Inui, Tomoya Iwakura, Atsushi Fujita and Tetsuro Takahashi
pages 763-764

LiLFeS/GENIA Project --- NLP Tools and A Biology Domain Corpus ---
Jun'ichi Tsujii
pages 765-766

XML Transformation-based three-stage pipelined Natural Language Generation System
Yohei Seki
pages 767-768

Collaborative Translation Environment `Yakushite.Net'
Tatsuya Sukehiro, Mihoko Kitamura and Toshiki Murata
pages 769-770

Associative information access using DualNAVI
Akihiko Takano, Yoshiki Niwa, Shingo Nishioka, Toru Hisamitsu, Makoto Iwayama and Osamu Imaichi
pages 771-772

Text Mining and Site Outlining Projects
Koichi Takeda, Hiroshi Nomiyama, Tetsuya Nasukawa, Mei Kobayashi, Takashi Sakairi, Hirofumi Matsuzawa, Tohru Nagano, Akiko Murakami and Hironori Takeuchi
pages 773-774

JSPS project: Natural Language Understanding and Action Control
Hozumi Tanaka
pages 775-776

AZALEA: A KE-Free ICALL (intelligent computer assisted language learning) System for Japanese-English Translation
Naoyuki Tokuda, L.iang Chen and Qing Zhong
pages 777-778

Tools for Exploring Natural Language
Masao Utiyama and Hitoshi Isahara
pages 779-780

Text Entry/Conversion Systems for Mobile Phones
Toshihiko Watanabe, Yusuke Inutsuka, Kumiko Tanaka-Ishii and Hiroshi Nakagawa
pages 781-782


Author Index

Author Index
pages 785-787