英語(yǔ)語(yǔ)料庫(kù)與自動(dòng)語(yǔ)法分析
定 價(jià):38 元
- 作者:方稱(chēng)宇 著
- 出版時(shí)間:2007/11/1
- ISBN:9787100056595
- 出 版 社:商務(wù)印書(shū)館
- 中圖法分類(lèi):H314
- 頁(yè)碼:225頁(yè)
- 紙張:膠版紙
- 版次:1
- 開(kāi)本:16K
語(yǔ)料庫(kù)語(yǔ)言學(xué)和計(jì)算語(yǔ)言學(xué)為促進(jìn)自然語(yǔ)言處理技術(shù)快速發(fā)展的兩門(mén)基礎(chǔ)學(xué)科!队⒄Z(yǔ)語(yǔ)料庫(kù)與自動(dòng)語(yǔ)法分析》系這兩個(gè)領(lǐng)域的一本專(zhuān)著,它以國(guó)際英語(yǔ)語(yǔ)料庫(kù)為背景,著重探討大型語(yǔ)料庫(kù)的語(yǔ)法分析,尤其是英語(yǔ)口語(yǔ)材料給計(jì)算機(jī)自動(dòng)處理帶來(lái)的一系列難題。書(shū)中涉及基于概率的自動(dòng)詞類(lèi)識(shí)別和基于實(shí)例的自動(dòng)句法分析這兩大技術(shù),并有專(zhuān)門(mén)章節(jié)來(lái)探討句法分析的評(píng)測(cè)問(wèn)題,對(duì)AUTASYS和The Survey Parser這兩個(gè)軟件系統(tǒng)的實(shí)際表現(xiàn)進(jìn)行了深入的量化評(píng)測(cè)。此外,本書(shū)還探討了介詞短語(yǔ)的自動(dòng)分析,特別是這類(lèi)短語(yǔ)的句法功能的自動(dòng)判定,并對(duì)自動(dòng)語(yǔ)法分析在語(yǔ)音合成及語(yǔ)音識(shí)別中的應(yīng)用做了相應(yīng)的說(shuō)明。
本書(shū)的主要思路就是將已經(jīng)分析過(guò)的語(yǔ)料庫(kù)變成一個(gè)句法知識(shí)庫(kù),從中提取短語(yǔ)結(jié)構(gòu)語(yǔ)法規(guī)則,并通過(guò)基于實(shí)例的手段,在知識(shí)庫(kù)中為待分析語(yǔ)句提取一棵最佳句法樹(shù)。本書(shū)對(duì)上述各個(gè)部分的研究進(jìn)行了詳細(xì)的描述,對(duì)系統(tǒng)的實(shí)際表現(xiàn)進(jìn)行了深入的量化評(píng)測(cè),并有專(zhuān)門(mén)章節(jié)來(lái)探討句法分析的評(píng)測(cè)問(wèn)題。除此之外,還探討了介詞短語(yǔ)的自動(dòng)分析,特別是這類(lèi)短語(yǔ)的句法功能的自動(dòng)判定,因?yàn)檫@一研究和句法相似度分析有著密切的關(guān)系。同時(shí),本書(shū)還就自動(dòng)語(yǔ)法分析在語(yǔ)音合成及語(yǔ)音識(shí)別中的應(yīng)用做了相應(yīng)的介紹和說(shuō)明,希望對(duì)讀者能有所幫助。
Preface
前言
List of Figures
List of Tables
Abstract
1. Introduction
1.1. What is Parsing?
1.2. The Introspective View
1.3. The Retrospective View
1.4. Data-Oriented Parsing
1.5. General Problems
1.6. The Proposed Research
1.6.1. Background to the Proposed Research
1.6.2. The Basic Approach of the Proposed Research
1.6.3. The Strengths and Novelties of the Proposed Approach
1.6.3.1. Automated Grammar Generation
1.6.3.2. De-Lexicalised Terminal Nodes
1.6.3.3. Global Parse with Subcategorisation Features
1.6.3.4. High-Quality Partial Parse
1.6.3.5. Intrinsic Ability to Learn
1.7. The Organisation of the Book
2. The Automatic Analysis of English Word Classes
2.1. An Overview of Word Class Tagging
2.2. Major Word Class Tagging Schemes
2.2.1. The Lancaster-Oslo/Bergen Tagging Scheme
2.2.1.1. The Lancaster-Oslo-Bergen Corpus
2.2.1.2. The Lancaster-Oslo-Bergen Tag Set
2.2.1.3. Summary
2.2.2. The International Corpus of English Tagging Scheme
2.2.2.1. The International Corpus of English
2.2.2.2. The International Corpus of English Tag Set
2.2.3. A Comparison of LOB and ICE
2.3. Word Class Tagging Methodologies
2.3.1. The Rule-Based Approach
2.3.2. The Probabilistic Approach
2.4. AUTASYS: A Hybrid Tagging System
2.4.1. A Probabilistic Approach Using the LOB Tag Set
2.4.1.1. The Tag Assignment Module
2.4.1.1.1. Tokenisation
2.4.1.1.2. The treatment of"."
2.4.1.1.3. The treatment of"'"
2.4.1.1.4. Sentence boundary markers
2.4.1.2. Orthographic Analysis
2.4.1.3. Lexicon Lookup
2.4.1.3.1. The lexicon
2.4.1.3.2. The coverage of the lexicon
2.4.1.4. Morphological Analysis
2.4.2. The Idiom Identification Module
2.4.3. The Probabilistic Tag Selection Module
2.4.3.1. The Bigram Probabilistic Matrix
2.4.3.2. Implementing Probabilistic Tag Selection
2.4.4. The Rule-Based Refinement Module
2.4.5. Empirical Evaluation
2.4.6. Permissive AUTASYS-LOB Disagreements
2.4.6.1. NNP-NPT
2.4.6.2. JJ-JJB
2.4.6.3. NNP-NPL
2.4.6.4. RB-NN
2.4.7. Summary
2.5. A Rule-Based Approach towards LOB to ICE Translation
2.5.1. Solutions for Verbs
2.5.1.1. Auxiliary vs. Lexical
2.5.1.2. Monotransitive vs. Complex Transitive
2.5.1.3. Finite vs. Nonfinite
2.5.2. Closed Sets
2.5.3. Initial Results
2.5.4. Problems
2.5.5. Summary
3. The Automatic Induction of a Formal Grammar
4. Robust Practical Analogy-Based Parsing
5. Extensive Evaluations of the Survey Parser
6. The Resolution of Prepositional Phrases
7. Conclusions and Further Work
References
Appendix A: A List of LOB Tags
Appendix B: A List of ICE Tags
Appendix C: A List of AUTASYS Idioms
Appendix D: A List of ICE Parsing Symbols
Appendix E: A List of ICE Prepositions in Descending Frequency Order
Appendix F: A Distributional Profile of ICE-GB Prepositions
Index