TY - BOOK AU - Baeza-Yates,Ricardo AU - Ribeiro-Neto, Berthier TI - Modern information retrieval SN - 020139829X U1 - 025.524 PY - 1999/// CY - New York PB - Addison-Wesley, KW - Information storage and retrieval systems N1 - 1 Introduction 1.1 Motivation 1.1.1 Information versus Data Retrieval. 1.1.2 Information Retrieval at the Center of the Stage 1.1.3 Focus of the Book 1.2 Basic Concepts 1.2.1 The User Task . . 1.2.2 Logical View of the Documents 1.3 Past, Present, and Future 1.3.1 Early Developments 1.3.2 Information Retrieval in the Library 1.3.3 The Web and Digital Libraries . .. . 1.3.4 Practical Issues . , . 1.4 The Retrieval Process . . . 1.5 Organization of the Book . 1.5.1 Book Topics . . 1.5.2 Book Chapters 1.6 How to Use this Book . 1.6.1 Teaching Suggestions 1.6.2 The Book's Web Page 1.7 Bibliographic Discussion . . 2 Modeling 2.1 Introduction . . 2.2 A Taxonomy of Information Retrieval Models 2.3 Retrieval: Ad hoc and'Filtering 2.4 A Formal Characterization of IR Models 2.5 Classic Information Retrieval 2.5.1 Basic Concepts . . . 2.5.2 Boolean Model . . . 2.5.3 Vector Model . . . . 2.5.4 Probabilistic Model . 2.5.5 Brief Comparison of Classic Models . 2.6 Alternative Set Theoretic Models . 2.6.1 Fuzzy Set Model 2.6.2 Extended Boolean Model . 2.7 Alternative Algebraic Models . . 2.7.1 Generalized Vector Space Model 2.7.2 Latent Semantic Indexing Model 2.7.3 Neural Network Model 2.8 Alternative Probabilistic Models 2.8.1 Bayesian Networks . . . 2.8.2 Inference Network Model 2.8.3 Belief Network Model . . 2.8.4 Comparison of Bayesian Network Models . 2.8.5 Computational Costs of Bayesian Networks 2.8.6 The Impact of Bayesian Network Models 2.9 Structured Text Retrieval Models 2.9.1 Model Based on Non-Overlapping Lists . 2.9.2 Model Based on Proximal Nodes 2.10 Models for Browsing 2.10.1 Flat Browsing 2.10.2 Structure Guided Browsing 2.10.3 The Hypertext Model. 2.11 Trends and Research Issues . . 2.12 Bibliographic Discussion . . . 3 Retrieval Evaluation 3.1 Introduction 3.2 Retrieval Performance Evaluation 3.2.1 Recall and Precision . . . 3.2.2 Alternative Measures . . . 3.3 Reference Collections 3.3.1 The TREC Collection . . 3.3.2 The CACM and ISI Collections 3.3.3 The Cystic Fibrosis Collection . 3.4 Trends and Research Issues 3.5 Bibliographic Discussion 4 Query Languages 4.1 Introduction 4.2 Keyword-Based Querying 4.2.1 Single-Word Queries 4.2.2 Context Queries . . . 4.2.3 Boolean Queries . . . 4.2.4 Natural Language . . 4.3 Pattern Matching 4.4 Structural Queries 4.4.1 Fixed Structure . . . 4.4.2 Hypertext 4.4.3 Hierarchical Structure 4.5 Query Protocols 4.6 Trends and Research Issues . 4.7 Bibliographic Discussion . . 5 Query Operations 5.1 Introduction 5.2 User Relevance Feedback 5.2.1 Query Expansion and Term Reweighting for the Vector Model 5.2.2 Term Reweighting for the Probabilistic Model 5.2.3 A Variant of Probabilistic Term Reweighting . 5.2.4 Evaluation of Relevance Feedback Strategies . 5.3 Automatic Local Analysis 5.3.1 Query Expansion Through Local Clustering . 5.3.2 Query Expansion Through Local Context Analysis 5.4 Automatic Global Analysis 5.4.1 Query Expansion based on a Similarity Thesaurus 5.4.2 Query Expansion based on a Statistical Thesaurus 5.5 Trends and Research Issues . 5.6 Bibliographic Discussion . . 6 Text and Multimedia Languages and Properties 6.1 Introduction 6.2 Metadata . . 6.3 Text . . . . 6.3.1 Formats 6.3.2 Information Theory 6.3.3 Modeling Natural Language . 6.3.4 Similarity Models 6.4 Markup Languages . . . 6.4.1 SGML 6.4.2 HTML 6.4.3 XML . 6.5 Multimedia. . 6.5.1 Formats 6.5.2 Textual Images 6.5.3 Graphics and Virtual Reality 10.6.2 Query Term Hits Within Document Content . . 10.6.3 Query Term Hits Between Documents 10.6.4 SuperBook: Context via Table of Contents . . . 10.6.5 Categories for Results Set Context 10.6.6 Using Hyperlinks to Organize Retrieval Results 10.6.7 Tables 10.7 Using Relevance Judgements 10.7.1 Interfaces for Standard Relevance Feedback 10.7.2 Studies of User Interaction with Relevance Feedback Systems 10.7.3 Fetching Relevant Information in the Background . . 10.7.4 Group Relevance Judgements . . 10.7.5 Pseudo-Relevance Feedback . . . 10.8 Interface Support for the Search Process 10.8.1 Interfaces for String Matching . . 10.8.2 Window Management 10.8.3 Example Systems 10.8.4 Examples of Poor Use of Overlapping Windows 10.8.5 Retaining Search History 10.8.6 Integrating Scanning, Selection, and Querying . 10.9 IVends and Research Issues 10.10 Bibliographic Discussion 11 Multimedia IR: Models and Languages 11.1 Introduction 11.2 Data Modeling 11.2.1 Multimedia Data Support in Commercial DBMSs . 11.2.2 The MULTOS Data Model 11.3 Query Languages 11.3 1 Request Specification 11.3.2 Conditions on Multimedia Data . 11.3.3 Uncertainty, Proximity, and Weights in Query Expressions 11.3.4 Some Proposals 11.4 Trends and Research Issues 11.5 Bibiographic Discussion 12 Multimedia IR: Indexing and Searching 12.1 Introduction 12.2 Bjickground — Spatial Access Methods 12.3 A Generic Multimedia Indexing Approach . . . 12.4 One-dimensional Time Series 12.4.1 Distance Function 12.4.2 Feature Extraction and Lower-bounding 12.4.3 Experiments 12.5 Two-dimensional Color Images 12.5.1 Image Features and Distance Functions 12.5.2 Lower-bounding 12.5.3 Experiments 12.6 Automatic Feature Extraction . 12.7 Trends and Research Issues . . . 12.8 Bibliographic Discussion . 13 Searching the Web 13.1 Introduction 13.2 Challenges 13.3 Characterizing the Web . . 13.3.1 Measuring the Web 13.3.2 Modeling the Web 13.4 Search Engines 13.4.1 Centralized Architecture 13.4.2 Distributed Architecture 13.4.3 User Interfaces 13.4.4 Ranking 13.4.5 Crawling the Web . . 13.4.6 Indices 13.5 Browsing 13.5.1 Web Directories 13.5.2 Combining Searching with Browsing . 13.5.3 Helpful Tools 13.6 Metasearchers 13.7 Finding the Needle in the Haystack . 13.7.1 User Problems . . . . 13.7.2 Some Examples . . . 13.7.3 Teaching the User . . 13.8 Searching using Hyperlinks . 13.8.1 Web Query Languages 13.8.2 Dynamic Search and Software Agents . 13.9 Trends and Research Issues 13.10 Bibliographic Discussion 14 Libraries and Bibliographical Systems 14.1 Introduction 14.2 Online IR Systems and Document Databases . 14.2.1 Databases 14.2.2 Online Retrieval Systems 14.2.3 IR in Online Retrieval Systems . 14.2.4 'Natural Language' Searching . . 14.3 Online Public Access Catalogs (OPACs) 14.3.1 OPACs and Their Content . .. . 14.3.2 OPACs and End Users 14.3.3 OPACs: Vendors and Products . 14.3.4 Alternatives to Vendor OPACs 14.4 Libraries and Digital Library Projects . 14.5 Trends and Research Issues . 14.6 Bibliographic Discussion . . 15 Digital Libraries 15.1 Introduction . . . . 15.2 Definitions 15.3 Architectural Issues 15.4 Document Models, Representations, and Access 15.4.1 Multilingual Documents 15.4.2 Multimedia Documents . 15.4.3 Structured Documents . 15.4.4 Distributed Collections . 15.4.5 Federated Search . ... 15.4.6 Access 15.5 Prototypes, Projects, and Interfaces 15.5.1 International Range of Efforts 15.5.2 Usability . 15.6 Standards . . . . 15.6.1 Protocols and Federation 15.6.2 Metadata 15.7 Trends and Research Issues . 15.8 Bibliographical Discussion ER -