Update Research Note Form
Research Note Topic:
Research Note Description:
TYPES OF INFORMATION SYSTEMS
There are three main types of information systems: data base management systems, bibliographic reference retrieval systems, and question-answering systems.
Data Base Management Systems.
Data base management deals with the processing of simple files of the type normally used in business. Each file contains records of a kind, for example, employee records or records of items in inventory control. In turn, each record stores certain information; for example, an employee may be identified by name, address, job classification, and salary category. A data management file can thus be represented by a table in which the rows identify individual records and the columns contain data from those records. Data base management then consists in relating tables to records for specific purposes. For example, data base management systems can determine how many employees 35 years old or older fall into certain job categories.
Bibliographic Reference Retrieval Systems.
These systems normally deal with text, such as titles and abstracts of books and articles, and retrieve citations to items of specific subject area stored in a library file. For example, a user might request all bibliographic references dealing with "the design of modern information retrieval systems." To identify particular citations, the content of the items on file must first be specified. Hence content analysis, or indexing, plays an important role in reference retrieval systems. Also, since bibliographic files may contain up to millions of items, rapid access to individual items is indispensable. Efficient methods for searching the files are therefore important. In many cases, queries and answers are expressed in English or any other natural language rather than in machine (computer) language.
These systems furnish direct answers to queries, which are often submitted in a natural language. Question-answering systems combine features of both data base management and bibliographic retrieval systems. Since factual queries are answered directly, a question-answering system needs linguistic know-how, detailed information about particular fields, and a fund of general knowledge. For this reason, such systems are used only in special circumstances and in certain subject areas.
STORAGE AND RETRIEVAL PROCESSES
In principle, a request for information could be compared with the file contents and the best match made. In practice, however, the content of both the query and of the items of stored information must first be more clearly identified. Thus, in data base management, the content of each record is rated according to a scale of values; in reference retrieval, a document is represented by a set of terms, each of which carries a value (weight) depending on its importance in each document.
An information storage and retrieval process consists of certain indexing, filing, query formulation, and search and retrieval operations carried out on stored records to answer a request for information.
Indexing is generally done by hand. An index may contain most of the terms found in a natural language or may be restricted to certain special terms. A dictionary of special terms will also identify terms of greater scope than a given term, as well as narrower terms, synonymous terms, and so on. From 6 to about 20 terms are assigned to a document. Manual indexing is an art, and consistency between individual indexers is not to be expected.
Various automatic indexing techniques have been developed. In the simplest of these, each word in a document excerpt is used for indexing purposes except such words as and, of, or, and but. More sophisticated systems choose and weigh index terms according to the frequency of occurrence of certain words or word phrases in individual documents; the higher the frequency of occurrence of a given word the greater the weight assigned to it. Words that appear frequently throughout the collection are not good index terms since they cannot be relied upon to retrieve certain stored items in preference to certain others. In automatic indexing up to 100 terms may represent document content for an item.
Queries must use terms likely to match the index terms assigned to a relevant document. Query formulations are often complex. Thus, the query "A and B" means that documents containing both term A and term B are to be retrieved; "A or B" asks for documents containing either term A or term B. In conventional retrieval systems, only documents whose terms exactly match those of the query are retrieved. In more advanced systems, query formulations are automatically constructed from formulations supplied by the user in a natural language. These formulations are then used to identify documents on the basis of similarity of terms.
Prof. Ashay Dharwadker