MEDLINE

DIS 801
Dr. Richard Smiraglia

Louiza Patsis
December 5, 2002


ABSTRACT
       MEDLARS online, or Medline, is the world's most heavily used medical database, complete with its own controlled vocabulary called Medical Subject Headings (Mesh). In 1971 Medline became one of the first online databases for information retrieval. This paper reviews the research that has taken place through the years regarding information retrieval and interface issues with Medline. These studies provide useful information for many online databases and their ongoing improvement in the future.

 

INTRODUCTION
       Medline is a bibliographic database published by the National Library of Medicine (NLM). Medline contains information about the medical documents or records. Medline now contains more than 10 million records from more than 4,200 journals that publish information about the causes, prevention and treatment of disease and injury (Katcher 1999). It is accessed more than 18,000 times a day (Kingsland III 1993). Each record has been read by a skilled indexer, who has assigned to each record roughly a dozen subject headings drawn from a controlled vocabulary of more than 19,000 Medical Subject Headings (MeSH). Chimoshey and Norris (1999) conducted a study that confirmed that even rural physicians in Washington State use Medline and 70% of them say that they benefit from using Medline.
The beginning of Medline, which can serve as an example for other large online databases, started with one man, John Shaw Billings. Billings was a medical student at the Medical College of Ohio. He needed information about epilepsy for his graduate thesis. He spent six months in Cincinnati, New York City and Philadelphia. In 1864, at 27 years old, he was transferred from field duty as a military surgeon to the Surgeon General's Office in Washington, DC one of his duties was to care for the Surgeon General's collection of books and journals. In the following years, Billings expanded the collection, building it into the country's largest medical library. He built up a collection of 52,000 records by 1876. He indexed the records by author and title. Records included books, journal articles, pamphlets, reports and theses.
       Billings developed an elaborate system for cataloguing everything within the Library. The practice of cataloguing records by subject was not practiced by most libraries at the time. Users located books by author and then searched the indices of individual books to find information about a particular topic. He designed a cataloguing system that included both author and subject names in his Index Catalogue of the Library of the Surgeon General's Office. Because it was so extensive, it came to be regarded as an index to the world's medical literature. Billings included the growing amount of medical journals in his collection. The Index Medicus was an index to articles published in medical journals. New monthly medical journals were published in Index Medicus starting in 1879.
       Billings became the Director of New York Public Library in 1895 after retiring from the army. In 1927 Index Medicus was merged with the AMERICANA Medical Association, competing bibliography and renamed Quarterly Cumulative Index Medicus.
In 1949 Colonel Frank Bradway "Brad" Rogers became director of the Army Medical Library, formerly the Library of the Office of the Surgeon General of the Army. He was sent by the arm to obtain a Masters in librarianship. He produced a standardized list of subject headings for the Current List of Medical Literature. In 1956 the United States government gave the Armed Forces Medical Library statutory authority as the NLM and made it a separate institution within the United States Public Health Service. This would later become the National Institutes of Health. In 1960 Rogers guided the publication of the newly revived monthly Index Medicus with a freshly revised and expanded list of standardized subject headings. This list was called the Medical Subject Headings or MeSH. The list consists of single- or multi-word terms used to catalogue and to index the medical literature.
       Billings had been busy designing the Johns Hopkins Hospital, shaping the curriculum of the Johns Hopkins Hospital, shaping the curriculum of the Johns Hopkins Medical School and presiding over the development of the New York Public Library system. He began to work on the United States Census. He suggested punching information as punched holes on cards so that later the cards could be sorted according to the alignment of holes that had been punched in them.
Herman Hollerith, a young engineer who was working on the eleventh 1890 census, perfected the cards and developed a machine for sorting them. The cards became known as Hollerith Cards, although, as even Hollerith admitted, they were Billings's idea.
       In 1896 Hollerith set up the Tabulating Machine Company, which was eventually absorbed into the International Business Machines (IBM). The cards became known as IBM punch cards and started to be made with punch holes corresponding to the Medical Subject Headings (MeSH) in the Index Medicus. MeSH was one of the first vocabularies with an explicitly tagged database, content free identifiers, polyhierarchy, extensive cross-references and text definition.
       In 1960 the Index Medicus had three volumes. In 1999 the author entries filled six volumes and the subject entries filled ten volumes. Prior to MeSH and punched cards, there was no way to search for entries with two authors or two subjects.
In 1960 the NLM began development of the Medical Literature Analysis and Retrieval System (MEDLARS), which used MeSH. MeSH is updated annually. The 1963 edition had the first version of MeSH "tree structures" with their headings and subheadings. A main heading can appear in more than one subcategory. For instance, "hepatic" appears in the Infectious Diseases tree under "tuberculosis" and in the Digestive System Diseases tree under "liver disease".
       By 1965, searches could be submitted to trained libraries at the NLM or its branches. The librarians would formulate each search and then submit it to MEDLARS Search Center where punched cards were fed into a computer and the printout was shipped back by mail. It took four to six weeks. Up to three search statements were allowed. Some searched yielded too few hits or too many hits, or retrieved irrelevant documents. These issues are still pertinent today. Scientists would have to resubmit queries.
       In 1968, the first real-time, online bibliographic retrieval system was inaugurated at the State University of New York Biomedical Communication Network headquartered in SUNY Upstate Medical center Library in Syracuse, New York. At nine medical libraries, teletypewriter terminals searched terminals searched 90,000 references. Results were printed offline and mailed. In 1971 Medline was introduced online by the NLM. Searchers took a two-week course in online searching. Connect time was expensive. Searchers had to know details like if search terms were singular or plural or if the adjective preceded the noun. A team of indexers analyzed each article and assigned eight or ten subject headings to each article. Two or four subject headings were pertinent for the article to appear in Index Medicus. Indexing was always prone to human error. Indexers may assign a wrong term or forget to assign a term or subheading.
       Searchers are also prone to error. The may misspell a word, not know the proper MeSH term, and retrieve no, little or too many documents.
Medline now includes the Index Medicus, Index to Dental Literature and the International Nursing Index. Documents can be searched by author, title, subject, journal, institution, author's address, year of publication and more.
       Ralph T. Esterquist, of the 1963 Clinic on Library Applications of Data Processing, said at a 1963 symposium: "The impact of MEDLARS in the medical library world is not that of the familiar metaphor - the pebble dropped into the pond, casting concentric circles that reach man points on the shore. The impact is no pebble for sure. It is a mighty rock. The waves it will cause will surge and splash for a long time to come. (Coletti and Bleich 2002, 1-2)
INFORMATION RETRIEVAL AND MEDLINE
       Many typical issues of information retrieval come up with Medline, which is a widely-used database. Studies on Medline information retrieval can shed light on these issues. The Pew Internet Project of 2002 (Clarke and Greaves 1997) estimated that 62% of Internet users (73 million people living in the United States) search the Internet for health information. The study showed that 93% of health information seekers surveyed looked for information about a specific illness, 65% sought information on exercise, nutrition or weight control, 64% for prescription drugs and 33% for sensitive health information. More than half of the respondents reported using the Internet for health information every few months or less frequently.
       Many types of searchers search Medline. Some are clinicians, some are students. Some know exactly what they are looking for and others do not. There has been an increase in use of Medline by physicians (Spink and Yang 2001)
It is best to use MeSH headings (Appendix I) and subheadings (Appendix II). Often searching by text words (Appendix III) brings up too few or too many results. Often a person conducts a search and sees that a new search with different terms is needed. (Feinglos 1985, 18)
Misspelling
       Falleis and Fricke (2002) researched the misspelling of medical terms. They saw that many users misspell words and that they had specific questions that were beyond the NLM website. Ray and Vermeulen (1996) misspelled ten popular text words on purpose. Examples are "hamorrhage" for "haemorrhage" and "myocardial infartion" for "myocardial infarction". Surprisingly, in some cases, articles that would not have been pulled up had the word been spelled correctly were pulled up because the word was spelled incorrectly in the article. Overall, of course, many articles were missed. Clarke and Greaves (1997) misspelled the methodological term "random" to "randon" and had similar results: some articles that would have not been retrieved had the term been spelled correctly were retrieved. But overall, articles were missed. The also found that searching with MeSH terms yields more relevant information.
       Often Medline searchers, including experienced users, do not know the MeSH term for their topic of interest. They will then enter a text word. For instance, the may enter "high blood pressure" instead of the MeSH term "hypertension" or "heart attack" instead of the MeSH term "myocardial infarction". For such common text words, the term will be mapped to the MeSH terms. For other less common terms, the relevant articles may not be retrieved. Cullinan S. et al. (2001) studied what the term "bloody nose" as opposed to the MeSH term "epitaxis" would retrieve. The searched in the Medline search engines PubMed and Ovid. They also studied the lexical variants "pink eye" and "pinkeye" and "color blindness" and "colorblind". They concluded that Medline is not designed for consumers. For instance, entering "bloody nose" in Ovid under the subject heading option yielded terms that consumers or new users may not be familiar with, such as "glycemic", "herbicides" and surface-active agents". On PubMed the entry "nosebleed" resulted in 1,849 entries. Clicking on the details box showed that the term had been linked to "epitaxis". A searcher would need to know how to narrow down their search to obtain less articles that are relevant. They would need to know for what they are searching. Narrowing down a search will be explored later in this review. Another search problem that came up is that "bloody nose" resulted in "No term found" in PubMed. This was not mapped to the MeSH term "epitaxis". Results of different terms entered are shown on Table 3 (Cullinan et al. 2001, 69) "Colorblindness" and "pink eye", with all of their variations, were mapped to MeSH terms "conjunctivitis" and "keratoconjuctivitis" retrieved more articles. Cullinan S. et al. concluded that more consumer terms should be mapped to MeSH counterparts.
       Clearly, as well indexed as Medline is, improvements can be made. This comes up often. There is also the issue of new medical terms that are added to MeSH such as new diseased and drugs.
Types of Searches
       These considerations need to be combined with the search technique of end users. Spink and Yang (2001) conducted a study, albeit not on Medline, to study medical health consumer searches on Excite and FAST commercial Web engines and concluded that most consumers fail to understand the limitations of the Web search process, ascribe human and advice-seeking abilities to Web search engines or do not understand how the Web search engine works and the medical information models need to be tailored to less-educated consumers.
       In 1997 Greenhalgh conducted a study of Medline retrieval. Different suffixes can be used according to how one is searching. (Table 2, page 181) For instance, if one are trying to find a paper called "Confidentiality and Patients' casenotes" and one knows that it is in the British Journal of General Practice, this sequence should be typed:
1. confidentiality.ti
2. 2. british journal of general practice.jn
3. 1. and 2.
Or in one step:
       Confidentiality. ti and british journal of general practice.jn
Greenhalgh then studied a sample search for answering a specific question: "Is there any evidence that taking oral contraceptives in these circumstances really prevents long term bone loss?"
       Greenhalgh searched in Ovid Medline and SilverPlatter search engines. One can type "anorexia nervosa" or "anorexia nervosa, tw." ("Tw" stands for text word.) In the first case, the request will be mapped to a standard MeSH term. In SilverPlatter, the "suggest" is clicked after one enters the query term. One can then choose "anorexia nervosa" or "eating disorders". Then the searcher is asked if he would like to "restrict to focus" to get not only articles that mention anorexia nervosa, but those that are about it. Another option is to search subheadings. The following search can be entered:
*anorexia nervosa/
"*" shows that it is a major focus and "/" that it is a MeSH term.
       The term "osteoporosis/" yielded 2200 hits and "contraceptives, oral/" yielded 1200 hits. The symbol "*" was not used with "osteoporosis" because it was not the major focus of the article.
       This combination can be used: *anorexia nervosa/and osteoporosis/and contraceptives, oral/". Over 4,000 articles were retrieved. To focus on more specific articles, MeSH subheadings can be used. Greenhalgh claims that 50% of Medline articles are inadequately or incorrectly classified by subheading. If a searcher is experienced or is sure of subheadings, then they can be used. As will be covered later, some Medline search engines have options for MeSH assistance on their interface.
One problem has been that articles indexed before a term is introduced are not indexed under that term. The seminal paper about a new topic is usually not indexed under that term, as the first article on percutaneous transluminal coronary angioplasty, first indexed under angiography, catheterization, heart catheterization and coronary vessels was later indexed under angioplasty, balloon, and, in 1989, angioplasty transluminal, percutaneous, coronary. (Coletti and Bleich 2001).
Boolean Searching
       Boolean logic is named after George Boole, the nineteenth century mathematician who described this system of logic. Boolean logic can be used in Medline. "AND" is used to look for both terms in an article. Table 4 shows the intersection of MeSH terms osteoarthritis and ibuprofen. "OR" can be used to find either or term in an article. For instance, one can look for osteoarthritis and (aspirin or ibuprofen). This would search for osteoarthritis and either aspirin or ibuprofen. Figures 1 and 2 are simple depictions of the "AND" and "OR" concept. "NOT" is used to exclude a specific search term. For instance, if too many papers are retrieved, one could enter "NIT" letters to exclude all letters.
       Susan J. Feinglos claims that "AND" is a better way to limit results. "AND human" would limit papers that focus only on animals but would not exclude those dealing with human beings and animals. Greenhalgh (2001) used "surrogate not mother$.tw" to search for surrogate endpoints in clinical pharmacology research. She wanted to exclude anything on motherhood. "$" means "mother" with an ending: "s", "hood" or other.
       Using "adj" operator also helps to narrow a search. For instance, one can look for "home help" as "home adj help.tw".
Sometimes no or too few articles are retrieved. For a not so common search topic such as the psychology of diabetes, the search "diabet$.tw" and "psychol$.tw" would be useful.
       The "explode" strategy is also helpful for preventing incomplete searches. For a broad topic like asthma, the MeSH "asthma tree" would have many subdivisions such as "asthma in children", "occupational asthma" and more. A search for "asthma" may miss these terms. On e can explode the search: "exp asthma/".
       Sometimes one does not know where to start searching, as for a term like "stress" where there could be types of stress that a searcher would not know about: ptx stress.
       Option such as "post-traumatic stress disorders", "stress fracture' and "oxidative stress" are shown.
If your subject is a MeSH term, use the tree command.
For instance, "tree epilepsy" shows you where epilepsy is in the MeSH index, and terms such as "generalized epilepsy", "partial epilepsy" and "post-traumatic epilepsy" are shown.
Limiting a set, though, does not guarantee that you will retrieve all of the important articles and not get any irrelevant articles or articles of low methodological quality.
       Evidence-based quality filters (EBQFs), which will be covered later are complex search strategies developed by experienced medical information experts. "AND"ing for certain articles, such as those covering randomized clinical trials, can also help to narrow results and to get relevant articles at the same time.
The Unified Medical Language System™
       In the area that is most important in Medline searching - the assigning of MeSH terms - the Medline database is remarkably accurate (Coletti and Bleich 2001). The Unified Medical Language System™ (UMLS) was developed in the 1980's by medical informatics specialists. The NLM and Lexical Technologies in Alameda, California, built the UMLS Knowledge Sources to improve the ability of computer programs to understand the biomedical meaning of user inquiries and to use this understanding to retrieve and to integrate relevant medical information from the Internet (Yu et al. 2002). The Integrated Advanced Information Management Systems (IAMS) were formed in the 1980's to link automated clinical data and knowledge-based information to support health care, research and education. UMLS components prove some of the infrastructure for integrated informatic systems that are the focus of the IAMS. UMLS was also initiated to facilitate the development of IAMS systems that can link and integrate different types of machine-readable biomedical information like patient records, biomedical literature, factual databases and expert systems (Yu et al. 2002).
       People that work on the UMLS build intellectual "middleware" - electronic knowledge sources and related lexical programs - to help systems developers build applications that can interpret user queries and find relevant information. The NLM continually expands UMLS products to update them and to improve their utility.
       The UMLS was built to overcome two important barriers to the development of information systems: disparity of terminologies used in different information sources and by different users and the sheer number and distribution of machine-readable sources that can be relevant to any user inquiry (Humphreys 1998).
The UMLS supports the development of user-friendly systems or information retrieval. The UMLS project has produced and widely disseminated four multi-purpose knowledge sources designed for system developers: the Metathesaurus, the Semantic Network, the Information Sources Map and the SPECIALIST LEXICON. To find out more information on UMLS, this URL address can be used: http://ww.nlm.nih.gov/pubs/factsheets/umlskss.html
The Metathesaurus links MeSH to text words and to other medical thesauri. Contributions from experts in medicine, biomedical sciences, medical informatics, computer science, library and information science and linguistics. It preserves the names, meanings, hierarchial contexts, attributes and inter-term relationships present in source vocabularies, adds basic information to each concept and established new relationships between terms from different source vocabularies. With Metathesaurus information, computer programs can interpret user inquiries, interact with users to refine queries and questions, identify relevant databases and linking alternate names such as abbreviations, lexical variants, synonyms and translations for the same concept.
       The Semantic Network has 134 semantic types and provides a consistent categorization of all concepts represented in the Metathesaurus. There are 54 links that provide the structure for the Network and represent important relationships in the biomedical domain.
       The SPECIALIST Lexicon provides access to lexical records. Lexical entries, which may be single words or multi-words, record syntactic, morphological and orthographic information and inflectional variation such as single and plural forms of nouns, conjugation of verbs and the positions, comparative and superlative for adjectives and adverbs. The table LRAGR lists all variant forms for each entry in the lexicon.
Version 2.0 of the UMLS was designed for:
" extensibility for ease of new feature incorporation
" scalability in handling ever-increasing user loads and increasing numbers of the UMLS vocabularies
" performance considerations permitting faster access to the UMLS data
" flexibility in access modes
" set with access to all of the UMLS data
" ease of administration by NLM staff and contractors
" limited system interruptions during system software upgrades
Version 3.0 will have the following additions:
" an associated object model for accessing/representing the Semantic Network
" object model equals/equivalence checking allowing instances of object model classes to be compared to each other
" online access to the features/functions of the UMLS MetamorphoSys utility
Term Disambiguation
       Liu et al. (2002) proposed a method to disambiguate terms that possess multiple UMLS concepts. Liu et al. contructs a sense-tagged corpora for almost all ambiguous terms in the UMLS using Medline abstracts. Manual methods for this can be expensive. For a term W that represents multiple UMLS concepts, a collection of Medline abstracts that contain W is extracted. For each abstract, occurrences of concepts S that have relations with W as defined in the UMLS are automatically identified. A corpus tagged with annotated senses of W is derived on identified concepts. This method was compared on a set of 35 frequently occurring ambiguous biomedical abbreviations using a gold standard set that was automatically derived. Precision and recall were used to measure the quality of the derived sense-tagged corpus.
The results were: precision rate of 92.9% and overall recall of 47.4%. Once rare senses and ignoring abbreviations with closely-related senses, the overall precision was 96.8% and the overall recall was 50.6%.
       This study addressed the problem of the sometimes inadequate interpretation of free-text in the biomedical, natural language processing by computer applications. Terms in free text can be ambiguous. For instance, capsule can be a unit of medication or a body region. Abbreviations can be ambiguous also, like "hr" for "hour" or "heart rate". This can pose a problem for Medline searches. Too much information or irrelevant information can be retrieved.
The Metathesaurus is organized by concept. Each distinct concept is assigned a unique concept identifier (CUI). Many concept names can have one CUI, for instance, "congestive heart failure" and "biventricular heart failure". Furthermore, each concept name has a term status to indicate whether it is the preferred concept name of the corresponding concept or if it is suppressed, i.e. abbreviated or problematic.
       There are hundreds of thousands of concepts and concept names listed in the table MRLON. Table MRREL lists relationships between UMLS concepts. There are millions of entries and nine relationship types, such as broader (RC), narrower (RN) and similar (RL). Two concepts may have multiple relationships.
In the Semantic Network, each CUI has been assigned to one or more semantic categories.
       There are two kinds of ambiguities presenting in the UMLS: conceptual, which refers to ambiguity due to multiple concepts of terms, and semantic, which refers to ambiguity due to multiple semantic categories of terms. There is an ambiguous term table AMBIG.SUI in UMLS. The Method proposed in the Lieu et al. article of 2002 was concerned with conceptual ambiguity. It utilized conceptual relations defined in the UMLS to automatically derive sense-tagged corpora for ambiguous terms and used the word sense disambiguation (WSD) classifiers were then automatically constructed using the sense-tagged corpora. This work, unlike previous work, used the UMLS as the conceptually oriented knowledge source.
       In this study, precision was the ratio for the number of abstracts with correctly identified sense to the number of abstracts that were sense-tagged using conceptual relatives. Recall was the ratio of the number of abstracts with correctly identified sense to the total number of abstracts in the gold standard set (GSS).
The WSD classifiers trained on sense-tagged corpora with high precision performed better than those on sense-tagged corpora with low precision. Liu et al.'s CRSMap performed better than the UMLS's MetaMap with respect to the quality of derived sense-tagged corpora, except for APC and BSA, and was superior to MetaMap, the program that maps biomedical text to concepts in the Metathesaurus, with respect to the performance of the WAD classifier for all but two abbreviations.
       The sense-tagged corporus derived using CRMap had a better precision than that derived using MetaMap for all but one abbreviation. Liu et al. found that causes of low precision were relatedness among different senses (such as CMG standing for electromyograph, electromyography, electromyogram and exomphalos macroglossia gigantism, and the existence of poor conceptual relatives), and, in the case of the WSD classifier, lack of enough training of the searcher.
       This study shows that, although the UMLS is detailed and extensive, improvement can be made in resolution of ambiguous terms. This would lead to higher precision and recall in the Medline information retrieval.
       The goal of the CRMap is to match only conceptual relatives while the goal of MetaMap is to find conceptual relatives that contain prepositional noun phrases and CRMap does not have this limitation. MetaMap fails to identify persistent pulmonary hypertension of the newborn, which is a sibling of MAS (meconium aspiration syndrome) or as a relative of MAS in abstracts that have it. Liu et al. plan to further investigate relations defined in different sources, to formulate a new sense assignment scheme and to use clustering techniques to find instances that are associated with rare senses or unknown senses.
Indexing
       Kim et al. (2000) studied the extraction of useful phrases from Medline records and from the UMLS with abstracts or entry dates from 1996 by statistical methods in order to leverage human effort by providing preprocessed phrase lists with a high percentage of relevant information.
       They developed six scoring methods based on different aspects of phrase occurrence. They focused on the statistical properties of word pairs and triples that can be obtained from a large database. The UMLS was used as a gold standard for validating methods. The authors found six different scoring methods that can prove effective for identifying the UMLS quality phrases in Medline.
They concluded that statistical scoring methods provide a promising approach to the extraction of useful phrases form a natural language database for linking or providing hyperlinks in text.
       For a large database like Medline, people often can use many terms for one topic. Indexing can alleviate this problem by expanding the list of terms to access a document.
       One path to improve indexing is to obtain a list of terms sufficient to include a high percentage of the terms that people will use in querying a database and to add enough synonymy information to allow a query to access documents that are indexed with an expression that is synonymous with a query. Kim et al. hypothesized that statistical information about the occurrence of phrases in Medline can provide a useful screen for candidate phrases that are of similar quality to the material already in the UMLS.
       Jones et al. had already studied the frequency of phrases, and maintained that words that compose them are important in the phase extraction method. Other studies, like the one by Harter in 1975, looked at the distribution of frequencies of a term within a document.
       The methods developed can serve as a screen for the extraction of useful phrases and can also form part of a system for marking useful phrases in text. Limitations such as excluding stopwords, cannot detect phrases like "vitamin A", and excluded phrases with more than three words. Terms for phrases such as "cancer of the lung" may have to be rearranged from "lung cancer", for instance.
       In the future Jones et al. will examine two ways of improving the system: 1. allow phrases that are longer than three words; and 2. find a way to score phrases more accurately according to how laden with content or subject matter they are.
Searchers can look up patient records in Medline. Cooper et al. (1998) developed and evaluated PostDoc, a lexical indexing system, and Pindex, a statistical indexing system, separately and then as a hybrid. Each system takes as input a portion of free text from a patient record and then returns a list of MeSH terms to formulate a Medline search that includes concepts in a text. The ability of PostDox to carry out synonymy mapping was dependent on the quality of the lexical variant terms included in the Metathesaurus. Pindex uses a hash table of phrases for which it assigns MeSH terms automatically.
       The patient records were six radiology reports, six pathology reports and six discharge summaries. Blinded assessment by the authors determined the extent to which a system-derived list of MeSH terms captured the relevant concepts in these documents. Pindex captured more relevant report concepts compared to PostDoc: 40% versus 45%.
       The results suggest a new way to reduce the number of terms output while maintaining the percentage of terms captured, including the use of the UMLS semantic types to constrain the output list to have only clinically relevant MeSH terms. This study was step toward the realization of systems that assist healthcare personnel in using the electronic medical record to help construct patient-specific searches of Medline.
       Two author raters did their own assigning of MeSH terms. Precision was defined as the fraction of MeSH terms output by a system for the report that were used in that annotation to represent one or more concepts. Recall was defined as the fraction of concepts in the annotation that were adequately represented by the MeSH term.
Some results were:
PostDoc: 40% - 50% of MeSH terms output by PostDoc were used to represent one or more concepts, and
Pindex 15% - 20% of MeSH terms output by PostDoc were used to represent one or more concepts.
       When PostDoc and Pindex outputs were taken together to create the Union System, recall was 60% and precision was 20%. The union of the two provided better recall. Cooper et al. concluded that both could be refined to produce better performance. PostDoc has been using version 1.1 of the UMLS Metathesaurus since it was developed in 1991 - 1992. Cooper et al. hypothesize that the use of the most current Metathesaurus, with more lexical variants and synonyms, would alter the performance of PostDoc. The increased coverage of the current Metathesaurus leads to an increase in PostDoc recall and a possible decrease in precision. If the probability threshold at which Pindex includes terms in its output list is increased, recall would be traded off for precision, if needed to perform a search. PostDoc and Pindex output MeSH terms from among those in the entire MeSH vocabulary. Precision would be increased by a postprocessor that would contrain their output. This study touches on a subject to be discussed later in the paper, how sensitivity and recall often cannot both be high in a search. Cooper et al. plan to study other clinical reports in the future and to find better systems for indexing in Medline.
Abbreviations
       Chang et al in 2002 wrote that the amount of literature in biomedicine is exploding as Medline "grows by 400,000 citations each year"(page 2). They defined abbreviation as "all strings that are shortened forms of sequences of words (its long form)", (page 2), as opposed to only acronyms, which are typically defined as the conjunction of the initial letters or words. They created an online dictionary of abbreviations from Medline to create an automatically generated and maintained lexicon of abbreviations. Their algorithm matched abbreviations in text with their expansions. Such algorithms of course already existed. With the growth of biomedical literature, such algorithms can be improved or new ones can be invented to increase the retrieval of relevant documents and to decrease ambiguous algorithms.
       Their method used logistic regression. Their algorithm was applied to Medstract, a corpus of Medline, because it is easily available, eliminated the need to develop an alternate standard and it provided a reference point to compare methods. They tested their algorithm against an independently created list of abbreviations from the China Medical tribune. They measured the precision and recall of the algorithm in identifying abbreviations from the Medstract corpus. Their algorithm is available at http://abbreviationstanford.edu.
       Recall was defined as the number of correct abbreviations divided by all correct abbreviations. Precision was defined as the number of correct abbreviations divided by all predictions. Recall was 83% and precision was 80%.
Chang et al. believe that automated methods for finding abbreviations are of greater value than manual ones, which they claim suffer for the problem of completeness and timeliness. The article presents "a novel algorithm for identifying abbreviations, a set of feature descriptive of various types of abbreviation server containing all abbreviation definitions found in Medline" (page 3).
       Some precision of the evaluation was hurt by some abbreviations missing from the gold standard. The largest amount of errors occurred because the gold standard included synonyms, words and phrases with identical meanings, and the algorithm could not find the correspondences between letters. This indicates a fundamental limitation of letter-matching techniques. A source of error was from their strong assumption that the abbreviation must be inside parentheses and the long from must be outside of parentheses. The study showed that linking to external dictionaries of abbreviation can augment the ability of automated methods to assign definitions that are not indicated in the text. Yu et al. (2002) developed two methods of mapping defined (abbreviations paired with their full form in the article) and undefined abbreviations. AbbRE (short for abbreviation recognition and extraction) was the software program into which pattern-making rules to match abbreviations and their full forms were implemented. Undefined abbreviations were mapped to any of four public abbreviation databases that map gene and protein abbreviations in LRABR of the UMLS Specialist Lexicon, GenBank, LocusLink, SWISSPORT and BioABACUS. The opinions of domain experts were used as a gold standard. Recall was defined as the number of correct abbreviations present in the reference standard and found by AbbRE divided by the number of abbreviations in the reference standard. The recall was 0.70 and the precision and 0.95 for defined abbreviations. They found only 25% of abbreviations were defined in biomedical articles and 68% of them could be mapped to an of four abbreviation databases.
       Yu et al. found yet another program to successfully map abbreviations. This can be useful to Medline searchers, especially when abbreviations are undefined. AbbRE: 1. handles full biomedical articles; 2. searches for parenthetical expressions for paired abbreviations and for full forms; 3. does not break up words into components; 4. relies on a set of pattern-matching rules for mapping an abbreviation to its full form; and 5. has been evaluated by domain experts.
       The five biomedical journals used were: Cell, Science, Trends in Neuroscience (TNS), Proceedings of the National Academy of Sciences (PNAS) and the Journal of Biological Chemistry (JBC). The five medical journals used were the New England Journal of Medicine CA: A Cancer Journal for Chemistry, the Journal of the National Cancer Institute (JNCI), the Journal of the American Medical Association (JAMA) and Lancet.
       The gold standard was 45 medical expert abbreviations and 51 biological expert abbreviations. Most abbreviations that failed to be recognized by AbbRE were not associated with their full forms.
       Recall and precision were high, but only 68% of the undefined abbreviations could be mapped to any of four databases. AbbRE had an average recall of 0.70 and an average precision of 0.95 for defined abbreviations. On average, 25% of abbreviations were defined in biomedical articles and that of a randomly selected subset of undefined abbreviations. The authors found that many abbreviations are ambiguous, i.e. they map to more than one full from in abbreviation databases. They concluded that AbbRE is efficient for mapping defined abbreviations. They agree that, to couple AbbRE with abbreviation database for the mapping of undefined abbreviations, exhaustive abbreviation databases and a method to resolve the ambiguity of abbreviations in the databases are needed. In addition, the overall agreements of medical and biological experts agreed more with defined than undefined abbreviations.
       Yu et al. plan to develop and expand AbbRE and to apply it to all Medline abstracts in PubMed and to study AbbRE in other databases. A program to more accurately define abbreviations with more than one meaning will be developed. Mapping an abbreviation to its full form facilitates natural language processing and is important for information retrieval. If the full form of an abbreviation is missing in an article and a program like AbbRE is not used, a searcher may miss relevant articles.
Search Filters
       Search filters are a collection of search terms intended to capture frequently sought research methods and are used to study designs in Medline. They can be used to locate systematic reviews of the effectiveness of health interventions. Systematic reviews identify, access and combine the evidence from primary research studies and were included in the study if they assessed causation, diagnosis, treatment or prognosis of disease. Non-systematic reviews present a summary of results and conclusions of studies, but do not contain a statement of methods, objectives or materials. Much research has gone into effective search filters. White et al. (2001) set out to improve previously developed methods to derive a more objective search strategy to identify systemic reviews in Medline. Known systematic reviews made up a quasi-gold standard.
       A frequency of words within a subset of the "quasi-gold standard" was calculated and then statistical analysis of the most frequently occurring words was undertaken. The analysis determined which terms best could be used to distinguish between systemic reviews, non-systemic reviews and non-reviews.
       Wolf et al. in 2002 and Boynton et al. in 1998 had previously shown that systematic reviews are not easy to find in Medline and are hidden among other studies that are called reviews. White et al. (page 358) wanted to expand on Boynton et al.'s search strategy because the thought that it had weaknesses: 1. The analysis of words in records did not capture phrases or analyze properly multi-term MeSH headings and publication types, the analysis was univariate, analyzing the value of each term alone and not jointly with other terms; and 3. sensitivity of new strategies was tested against the original "quasi-gold standard" records used to derive the search strategies and ma be overestimated.
       In the White et al. study, the journals used were: Annals of Internal Medicine, British Medical Journal, Journal of the American Medical Association and the Lancet. The records identified were: 110 systematic reviews, 110 non-systematic reviews and 125 non-reviews. Sensitivity or recall was defined as the number of systematic reviews correctly classified times 100, divided by the total number of systematic reviews. Specificity was defined as the number of records correctly classified as not systematic reviews or records correctly classified as not systematic reviews times 100, divided by the total number of records that are not systematic reviews. Precision was defined as the number correctly classified systematic reviews divided by the records retrieved by the searcher.
Different models with different objectives were produced.
The sensitivities for the models were as follows:
Sensitivity (%) Specificity (%)
Model A 98.3 73.4
Model B 94.9 67.1
Model C 51.9 99.4
Model D 87.1 89.2
Model E 77.2 94.9

       Model A was formed to discriminate between systematic, non-systematic and non-reviews. It would be best to model for searchers who want to retrieve and sift through a large amount of systematic reviews. Model B investigated the importance of frequency of occurrence of terms in records by exploring the effect of ignoring frequency and only using the presence or absence of terms in records. Sensitivity and specificity were lower than for Model A. This indicates that the frequency of occurrence of terms does have an effect on the classification of systematic reviews. Model C tested parsimony and test validity. The low sensitivity shows that despite face validity a statistical approach can increase sensitivity. The specificity was high; only one term was wrongly classified as a systematic review by the model. This shows that the five terms are highly focused and do not retrieve a large number of other record types. Model D was developed to discriminate between non-reviews and any sort of review - systematic or non-systematic. Model D gave the same sensitivity of detecting systematic reviews as Model A but had lower specificity. Model E was designed to distinguish between systematic reviews and all other types of records (non-systematic reviews and non-reviews). Model E had high sensitivity and high specificity.
       Model A would be best for a researcher who wants to be sure of retrieving a high proportion of systematic reviews and who is willing to sift through many irrelevant records, while Model C would be best for a researcher who wanted to find a high proportion of relevant records quickly. Compromising of the search filter model according to the goals of the researcher could be applied after viewing results of each model in this study.
       White et al. feel that they expanded on previous search filters and showed that the number of times that a term occurs and the combination of terms can help to identify systematic reviews. The authors think that a database interface development will allow searching by frequency of terms and the weighting of these terms. Other databases and interfaces will be studied. Limitations of the study were: results were based on English medical journals, which tend to have a more rigorous peer review and may demand a higher standard of reporting or research methods than other journals. Objectivity can be improved in the selection of terms to discard and the cut-off points for frequency analysis.
       Ingui et al. in 2001 derived and validated an optimal search filter for retrieving clinical prediction rules using Medline. Clinical prediction rules are tools designed to assist health care professionals in making decisions. They compromise variables obtained from the history, physical examination and simple diagnostic tests of patients. Inconsistent terminology makes them difficult to index and to retrieve by computer systems. The "gold standard" was established by a manual search of all articles from print journals from 1991 to 1998, identifying articles covering various aspects of clinical prediction rules such as derivation, validation and evaluation. Filter predict$ or clinical$ or outcome$ or risk$ retrieved 98% of clinical prediction rules. Predict$ and rules$ retrieved 99.97%. Sensitivity and specificity were both above 90%. Sensitivity was defined as the proportion of articles with clinical prediction rules that were retrieved by the filter. The positive predictive value was defined as the proportion of retrieved articles that contained clinical prediction rules. Positive likelihood ratio was defined as the ratio of sensitivity to specificity. The amount of search filters studied was 694. The highest sensitivity was 98%. Four filters had sensitivity and specificity higher than 90%. Higher positive predictive values and positive likelihood ratios had low sensitivities. The filter "predict$ OR clinical$ OR outcome$ OR risk$ yielded the highest sensitivity - 98.4%. The filter with the highest specificity - 78.6% - was "predict$.ti AND rule$". The sensitivity was 16.1%. The single term with the highest sensitivity was "predict$". Ingui et al. concluded that one search filter could not meet the needs of researchers, clinicians and students. For the person who wants to quickly retrieve a clinical prediction rule for illustrative purposes, the use of predict$.ti AND rule$ yields three relevant articles for four irrelevant articles. The positive predictive value was 75% in the validation set. Optimal information retrieval was found to include population, intervention, comparison, outcome and translation into searchable strategy. Ingui et al. concluded: "Optimal retrieval of the best evidence is based on the formulation of a well-defined question, which includes population, intervention, comparison and outcome, and its translation into searchable" (page 397).
       Haynes et al. (1994) developed search filters and strategies for retrieving sound clinical studies in Medline. They performed an analytic survey of operating characteristics of search strategies developed by computerized combination of MeSH and text terms selected to detect studies meeting basic methodological criteria for direct clinical use in adult general medicine. Sensitivities, specificities, precision and accuracy of 134,264 unique combinations of search terms were calculated and compared to the manual review of articles or "gold Standard". Ten internal medicine and ten general medicine journals in 1986 and in 1991 were searched.
       Combinations of search terms in 1991 reached peak sensitivities of 82% for studies of etiology, 92% for studies of prognosis, 92% for studies of diagnosis and 99% for studies of therapy. Multiple terms, compared to single terms, increased sensitivity by more than 30%, with some loss of specificity. For 1986, it was 72% for studies of etiology, 95% for studies of prognosis, 86% for studies of diagnosis and 98% for studies of therapy.
Search terms were combined to maximize specificity, over 93% specificity was achieved for all purpose categories in both years. High accuracy was achieved by combining terms. Peak accuracies of over 90% were reached for therapy in 1986 and in 1991.
       Haynes et al. contributed some of the difficulties in using Medline to: the large number of postings in Medline (several million), the low prevalence of clinically applicable studies, the well-documented limitations of indexing and retrieval in Medline from its inception and the imprecise search skills of clinical end user.
       Seven of the 12 search strategies from 1986 could not be run in 1991 because terms used in 1991 were not available in 1986. Now, with even more new terms, search strategies like these need to be revised and update and new ones need to be invented. The search strategy that yielded the best sensitivity (99%) for treatment in 1991 was "randomized controlled trial (pt)" or "drug therapy (sh)" or "therapeutic use (sh)" or all random (tw). "AND NOT" comment, letter and news can include other journals. Search strategies to maximize both sensitivity and accuracy outperformed other strategies.
Wood et al. (1999) developed the Large Scale Vocabulary Test (LSVT) to allow participants to search local terms and concepts in the Metathesaurus. The hypothesis was that a combination of existing terminologies will cover the majority of the concepts needed for a broad range of health information systems. The two largest vocabularies in the test - SNOMED International and the Read Codes - had the highest percentage (more than 60%) of the exact meaning matches. The study showed that most of the concepts and qualifiers needed to record data about patient conditions are already included in one or more of the UMLS vocabularies. The authors feel that their test could be used to enhance controlled vocabularies and for other collaborative informatics research and for design of efficient clinical data entry systems.
       Another possible problem with Medline information retrieval was addressed by Ojasoo et al. (2001). They analyzed the publication trends (PTs) of clinical medicine records in Medline and found that there were periods of erratic activity or quirks. In the late 1980's there was a greater interest in randomized clinical trials (RCTs) in the gold standard of clinical investigation. Medical journals encouraged this publication and sometimes grants and career advancement depended on this. One looking for a term or answer may not use the term RCT or, upon, using it, may come up with too many records.
       Bachmann et al. in 2002 constructed and validated a better search strategy to identify diagnostic articles recorded on Medline with special emphasis on precision. They set out to develop a more precise search strategy for selecting publications on diagnostic test evaluations without losing sensitivity. Medical journals in 1989, 1994 and 1999 were hand-searched. A word frequency analysis of the abstracts identified text words for search strategies. Sensitivity, precision and number needed to read (1/precision) of every candidate term was calculated. Sensitivity was the number of gold standard articles as a proportion of all gold standard articles. Precision was the number of gold standard articles as a proportion of all articles retrieved. The currently used PubMed filters Clinical Queries, which was based on the work of Haynes et al. (1994) was the "gold standard".
       Bachmann et al. concluded that the performance of Clinical Queries may be overstated. The filter developed by Bachmann et al. performed slightly better than the currently available one and better with regards to precision in the 1994 subset. Clinical Queries's sensitivity and precision for 1994 and 1999 were:
1994 1999
Sensitivity 95.1% 88.8%
Precision 8.2% 4.3%

The sensitivity and precision of the new search filter for the years 1994 and 1999 are as follows:
1994 1999
Sensitivity 98.1% 95.1%
Precision 12.0% 4.3%

Diagnostic studies were defined as having content pertaining directly to the evaluation of disease process usually through comparing methods of arriving at a diagnosis. Tests were defined as procedures used to change the estimate of the likelihood of disease presence.
       Inconsistent terminology used in diagnostic studies makes them difficult to index and to retrieve in electronic databases. Using Clinical Queries, Bachmann et al. found between 77% and 92% sensitivities of all recorded material on Medline at a price of having to sift through 12.5 records to find one article that refers to diagnosis. This does not seem bad until one sees that 625 records ma have to be dealt with to find 50 relevant records. Time could be saved by relying on the filter with the highest specificity, but then many relevant records may be lost. This is especially risky in searching for diagnostic research, where there is a high variability in study outcomes. Bachmann et al. do not recommend the PubMed high-specificity filter. The term that performed best in their search was predict$ and was not evaluated as a text word by Haynes et al. (1994).
Four factors that can influence a filter's reproducibility, according to Bachmann et al., are: 1. the selection of journals; 2. the way in which abstracts are written may change over item; 3. editorial processing may change over time and may lead to different working in abstracts; and 4.variation in indexing quality in Medline over time.
       Bachmann et al. plan in the future to evaluate their filter more in terms of time, cost, missing relevant records and to study the impact of language restrictions on summary measures, different search filters and search strategies and evaluating the conclusions of diagnostic reviews.
       Information seekers in Medline have the option to search in core and non-core journals. McCain (1994) conducted a study, not to develop the definitive core list of biotechnology journals, but to explore the relationships among biotechnology (narrowly construed) and those several other fields participating in biotechnology R & D (exporting basic research or importing applications) through the citation and publication patterns of the formal literature. Her ranking-base selection technique weights the number of citations received by one journal from another by the proportion of all citations received and the size of both journals and ranks the titles based on this citation weight. "Cocitation" was defined as when a minimum of one article from each of two journals is jointly cited. Journals with high intercorrelation are grouped together. This is cluster analysis.
The database-filtering approach developed combines citation and coverage analyzes and can identify core journals in biotechnology based on the aggregate citation choices of authors and distinguish those that best cover biotechnology research from titles publishing an occasional article. McCain concluded that Medline searchers can identify useful core journals for their search and, if relevant information is not retrieved, could search in non-core journals.
Sensitivity and Precision
       Boynton et al. (1998) designed search strategies based on a more objective approach to strategy construction to search for systematic reviews. A high sensitivity level of 98% and a relatively high precision level of 20% were achieved. The study showed that a frequency analysis approach can be used to construct highly sensitive strategies that have adequate levels of precision for retrieving systematic reviews. Medline was the test database. The authors state these problems with search strategies that rely on indexing terms alone: poor description of research design by the author, alternative terms or synonyms, lack of appropriate indexing terms in the Metathesaurus and inaccuracies in assigning index terms.
The "quasi-gold standard" was made up of 288 terms from the Annals of Internal Medicine, Archives of Internal Medicine, British Medical Journal, Journal of the American Medical Association, Lancet and the New England Journal of Medicine. Boynton et al. concluded that different search strategies could be used according to research needs. One surprising thing found was that using high sensitivity terms resulted in much lower overall or cumulative sensitivity than searches using lower sensitivity terms. Choosing various combinations of sensitivity and precision resulted in the optimal search strategy obtained by the group. This study could be conducted on other journals, including non-English ones and ones with different publication dates. Also, the unique contribution of individual terms to the overall sensitivity and precision of each strategy has not been estimated.
       Patel et al. (1998) studied medical informatics and Medline and concluded that cognitive science can contribute to objectives that concern researchers and practitioner in medical informatics. They expanded on research in computer-mediated communication. According to the authors, theories and methods from cognitive science can inform medical informatics by addressing important issues such as the usability of systems, the process of medical decision making and the training of physicians and end users.
       Westberg et al. (1999) worked with the UMLS to represent and link information needs for clinical practitioners to look up patient information. From 1991 to 1996 Cimino et al. examined common semantic and syntactic patterns and identified a set of general-purpose questions called "generic queries" that are tailored for user information needs. They hypothesized that the use of generic queries in clinical applications could facilitate determination of users' information needs and simplify the selection of potentially relevant information resources. They combined manual review by librarians with natural language processing and derived 37 generic queries that captured the essence of all queries in the study. They integrated what they learned into the Medline.
       Ribeiro-Neto et al. in 2000 devised an automatic algorithm that categorizes medical documents by assigning an International Code of Disease (ICD) to medical documents, which in the study were 77 discharge summaries. An average level of precision of 70% - 80% for category coding and 60% - 70% for subcategory coding was achieved. The algorithm made 25% of the mistakes of human specialists. Ribeiro-Neto et al. focused on medical records and ICD-9, the ninth version of ICD. Distinctions learned from this algorithm can be extended to Medline information retrieval. Ribeiro-Neto et al. set out to improve precision at high recall by taking advantage of the hierarchial structure of MeSH.
       Hersh and Greenes in 1990 and Hersh and Hickam in 1995 constructed the project SAPHIRE (Semantic and Probabilistic Heuristic Information Retrieval Environment) that developed methods for indexing and searching collections of medical documents and establishing reference collections that compare distinct information retrieval systems. SAPHIRE proposes to index and to search medical collections by using a semantic network of medical concepts and terms. The semantic network is based on UMLS. Spyns in 1996 gave an overview of categorization algorithms based on the idea of codes treated as concepts of the medical language whose meanings are defined through sentences in natural language.
       Some documents need to be assigned codes manually or semiautomatically. For instance, if a patient suffered form encephalomyelopath mitochondrial and the corresponding ICD-9 alphabetical index is encephalomyeltiits. The specialist would add an annotation to her ICD-9 alphabetical index that indicates that encephalomyelopathy, mitochondrial constitutes an alternative path to the code.
Ribeiro-Neto et al. found that, in some cases, the ICD-9 index is not complete: 1. Not all medical semantics is represented within the index; 2. The specialists opted for a code which is distinct from the default code recommended by the index; 3. The specialists used the knowledge about the semantics of a specific term that does not appear in the index and the specialists deduced the code using additional information on the text of the medical document.
In 1985 the NLM reviewed 2,000 literature search request forms submitted from the NIH and created a database of 155 representative queries for experimentation in bibliographic retrieval.
       A group at Yale University from 1989 to 1992 developed two knowledge-based programs designed to help clinicians find relevant literature references PsychTopix for psychiatry and HepaTopix for hepatology. Selecting a topic generates an automatic Medline search using MeSH logic form the program's knowledge base.
       According to Westberg and Miller's review article, there have been varying reports of words in patient records mapping to MeSH and to UMLS. They site these problems: difficulty for interpreting user queries, incomplete coverage of primary care concepts, intervocabulary mapping difficulties and inconsistencies within the Metathesaurus.
Human Computer Interaction
       There are programs such as COACH™ (Kingsland III et al. 1993) help end users solve search problems by emulating the approach of the end user and applying specialized knowledge to help them resolve it. COACH™ analyzes the user's search, interacts with the user by applying or suggesting alternative mappings form its knowledge sources - which include the Metathesaurus - and returns modified search results to the end user. COACH™ does not use Boolean searching but can help in a Boolean search. One common problem that end users face is entering an "AND" search and few or no results are retrieved. For instance, one searching for stress fractures of the spine may enter "(fractures, stress) and spine". This may give few hits. COACH™ assumes that the user wants more hits. COACH™ assumes that the user wants more hits. COACH™ uses that "spine" has seven narrower terms in MeSH and explodes the command. The user can select which term to use. COACH™ can also narrow down a second. For instance, if "AZT and AIDS" is entered, over 800 hits are obtained. The searcher may be looking for specific therapeutic uses. Upon the searcher's request, COACH™ can add subheadings such as "therapeutic use" as qualifiers and synonyms to narrow down the search. COACH™ also limits searches by language, publication types, search years, current month (SDLINE), check tags and age group. A Metathesaurus concept pick list is presented in the screen. Programs like COACH™ can be very helpful, especially for new Medline users, for those not well-versed in the topic that they are searching or for new medical topics.
       Wood et al. (1998) studied Internet end-to-end performance of pathways used to access information in the NLM databases and, by extension, other Internet biomedical resources. Quick median time to conduct standardized searches and get results form PubMed was 2 to 14 seconds for PubMed. In 1997 the NLM established an Internet connectivity evaluation project. Intercept to improve the understanding of the role that end-to-end Internet performance plays in facilitating (or hindering) access to the NLM and other biomedical databases.
       Bulk transfer capacity (BTC) measured the data transmission capacity of the Internet pathway between two locations; "ping" round-trip time as an indicator of the latency or propagation delay in the network for data traveling from one end of the network to the other and back; and the number and sequencing of links or hops form origin to destination (network routing). Packet loss, percentage of data packets for which the testing software did not receive an acknowledgment of successful transmission. Good for describing the performance of congested networks. The overall error rate is 1%. There was a slow response, no response or problems such as: "network error occurred, reset by peer", and "unexpected error has caused break in correction". The authors wrote that a multilevel approach with multiple tools, methods and metrics is probably needed to study end-to-end Internet performance, and that building and designing a significant margin of excess transmission capacity ma help to minimize peak. Peak hour delays were found.
       Joubert et al. (1998) showed that the conceptual graphs formalism allows powerful capabilities to operate a semantic integration of information databases using the UMLS knowledge sources. The authors concluded that, when data are structured by means of semantic relationships, the matching process is successful. It can also operate well if records of a database are not structured, but it is nonetheless possible to identify semantic relationships between concepts. The use of conceptual queries for information retrieval is significant here. The Metathesaurus has an explicit hierarchy of instances of queries that can be immediately exploited by applications and used to implement the mechanism that is the basis for conceptual graphs systems. The authors would not refine concepts in the Semantic Network but would enhance the definitions of the core concepts that the Metathesaurus registers by means of contextual knowledge. The conceptual graph theory is able to represent concepts, instances of concepts in medical contexts an associations by means of semantic relationships.
Interfaces
       Medline is available through several search interfaces. The most popular one is PubMed. There are also Ovid, Silverplatter (Webspirs) and FirstSearch.
Sandi Parker (2000) conducted a study comparing PubMed, Ovid, Silverplatter and FirstSearch. The stars assigned to each one were:
PubMed 4.5
Ovid 4
Webspirs 3.5
Firstsearch 2.25

       She considered PubMed to be the best choice. It is the only one that is free. It is quick and can be easy or complex depending on the searcher and the type of search. It has access to Old Medline, which contains articles dating back to 1961, PreMedline, which contains records before they are officially indexed into Medline, PubMed Central and the databases Nucleotide, Protein, Genome, Structure, Popset, Taxonomy, OMIM, SNP and UniGene databases.

The following are the rates the Parker gave each interface:


FirstSearch Ovid PubMed Webspirs
Composite 2.4 4 4.5 3.5
Content 3 4 5 3
Searchability 2 4 4 3
Pricing 2 4 N/A 4
Contract Options 2 4 N/A 4

Cost
PubMed is free. For FirstSearch, libraries can purchase a bank of searches or subscribe annually. For Ovid, the policy is "pay as you go". Also offered is Web access for locally installed Ovid client - servant system or fixed free Ovid online via Web Access Site licensing of databases. Silverplatter costs are different for different users, needs and parts of Medline accessed.

Searchability
Criteria used were:
1. Access to MeSH features; explode, focus, mapping and subheadings
2. Filtering results
3. Searching levels
4. Save features and support options
5. Document delivery
6. Currency and update options
7. Additional features

       Parker wrote that Ovid is great for both easy and advanced searches. Webspirs offers the Search Builder to design a complex search strategy. PubMed goes straight to a basic search but offers Limits, where the searcher can limit the search to year, database, language, publication type, human or animal, gender and publication date.
       Each interface utilizes MeSH headings and subheadings. Two map them - PubMed and Ovid. In PubMed, when the searcher presses "Details", the MeSH terms are shown. In FirstSearch one can search for one index at a time unless terms are combined and then one can search in multiple indices with up to four search statements. In SilverPlatter, one can search in all fields of records but there is no advice on MeSH terms to get more relevant retrieval.
Exploding/Focus
       Firstsearch offers no explode. Central concept focuses. PubMed automatically explodes, which is good because the searcher gets an idea of how much information there is on a topic and can focus.
       The way to focus on PubMed is not too obvious. In order to "focus" a search, you the MeSH Browser feature (also found on the sidebar) can be used so that the selected MeSH term is one of the main topics discussed in the article. In the "detailed display" of the selected term, the box "Restrict Search to Major Topic headings only" can be used. Turning on this feature narrows the focus of a search and may significantly decrease retrieval. If one is, however, typing a search into the query box, one can limit the MeSH term by tagging it with [major] instead of [mesh].
       Ovid explodes and focuses. In SilverPlatter one can focus from Search Builder and Thesaurus. One can click on "major MeSH heading" which is the same as focus. Someone not familiar with the interface may not be able to find these options.


Subheadings
In Ovid it is easier to find this command than in PubMed, where one would have to go to Detail Display and then Browse MeSH. Webspirs gives this option in the Thesaurus window.
In PubMed, one can add the appropriate subheading directly to a term, i.e., nursing/manpower OR nursing/ma (using 2-letter short form).
One can also add a subheading without attaching it to a specific term, a less exacting strategy, i.e.:

Anti-Inflammatory Agents, Non-Steroidal AND alcoholic beverages AND (adverse effects [sh] OR poisoning [sh]).
The above type of searches also can be entered with 2-letter short form of subheadings:
Anti-Inflammatory Agents, Non-Steroidal AND alcohol beverages AND (ae [sh] OR po [sh])
To determine what subheading can be used with each MeSH vocabulary term, the MeSH Browser can be used, accessed from the sidebar (click on detailed display for allowable subheadings) or one can go to the more detailed MeSH Browser at http://www.nlm.nih.gov/mesh/MBrowser.html.
Filtering
Ovid offered the most advanced filtering options. Then came PubMed, while Firstsearch and SilverPlatter did not offer many choices.
Saving
All but Firstsearch offered a save option. On PubMed it is not too obvious.
Help and Support Features
All four search engines offer these features.
Fulltext Links
All four search engines offer fulltext links.
Updating
Medline is updated monthly on Firstsearch and PubMed is updated weekly and PreMedline is updated daily. Ovid and SilverPlatter are updated weekly or monthly depending on price.
Unique Interfaces
Each interface has unique features. Ovid has an "Ask the Librarian" button that links every page to a comment for m that allows the searcher to send a question to the local library. Ovid also has the "Recover" option to make sure that a searcher can get back to a lost search whenever the "logoff" has not been initiated. PubMed has the hyperlink "Related Articles" that will retrieve a pre-calculated set of articles in Medline that closely relate to the selected article. PubMed also has the "Consumer Health" button which links to an NLM Web site called MedlinePlus, which is an exceptional database designed for the health consumer. It includes health information and will also run a preformulated Medline search.
Firstsearch has a familiar interface for library clients and the flexibility to use the interface to search other databases in the Firstsearch family. Libraries subscribing to Electronic Collections Online take advantage of links to full text. Parker says about SilverPlatter that it is a "sophisticated Medline search interface with all the capability of the Ovid Advanced Search mode"(page 9) if the searcher goes into the Build a Search or the Thesaurus. She also wrote: "I was also somewhat frustrated by the way Webspirs opens separate windows for each search or function selected. But there may be other versions that do not perform that way."
Gruwell and Littleton (2002) provided several links to Medline tutorials:
For PubMed: http://www.nlm.nih.gov/bsd/pubmed_tutorial/m1001.html
http://www.stanford.edu/~cstave/pubmed/pubmed3.html
http://www.library.health.ufl.edu/pubmed/pubmed2
For Ovid: http://www.mclibrary.duke.edu/respub/guides/ovidtut
http://www.health.library.mcgill.ca/eguides/tutorial/index.htm


Online Publishing

Barry P. Markovitz, MD, is an advocate of making biomedical information available for free on the Internet. He believes that writers, not publishers, should have the rights to their own work. PubMed Central, a freely accessible preprint and postprint "eprint" archive, is available on the Internet, can be accessed from the PubMed interface and includes non-peer-reviewed material. Some non-peer-reviewed articles are preprints to be subjected to from a peer review by journal editorial boards Once scientists sell copyright to a journal, they would have to pay the publisher for in able to distribute their material elsewhere. For the maximum amount of audience to read the material, it would cost the scientist a lot of money. PubMed Central bypasses this. Certain journals, like BMJ, already make their full text articles available online with little or no lagtime from publication. Markovitz believes that most scientist and institutions would be content to pay page prices for online publishing because this would be cheaper than print. Advertising would cover costs. Markovitz maintains that the current biomedical publishing industry appears unable to think "outside the box" of the reader pays, restricted access, commercial publishing model.
Hersh et al (2000) wrote that there are several reasons that online publishing of scientific journals would be a good idea: the cost of print journals is rising, libraries are buying less of them so there is less access to them and there are more journals and conference proceedings.
Liz Pope of Haworth Press writes that PubMed Central "democratizes a process now almost exclusively the preserve of commercial publishers and learned societies" and "providing the options for depositing both peer-reviewed journal articles and non-peer - reviewed preprints, PubMed Central proves an important addition to the way scientific findings are communicated" (page 189).
Maxine Hatley, editor of Information Retrieval Library Automaton, wrote about how several university and research institute scientists in the fall of 2000 vowed to refuse to submit articles to any publisher that did not agree to deposit its journal articles six months after publication to PubMed Central or a similar archive for free public access. That year scientists from around the world, including Nobel laureates, sent letters to publishers expressing how they felt about maintaining the rights to their work and allowing free online publication and information exchange.
Delamonthe and Smith in an editorial in BMJ (2001) wrote that "But PubMed Central is the first initiative really to take account of how fundamentally the worldwide Web has changed the landscape of scientific publishing" (page 323). They believe that "authors want their work to have as wide a circulation as possible" and that PubMed Central will help them to have this (page 323).
Micheal W. Jacobson, MD (2001) wrote that "the flow of information will indeed be enhanced and liberated, and the cost to consumers and researchers and libraries for access to information will drop substantially"(page 233) but that "The reality is that the biomedical press is too powerful and too integral a part of the research industry to have its foundations threatened by well-meaning scientist" (page 233 ). He does point out that "Allowing free access does not require giving up possession (just as museums the world over allow visitors to view their artworks, often for free, while retaining the rights to reproductions of work in their possession)" (page 232).

Works Cited
Bianchi, S. 2002. Database Reviews and Reports. PubMed: For More Than Just Medicine, This Is One of the World's Greatest Databases. Issues in Science an Technology Librarianship Spring 2002:1 - 4.
Boynton, J. et al. 1998. Identifying Systematic Reviews in Medline: Developing an Objective Approach to Search Strategy Design. Journal of Information Science 24:137 - 157.
Brachmann, L. M. et al. 2002. Identifying Diagnostic Studies in Medline: Reducing the Number Needed to Read. Journal of the American Medical Informatics Association 9: 653-658.
Brennan, P. F. and Strombom, I. 1998. Improving Healthcare by Understanding Patient Preferences. Journal of the American Informatics Association 2: 257 - 262.
Chang, J. T., Shutzce, H. and Altman, R. B. 2002. Creating an Online Dictionary of Abbreviations from Medline. Journal of the American Informatics Association
9: 612-650.
Chimoskey, S. J. and Norris, T. E. 1999. Use of Medline by Rural Physicians in
Washington State. Journal of the American Medical Informatics Association. 6: 332
- 333.
Clarke, M. 1997. MeSH Terms Must Be Used in Medline Searches. British
Medical Journal 314: 1203 - 1204.
Clarke, M. and Oxman, A. 1999. Cochrane Reviews Will Be in Medline. British
Medical Journal 319: 1435 - 1436.
Cockerill, M. 2002. Biological and Medical Publishing Via the Internet.
Information Services and Use. 21: 33 - 42.
Coletti, M. H. and Bleich, H. L. 2001. Medical Subject Headings Used to
Search the Biomedical Literature. Journal of the American Medical Informatics Association. 8: 317 - 323.
Cooper, G. F. and Miller, R. A. 1998. An Experiment Comparing Lexical and
Statistical Methods for Extracting MeSH Terms From Clinical Free Text.
Journal of the American Medical Informatics Association 5: 62 - 75.
Corn, M. 1998. Funding for Nursing Vocabularies. Journal of the American Medical
Informatics Association 5: 391 - 392.
Delamonthe, T. 2001. PubMed Central Increases Its Appeal. British Medical Journal
322: 818.
Dixon, Laura. A Quiver Full of Arrows: Recommended Web-Based Tutorials for PubMed, Powerpoint, Ovid Medline and Frontpage. Medical reference Services Quarterly 21(2): 55 - 63.
Eberle, M. 2000. Current Awareness Using PubMed: Current WebServices and
Possibilities for Local Solutions. Internet reference Services Quarterly 5(2): 21 -
29.
Feinglos, S. F. 1985. MEDLINE: A Basic Guide to Searching. Chicago:
Medical Library Association, Inc.
Goossen, W. T. F. et al. 1998. A Comparison of Nursing Minimal Data Sets.
Journal of the American Medical Association 5: 152 - 163.
Greaves, L. and James, S. 1997. MeSH Terms Must Be Used in Medline
Searching. British Medical Journal 314: 1203.
Greenhalgh, T. 1997. How to Read a Paper: The Medline Database. British Medical Journal 315: 180 - 183.
Greisdorf, H. and Spink, A. 2001. Median Measure: An Approach to IR System Evaluation. Information Processing and Management 37: 843 - 857.
Grogg, J.E. 2002. EBSCO Publishing Offers Full Text Through PubMed's LinkOut. InformationToday Spring 2002.
Hattery, M. 2001. The Public Library of Science Research 37 (@): 1 - 3.
Haynes, R. B. et al. 1994. Developing Optimal Search Strategies for Detecting Clinically Optimal Search Strategies for Detecting Clinically Sound Studies in Medline. Journal of the American Medical Informatics Association 1(6): 447 - 456.
Hersh, W. R. and Rindfleisch, T.C. 2000. Electronic Publishing of Scholarly Communication in the Biomedical Sciences. Journal of the American Medical Association 7: 324 - 325.
Hersh, W. R. and Greens, R. A. 1990. SAPHIRE - An Information Retrieval System Featuring Concept Matching, Automating Indexing Probabilistic Retrieval, and Hierarchical Relationships. Computers and Biomedical Research 23: 410 - 425.
Hersch, W. R. and Hickam, D. H. 1995. Information Retrieval in Medicine: The SAPHIRE Experience. Journal of the American Society for Information Society 46(10): 743 - 747.
Humphreys, B. L. 2000. Electronic Health Record Meets Digital Library. Journal of the American Medical Informatics Association 7: 444 - 452.
Humphreys, B. L. et al. 1997. Evaluating the Coverage of Controlled Health Data Technologies. Journal of the American Medical Informatics Association 4: 484 - 500.
Humphreys, B. L. et al. 1998. The Unified Medical Language System: An Informatics Research Collaboration. Journal of the American Medical Informatics Association 5: 1 - 11.
Impicciatore, P. 1997. Reliability of Health Information for the Public on the World Wide Web: Systematic Survey of Advice on Managing Fever in children at Home. British Medical Journal 314: 1875 - 1881.
Ingui, B. J. and Rogers, M. A. M. 2001. Searching for Clinical Prediction Rules in Medline. Journal of the American Medical Informatics Association 8: 391 - 397.
Jacobson, M. 2000. Biomedical Publishing and the Internet. Journal of the American Medical Association 7: 230 - 233.
Jones, R et al. 1999. Randomized Trial of Personalized Computer Based Information for Cancer Patients. British Medical Journal. 319: 1241 - 1248.
Katcher, B.S. 1999. MEDLINE: A Guide to Effective Searching San Francisco:
The Ashbury Press.
Kim, W. and Wilbur, W. J. 2000. Corpus- Based Statistical Screening for Phrase Identification. Journal of the American Medical Informatics Association 7: 499
- 511.
Kingsland III, L. C. 1993. Coach™: Applying UMLS Knowledge Sources in
an Expert Searcher Environment. Bulletin Medical Library Association
81(2): 178 - 183.
Kotzin, S. 2002. Medline and PubMed Will Be Able to Synthesize Clinical
Data. British Medical Journal 324: 791.
Liu, H., Johnson, S. B. and Friedman, C. 2002. Automatic Resolution of
Ambiguous Terms Based on Machine Learning and Conceptual Relations in the
UMLS. Journal of the American Medical Informatics Association 9: 621 - 636.
Markovitz, B. P. 2000. Biomedicine's Electronic Publishing Paradigm Shift:
Copyright Policy and PubMed Central. Journal of the American Medical
Informatics Association 7(3): 222 - 229.
Masys, D.R. 1998. Presentation of the Morris F. Collen Award to Donald A. B.
Lindberg, MD Journal of the American Medical Informatics Association 5: 214 -
216.
McCain, K. W. Biotechnology in Context: A Database - Filtering Approach to
Identifying Core and Productive Non-Core Journals Supporting Multidisciplinary
R & D. Journal of the American Society for Information Science 46(4): 306 -
317.
McCain, K. W. and Morris, T. The Structure of Medical Informatics Journal
Literature. Journal of the American Medical Informatics Association 5: 448 - 466.
Morris, T. A. et al. 1997. Approaching Equity in Health Information Delivery.
Journal of the American Medical Informatics Association. 4: 6 - 13.
Notess, G.R. 2000. PubScience: Evolution or Devolution. Econtent
February/March 64 - 66.
Ojasoo, T., Maisonneuve, H. and Jean-Christophe, D. 2000. Evaluating
Publication Trends in Clinical Research. How Reliable Are Medical Databases?
Scientometrics. 50 (3) 391 - 404.
Oxman, A. D. et al. 1994. Users' Guide to the Medical Literature VI. How to
Use an Overview. Journal of the American Medical Association 272(17): 1367 -
1371.
Parker, S. 2001. Medline: Comparative Review on Ovid, Silverplatter,
FirstSearch and PubMed. Denison Memorial Library, University of Colorado Health Sciences Center.
Pope, L. 2001. PubMed Central: A Barrier-Free Repository for the Life
Sciences. The Serials Librarian 40 (1/2): 183 - 190.
Ra, J. G. and Vermuelen, M.J. 1996. Mizspellin and Medline. British Medical
Journal 313: 1658 - 1659.
Ribeiro-Neto, B., Laender, B.F. and Lima, L.R. S. 2000. An Experimental
Study in Automation - Categorizing Medical Documents. Journal of the
American Society for Information Science 52(5): 391 - 401
Smith, R. 2001. Britain's Gift: a "Medline" of Synthesized Evidence. British
Medical Journal 323: 694 - 696.
Sievert, M.C. et al. 2001. Need a Bloody Nose Be a Nosebleed? Or,
Lexical Variants Cause Surprising Results. Bulletin Medical Library Association.
89(1): 68 - 71.
Spink, A. and Yang, Y. 2001. Medical and Health Web Searching: An Exploratory
Study School of Information Sciences and Technology. 1 - 29.

Spyns, P. 1996. Natural Language Processing in Medicine: An Overview. Methods of Information in Medicine35(4): 285 - 301.
Treweek, S.P. et al. Computer- Generated Patient Education Materials: Do They
Affect Professional Practice? Journal of the American Medical Informatics
Association 9: 346 - 358.
Westberg, E. E. and Miller, R.A. 1999. The Basis for Using the Internet to
Support the Information Needs of Primary Care. Journal of the American
Informatics Association 6: 6 - 25.
Wolf, F. M. et al. A Trend Analysis and Search Strategies for the Identification
of Meta-Analyses in Medline. (Abstract) In: Fourth International Cochrane
Colloqium, Adelaid. 1996. available at:
www.cochrane.org/cochrane/abpos22.htm (last accessed 8 November 2001.)
Woods, D. and Trewsheetlar, K. 1998. Medline and Embase Complement Each
Other in Literature Searches. British Medical Journal 316: 1166.
Yu, H., Hripcsk, G and Friedman, C. 2002. Mapping Abbreviations to Full
Forms in Biomedical Articles. Journal of the American Medical Association
9: 262 - 272.

Louiza Patsis, M.S.