DIS 801
Dr. Richard Smiraglia
Louiza
Patsis
December 5, 2002
ABSTRACT
MEDLARS online, or Medline, is the world's most heavily used medical database,
complete with its own controlled vocabulary called Medical Subject Headings
(Mesh). In 1971 Medline became one of the first online databases for information
retrieval. This paper reviews the research that has taken place through the
years regarding information retrieval and interface issues with Medline. These
studies provide useful information for many online databases and their ongoing
improvement in the future.
INTRODUCTION
Medline is a bibliographic database published by the National Library of Medicine
(NLM). Medline contains information about the medical documents or records.
Medline now contains more than 10 million records from more than 4,200 journals
that publish information about the causes, prevention and treatment of disease
and injury (Katcher 1999). It is accessed more than 18,000 times a day (Kingsland
III 1993). Each record has been read by a skilled indexer, who has assigned
to each record roughly a dozen subject headings drawn from a controlled vocabulary
of more than 19,000 Medical Subject Headings (MeSH). Chimoshey and Norris (1999)
conducted a study that confirmed that even rural physicians in Washington State
use Medline and 70% of them say that they benefit from using Medline.
The beginning of Medline, which can serve as an example for other large online
databases, started with one man, John Shaw Billings. Billings was a medical
student at the Medical College of Ohio. He needed information about epilepsy
for his graduate thesis. He spent six months in Cincinnati, New York City and
Philadelphia. In 1864, at 27 years old, he was transferred from field duty as
a military surgeon to the Surgeon General's Office in Washington, DC one of
his duties was to care for the Surgeon General's collection of books and journals.
In the following years, Billings expanded the collection, building it into the
country's largest medical library. He built up a collection of 52,000 records
by 1876. He indexed the records by author and title. Records included books,
journal articles, pamphlets, reports and theses.
Billings developed an elaborate system for cataloguing everything within the
Library. The practice of cataloguing records by subject was not practiced by
most libraries at the time. Users located books by author and then searched
the indices of individual books to find information about a particular topic.
He designed a cataloguing system that included both author and subject names
in his Index Catalogue of the Library of the Surgeon General's Office. Because
it was so extensive, it came to be regarded as an index to the world's medical
literature. Billings included the growing amount of medical journals in his
collection. The Index Medicus was an index to articles published in medical
journals. New monthly medical journals were published in Index Medicus starting
in 1879.
Billings became the Director of New York Public Library in 1895 after retiring
from the army. In 1927 Index Medicus was merged with the AMERICANA Medical Association,
competing bibliography and renamed Quarterly Cumulative Index Medicus.
In 1949 Colonel Frank Bradway "Brad" Rogers became director of the
Army Medical Library, formerly the Library of the Office of the Surgeon General
of the Army. He was sent by the arm to obtain a Masters in librarianship. He
produced a standardized list of subject headings for the Current List of Medical
Literature. In 1956 the United States government gave the Armed Forces Medical
Library statutory authority as the NLM and made it a separate institution within
the United States Public Health Service. This would later become the National
Institutes of Health. In 1960 Rogers guided the publication of the newly revived
monthly Index Medicus with a freshly revised and expanded list of standardized
subject headings. This list was called the Medical Subject Headings or MeSH.
The list consists of single- or multi-word terms used to catalogue and to index
the medical literature.
Billings had been busy designing the Johns Hopkins Hospital, shaping the curriculum
of the Johns Hopkins Hospital, shaping the curriculum of the Johns Hopkins Medical
School and presiding over the development of the New York Public Library system.
He began to work on the United States Census. He suggested punching information
as punched holes on cards so that later the cards could be sorted according
to the alignment of holes that had been punched in them.
Herman Hollerith, a young engineer who was working on the eleventh 1890 census,
perfected the cards and developed a machine for sorting them. The cards became
known as Hollerith Cards, although, as even Hollerith admitted, they were Billings's
idea.
In 1896 Hollerith set up the Tabulating Machine Company, which was eventually
absorbed into the International Business Machines (IBM). The cards became known
as IBM punch cards and started to be made with punch holes corresponding to
the Medical Subject Headings (MeSH) in the Index Medicus. MeSH was one of the
first vocabularies with an explicitly tagged database, content free identifiers,
polyhierarchy, extensive cross-references and text definition.
In 1960 the Index Medicus had three volumes. In 1999 the author entries filled
six volumes and the subject entries filled ten volumes. Prior to MeSH and punched
cards, there was no way to search for entries with two authors or two subjects.
In 1960 the NLM began development of the Medical Literature Analysis and Retrieval
System (MEDLARS), which used MeSH. MeSH is updated annually. The 1963 edition
had the first version of MeSH "tree structures" with their headings
and subheadings. A main heading can appear in more than one subcategory. For
instance, "hepatic" appears in the Infectious Diseases tree under
"tuberculosis" and in the Digestive System Diseases tree under "liver
disease".
By 1965, searches could be submitted to trained libraries at the NLM or its
branches. The librarians would formulate each search and then submit it to MEDLARS
Search Center where punched cards were fed into a computer and the printout
was shipped back by mail. It took four to six weeks. Up to three search statements
were allowed. Some searched yielded too few hits or too many hits, or retrieved
irrelevant documents. These issues are still pertinent today. Scientists would
have to resubmit queries.
In 1968, the first real-time, online bibliographic retrieval system was inaugurated
at the State University of New York Biomedical Communication Network headquartered
in SUNY Upstate Medical center Library in Syracuse, New York. At nine medical
libraries, teletypewriter terminals searched terminals searched 90,000 references.
Results were printed offline and mailed. In 1971 Medline was introduced online
by the NLM. Searchers took a two-week course in online searching. Connect time
was expensive. Searchers had to know details like if search terms were singular
or plural or if the adjective preceded the noun. A team of indexers analyzed
each article and assigned eight or ten subject headings to each article. Two
or four subject headings were pertinent for the article to appear in Index Medicus.
Indexing was always prone to human error. Indexers may assign a wrong term or
forget to assign a term or subheading.
Searchers are also prone to error. The may misspell a word, not know the proper
MeSH term, and retrieve no, little or too many documents.
Medline now includes the Index Medicus, Index to Dental Literature and the International
Nursing Index. Documents can be searched by author, title, subject, journal,
institution, author's address, year of publication and more.
Ralph T. Esterquist, of the 1963 Clinic on Library Applications of Data Processing,
said at a 1963 symposium: "The impact of MEDLARS in the medical library
world is not that of the familiar metaphor - the pebble dropped into the pond,
casting concentric circles that reach man points on the shore. The impact is
no pebble for sure. It is a mighty rock. The waves it will cause will surge
and splash for a long time to come. (Coletti and Bleich 2002, 1-2)
INFORMATION RETRIEVAL AND MEDLINE
Many typical issues of information retrieval come up with Medline, which is
a widely-used database. Studies on Medline information retrieval can shed light
on these issues. The Pew Internet Project of 2002 (Clarke and Greaves 1997)
estimated that 62% of Internet users (73 million people living in the United
States) search the Internet for health information. The study showed that 93%
of health information seekers surveyed looked for information about a specific
illness, 65% sought information on exercise, nutrition or weight control, 64%
for prescription drugs and 33% for sensitive health information. More than half
of the respondents reported using the Internet for health information every
few months or less frequently.
Many types of searchers search Medline. Some are clinicians, some are students.
Some know exactly what they are looking for and others do not. There has been
an increase in use of Medline by physicians (Spink and Yang 2001)
It is best to use MeSH headings (Appendix I) and subheadings (Appendix II).
Often searching by text words (Appendix III) brings up too few or too many results.
Often a person conducts a search and sees that a new search with different terms
is needed. (Feinglos 1985, 18)
Misspelling
Falleis and Fricke (2002) researched the misspelling of medical terms. They
saw that many users misspell words and that they had specific questions that
were beyond the NLM website. Ray and Vermeulen (1996) misspelled ten popular
text words on purpose. Examples are "hamorrhage" for "haemorrhage"
and "myocardial infartion" for "myocardial infarction".
Surprisingly, in some cases, articles that would not have been pulled up had
the word been spelled correctly were pulled up because the word was spelled
incorrectly in the article. Overall, of course, many articles were missed. Clarke
and Greaves (1997) misspelled the methodological term "random" to
"randon" and had similar results: some articles that would have not
been retrieved had the term been spelled correctly were retrieved. But overall,
articles were missed. The also found that searching with MeSH terms yields more
relevant information.
Often Medline searchers, including experienced users, do not know the MeSH term
for their topic of interest. They will then enter a text word. For instance,
the may enter "high blood pressure" instead of the MeSH term "hypertension"
or "heart attack" instead of the MeSH term "myocardial infarction".
For such common text words, the term will be mapped to the MeSH terms. For other
less common terms, the relevant articles may not be retrieved. Cullinan S. et
al. (2001) studied what the term "bloody nose" as opposed to the MeSH
term "epitaxis" would retrieve. The searched in the Medline search
engines PubMed and Ovid. They also studied the lexical variants "pink eye"
and "pinkeye" and "color blindness" and "colorblind".
They concluded that Medline is not designed for consumers. For instance, entering
"bloody nose" in Ovid under the subject heading option yielded terms
that consumers or new users may not be familiar with, such as "glycemic",
"herbicides" and surface-active agents". On PubMed the entry
"nosebleed" resulted in 1,849 entries. Clicking on the details box
showed that the term had been linked to "epitaxis". A searcher would
need to know how to narrow down their search to obtain less articles that are
relevant. They would need to know for what they are searching. Narrowing down
a search will be explored later in this review. Another search problem that
came up is that "bloody nose" resulted in "No term found"
in PubMed. This was not mapped to the MeSH term "epitaxis". Results
of different terms entered are shown on Table 3 (Cullinan et al. 2001, 69) "Colorblindness"
and "pink eye", with all of their variations, were mapped to MeSH
terms "conjunctivitis" and "keratoconjuctivitis" retrieved
more articles. Cullinan S. et al. concluded that more consumer terms should
be mapped to MeSH counterparts.
Clearly, as well indexed as Medline is, improvements can be made. This comes
up often. There is also the issue of new medical terms that are added to MeSH
such as new diseased and drugs.
Types of Searches
These considerations need to be combined with the search technique of end users.
Spink and Yang (2001) conducted a study, albeit not on Medline, to study medical
health consumer searches on Excite and FAST commercial Web engines and concluded
that most consumers fail to understand the limitations of the Web search process,
ascribe human and advice-seeking abilities to Web search engines or do not understand
how the Web search engine works and the medical information models need to be
tailored to less-educated consumers.
In 1997 Greenhalgh conducted a study of Medline retrieval. Different suffixes
can be used according to how one is searching. (Table 2, page 181) For instance,
if one are trying to find a paper called "Confidentiality and Patients'
casenotes" and one knows that it is in the British Journal of General Practice,
this sequence should be typed:
1. confidentiality.ti
2. 2. british journal of general practice.jn
3. 1. and 2.
Or in one step:
Confidentiality. ti and british journal of general practice.jn
Greenhalgh then studied a sample search for answering a specific question: "Is
there any evidence that taking oral contraceptives in these circumstances really
prevents long term bone loss?"
Greenhalgh searched in Ovid Medline and SilverPlatter search engines. One can
type "anorexia nervosa" or "anorexia nervosa, tw." ("Tw"
stands for text word.) In the first case, the request will be mapped to a standard
MeSH term. In SilverPlatter, the "suggest" is clicked after one enters
the query term. One can then choose "anorexia nervosa" or "eating
disorders". Then the searcher is asked if he would like to "restrict
to focus" to get not only articles that mention anorexia nervosa, but those
that are about it. Another option is to search subheadings. The following search
can be entered:
*anorexia nervosa/
"*" shows that it is a major focus and "/" that it is a
MeSH term.
The term "osteoporosis/" yielded 2200 hits and "contraceptives,
oral/" yielded 1200 hits. The symbol "*" was not used with "osteoporosis"
because it was not the major focus of the article.
This combination can be used: *anorexia nervosa/and osteoporosis/and contraceptives,
oral/". Over 4,000 articles were retrieved. To focus on more specific articles,
MeSH subheadings can be used. Greenhalgh claims that 50% of Medline articles
are inadequately or incorrectly classified by subheading. If a searcher is experienced
or is sure of subheadings, then they can be used. As will be covered later,
some Medline search engines have options for MeSH assistance on their interface.
One problem has been that articles indexed before a term is introduced are not
indexed under that term. The seminal paper about a new topic is usually not
indexed under that term, as the first article on percutaneous transluminal coronary
angioplasty, first indexed under angiography, catheterization, heart catheterization
and coronary vessels was later indexed under angioplasty, balloon, and, in 1989,
angioplasty transluminal, percutaneous, coronary. (Coletti and Bleich 2001).
Boolean Searching
Boolean logic is named after George Boole, the nineteenth century mathematician
who described this system of logic. Boolean logic can be used in Medline. "AND"
is used to look for both terms in an article. Table 4 shows the intersection
of MeSH terms osteoarthritis and ibuprofen. "OR" can be used to find
either or term in an article. For instance, one can look for osteoarthritis
and (aspirin or ibuprofen). This would search for osteoarthritis and either
aspirin or ibuprofen. Figures 1 and 2 are simple depictions of the "AND"
and "OR" concept. "NOT" is used to exclude a specific search
term. For instance, if too many papers are retrieved, one could enter "NIT"
letters to exclude all letters.
Susan J. Feinglos claims that "AND" is a better way to limit results.
"AND human" would limit papers that focus only on animals but would
not exclude those dealing with human beings and animals. Greenhalgh (2001) used
"surrogate not mother$.tw" to search for surrogate endpoints in clinical
pharmacology research. She wanted to exclude anything on motherhood. "$"
means "mother" with an ending: "s", "hood" or
other.
Using "adj" operator also helps to narrow a search. For instance,
one can look for "home help" as "home adj help.tw".
Sometimes no or too few articles are retrieved. For a not so common search topic
such as the psychology of diabetes, the search "diabet$.tw" and "psychol$.tw"
would be useful.
The "explode" strategy is also helpful for preventing incomplete searches.
For a broad topic like asthma, the MeSH "asthma tree" would have many
subdivisions such as "asthma in children", "occupational asthma"
and more. A search for "asthma" may miss these terms. On e can explode
the search: "exp asthma/".
Sometimes one does not know where to start searching, as for a term like "stress"
where there could be types of stress that a searcher would not know about: ptx
stress.
Option such as "post-traumatic stress disorders", "stress fracture'
and "oxidative stress" are shown.
If your subject is a MeSH term, use the tree command.
For instance, "tree epilepsy" shows you where epilepsy is in the MeSH
index, and terms such as "generalized epilepsy", "partial epilepsy"
and "post-traumatic epilepsy" are shown.
Limiting a set, though, does not guarantee that you will retrieve all of the
important articles and not get any irrelevant articles or articles of low methodological
quality.
Evidence-based quality filters (EBQFs), which will be covered later are complex
search strategies developed by experienced medical information experts. "AND"ing
for certain articles, such as those covering randomized clinical trials, can
also help to narrow results and to get relevant articles at the same time.
The Unified Medical Language System
In the area that is most important in Medline searching - the assigning of MeSH
terms - the Medline database is remarkably accurate (Coletti and Bleich 2001).
The Unified Medical Language System (UMLS) was developed in the 1980's
by medical informatics specialists. The NLM and Lexical Technologies in Alameda,
California, built the UMLS Knowledge Sources to improve the ability of computer
programs to understand the biomedical meaning of user inquiries and to use this
understanding to retrieve and to integrate relevant medical information from
the Internet (Yu et al. 2002). The Integrated Advanced Information Management
Systems (IAMS) were formed in the 1980's to link automated clinical data and
knowledge-based information to support health care, research and education.
UMLS components prove some of the infrastructure for integrated informatic systems
that are the focus of the IAMS. UMLS was also initiated to facilitate the development
of IAMS systems that can link and integrate different types of machine-readable
biomedical information like patient records, biomedical literature, factual
databases and expert systems (Yu et al. 2002).
People that work on the UMLS build intellectual "middleware" - electronic
knowledge sources and related lexical programs - to help systems developers
build applications that can interpret user queries and find relevant information.
The NLM continually expands UMLS products to update them and to improve their
utility.
The UMLS was built to overcome two important barriers to the development of
information systems: disparity of terminologies used in different information
sources and by different users and the sheer number and distribution of machine-readable
sources that can be relevant to any user inquiry (Humphreys 1998).
The UMLS supports the development of user-friendly systems or information retrieval.
The UMLS project has produced and widely disseminated four multi-purpose knowledge
sources designed for system developers: the Metathesaurus, the Semantic Network,
the Information Sources Map and the SPECIALIST LEXICON. To find out more information
on UMLS, this URL address can be used: http://ww.nlm.nih.gov/pubs/factsheets/umlskss.html
The Metathesaurus links MeSH to text words and to other medical thesauri. Contributions
from experts in medicine, biomedical sciences, medical informatics, computer
science, library and information science and linguistics. It preserves the names,
meanings, hierarchial contexts, attributes and inter-term relationships present
in source vocabularies, adds basic information to each concept and established
new relationships between terms from different source vocabularies. With Metathesaurus
information, computer programs can interpret user inquiries, interact with users
to refine queries and questions, identify relevant databases and linking alternate
names such as abbreviations, lexical variants, synonyms and translations for
the same concept.
The Semantic Network has 134 semantic types and provides a consistent categorization
of all concepts represented in the Metathesaurus. There are 54 links that provide
the structure for the Network and represent important relationships in the biomedical
domain.
The SPECIALIST Lexicon provides access to lexical records. Lexical entries,
which may be single words or multi-words, record syntactic, morphological and
orthographic information and inflectional variation such as single and plural
forms of nouns, conjugation of verbs and the positions, comparative and superlative
for adjectives and adverbs. The table LRAGR lists all variant forms for each
entry in the lexicon.
Version 2.0 of the UMLS was designed for:
" extensibility for ease of new feature incorporation
" scalability in handling ever-increasing user loads and increasing numbers
of the UMLS vocabularies
" performance considerations permitting faster access to the UMLS data
" flexibility in access modes
" set with access to all of the UMLS data
" ease of administration by NLM staff and contractors
" limited system interruptions during system software upgrades
Version 3.0 will have the following additions:
" an associated object model for accessing/representing the Semantic Network
" object model equals/equivalence checking allowing instances of object
model classes to be compared to each other
" online access to the features/functions of the UMLS MetamorphoSys utility
Term Disambiguation
Liu et al. (2002) proposed a method to disambiguate terms that possess multiple
UMLS concepts. Liu et al. contructs a sense-tagged corpora for almost all ambiguous
terms in the UMLS using Medline abstracts. Manual methods for this can be expensive.
For a term W that represents multiple UMLS concepts, a collection of Medline
abstracts that contain W is extracted. For each abstract, occurrences of concepts
S that have relations with W as defined in the UMLS are automatically identified.
A corpus tagged with annotated senses of W is derived on identified concepts.
This method was compared on a set of 35 frequently occurring ambiguous biomedical
abbreviations using a gold standard set that was automatically derived. Precision
and recall were used to measure the quality of the derived sense-tagged corpus.
The results were: precision rate of 92.9% and overall recall of 47.4%. Once
rare senses and ignoring abbreviations with closely-related senses, the overall
precision was 96.8% and the overall recall was 50.6%.
This study addressed the problem of the sometimes inadequate interpretation
of free-text in the biomedical, natural language processing by computer applications.
Terms in free text can be ambiguous. For instance, capsule can be a unit of
medication or a body region. Abbreviations can be ambiguous also, like "hr"
for "hour" or "heart rate". This can pose a problem for
Medline searches. Too much information or irrelevant information can be retrieved.
The Metathesaurus is organized by concept. Each distinct concept is assigned
a unique concept identifier (CUI). Many concept names can have one CUI, for
instance, "congestive heart failure" and "biventricular heart
failure". Furthermore, each concept name has a term status to indicate
whether it is the preferred concept name of the corresponding concept or if
it is suppressed, i.e. abbreviated or problematic.
There are hundreds of thousands of concepts and concept names listed in the
table MRLON. Table MRREL lists relationships between UMLS concepts. There are
millions of entries and nine relationship types, such as broader (RC), narrower
(RN) and similar (RL). Two concepts may have multiple relationships.
In the Semantic Network, each CUI has been assigned to one or more semantic
categories.
There are two kinds of ambiguities presenting in the UMLS: conceptual, which
refers to ambiguity due to multiple concepts of terms, and semantic, which refers
to ambiguity due to multiple semantic categories of terms. There is an ambiguous
term table AMBIG.SUI in UMLS. The Method proposed in the Lieu et al. article
of 2002 was concerned with conceptual ambiguity. It utilized conceptual relations
defined in the UMLS to automatically derive sense-tagged corpora for ambiguous
terms and used the word sense disambiguation (WSD) classifiers were then automatically
constructed using the sense-tagged corpora. This work, unlike previous work,
used the UMLS as the conceptually oriented knowledge source.
In this study, precision was the ratio for the number of abstracts with correctly
identified sense to the number of abstracts that were sense-tagged using conceptual
relatives. Recall was the ratio of the number of abstracts with correctly identified
sense to the total number of abstracts in the gold standard set (GSS).
The WSD classifiers trained on sense-tagged corpora with high precision performed
better than those on sense-tagged corpora with low precision. Liu et al.'s CRSMap
performed better than the UMLS's MetaMap with respect to the quality of derived
sense-tagged corpora, except for APC and BSA, and was superior to MetaMap, the
program that maps biomedical text to concepts in the Metathesaurus, with respect
to the performance of the WAD classifier for all but two abbreviations.
The sense-tagged corporus derived using CRMap had a better precision than that
derived using MetaMap for all but one abbreviation. Liu et al. found that causes
of low precision were relatedness among different senses (such as CMG standing
for electromyograph, electromyography, electromyogram and exomphalos macroglossia
gigantism, and the existence of poor conceptual relatives), and, in the case
of the WSD classifier, lack of enough training of the searcher.
This study shows that, although the UMLS is detailed and extensive, improvement
can be made in resolution of ambiguous terms. This would lead to higher precision
and recall in the Medline information retrieval.
The goal of the CRMap is to match only conceptual relatives while the goal of
MetaMap is to find conceptual relatives that contain prepositional noun phrases
and CRMap does not have this limitation. MetaMap fails to identify persistent
pulmonary hypertension of the newborn, which is a sibling of MAS (meconium aspiration
syndrome) or as a relative of MAS in abstracts that have it. Liu et al. plan
to further investigate relations defined in different sources, to formulate
a new sense assignment scheme and to use clustering techniques to find instances
that are associated with rare senses or unknown senses.
Indexing
Kim et al. (2000) studied the extraction of useful phrases from Medline records
and from the UMLS with abstracts or entry dates from 1996 by statistical methods
in order to leverage human effort by providing preprocessed phrase lists with
a high percentage of relevant information.
They developed six scoring methods based on different aspects of phrase occurrence.
They focused on the statistical properties of word pairs and triples that can
be obtained from a large database. The UMLS was used as a gold standard for
validating methods. The authors found six different scoring methods that can
prove effective for identifying the UMLS quality phrases in Medline.
They concluded that statistical scoring methods provide a promising approach
to the extraction of useful phrases form a natural language database for linking
or providing hyperlinks in text.
For a large database like Medline, people often can use many terms for one topic.
Indexing can alleviate this problem by expanding the list of terms to access
a document.
One path to improve indexing is to obtain a list of terms sufficient to include
a high percentage of the terms that people will use in querying a database and
to add enough synonymy information to allow a query to access documents that
are indexed with an expression that is synonymous with a query. Kim et al. hypothesized
that statistical information about the occurrence of phrases in Medline can
provide a useful screen for candidate phrases that are of similar quality to
the material already in the UMLS.
Jones et al. had already studied the frequency of phrases, and maintained that
words that compose them are important in the phase extraction method. Other
studies, like the one by Harter in 1975, looked at the distribution of frequencies
of a term within a document.
The methods developed can serve as a screen for the extraction of useful phrases
and can also form part of a system for marking useful phrases in text. Limitations
such as excluding stopwords, cannot detect phrases like "vitamin A",
and excluded phrases with more than three words. Terms for phrases such as "cancer
of the lung" may have to be rearranged from "lung cancer", for
instance.
In the future Jones et al. will examine two ways of improving the system: 1.
allow phrases that are longer than three words; and 2. find a way to score phrases
more accurately according to how laden with content or subject matter they are.
Searchers can look up patient records in Medline. Cooper et al. (1998) developed
and evaluated PostDoc, a lexical indexing system, and Pindex, a statistical
indexing system, separately and then as a hybrid. Each system takes as input
a portion of free text from a patient record and then returns a list of MeSH
terms to formulate a Medline search that includes concepts in a text. The ability
of PostDox to carry out synonymy mapping was dependent on the quality of the
lexical variant terms included in the Metathesaurus. Pindex uses a hash table
of phrases for which it assigns MeSH terms automatically.
The patient records were six radiology reports, six pathology reports and six
discharge summaries. Blinded assessment by the authors determined the extent
to which a system-derived list of MeSH terms captured the relevant concepts
in these documents. Pindex captured more relevant report concepts compared to
PostDoc: 40% versus 45%.
The results suggest a new way to reduce the number of terms output while maintaining
the percentage of terms captured, including the use of the UMLS semantic types
to constrain the output list to have only clinically relevant MeSH terms. This
study was step toward the realization of systems that assist healthcare personnel
in using the electronic medical record to help construct patient-specific searches
of Medline.
Two author raters did their own assigning of MeSH terms. Precision was defined
as the fraction of MeSH terms output by a system for the report that were used
in that annotation to represent one or more concepts. Recall was defined as
the fraction of concepts in the annotation that were adequately represented
by the MeSH term.
Some results were:
PostDoc: 40% - 50% of MeSH terms output by PostDoc were used to represent one
or more concepts, and
Pindex 15% - 20% of MeSH terms output by PostDoc were used to represent one
or more concepts.
When PostDoc and Pindex outputs were taken together to create the Union System,
recall was 60% and precision was 20%. The union of the two provided better recall.
Cooper et al. concluded that both could be refined to produce better performance.
PostDoc has been using version 1.1 of the UMLS Metathesaurus since it was developed
in 1991 - 1992. Cooper et al. hypothesize that the use of the most current Metathesaurus,
with more lexical variants and synonyms, would alter the performance of PostDoc.
The increased coverage of the current Metathesaurus leads to an increase in
PostDoc recall and a possible decrease in precision. If the probability threshold
at which Pindex includes terms in its output list is increased, recall would
be traded off for precision, if needed to perform a search. PostDoc and Pindex
output MeSH terms from among those in the entire MeSH vocabulary. Precision
would be increased by a postprocessor that would contrain their output. This
study touches on a subject to be discussed later in the paper, how sensitivity
and recall often cannot both be high in a search. Cooper et al. plan to study
other clinical reports in the future and to find better systems for indexing
in Medline.
Abbreviations
Chang et al in 2002 wrote that the amount of literature in biomedicine is exploding
as Medline "grows by 400,000 citations each year"(page 2). They defined
abbreviation as "all strings that are shortened forms of sequences of words
(its long form)", (page 2), as opposed to only acronyms, which are typically
defined as the conjunction of the initial letters or words. They created an
online dictionary of abbreviations from Medline to create an automatically generated
and maintained lexicon of abbreviations. Their algorithm matched abbreviations
in text with their expansions. Such algorithms of course already existed. With
the growth of biomedical literature, such algorithms can be improved or new
ones can be invented to increase the retrieval of relevant documents and to
decrease ambiguous algorithms.
Their method used logistic regression. Their algorithm was applied to Medstract,
a corpus of Medline, because it is easily available, eliminated the need to
develop an alternate standard and it provided a reference point to compare methods.
They tested their algorithm against an independently created list of abbreviations
from the China Medical tribune. They measured the precision and recall of the
algorithm in identifying abbreviations from the Medstract corpus. Their algorithm
is available at http://abbreviationstanford.edu.
Recall was defined as the number of correct abbreviations divided by all correct
abbreviations. Precision was defined as the number of correct abbreviations
divided by all predictions. Recall was 83% and precision was 80%.
Chang et al. believe that automated methods for finding abbreviations are of
greater value than manual ones, which they claim suffer for the problem of completeness
and timeliness. The article presents "a novel algorithm for identifying
abbreviations, a set of feature descriptive of various types of abbreviation
server containing all abbreviation definitions found in Medline" (page
3).
Some precision of the evaluation was hurt by some abbreviations missing from
the gold standard. The largest amount of errors occurred because the gold standard
included synonyms, words and phrases with identical meanings, and the algorithm
could not find the correspondences between letters. This indicates a fundamental
limitation of letter-matching techniques. A source of error was from their strong
assumption that the abbreviation must be inside parentheses and the long from
must be outside of parentheses. The study showed that linking to external dictionaries
of abbreviation can augment the ability of automated methods to assign definitions
that are not indicated in the text. Yu et al. (2002) developed two methods of
mapping defined (abbreviations paired with their full form in the article) and
undefined abbreviations. AbbRE (short for abbreviation recognition and extraction)
was the software program into which pattern-making rules to match abbreviations
and their full forms were implemented. Undefined abbreviations were mapped to
any of four public abbreviation databases that map gene and protein abbreviations
in LRABR of the UMLS Specialist Lexicon, GenBank, LocusLink, SWISSPORT and BioABACUS.
The opinions of domain experts were used as a gold standard. Recall was defined
as the number of correct abbreviations present in the reference standard and
found by AbbRE divided by the number of abbreviations in the reference standard.
The recall was 0.70 and the precision and 0.95 for defined abbreviations. They
found only 25% of abbreviations were defined in biomedical articles and 68%
of them could be mapped to an of four abbreviation databases.
Yu et al. found yet another program to successfully map abbreviations. This
can be useful to Medline searchers, especially when abbreviations are undefined.
AbbRE: 1. handles full biomedical articles; 2. searches for parenthetical expressions
for paired abbreviations and for full forms; 3. does not break up words into
components; 4. relies on a set of pattern-matching rules for mapping an abbreviation
to its full form; and 5. has been evaluated by domain experts.
The five biomedical journals used were: Cell, Science, Trends in Neuroscience
(TNS), Proceedings of the National Academy of Sciences (PNAS) and the Journal
of Biological Chemistry (JBC). The five medical journals used were the New England
Journal of Medicine CA: A Cancer Journal for Chemistry, the Journal of the National
Cancer Institute (JNCI), the Journal of the American Medical Association (JAMA)
and Lancet.
The gold standard was 45 medical expert abbreviations and 51 biological expert
abbreviations. Most abbreviations that failed to be recognized by AbbRE were
not associated with their full forms.
Recall and precision were high, but only 68% of the undefined abbreviations
could be mapped to any of four databases. AbbRE had an average recall of 0.70
and an average precision of 0.95 for defined abbreviations. On average, 25%
of abbreviations were defined in biomedical articles and that of a randomly
selected subset of undefined abbreviations. The authors found that many abbreviations
are ambiguous, i.e. they map to more than one full from in abbreviation databases.
They concluded that AbbRE is efficient for mapping defined abbreviations. They
agree that, to couple AbbRE with abbreviation database for the mapping of undefined
abbreviations, exhaustive abbreviation databases and a method to resolve the
ambiguity of abbreviations in the databases are needed. In addition, the overall
agreements of medical and biological experts agreed more with defined than undefined
abbreviations.
Yu et al. plan to develop and expand AbbRE and to apply it to all Medline abstracts
in PubMed and to study AbbRE in other databases. A program to more accurately
define abbreviations with more than one meaning will be developed. Mapping an
abbreviation to its full form facilitates natural language processing and is
important for information retrieval. If the full form of an abbreviation is
missing in an article and a program like AbbRE is not used, a searcher may miss
relevant articles.
Search Filters
Search filters are a collection of search terms intended to capture frequently
sought research methods and are used to study designs in Medline. They can be
used to locate systematic reviews of the effectiveness of health interventions.
Systematic reviews identify, access and combine the evidence from primary research
studies and were included in the study if they assessed causation, diagnosis,
treatment or prognosis of disease. Non-systematic reviews present a summary
of results and conclusions of studies, but do not contain a statement of methods,
objectives or materials. Much research has gone into effective search filters.
White et al. (2001) set out to improve previously developed methods to derive
a more objective search strategy to identify systemic reviews in Medline. Known
systematic reviews made up a quasi-gold standard.
A frequency of words within a subset of the "quasi-gold standard"
was calculated and then statistical analysis of the most frequently occurring
words was undertaken. The analysis determined which terms best could be used
to distinguish between systemic reviews, non-systemic reviews and non-reviews.
Wolf et al. in 2002 and Boynton et al. in 1998 had previously shown that systematic
reviews are not easy to find in Medline and are hidden among other studies that
are called reviews. White et al. (page 358) wanted to expand on Boynton et al.'s
search strategy because the thought that it had weaknesses: 1. The analysis
of words in records did not capture phrases or analyze properly multi-term MeSH
headings and publication types, the analysis was univariate, analyzing the value
of each term alone and not jointly with other terms; and 3. sensitivity of new
strategies was tested against the original "quasi-gold standard" records
used to derive the search strategies and ma be overestimated.
In the White et al. study, the journals used were: Annals of Internal Medicine,
British Medical Journal, Journal of the American Medical Association and the
Lancet. The records identified were: 110 systematic reviews, 110 non-systematic
reviews and 125 non-reviews. Sensitivity or recall was defined as the number
of systematic reviews correctly classified times 100, divided by the total number
of systematic reviews. Specificity was defined as the number of records correctly
classified as not systematic reviews or records correctly classified as not
systematic reviews times 100, divided by the total number of records that are
not systematic reviews. Precision was defined as the number correctly classified
systematic reviews divided by the records retrieved by the searcher.
Different models with different objectives were produced.
The sensitivities for the models were as follows:
Sensitivity (%) Specificity (%)
Model A 98.3 73.4
Model B 94.9 67.1
Model C 51.9 99.4
Model D 87.1 89.2
Model E 77.2 94.9
Model A
was formed to discriminate between systematic, non-systematic and non-reviews.
It would be best to model for searchers who want to retrieve and sift through
a large amount of systematic reviews. Model B investigated the importance of
frequency of occurrence of terms in records by exploring the effect of ignoring
frequency and only using the presence or absence of terms in records. Sensitivity
and specificity were lower than for Model A. This indicates that the frequency
of occurrence of terms does have an effect on the classification of systematic
reviews. Model C tested parsimony and test validity. The low sensitivity shows
that despite face validity a statistical approach can increase sensitivity.
The specificity was high; only one term was wrongly classified as a systematic
review by the model. This shows that the five terms are highly focused and do
not retrieve a large number of other record types. Model D was developed to
discriminate between non-reviews and any sort of review - systematic or non-systematic.
Model D gave the same sensitivity of detecting systematic reviews as Model A
but had lower specificity. Model E was designed to distinguish between systematic
reviews and all other types of records (non-systematic reviews and non-reviews).
Model E had high sensitivity and high specificity.
Model A would be best for a researcher who wants to be sure of retrieving a
high proportion of systematic reviews and who is willing to sift through many
irrelevant records, while Model C would be best for a researcher who wanted
to find a high proportion of relevant records quickly. Compromising of the search
filter model according to the goals of the researcher could be applied after
viewing results of each model in this study.
White et al. feel that they expanded on previous search filters and showed that
the number of times that a term occurs and the combination of terms can help
to identify systematic reviews. The authors think that a database interface
development will allow searching by frequency of terms and the weighting of
these terms. Other databases and interfaces will be studied. Limitations of
the study were: results were based on English medical journals, which tend to
have a more rigorous peer review and may demand a higher standard of reporting
or research methods than other journals. Objectivity can be improved in the
selection of terms to discard and the cut-off points for frequency analysis.
Ingui et al. in 2001 derived and validated an optimal search filter for retrieving
clinical prediction rules using Medline. Clinical prediction rules are tools
designed to assist health care professionals in making decisions. They compromise
variables obtained from the history, physical examination and simple diagnostic
tests of patients. Inconsistent terminology makes them difficult to index and
to retrieve by computer systems. The "gold standard" was established
by a manual search of all articles from print journals from 1991 to 1998, identifying
articles covering various aspects of clinical prediction rules such as derivation,
validation and evaluation. Filter predict$ or clinical$ or outcome$ or risk$
retrieved 98% of clinical prediction rules. Predict$ and rules$ retrieved 99.97%.
Sensitivity and specificity were both above 90%. Sensitivity was defined as
the proportion of articles with clinical prediction rules that were retrieved
by the filter. The positive predictive value was defined as the proportion of
retrieved articles that contained clinical prediction rules. Positive likelihood
ratio was defined as the ratio of sensitivity to specificity. The amount of
search filters studied was 694. The highest sensitivity was 98%. Four filters
had sensitivity and specificity higher than 90%. Higher positive predictive
values and positive likelihood ratios had low sensitivities. The filter "predict$
OR clinical$ OR outcome$ OR risk$ yielded the highest sensitivity - 98.4%. The
filter with the highest specificity - 78.6% - was "predict$.ti AND rule$".
The sensitivity was 16.1%. The single term with the highest sensitivity was
"predict$". Ingui et al. concluded that one search filter could not
meet the needs of researchers, clinicians and students. For the person who wants
to quickly retrieve a clinical prediction rule for illustrative purposes, the
use of predict$.ti AND rule$ yields three relevant articles for four irrelevant
articles. The positive predictive value was 75% in the validation set. Optimal
information retrieval was found to include population, intervention, comparison,
outcome and translation into searchable strategy. Ingui et al. concluded: "Optimal
retrieval of the best evidence is based on the formulation of a well-defined
question, which includes population, intervention, comparison and outcome, and
its translation into searchable" (page 397).
Haynes et al. (1994) developed search filters and strategies for retrieving
sound clinical studies in Medline. They performed an analytic survey of operating
characteristics of search strategies developed by computerized combination of
MeSH and text terms selected to detect studies meeting basic methodological
criteria for direct clinical use in adult general medicine. Sensitivities, specificities,
precision and accuracy of 134,264 unique combinations of search terms were calculated
and compared to the manual review of articles or "gold Standard".
Ten internal medicine and ten general medicine journals in 1986 and in 1991
were searched.
Combinations of search terms in 1991 reached peak sensitivities of 82% for studies
of etiology, 92% for studies of prognosis, 92% for studies of diagnosis and
99% for studies of therapy. Multiple terms, compared to single terms, increased
sensitivity by more than 30%, with some loss of specificity. For 1986, it was
72% for studies of etiology, 95% for studies of prognosis, 86% for studies of
diagnosis and 98% for studies of therapy.
Search terms were combined to maximize specificity, over 93% specificity was
achieved for all purpose categories in both years. High accuracy was achieved
by combining terms. Peak accuracies of over 90% were reached for therapy in
1986 and in 1991.
Haynes et al. contributed some of the difficulties in using Medline to: the
large number of postings in Medline (several million), the low prevalence of
clinically applicable studies, the well-documented limitations of indexing and
retrieval in Medline from its inception and the imprecise search skills of clinical
end user.
Seven of the 12 search strategies from 1986 could not be run in 1991 because
terms used in 1991 were not available in 1986. Now, with even more new terms,
search strategies like these need to be revised and update and new ones need
to be invented. The search strategy that yielded the best sensitivity (99%)
for treatment in 1991 was "randomized controlled trial (pt)" or "drug
therapy (sh)" or "therapeutic use (sh)" or all random (tw). "AND
NOT" comment, letter and news can include other journals. Search strategies
to maximize both sensitivity and accuracy outperformed other strategies.
Wood et al. (1999) developed the Large Scale Vocabulary Test (LSVT) to allow
participants to search local terms and concepts in the Metathesaurus. The hypothesis
was that a combination of existing terminologies will cover the majority of
the concepts needed for a broad range of health information systems. The two
largest vocabularies in the test - SNOMED International and the Read Codes -
had the highest percentage (more than 60%) of the exact meaning matches. The
study showed that most of the concepts and qualifiers needed to record data
about patient conditions are already included in one or more of the UMLS vocabularies.
The authors feel that their test could be used to enhance controlled vocabularies
and for other collaborative informatics research and for design of efficient
clinical data entry systems.
Another possible problem with Medline information retrieval was addressed by
Ojasoo et al. (2001). They analyzed the publication trends (PTs) of clinical
medicine records in Medline and found that there were periods of erratic activity
or quirks. In the late 1980's there was a greater interest in randomized clinical
trials (RCTs) in the gold standard of clinical investigation. Medical journals
encouraged this publication and sometimes grants and career advancement depended
on this. One looking for a term or answer may not use the term RCT or, upon,
using it, may come up with too many records.
Bachmann et al. in 2002 constructed and validated a better search strategy to
identify diagnostic articles recorded on Medline with special emphasis on precision.
They set out to develop a more precise search strategy for selecting publications
on diagnostic test evaluations without losing sensitivity. Medical journals
in 1989, 1994 and 1999 were hand-searched. A word frequency analysis of the
abstracts identified text words for search strategies. Sensitivity, precision
and number needed to read (1/precision) of every candidate term was calculated.
Sensitivity was the number of gold standard articles as a proportion of all
gold standard articles. Precision was the number of gold standard articles as
a proportion of all articles retrieved. The currently used PubMed filters Clinical
Queries, which was based on the work of Haynes et al. (1994) was the "gold
standard".
Bachmann et al. concluded that the performance of Clinical Queries may be overstated.
The filter developed by Bachmann et al. performed slightly better than the currently
available one and better with regards to precision in the 1994 subset. Clinical
Queries's sensitivity and precision for 1994 and 1999 were:
1994 1999
Sensitivity 95.1% 88.8%
Precision 8.2% 4.3%
The sensitivity
and precision of the new search filter for the years 1994 and 1999 are as follows:
1994 1999
Sensitivity 98.1% 95.1%
Precision 12.0% 4.3%
Diagnostic
studies were defined as having content pertaining directly to the evaluation
of disease process usually through comparing methods of arriving at a diagnosis.
Tests were defined as procedures used to change the estimate of the likelihood
of disease presence.
Inconsistent terminology used in diagnostic studies makes them difficult to
index and to retrieve in electronic databases. Using Clinical Queries, Bachmann
et al. found between 77% and 92% sensitivities of all recorded material on Medline
at a price of having to sift through 12.5 records to find one article that refers
to diagnosis. This does not seem bad until one sees that 625 records ma have
to be dealt with to find 50 relevant records. Time could be saved by relying
on the filter with the highest specificity, but then many relevant records may
be lost. This is especially risky in searching for diagnostic research, where
there is a high variability in study outcomes. Bachmann et al. do not recommend
the PubMed high-specificity filter. The term that performed best in their search
was predict$ and was not evaluated as a text word by Haynes et al. (1994).
Four factors that can influence a filter's reproducibility, according to Bachmann
et al., are: 1. the selection of journals; 2. the way in which abstracts are
written may change over item; 3. editorial processing may change over time and
may lead to different working in abstracts; and 4.variation in indexing quality
in Medline over time.
Bachmann et al. plan in the future to evaluate their filter more in terms of
time, cost, missing relevant records and to study the impact of language restrictions
on summary measures, different search filters and search strategies and evaluating
the conclusions of diagnostic reviews.
Information seekers in Medline have the option to search in core and non-core
journals. McCain (1994) conducted a study, not to develop the definitive core
list of biotechnology journals, but to explore the relationships among biotechnology
(narrowly construed) and those several other fields participating in biotechnology
R & D (exporting basic research or importing applications) through the citation
and publication patterns of the formal literature. Her ranking-base selection
technique weights the number of citations received by one journal from another
by the proportion of all citations received and the size of both journals and
ranks the titles based on this citation weight. "Cocitation" was defined
as when a minimum of one article from each of two journals is jointly cited.
Journals with high intercorrelation are grouped together. This is cluster analysis.
The database-filtering approach developed combines citation and coverage analyzes
and can identify core journals in biotechnology based on the aggregate citation
choices of authors and distinguish those that best cover biotechnology research
from titles publishing an occasional article. McCain concluded that Medline
searchers can identify useful core journals for their search and, if relevant
information is not retrieved, could search in non-core journals.
Sensitivity and Precision
Boynton et al. (1998) designed search strategies based on a more objective approach
to strategy construction to search for systematic reviews. A high sensitivity
level of 98% and a relatively high precision level of 20% were achieved. The
study showed that a frequency analysis approach can be used to construct highly
sensitive strategies that have adequate levels of precision for retrieving systematic
reviews. Medline was the test database. The authors state these problems with
search strategies that rely on indexing terms alone: poor description of research
design by the author, alternative terms or synonyms, lack of appropriate indexing
terms in the Metathesaurus and inaccuracies in assigning index terms.
The "quasi-gold standard" was made up of 288 terms from the Annals
of Internal Medicine, Archives of Internal Medicine, British Medical Journal,
Journal of the American Medical Association, Lancet and the New England Journal
of Medicine. Boynton et al. concluded that different search strategies could
be used according to research needs. One surprising thing found was that using
high sensitivity terms resulted in much lower overall or cumulative sensitivity
than searches using lower sensitivity terms. Choosing various combinations of
sensitivity and precision resulted in the optimal search strategy obtained by
the group. This study could be conducted on other journals, including non-English
ones and ones with different publication dates. Also, the unique contribution
of individual terms to the overall sensitivity and precision of each strategy
has not been estimated.
Patel et al. (1998) studied medical informatics and Medline and concluded that
cognitive science can contribute to objectives that concern researchers and
practitioner in medical informatics. They expanded on research in computer-mediated
communication. According to the authors, theories and methods from cognitive
science can inform medical informatics by addressing important issues such as
the usability of systems, the process of medical decision making and the training
of physicians and end users.
Westberg et al. (1999) worked with the UMLS to represent and link information
needs for clinical practitioners to look up patient information. From 1991 to
1996 Cimino et al. examined common semantic and syntactic patterns and identified
a set of general-purpose questions called "generic queries" that are
tailored for user information needs. They hypothesized that the use of generic
queries in clinical applications could facilitate determination of users' information
needs and simplify the selection of potentially relevant information resources.
They combined manual review by librarians with natural language processing and
derived 37 generic queries that captured the essence of all queries in the study.
They integrated what they learned into the Medline.
Ribeiro-Neto et al. in 2000 devised an automatic algorithm that categorizes
medical documents by assigning an International Code of Disease (ICD) to medical
documents, which in the study were 77 discharge summaries. An average level
of precision of 70% - 80% for category coding and 60% - 70% for subcategory
coding was achieved. The algorithm made 25% of the mistakes of human specialists.
Ribeiro-Neto et al. focused on medical records and ICD-9, the ninth version
of ICD. Distinctions learned from this algorithm can be extended to Medline
information retrieval. Ribeiro-Neto et al. set out to improve precision at high
recall by taking advantage of the hierarchial structure of MeSH.
Hersh and Greenes in 1990 and Hersh and Hickam in 1995 constructed the project
SAPHIRE (Semantic and Probabilistic Heuristic Information Retrieval Environment)
that developed methods for indexing and searching collections of medical documents
and establishing reference collections that compare distinct information retrieval
systems. SAPHIRE proposes to index and to search medical collections by using
a semantic network of medical concepts and terms. The semantic network is based
on UMLS. Spyns in 1996 gave an overview of categorization algorithms based on
the idea of codes treated as concepts of the medical language whose meanings
are defined through sentences in natural language.
Some documents need to be assigned codes manually or semiautomatically. For
instance, if a patient suffered form encephalomyelopath mitochondrial and the
corresponding ICD-9 alphabetical index is encephalomyeltiits. The specialist
would add an annotation to her ICD-9 alphabetical index that indicates that
encephalomyelopathy, mitochondrial constitutes an alternative path to the code.
Ribeiro-Neto et al. found that, in some cases, the ICD-9 index is not complete:
1. Not all medical semantics is represented within the index; 2. The specialists
opted for a code which is distinct from the default code recommended by the
index; 3. The specialists used the knowledge about the semantics of a specific
term that does not appear in the index and the specialists deduced the code
using additional information on the text of the medical document.
In 1985 the NLM reviewed 2,000 literature search request forms submitted from
the NIH and created a database of 155 representative queries for experimentation
in bibliographic retrieval.
A group at Yale University from 1989 to 1992 developed two knowledge-based programs
designed to help clinicians find relevant literature references PsychTopix for
psychiatry and HepaTopix for hepatology. Selecting a topic generates an automatic
Medline search using MeSH logic form the program's knowledge base.
According to Westberg and Miller's review article, there have been varying reports
of words in patient records mapping to MeSH and to UMLS. They site these problems:
difficulty for interpreting user queries, incomplete coverage of primary care
concepts, intervocabulary mapping difficulties and inconsistencies within the
Metathesaurus.
Human Computer Interaction
There are programs such as COACH (Kingsland III et al. 1993) help end
users solve search problems by emulating the approach of the end user and applying
specialized knowledge to help them resolve it. COACH analyzes the user's
search, interacts with the user by applying or suggesting alternative mappings
form its knowledge sources - which include the Metathesaurus - and returns modified
search results to the end user. COACH does not use Boolean searching but
can help in a Boolean search. One common problem that end users face is entering
an "AND" search and few or no results are retrieved. For instance,
one searching for stress fractures of the spine may enter "(fractures,
stress) and spine". This may give few hits. COACH assumes that the
user wants more hits. COACH assumes that the user wants more hits. COACH
uses that "spine" has seven narrower terms in MeSH and explodes the
command. The user can select which term to use. COACH can also narrow
down a second. For instance, if "AZT and AIDS" is entered, over 800
hits are obtained. The searcher may be looking for specific therapeutic uses.
Upon the searcher's request, COACH can add subheadings such as "therapeutic
use" as qualifiers and synonyms to narrow down the search. COACH
also limits searches by language, publication types, search years, current month
(SDLINE), check tags and age group. A Metathesaurus concept pick list is presented
in the screen. Programs like COACH can be very helpful, especially for
new Medline users, for those not well-versed in the topic that they are searching
or for new medical topics.
Wood et al. (1998) studied Internet end-to-end performance of pathways used
to access information in the NLM databases and, by extension, other Internet
biomedical resources. Quick median time to conduct standardized searches and
get results form PubMed was 2 to 14 seconds for PubMed. In 1997 the NLM established
an Internet connectivity evaluation project. Intercept to improve the understanding
of the role that end-to-end Internet performance plays in facilitating (or hindering)
access to the NLM and other biomedical databases.
Bulk transfer capacity (BTC) measured the data transmission capacity of the
Internet pathway between two locations; "ping" round-trip time as
an indicator of the latency or propagation delay in the network for data traveling
from one end of the network to the other and back; and the number and sequencing
of links or hops form origin to destination (network routing). Packet loss,
percentage of data packets for which the testing software did not receive an
acknowledgment of successful transmission. Good for describing the performance
of congested networks. The overall error rate is 1%. There was a slow response,
no response or problems such as: "network error occurred, reset by peer",
and "unexpected error has caused break in correction". The authors
wrote that a multilevel approach with multiple tools, methods and metrics is
probably needed to study end-to-end Internet performance, and that building
and designing a significant margin of excess transmission capacity ma help to
minimize peak. Peak hour delays were found.
Joubert et al. (1998) showed that the conceptual graphs formalism allows powerful
capabilities to operate a semantic integration of information databases using
the UMLS knowledge sources. The authors concluded that, when data are structured
by means of semantic relationships, the matching process is successful. It can
also operate well if records of a database are not structured, but it is nonetheless
possible to identify semantic relationships between concepts. The use of conceptual
queries for information retrieval is significant here. The Metathesaurus has
an explicit hierarchy of instances of queries that can be immediately exploited
by applications and used to implement the mechanism that is the basis for conceptual
graphs systems. The authors would not refine concepts in the Semantic Network
but would enhance the definitions of the core concepts that the Metathesaurus
registers by means of contextual knowledge. The conceptual graph theory is able
to represent concepts, instances of concepts in medical contexts an associations
by means of semantic relationships.
Interfaces
Medline is available through several search interfaces. The most popular one
is PubMed. There are also Ovid, Silverplatter (Webspirs) and FirstSearch.
Sandi Parker (2000) conducted a study comparing PubMed, Ovid, Silverplatter
and FirstSearch. The stars assigned to each one were:
PubMed 4.5
Ovid 4
Webspirs 3.5
Firstsearch 2.25
She considered PubMed to be the best choice. It is the only one that is free.
It is quick and can be easy or complex depending on the searcher and the type
of search. It has access to Old Medline, which contains articles dating back
to 1961, PreMedline, which contains records before they are officially indexed
into Medline, PubMed Central and the databases Nucleotide, Protein, Genome,
Structure, Popset, Taxonomy, OMIM, SNP and UniGene databases.
The following are the rates the Parker gave each interface:
FirstSearch Ovid PubMed Webspirs
Composite 2.4 4 4.5 3.5
Content 3 4 5 3
Searchability 2 4 4 3
Pricing 2 4 N/A 4
Contract Options 2 4 N/A 4
Cost
PubMed is free. For FirstSearch, libraries can purchase a bank of searches or
subscribe annually. For Ovid, the policy is "pay as you go". Also
offered is Web access for locally installed Ovid client - servant system or
fixed free Ovid online via Web Access Site licensing of databases. Silverplatter
costs are different for different users, needs and parts of Medline accessed.
Searchability
Criteria used were:
1. Access to MeSH features; explode, focus, mapping and subheadings
2. Filtering results
3. Searching levels
4. Save features and support options
5. Document delivery
6. Currency and update options
7. Additional features
Parker
wrote that Ovid is great for both easy and advanced searches. Webspirs offers
the Search Builder to design a complex search strategy. PubMed goes straight
to a basic search but offers Limits, where the searcher can limit the search
to year, database, language, publication type, human or animal, gender and publication
date.
Each interface utilizes MeSH headings and subheadings. Two map them - PubMed
and Ovid. In PubMed, when the searcher presses "Details", the MeSH
terms are shown. In FirstSearch one can search for one index at a time unless
terms are combined and then one can search in multiple indices with up to four
search statements. In SilverPlatter, one can search in all fields of records
but there is no advice on MeSH terms to get more relevant retrieval.
Exploding/Focus
Firstsearch offers no explode. Central concept focuses. PubMed automatically
explodes, which is good because the searcher gets an idea of how much information
there is on a topic and can focus.
The way to focus on PubMed is not too obvious. In order to "focus"
a search, you the MeSH Browser feature (also found on the sidebar) can be used
so that the selected MeSH term is one of the main topics discussed in the article.
In the "detailed display" of the selected term, the box "Restrict
Search to Major Topic headings only" can be used. Turning on this feature
narrows the focus of a search and may significantly decrease retrieval. If one
is, however, typing a search into the query box, one can limit the MeSH term
by tagging it with [major] instead of [mesh].
Ovid explodes and focuses. In SilverPlatter one can focus from Search Builder
and Thesaurus. One can click on "major MeSH heading" which is the
same as focus. Someone not familiar with the interface may not be able to find
these options.
Subheadings
In Ovid it is easier to find this command than in PubMed, where one would have
to go to Detail Display and then Browse MeSH. Webspirs gives this option in
the Thesaurus window.
In PubMed, one can add the appropriate subheading directly to a term, i.e.,
nursing/manpower OR nursing/ma (using 2-letter short form).
One can also add a subheading without attaching it to a specific term, a less
exacting strategy, i.e.:
Anti-Inflammatory
Agents, Non-Steroidal AND alcoholic beverages AND (adverse effects [sh] OR poisoning
[sh]).
The above type of searches also can be entered with 2-letter short form of subheadings:
Anti-Inflammatory Agents, Non-Steroidal AND alcohol beverages AND (ae [sh] OR
po [sh])
To determine what subheading can be used with each MeSH vocabulary term, the
MeSH Browser can be used, accessed from the sidebar (click on detailed display
for allowable subheadings) or one can go to the more detailed MeSH Browser at
http://www.nlm.nih.gov/mesh/MBrowser.html.
Filtering
Ovid offered the most advanced filtering options. Then came PubMed, while Firstsearch
and SilverPlatter did not offer many choices.
Saving
All but Firstsearch offered a save option. On PubMed it is not too obvious.
Help and Support Features
All four search engines offer these features.
Fulltext Links
All four search engines offer fulltext links.
Updating
Medline is updated monthly on Firstsearch and PubMed is updated weekly and PreMedline
is updated daily. Ovid and SilverPlatter are updated weekly or monthly depending
on price.
Unique Interfaces
Each interface has unique features. Ovid has an "Ask the Librarian"
button that links every page to a comment for m that allows the searcher to
send a question to the local library. Ovid also has the "Recover"
option to make sure that a searcher can get back to a lost search whenever the
"logoff" has not been initiated. PubMed has the hyperlink "Related
Articles" that will retrieve a pre-calculated set of articles in Medline
that closely relate to the selected article. PubMed also has the "Consumer
Health" button which links to an NLM Web site called MedlinePlus, which
is an exceptional database designed for the health consumer. It includes health
information and will also run a preformulated Medline search.
Firstsearch has a familiar interface for library clients and the flexibility
to use the interface to search other databases in the Firstsearch family. Libraries
subscribing to Electronic Collections Online take advantage of links to full
text. Parker says about SilverPlatter that it is a "sophisticated Medline
search interface with all the capability of the Ovid Advanced Search mode"(page
9) if the searcher goes into the Build a Search or the Thesaurus. She also wrote:
"I was also somewhat frustrated by the way Webspirs opens separate windows
for each search or function selected. But there may be other versions that do
not perform that way."
Gruwell and Littleton (2002) provided several links to Medline tutorials:
For PubMed: http://www.nlm.nih.gov/bsd/pubmed_tutorial/m1001.html
http://www.stanford.edu/~cstave/pubmed/pubmed3.html
http://www.library.health.ufl.edu/pubmed/pubmed2
For Ovid: http://www.mclibrary.duke.edu/respub/guides/ovidtut
http://www.health.library.mcgill.ca/eguides/tutorial/index.htm
Online Publishing
Barry
P. Markovitz, MD, is an advocate of making biomedical information available
for free on the Internet. He believes that writers, not publishers, should have
the rights to their own work. PubMed Central, a freely accessible preprint and
postprint "eprint" archive, is available on the Internet, can be accessed
from the PubMed interface and includes non-peer-reviewed material. Some non-peer-reviewed
articles are preprints to be subjected to from a peer review by journal editorial
boards Once scientists sell copyright to a journal, they would have to pay the
publisher for in able to distribute their material elsewhere. For the maximum
amount of audience to read the material, it would cost the scientist a lot of
money. PubMed Central bypasses this. Certain journals, like BMJ, already make
their full text articles available online with little or no lagtime from publication.
Markovitz believes that most scientist and institutions would be content to
pay page prices for online publishing because this would be cheaper than print.
Advertising would cover costs. Markovitz maintains that the current biomedical
publishing industry appears unable to think "outside the box" of the
reader pays, restricted access, commercial publishing model.
Hersh et al (2000) wrote that there are several reasons that online publishing
of scientific journals would be a good idea: the cost of print journals is rising,
libraries are buying less of them so there is less access to them and there
are more journals and conference proceedings.
Liz Pope of Haworth Press writes that PubMed Central "democratizes a process
now almost exclusively the preserve of commercial publishers and learned societies"
and "providing the options for depositing both peer-reviewed journal articles
and non-peer - reviewed preprints, PubMed Central proves an important addition
to the way scientific findings are communicated" (page 189).
Maxine Hatley, editor of Information Retrieval Library Automaton, wrote about
how several university and research institute scientists in the fall of 2000
vowed to refuse to submit articles to any publisher that did not agree to deposit
its journal articles six months after publication to PubMed Central or a similar
archive for free public access. That year scientists from around the world,
including Nobel laureates, sent letters to publishers expressing how they felt
about maintaining the rights to their work and allowing free online publication
and information exchange.
Delamonthe and Smith in an editorial in BMJ (2001) wrote that "But PubMed
Central is the first initiative really to take account of how fundamentally
the worldwide Web has changed the landscape of scientific publishing" (page
323). They believe that "authors want their work to have as wide a circulation
as possible" and that PubMed Central will help them to have this (page
323).
Micheal W. Jacobson, MD (2001) wrote that "the flow of information will
indeed be enhanced and liberated, and the cost to consumers and researchers
and libraries for access to information will drop substantially"(page 233)
but that "The reality is that the biomedical press is too powerful and
too integral a part of the research industry to have its foundations threatened
by well-meaning scientist" (page 233 ). He does point out that "Allowing
free access does not require giving up possession (just as museums the world
over allow visitors to view their artworks, often for free, while retaining
the rights to reproductions of work in their possession)" (page 232).
Works Cited
Bianchi, S. 2002. Database Reviews and Reports. PubMed: For More Than Just Medicine,
This Is One of the World's Greatest Databases. Issues in Science an Technology
Librarianship Spring 2002:1 - 4.
Boynton, J. et al. 1998. Identifying Systematic Reviews in Medline: Developing
an Objective Approach to Search Strategy Design. Journal of Information Science
24:137 - 157.
Brachmann, L. M. et al. 2002. Identifying Diagnostic Studies in Medline: Reducing
the Number Needed to Read. Journal of the American Medical Informatics Association
9: 653-658.
Brennan, P. F. and Strombom, I. 1998. Improving Healthcare by Understanding
Patient Preferences. Journal of the American Informatics Association 2: 257
- 262.
Chang, J. T., Shutzce, H. and Altman, R. B. 2002. Creating an Online Dictionary
of Abbreviations from Medline. Journal of the American Informatics Association
9: 612-650.
Chimoskey, S. J. and Norris, T. E. 1999. Use of Medline by Rural Physicians
in
Washington State. Journal of the American Medical Informatics Association. 6:
332
- 333.
Clarke, M. 1997. MeSH Terms Must Be Used in Medline Searches. British
Medical Journal 314: 1203 - 1204.
Clarke, M. and Oxman, A. 1999. Cochrane Reviews Will Be in Medline. British
Medical Journal 319: 1435 - 1436.
Cockerill, M. 2002. Biological and Medical Publishing Via the Internet.
Information Services and Use. 21: 33 - 42.
Coletti, M. H. and Bleich, H. L. 2001. Medical Subject Headings Used to
Search the Biomedical Literature. Journal of the American Medical Informatics
Association. 8: 317 - 323.
Cooper, G. F. and Miller, R. A. 1998. An Experiment Comparing Lexical and
Statistical Methods for Extracting MeSH Terms From Clinical Free Text.
Journal of the American Medical Informatics Association 5: 62 - 75.
Corn, M. 1998. Funding for Nursing Vocabularies. Journal of the American Medical
Informatics Association 5: 391 - 392.
Delamonthe, T. 2001. PubMed Central Increases Its Appeal. British Medical Journal
322: 818.
Dixon, Laura. A Quiver Full of Arrows: Recommended Web-Based Tutorials for PubMed,
Powerpoint, Ovid Medline and Frontpage. Medical reference Services Quarterly
21(2): 55 - 63.
Eberle, M. 2000. Current Awareness Using PubMed: Current WebServices and
Possibilities for Local Solutions. Internet reference Services Quarterly 5(2):
21 -
29.
Feinglos, S. F. 1985. MEDLINE: A Basic Guide to Searching. Chicago:
Medical Library Association, Inc.
Goossen, W. T. F. et al. 1998. A Comparison of Nursing Minimal Data Sets.
Journal of the American Medical Association 5: 152 - 163.
Greaves, L. and James, S. 1997. MeSH Terms Must Be Used in Medline
Searching. British Medical Journal 314: 1203.
Greenhalgh, T. 1997. How to Read a Paper: The Medline Database. British Medical
Journal 315: 180 - 183.
Greisdorf, H. and Spink, A. 2001. Median Measure: An Approach to IR System Evaluation.
Information Processing and Management 37: 843 - 857.
Grogg, J.E. 2002. EBSCO Publishing Offers Full Text Through PubMed's LinkOut.
InformationToday Spring 2002.
Hattery, M. 2001. The Public Library of Science Research 37 (@): 1 - 3.
Haynes, R. B. et al. 1994. Developing Optimal Search Strategies for Detecting
Clinically Optimal Search Strategies for Detecting Clinically Sound Studies
in Medline. Journal of the American Medical Informatics Association 1(6): 447
- 456.
Hersh, W. R. and Rindfleisch, T.C. 2000. Electronic Publishing of Scholarly
Communication in the Biomedical Sciences. Journal of the American Medical Association
7: 324 - 325.
Hersh, W. R. and Greens, R. A. 1990. SAPHIRE - An Information Retrieval System
Featuring Concept Matching, Automating Indexing Probabilistic Retrieval, and
Hierarchical Relationships. Computers and Biomedical Research 23: 410 - 425.
Hersch, W. R. and Hickam, D. H. 1995. Information Retrieval in Medicine: The
SAPHIRE Experience. Journal of the American Society for Information Society
46(10): 743 - 747.
Humphreys, B. L. 2000. Electronic Health Record Meets Digital Library. Journal
of the American Medical Informatics Association 7: 444 - 452.
Humphreys, B. L. et al. 1997. Evaluating the Coverage of Controlled Health Data
Technologies. Journal of the American Medical Informatics Association 4: 484
- 500.
Humphreys, B. L. et al. 1998. The Unified Medical Language System: An Informatics
Research Collaboration. Journal of the American Medical Informatics Association
5: 1 - 11.
Impicciatore, P. 1997. Reliability of Health Information for the Public on the
World Wide Web: Systematic Survey of Advice on Managing Fever in children at
Home. British Medical Journal 314: 1875 - 1881.
Ingui, B. J. and Rogers, M. A. M. 2001. Searching for Clinical Prediction Rules
in Medline. Journal of the American Medical Informatics Association 8: 391 -
397.
Jacobson, M. 2000. Biomedical Publishing and the Internet. Journal of the American
Medical Association 7: 230 - 233.
Jones, R et al. 1999. Randomized Trial of Personalized Computer Based Information
for Cancer Patients. British Medical Journal. 319: 1241 - 1248.
Katcher, B.S. 1999. MEDLINE: A Guide to Effective Searching San Francisco:
The Ashbury Press.
Kim, W. and Wilbur, W. J. 2000. Corpus- Based Statistical Screening for Phrase
Identification. Journal of the American Medical Informatics Association 7: 499
- 511.
Kingsland III, L. C. 1993. Coach: Applying UMLS Knowledge Sources in
an Expert Searcher Environment. Bulletin Medical Library Association
81(2): 178 - 183.
Kotzin, S. 2002. Medline and PubMed Will Be Able to Synthesize Clinical
Data. British Medical Journal 324: 791.
Liu, H., Johnson, S. B. and Friedman, C. 2002. Automatic Resolution of
Ambiguous Terms Based on Machine Learning and Conceptual Relations in the
UMLS. Journal of the American Medical Informatics Association 9: 621 - 636.
Markovitz, B. P. 2000. Biomedicine's Electronic Publishing Paradigm Shift:
Copyright Policy and PubMed Central. Journal of the American Medical
Informatics Association 7(3): 222 - 229.
Masys, D.R. 1998. Presentation of the Morris F. Collen Award to Donald A. B.
Lindberg, MD Journal of the American Medical Informatics Association 5: 214
-
216.
McCain, K. W. Biotechnology in Context: A Database - Filtering Approach to
Identifying Core and Productive Non-Core Journals Supporting Multidisciplinary
R & D. Journal of the American Society for Information Science 46(4): 306
-
317.
McCain, K. W. and Morris, T. The Structure of Medical Informatics Journal
Literature. Journal of the American Medical Informatics Association 5: 448 -
466.
Morris, T. A. et al. 1997. Approaching Equity in Health Information Delivery.
Journal of the American Medical Informatics Association. 4: 6 - 13.
Notess, G.R. 2000. PubScience: Evolution or Devolution. Econtent
February/March 64 - 66.
Ojasoo, T., Maisonneuve, H. and Jean-Christophe, D. 2000. Evaluating
Publication Trends in Clinical Research. How Reliable Are Medical Databases?
Scientometrics. 50 (3) 391 - 404.
Oxman, A. D. et al. 1994. Users' Guide to the Medical Literature VI. How to
Use an Overview. Journal of the American Medical Association 272(17): 1367 -
1371.
Parker, S. 2001. Medline: Comparative Review on Ovid, Silverplatter,
FirstSearch and PubMed. Denison Memorial Library, University of Colorado Health
Sciences Center.
Pope, L. 2001. PubMed Central: A Barrier-Free Repository for the Life
Sciences. The Serials Librarian 40 (1/2): 183 - 190.
Ra, J. G. and Vermuelen, M.J. 1996. Mizspellin and Medline. British Medical
Journal 313: 1658 - 1659.
Ribeiro-Neto, B., Laender, B.F. and Lima, L.R. S. 2000. An Experimental
Study in Automation - Categorizing Medical Documents. Journal of the
American Society for Information Science 52(5): 391 - 401
Smith, R. 2001. Britain's Gift: a "Medline" of Synthesized Evidence.
British
Medical Journal 323: 694 - 696.
Sievert, M.C. et al. 2001. Need a Bloody Nose Be a Nosebleed? Or,
Lexical Variants Cause Surprising Results. Bulletin Medical Library Association.
89(1): 68 - 71.
Spink, A. and Yang, Y. 2001. Medical and Health Web Searching: An Exploratory
Study School of Information Sciences and Technology. 1 - 29.
Spyns,
P. 1996. Natural Language Processing in Medicine: An Overview. Methods of Information
in Medicine35(4): 285 - 301.
Treweek, S.P. et al. Computer- Generated Patient Education Materials: Do They
Affect Professional Practice? Journal of the American Medical Informatics
Association 9: 346 - 358.
Westberg, E. E. and Miller, R.A. 1999. The Basis for Using the Internet to
Support the Information Needs of Primary Care. Journal of the American
Informatics Association 6: 6 - 25.
Wolf, F. M. et al. A Trend Analysis and Search Strategies for the Identification
of Meta-Analyses in Medline. (Abstract) In: Fourth International Cochrane
Colloqium, Adelaid. 1996. available at:
www.cochrane.org/cochrane/abpos22.htm (last accessed 8 November 2001.)
Woods, D. and Trewsheetlar, K. 1998. Medline and Embase Complement Each
Other in Literature Searches. British Medical Journal 316: 1166.
Yu, H., Hripcsk, G and Friedman, C. 2002. Mapping Abbreviations to Full
Forms in Biomedical Articles. Journal of the American Medical Association
9: 262 - 272.
Louiza Patsis, M.S.