Information Retrieval Louiza Patsis
DIS 812 May 4, 2005
Click Here for Power Point Presentation
An Evaluation of User Information Retrieval on www.CDC.gov
ABSTRACT
Novice and expert users look to the Internet to find information on
various health topics. One of the most popular health websites is
www.cdc.gov, the website of the Centers for Disease Control. Five tasks
were given to five users. The purpose of the study was to assess the effects
of user characteristics such as experience, academic background, terms used
and information goal on IS and search success. Pre- and post-test questionnaires
were given to the users to assess their thoughts and input on the website and
the tasks. Evaluation of the results of the tasks and the pre- and post-test
questionnaires was undertaken, and included relevance on material found, search
methods, and users' reactions and thoughts.
INTRODUCTION
Many studies have been conducted to show user influence on information seeking
(IS) and information retrieval (IR). These studies have revealed that several
factors associated with users influence their behavior and search success. These
factors include user academic experience, domain, system, website and IS experience,
and information goals. A search process can be complex and iterative. Information
need may change due to documents found that shift user focus and lead to query
expansion, and due to new knowledge accumulated by the user. Ellis (1995) developed
an IS behavior model with six generic features: starting, browsing, chaining,
differentiating, monitoring, extracting, verifying, networking and information
managing. Starting refers to starting a search. Chaining refers to identifying
new sources of information from bibliographies and retrieved documents. Accessing
refers to identifying and locating sources of information. Browsing refers to
looking for information in areas of potential interest. Differentiating refers
to filtering the amount of information obtained using nature, quality, relative
importance and usefulness. Monitoring refers to maintaining awareness of development
in a topic. Extracting refers to identifying relevant material from a document.
Academic background and user experience can influence all of these factors.
User Experience
User experience refers to experience that users have in the domain, in online
IS in general and in IS on the Internet or a particular search engine. Studies
have been conducted on how user experience affects search terms used. These
terms can influence IR. Lancaster et al. (1972) conducted a study searching
the Epilepsy Abstracts Retrieval System, in order to determine how much use
was made of the system, how successfully it was used, problems encountered and
what is the general user reaction to searching in the system (Lancaster et al.
1972, 223). The authors attributed search failures to: use of incorrect terms
or illogical strategies, failures due to the level of strategy adopted and failure
to cover all possible approaches of retrieval. (Lancaster et al. 1972, 231).
Failures were not due to indexing failures of the delegation of searches to
intermediaries.
The primary reason of success for system experts was their use of more search
terms. The authors wrote that more searching aids may decrease recall failures.
Recall failures were often due to overexhaustive or overspecific strategies
(Lancaster et al. 1972, 237), often conducted by novice users. The authors concluded
that training in the form of novice users observing expert users, intermediary
help and instruction, print materials, online instruction an instructional film
is needed to help users know what search terms to use.
Fenichel (1981) and Howard (1982) found that users with search and database
experience were more effective in their searching. Howard (1982) studied DIALOG
and ERIC users of five different experience levels: novice, moderately experienced
with ERIC, moderately experience without ERIC, very experienced with ERIC, and
very experienced without ERIC. She conducted interviews to find out about their
academic background and online experience and training. She measured effectiveness,
error rate command language used, procedural errors and dubious practices. She
also measured cost effectiveness in searcher minutes, dollars per reference
retrieved and direct costs. Searchers were categorized into very and moderately
experienced online and ERIC searchers, and very and moderately-experienced online
and non-ERIC searchers. She compared ERIC system experienced users with and
without domain experience. Very experienced ERIC users conducted more restricted
searches that were higher in precision. Howard (1982, 323) found that they viewed
the most sets, searched the most terms and spent the least time preparing before
going online, achieved highest speed scores, made less errors, and performed
searches in close correlation to online-experienced, non-Eric experience groups.
Novices took more time to search, but this was not significant. Very experienced
non ERIC users achieved high recall and low precision. Users with ERIC experience
searched more thesaurus terms than non ERIC experienced users. (Howard 1982,
320) Users with no ERIC experienced performed the most free text searching.
The very experienced ERIC group performed the most cost-effective searches and
achieved the highest precision ratio.
Borgman (1989) found that user experience with computers and task domains affected
their searches and search outcomes. In their study of the search abilities of
undergraduates, technical aptitudes affected search results, and were found
to cluster.
Marchionini et al (1993, 65) wrote that IS involves several skills that get
polished with practice: procedural skills, such as eliciting problem clarification
from users, determining terminology and mapping it to databases, applying search
strategies in opportunistic ways, interpreting feedback from systems, and assessing
relevance of items according to expressed or anticipated user needs. Marchionini
et al. (1993) compared domain experts to systems experts. Domain experts were
content-driven, sometimes using technical terms based on their knowledge, and
formed expectations related to possible answers. Systems experts were problem-driven,
used more system features, focused on documents, and browsed more than they
would when using online databases. Information seekers were evaluated using
questionnaires and think-aloud, and interviews were conducted. Factors studied
included: problem definition, experience, query formulation, reflection, stopping,
monitoring and system effectiveness. Many of the domain experts had system experience.
They made better inferences and used acronyms. Subject experts focused on query
formulation and system manipulation, and were search driven. Many used different
databases. Marchionini et al (1993, 61) found that the chance of a search success
depends on the information seeker’s experience with the task domain, general
Internet experience and experience on that the answer will be. Both domain and
search experts used systems features to narrow down searches by date and to
limit searches by language. Systems experts used more special, advanced features
than domain experts. These features included author nationality, thesaurus,
verb tenses, field limits and a wider variety of search models and strategies.
Their conclusion was that designers can improve systems by reducing responsibilities
for search strategies or by using short cuts.
In the law system evaluation, for instance, the four users were: the attorney
with government contracts expertise (AE); the attorney with less government
contract expertise (AN); the search expert with government contracts expertise
(IE); and the search expert with little government contracts expertise (IN).
Marchionini et al (1993, 56-57) found that AE and IE were analytic, confident
and had workmanlike ways of searching. Users with domain and subdomain expertise
formulated precise and straightforward queries. AN and IN tried to compensate
for low domain knowledge with exploratory probes, varied attacks and wishful
reasoning.
Lazonder (2000) conducted a study to find out about user World Wide Web experience
and its influence on user searches for websites and information on websites.
Users were 25 second-graders, and were asked questions on Dutch literature,
the domain. Novice and experienced users had equal knowledge of the domain.
Factors measured were: success, time, effectiveness and efficiency. Users with
higher experience had higher success rates. The time for search was equal among
novice and experienced users. Experienced users were more likely to keep searching
and hone in on the desired documents. Experts were not better at browsing. The
authors found that experts were better at locating needed websites, but that
experts and novices had about the same success rates in locating information
on websites. Lazonder (2000, 579) found that experts, in locating websites,
were overall faster, more efficient and effective, had higher performance scores,
and used less actions to come up with successful results. Regarding browsing,
they concluded that little training is needed; even novices pick up how to browse
quickly. Efficiency was the ration of the number of successfully completed tasks
to time to complete tasks. Effectiveness was the overall number of actions to
complete a task.
Slone (2002) conducted a study of user searches on Internet search engines,
a website, and an online catalog. She found that novices and experts exhibited
similar scrolling behavior, novices used less backtracking and experts used
a wider variety of mental models. She found that both novice and expert users
relied on serendipity, linking and other tasks that were not cognitively overbearing.
Users abandoned the task when it became too difficult. Slone (2002) found little
difference in the browsing and scrolling behavior of novices and experts searching
library online or Web-based catalogs.
Searching versus Browsing
When it comes to Internet IR, some users prefer to search while others prefer
to browse. Often, searching involves looking for something specific, while browsing
involves looking for something new and interesting. Hauck et al. (2000, 209)
referenced several studies that showed that, in general, expert users prefer
to search while novice users prefer to browse. Lazonder et al.(2000) found that
expert users tended to search more than browse.
Choo, Betlor and Turnball (2000) studied employees who searched for information
on the Web. They categorized browsing into: directed browsing, which occurs
in a systematic, focused way and is directed by a specific object or target;
semi-directed browsing, which occurs in a predictive or generally purposeful
way, as when the target is less definite; and an undirected browsing, which
occurs when there is no real goal and very little focus. Information need and
purpose influenced browsing and searching.
Academic Background
Borgman (1989) studied UCLA undergraduates of different majors to see how their
personality and academic background influenced IR. Borgman (1989, 238) cited
previous studies that had shown that users with a math or technical background,
and a high technical aptitude were better in IR. Technical aptitude referred
to factors such as spatial and reasoning aptitudes and background. Previous
studies had also shown that users involved with social sciences, humanities
and sciences have different IR behavior. Borgman hypothesized that user’s
personality and technical aptitude lead to a choice of major, which then was
an intervening variable in IR. Borgman (1989, 243-244) used Carl Jung’s
personality trait classifications of concrete experience versus abstract conceptualization
and active experimentation versus reflective observation. Computer experience
was controlled. Math and science majors were better and faster at completing
initial search tasks. Borgman (1989, 246) found that engineering students had
a search pattern similar to programmers’ patterns in other studies, and
English majors had search patterns similar to graduate library school students
in other studies. Students with the highest technical aptitudes had personality
characteristics similar to skilled searchers and to programmers. Borgman (1989,
248) concluded that personality characteristics, academic background and technical
aptitudes influenced IR performance. No strong link was found between personality
and technical aptitudes, as between major and technical aptitudes. She concluded
that a more extensive study needs to be conducted to learn exactly what behavior
is influenced by each of the factors studied.
Hill (1997, 235) wrote that IS involves thinking (planning and organization),
acting (browsing and searching), integrating (differentiating and monitoring)
transforming (extracting), and reaching resolution (decision making and monitoring).
She found that domain and system knowledge affect search process and success.
Research Questions
Research questions in this study are:
1. How does user domain, website and system experience influence search success,
including time to finish?
2. How does use academic background affect search success?
METHODOLOGY
Five users were chosen. User 1 is a domain novice, and has Internet expertise
mostly for personal purposes such as shopping and email. User 3 is a nurse with
domain, Internet and website expertise. User 2 and User 4 are domain novices,
have Internet expertise, and User 2 has used the website once. User 5 is a gastroenterologist
with domain experience, and average system experience. He usually conducts textbook
research for questions in health. Users 4 and 5 had no website experience.
Pre-Test Questionnaire
A pre-test questionnaire was emailed to the users to get information on their
gender, age, academic and experience background, and on their search behavior.
(Appendix I) The first two questions were about the gender and age of the users.
The study was not specifically designed to obtain the effects of gender and
age on user searching. The third question asked what degree the user holds,
to get an idea of their academic background. The fifth and sixth questions,
for the same aim, asked about their major of focus and extent of experience
in biology. Questions 7 through 9 asked about the extent of user Internet experience,
including how often they search and what type of searches (topic of interest,
academic or corporate searches) they conduct. This was to get a further idea
of their domain and systems experience. Question 11 asked how confident searchers
were on taking the test, to see if this influenced their searching in any way.
Question 12 asked if users stayed on target when searching or browsing the Internet.
The purpose of this question was to get an idea of their conciseness in their
Internet IS. Question 13 asked if users found pictures and graphics useful.
The effect of the answers to the last three questions on search success were
beyond the scope of this study, but were included in the questionnaire to get
a better idea of user background. Questions 14 and 15 were asked to provide
an idea of how social effects of family background and cognitive style influenced
users. However, these are beyond the scope of this study, and could be used
in future research to assess how cultural-social and cognitive styles affect
IR.
When the questionnaire was first made up, the scope of the study was to look
at more factors of user IR, including cultural, social, personality and cognitive
style factors. However, the questions and design of the study did not allow
for these factors to be studied. Question 14 pertained to cultural and social
background factors that may influence IR. Question 15 was written to get an
idea of the cognitive skills of the users.
Search Tasks
The same five questions were emailed to each user. (Appendix II) Answers were
to be found only on www.cdc.gov. The questions started from the most specific,
finding answers to specific questions, and became broader. The last question
allowed the users to search and/or browse on a topic of interest. After each
question, users were asked to indicate the search time, how many sittings it
took to answer the question, if they asked a person or the computer for help,
if they quit, if they searched or browsed, and, if both, which yielded the right
answer. They were asked to note the query terms, links and URLs they used. This
set of questions after each main task were given in order to assess the time
and ease in which the users undertook each task, if they searched or browsed,
and how proficient they were in formulating queries.
Formulation of Questions
The questions were written in order of difficulty, the first one being the most
difficult. The answer to Question 1 could be found by searching or browsing,
but search terms were not so obvious to get. The answer to Question 2 could
be obtained by searching or browsing with perhaps some difficulty for a novice
user. Then browsing was required to find the answer inside the article. Care
had to also be taken to pick search terms for this question, but using the full
title, especially with the authors, would yield the result. This question was
a little easier than Question 1. Question 3 was easy and straightforward to
answer by browsing or searching. Questions 4 allowed the users to choose what
facts to write. Question 5 gave users most freedom; they were allowed to search
for what was of interest to them.
Question 1 was chosen for the following reasons: the right query terms would
have to be chosen, such as “women and race and heart disease” so
as not to obtain irrelevant information. If the user had chosen to browse, they
would have to click on the appropriate link on the left, “Diseases and
Conditions”, and then scroll to the appropriate link “Heart Disease”.
Then they would have to find the tenth link on the page “Women and Heart
Disease: An Atlas of Racial and Ethnic Disparities in Mortality”, and
find in this PDF the appropriate information. It was assumed that some amount
of domain and systems sophistication and experience would be needed for this
search.
Question 2 was chosen for the following reasons: searchers had a variety of
ways to search or browse for the article and information in the article. They
could have chosen the journal link on the left of the website, or chosen to
look up the article by title or author. Looking it up by one author does not
yield an immediate result. The searchers would have to have patience if their
first result did not work and look up by both authors or by authors and title.
Searchers who were the most experienced in the website and domain searching
probably would be better at formulating query terms for the search. Part b was
included to verify that the users did find the article.
Question 3 was chosen because the results could have been found out in a variety
of ways by browsing or searching. It was assumed that the more experienced users
would stay on the mark and find results the fastest, without getting discouraged
from lack of results at first. Information on rapid HIV testing and CLIA can
easily be found by searching. Or the user can use the Diseases and Conditions
link, then the HIV/AIDS link, and then the first link Advancing HIV Prevention:
new Strategies for a Changing Epidemic. The PDF would then be browsed for answers
to parts a, b and c. To find part b searching, an appropriate query such as
would have to be formulated.
On Question 4, the users were able to look for themselves for three facts, and
had more freedom in how to search or browse. This question was chosen to provide
insight into user style – what users would do if they had freedom to search
or browse and look for what they wanted on a topic – according to user
experience.
Question 5 was chosen for the same reason. Even more freedom was given to users.
This time they could choose what they wished to look for. The information need
would be their own. The researcher wanted to see if searching or browsing for
a topic of interest would make the information more successful or quicker. According
to Slone (2002), searchers looking for recreational goals would rely more on
serendipity and would be likely to give up on difficult tasks. Experienced users
were more likely to have mental models of how the Internet works. This aided
in their searches. Novices did not link heavily and produced less term generation.
This question was used to see if user less experienced in the domain, system
and website would be more likely to give up on a recreational task.
Post-Test Questionnaire
A post-questionnaire was given at the same time to be completed after the five
questions. (Appendix III) The first question was asked to determine the extent
of website experience. The second question was asked to determine if the environment
in any way impeded their IS. Environment may have affected search times. If
so, experience and academic background may not have been the only factors that
influenced search times. Questions 3 to 5 were asked to see if a certain type
of question was found most difficult by the users, and to assess question difficulty
in general. Question 6 was asked to see if users completed the task on their
own, or if they used the computer or someone for help. Questions 7 a-d were
asked to get an idea of what users thought of the website. These questions were
not used in the final analysis since the intention of the research was to find
how user academic background and experience affect IR. Questions 7e was asked
to assess question difficulty in general. Questions 8 and 9 were asked to learn
even more about what users thought of the website. Question 10 was written to
see how the users felt about the easiness of answering each question in their
own words. These questions were not used for the final evaluation. Question
11 was asked to assess user confidence during the test, in order to see if this
hampered their IS in any way. It was assumed that the less experienced users
would have less confidence. Questions 12 and 13 were asked to see what users
felt about the interface and website. Question 14 was asked to see if the users
felt lost during the test. Question 15 was asked to determine if users usually
give up on Internet IS tasks. Questions 16-20 were asked to see what users thought
of the worth of answering each question in terms of time. These questions were
not used for the final evaluation. Question 21 was asked to see if users strayed
away from answering questions to browse or to use another website. If users
strayed, their time results may not have been accurate. Question 22 was asked
to see of what value answering each question was to the users. This question
was not used for the final evaluation. Question 23 was asked to allow users
to write their comments in their own words, in case the questions did not cover
all of the pertinent facts about their IS behavior.
RESULTS AND DISCUSSION
Following is a review of the pre-test questionnaire, the questions and the post-test
questionnaire.
Pre-test Questionnaire
Gender and Age
Users 2 and 5 were male. User 1 is 28 years old, User 2 is 49 years old, User
3 is 50 years old, User 4 is 40 years old and User 5 is 45 years old. There
is no huge difference among ages, except for User 1, the most inexperienced
user. Implications of age were beyond the scope of this study.
Academic Background
User 1 has the least amount of college, earning an Associates Degree. User 2
has a Bachelor of Science in computer programming. User 3 has the most relevant
academic background; she is a nurse. User 4 has a Bachelor of Arts in Computer
Science and a Masters of Science in computer information systems. User 5 has
the most education measured in years; he is a gastroenterologist who served
for some months being the head of gastroenterology at Northshore University
Hospital in Manhasset, Long Island.
User Domain, System and Website Experience
Everyone used the Internet daily, and User 1 used it more for recreational purposes
and less for professional purposes. Other people tended to use it for both.
User 2 and User 4 had computer system training, yet their average search times
differed by almost three minutes. User 3 had the most of all types of experience;
domain, website and systems. User 1 had little domain experience, and used the
Internet mostly for personal and recreational uses such as email. Users 2 and
4 had used the website once each. It may be concluded, although not definitively,
that combined domain, website and Internet experience does make a difference
in search times.
User Confidence
User 1 had average confidence. User 2 was “very confident”. User
3 was confident. When it comes to confidence, User 4 said “not very”
and User 5 said “Okay”. No correlation was found between this and
search times, except that perhaps. User 4’s low confidence had her take
longest to answer some of the questions, despite her computer experience. It
may have had something to do with why User 5, who is a physician, took longer
to answer some questions than User 3. Effects of confidence were inconclusive
for this study, and were beyond the scope of this study. This did not seem to
affect IR.
Staying on Target, Pictures and Graphics and Family Background
Everyone agreed that they stay on target and avoid irrelevant information when
browsing or searching. All but User 4 and User 5 agreed that pictures and graphics
are helpful. User 5 answered “neither” and User 4 answered “disagree”.
No one reported that their family affected their searching. User 2’s family
has a technical background.
Questions
All users answered the questions correctly. Average times to answer questions
were calculated using the first four questions. The following are the average
task times:
User 1
(time in min) User 2
(time in min) User 3
(time in min) User 4
(time in min) User 5
(time in min)
Question 1 15 12.5 3 20 6
Question 2 8 4.1 2 1 5
Question 3 20 3.12 1 10 8
Question 4 15 3.54 6 3 8
Question 5 5 4 4 5 2
Average Time 14.5 6 3 8.5 7
User 3 may have been the quickest because of her combined academic background,
and domain, website and Internet experience. The academic backgrounds and experience
of Users 2 and 4 were similar. User 4 did have a Masters degree in the computer
field, though. Yet, User 4 had search times that were 2.5 minutes faster than
those of User 2. It may be that User 2 was more confident. User 2’s huge
Internet experience and confidence may have been the reasons that he took a
shorter amount of time to answer than User 5. All users were successful on Question
5, and times were 5 minutes or less. The nature of the question to one where
users look for a topic of choice did not seem to affect search success or time,
regardless of user academic background and experience.
Everyone completed each question at one sitting and did not ask for help. Users
used similar links. User 3 used the most URLs. No correlation was found between
links and academic background or experience. Users for the most part did not
note links used when browsing. The questionnaire may have been confusing. Some
users may have confused “links” with “URLs”.
The following is a table of the search and browsing behavior of the users.
User 1
(time in min) User 2
(time in min) User 3
(time in min) User 4
(time in min) User 5
(time in min)
Question 1 Search Both/Browse Search Both/Browse Search
Question 2 Search Search Search Search Search
Question 3 a. Search
b. Browse a. Search
b. Browse a. Search
b. Browse a. Search
b. Browse a. Search
b. Browse
Question 4 Both/Search Search Search Search Search
Question 5 Both/browse Search Both/Search Browse Search
Past studies have shown that experienced users tend to search. There was no
correlation in this study. User 5 searched except for Question 3b. Other than
that, all users searched more often than browsed. User 1 and User 4 browsed
the most. They also took the most time to search. Their academic backgrounds
and Internet experience are different. They have low domain experience and little
or no website experience.
Other Questions after Each Task
No user quite or asked for help. All users answered questions in each sitting.
Most users did not put down search terms. When they did, they were similar or
the same among all users. No user provided links. They may have misunderstood
links for URLs. All users did not provide URLs all of the time. When they did,
they were similar or the same, except for User 3 who provided more URLs. This
was mainly because she provide separate URLs for each step. For instance, for
Question 1, she put down both: http://www.dcd.gov/node.do/id/0900f3ec8000e035
and http://www.cdc.gove/doc.do/id/0900f3ec802720b8.
Post-Test Questionnaire
User 4 and User 5 had never used www.cdc.gov before. Other users had used it
once, except for User 3 that uses it often. Environment did not affect searching.
No one quit or asked for help.
Question 3
User 1 User 2 User 3 User 4 User 5
2; easiest to locate 5; his choice All Easy 2; easiest to locate; used title
All easy; his own #5
For Question 3, User 2 and User 5 found Question 5 to be the easiest. User 3
thought that all were easy, as was expected. User 1 and User 4 found the second
question easiest, looking for a specific title and article. These users had
different backgrounds.
Question 4
User 1 User 2 User 3 User 4 User 5
#3;had to browse a lot None #4; question is less structured #1; took the longest
#3b
For Question 4, it can is significant but inconclulsive that User 1 found
question 3 the hardest, stating that she had to browse a lot, since this is
a straightforward question and to browse for specific definitions is not necessary.
It is not the easiest way to complete the task. Perhaps if she would have been
more of an experienced Internet user in terms of science or professional activity,
she would have searched differently. This question took her the longest, 20
minutes. User 3 found Question 4 hardest to answer, since it was less structured.
It took here the longest than any other question she had answered – 6
minutes. It is surprising that she found any questions difficult. User 4 found
Question 1, the most difficult question, the hardest to answer. It took her
a longer time to answer this question than any other question. User 5 surprisingly
found one question hard – the HIV question. It took him the same amount
of time – 8 minutes – to answer this one as Question 4. It is surprising
that User 5, a physician, and User 1, the most novice, found this question the
hardest. For Questions 5 and 6, no one gave up on questions or asked for help
from a person or the website.
Question 7
User 1 User 2 User 3 User 4 User 5
1-3 Agree;
4 Disagree; 5 Neither All Agree All Agree 1-4 Agree; 5 Disagree All Agree
For Question 7, Users 2, 3 and 5 agreed that all questions were easy to answer.
This was surprising because User 3 did state that Question 4 was less structured.
User 1 thought Questions 1 through 3 were easy to answer, 4 was difficult and
marked “neither” for 5. User 4 thought that all but five were easy
to answer. It is not possible to determine for sure if user experience and academic
background had to do with how they viewed questions.
No one answered Question 8, to describe in their own words what they thought
of answering each question in terms of easiness. There were discrepancies between
Questions 4 and 7. For instance, User 1 found test Question 3 difficult, while
answering that they agree that Questions 1 through 3 are easy to answer on Question
7. User 3 found test Question 4 the hardest, while answering on Question 7 that
all questions were easy to answer. User 4 answered on Question 4 that test Question
1 was the hardest while agreeing that test Question 1 was easy to answer. User
5 found that test Question 4 was hard to answer, while agreeing on Question
7 that all test questions were easy to answer.
For Question 9, all users were confident, except that User 5 wrote “Okay”
and User 4 wrote that her confidence rose as she took the test. Their confidence
levels were not as high as they could have been. Perhaps this is why User 5,
a physician, and User 4, a computer specialist, were not as quick as User 3
and User 2 respectively.
For Question 10, no one felt lost except for User 1 once. For Question 11, on
whether they would give up once Internet browsing and searching gets too tough,
User 1 disagreed (unexpected), User 3 and User 2 strongly disagreed (expected)
and User 4 and User 5 agreed (unexpected). No one gave up on this test. Perhaps
the questions were easy or the website was easy to use. For Question 12, no
one strayed or asked for help. This may mean that questions were easy for different
types of users.
Limitations of the study included little time, no observations, think-aloud,
and interviews with the users, only five users were used, and that no user was
a complete systems novice.
CONCLUSION
Findings in this study were not conclusive. Academic background and experience
were not found to play a big role in IR, IS and task completion time. The exception
is that User 3, the nurse with the most domain, website and Internet experience,
completed the tasks quicker and used the most search links. Terms and URLs used
did not differ enough to allow for conclusions on user academic background and
experience and terms used. Conducting searches on topics of interest while answering
Question 5 did not seem to be affected by academic background or experience;
all users found what they were looking for and found it in 5 minutes or less.
Reasons that more conclusions cannot be draw from this study may be that the
website is relatively easy to use for even those without much domain and system
experience. Another reason may be that questions were too easy. Unlike in Slone
2002, users did not abandon the task when it became too difficult. Most users
did not think that any of the questions were very difficult. All of the users
answered the questions correctly. User 3, the user with the highest level of
domain, website and Internet experience did complete the questions fastest.
All users had a similar amount of confidence in using the website. User 1 posted
average to normal confidence; there may be some correlation with this and her
slowest average search time. Academic factors may have made a difference since
User 3 answered quickest and User 1 answered the slowest. Users used similar
search terms. All users found the correct answers.
FUTURE STUDY
In the future, users will include ones with domain experience and little or
no website and systems or no experience. Less questions and question tasks will
be asked, so as not to be so demanding. Perhaps then the users would not leave
any blanks. Interviews will take place before and after the test to really get
an idea of users’ opinions, experience and reactions. Users will be observed.
Interviews and observations will allow a better picture of user moves, and will
solve the problem of users not completing all questions. There were disparities
in the post-test questionnaire between questions 3 and 4, and 7 on which questions
the users viewed the easiest or hardest. With interviews, such discrepancies
would not occur. Questions on domain, website and Internet experience will contain
range of time choices such as “2 to 5 times per day” to get a more
accurate idea of years of academic experience, and times that a website or the
Internet are used.
In future questionnaires, choices will be offered for users to indicate their
age in an age range, so as not to be so intrusive. Some questions, such as the
one asking if questions were easy to answer will not have a “neither”
option. If checked, this is difficult to interpret. “Links” will
be changed to “URLs” in the question: “Note the links that
you used” and “Note the link(s) that you used to find the answer”.
User 1 was confused and answered only under “Please provide the URL(s)
of where you found your answer(s). This was a repetitive question. In future
questions, the differentiation between links used during browsing and a URL
will be made apparent.
In a future study, users that are even less domain, website and Internet experienced
will be sought to produce a wider variety of results. Questions will be made
tougher to answer and a less user-friendly website will be used. These changes
may allow factors such as domain and system experience to weigh in more.
Future studies may include an analysis based on cognitive style, personality,
and cultural and social background and factors. Personality and psychology tests
would be administered and trained personnel would interpret them.
Appendix I
Pre-Test Questionnaire
Please provide us with the following information:
1. What is your gender?
2. What is your age?
3. What degree(s) do you hold?
4. How many years of college have you had?
5. What was your major or field of focus?
6. What is the extent of your experience in biology?
7. What is your extent of experience with the Internet?
8. How many times a day or week do you use the Internet?
9. What is the purpose of your use of the Internet – recreational browsing,
specific searching (if so, in what topic) or email?
10. What type of website do you usually visit? (academic, library, corporate, art, etc.)
11. How confident were you about taking the test before taking it?
12. Please put an “X” in the appropriate spot:
When searching or browsing the Internet, I usually stay on target and avoid irrelevant information.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
13. Please put an “X” in the appropriate spot:
When searching or browsing the Internet, I find pictures and graphics useful in making the search easier.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
14. Please provide any comments about how your family, cultural and social background prepared or hindered you form becoming a confident person in biology and Internet searching, or in general.
15. Put an “X” next to the sentence that most characterizes your
beliefs:
Knowledge is known, and there is a right and wrong for everything.
Most knowledge is known, and there is probably a right way to find answers.
Usually, there is certainty about knowledge.
Knowledge is contextual; it is disconnected from any absolute truth.
16. Please feel free to provide any other comments.
Appendix II
Questions
To answer these questions, you will use ONLY www.cdc.gov. Please take the time to answer each question. There are questions at the end of each section that you will need to answer. Look at them first before beginning to answer the question. If it gets too difficult, you can ask for help, or you can stop and not answer. Note this. You can do the questions in more than one sitting. Whether you answer each question all at once, or at more than one sitting, please note the total time it took to answer each question. If you stop, note how long it took you before you stop. Also note if you browsed or search. Also note if you reformulated your query terms. Thank you.
Question 1
Search or browse to find the United States 1991-1995 statistics of what racial
group of women ages 35 and up experienced the most heart disease.
If you find a graph, what other racial groups were depicted in the graph?
Answer
Time to finish _____________
How many sittings__________
Asked for help___________
Online help yes/no
By whom_________________
For what_______________
Did you stop completely? _________
Time before stopping________
Did you search or browse? _____________
If both, which yielded the successful answer? ______________
If you searched, what query terms/questions did you use, in chronological order
Note the links that you used
Note the link(s) you used to find the answer(s).
Did you read material in another language? Why/why not?
If you read material in another language, which language and what material?
Please provide the URL(s) of where you found your answer(s).
Question 2
Muin J. Khurry and George A. M. Mensah recently wrote the journal article “Genomics and the Prevention and Control of Common Chronic Diseases: Emerging Priorities for Public Health Action” for the journal Preventing Chronic Disease: Public Health, Research ad Policy.
a. Please find the article by browsing or searching. Indicate if you browsed or searched. If you searched, indicate the query term for each search and the query term for the successful search.
b. Find in the article the number two fact about family disease history that makes it ideal for public health practice, and write it here: ________________________________________________________________________________________________________________________________________________
c. Did you search or scroll to find the answer?
Time to finish _____________
How many sittings__________
Asked for help___________
Online help yes/no
By whom_________________
For what_______________
Did you stop completely? _________
Time before stopping________
Did you search or browse? _____________
If both, which yielded the successful answer? ______________
If you searched, what query terms/questions did you use, in chronological order
Note the links that you used
Note the link(s) you used to find the answer(s).
Did you read material in another language? Why/why not?
If you read material in another language, which language and what material?
Please provide the URL(s) of where you found your answer(s).
Question 3
a. Find information on rapid HIV testing. Provide the URL.
________________________________________________________________________
b. When is it recommended that pregnant women first get tested for HIV?
________________________________________________________________________
c. For what does CLIA stand?
________________________________________________________________________
Time to finish _____________
How many sittings__________
Asked for help___________
Online help yes/no
By whom_________________
For what_______________
Did you stop completely? _________
Time before stopping________
Did you search or browse? _____________
If both, which yielded the successful answer? ______________
If you searched, what query terms/questions did you use, in chronological order
Note the links that you used
Note the link(s) you used to find the answer(s).
Did you read material in another language? Why/why not?
If you read material in another language, which language and what material?
Please provide the URL(s) of where you found your answer(s).
Question 4
Find three facts about cholesterol and write them down.
__________________________
__________________________
__________________________
Time to finish _____________
How many sittings__________
Asked for help___________
Online help yes/no
By whom_________________
For what_______________
Did you stop completely? _________
Time before stopping________
Did you search or browse? _____________
If both, which yielded the successful answer? ______________
If you searched, what query terms/questions did you use, in chronological order
Note the links that you used
Note the link(s) you used to find the answer(s).
Did you read material in another language? Why/why not?
If you read material in another language, which language and what material?
Please provide the URL(s) of where you found your answer(s).
Question 5
Browse or search for your topic of interest, formulate a question, and find the answer to the question.
Topic of interest ________________________________________________________________________
Question ________________________________________________________________________________________________________________________________________________
Answer
________________________________________________________________________
________________________________________________________________________
________________________________________________________________________
Time to finish _____________
How many sittings__________
Asked for help___________
Online help yes/no
By whom_________________
For what_______________
Did you stop completely? _________
Time before stopping________
Did you search or browse? _____________
If both, which yielded the successful answer? ______________
If you searched, what query terms/questions did you use, in chronological order
Note the links that you used
Note the link(s) you used to find the answer(s).
Did you read material in another language? Why/why not?
If you read material in another language, which language and what material?
Please provide the URL(s) of where you found your answer(s).
Appendix III
Post-Test Questionnaire
1. Have you used www.cdc.gov before? If you have, when, how often, and for what did you use www.cdc.gov?
2. Did your environment affect your taking the test? i.e.
a. Was there time pressure?
b. Did you think that others were watching you?
c. Did you want to impress them?
d. Did you want to impress the test-administrator?
3. Which question did you find the easiest and why?
4. Which question did you find the hardest and why?
5. Did you give up on any questions and why?
6. Did you ask for help from the website or another person for any questions? How did it or they help? Were you comfortable asking for help?
7.Please put an “X” in the appropriate spot:
a. www.cdc.gov is a website that is easy to navigate.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
b. www.cdc.gov is a website that is easy to search.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
c. I found my experience taking this test pleasant.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
d. I would use this website in the future for personal use.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
e. For each question, answer: This question was easy to answer.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
8. Please put an “X” in the appropriate spot:
I found www.cdc.gov a website that clearly presents information.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
9. Please put an “X” in the appropriate spot:
I found www.cdc. a complex website.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
Please provide any additional comments.
10. In your own words, what do you think of the easiness of answering each of the test’s questions:
11. How confident were you about taking the test during and after taking it?
12. What did you think of the www.cdc.gov interface?
13. In your own words, what do you think of www.cdc.gov?
14. Did you at any time during the test feel “lost”? While answering which question(s) did you experience this?
15. Please put an “X” in the appropriate spot:
I usually give up is Internet search/browsing tasks get too tough.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
16. Please put an “X” in the appropriate spot:
I think the time it took to answer Question 1 was worth it.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
17. Please put an “X” in the appropriate spot:
I think the time it took to answer Question 2 was worth it.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
18. Please put an “X” in the appropriate spot:
I think the time it took to answer Question 3 was worth it.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
19. Please put an “X” in the appropriate spot:
I think the time it took to answer Question 4 was worth it.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
20. Please put an “X” in the appropriate spot:
I think the time it took to answer Question 5 was worth it.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree
21.At any time during your test, did you stray from answering a question to browse the CDC website or another website? If so, when, why and for how long?
*22. Please put an “X” next to the appropriate phrase for each question:
Question 1
For me, this search was:
___of major value
___of considerable value
___of minor value
___of no value
Question 2
For me, this search was:
___of major value
___of considerable value
___of minor value
___of no value
Question 3
For me, this search was:
___of major value
___of considerable value
___of minor value
___of no value
Question 4
For me, this search was:
___of major value
___of considerable value
___of minor value
___of no value
Question 5
For me, this search was:
___of major value
___of considerable value
___of minor value
___of no value
* adopted from Lancaster 1972
23. Please feel free to provide any other comments.
Bibliography
Borgman C. L. 1989. All users of information retrieval systems are not created
equal: An exploration into individual differences. Information Processing &
Management 25(3):237-251.
Cooper W. S. 1968. Expected search length: A single measure of retrieval effectiveness
based on the weak ordering action of retrieval systems. American Documentation
Jamuary 1968:30-41.
Cooper W. S. 1976. The paradoxical role of unexamined documents in the evaluation
of retrieval effectiveness. Information Processing & Management 12:367-375.
Elkerton, J., & Williges, R. C. (1984). Information retrieval strategies
in a file-search environment. Human Factors 26:171-184.
Fenichel C. H. 1981. Online Searching: Measures that discriminate among user
with different types of experiences. Journal of the American Society for Information
Science
32(1):23-32.
Friede A, Rosen D. H. and Reid J. A. CDC Wonder: A cooperative processing architecture
for public health. Journal of the American Medical Informatics Association 1(4):303-312.
Hauck R. V., Sewell R. R., Ng T. D. and Chen H. 2001. Concept-based searching
and browsing: A geoscience experiment. Journal of Information Science 27(4):199-210.
Hill, J. R. The world wide web as a tool for information retrieval: An exploratory
study of users’ strategies in an open-ended system. School Library Media
Quarterly Summer 1997:229-236.
Howard, H. 1982. Measures that discriminate among online searchers with different
training and experience. Online Review 6(4):315-326.
Lancaster F. W., Rapport R. L., and Penry J.K. 1972. Evaluating the effectiveness
of an on-line, natural language retrieval system. Information Storage and Retrieval
8:223-245.
Lazonder A. W., Biemans H. J. A. and Wopereis G. J. H. 2000. Differences between
novice and experienced users in searching information on the World Wide Web.
Journal of the American Society for Information Science 51(6):576-581.
Lesk, M.E. and Salton, G. 1969. Relevance assessments and retrieval system evaluation.
Information Storage and Retrieval 4:343-359.
Marchionini G., Dwiggins S., Katz A and Lin X. 1993. Library & Information Science Research 15:35-69.
Marchionini, G. M. 1995. IS in Eletronic Environments Cambridge, United Kingdom: Cambridge University Press.
Salton Gerard. 1992. The state of retrieval system evaluation. Information
Processing & Management 28(4):441-449.
Saravecic T., Kanto P., Chamis A., and Trivison D. 1988. A study of IS and retrieving.
Background and methodology. Journal of the American Society for Information
Science 39(3):161-176.
Saravecic T. and Kantor P. 1988. A study of information seekign and retrieving.
II. Users, questions, and effectiveness. Journal of the American Society for
Information Science 39(3):177-196.
Slone D. J. 2002. The influence of mental models and goals on search User 3terns
during Web interaction. Journal of the American Society for Information Science
and Technology 53(13):1152-1169.
Spink A. 2004. Multitasking information behavior and information task switching:
An exploratory study. Journal of Documentation 60(4):336-351.
Swanson, D. R.1977. Information retrieval as a trial-and-error process. Library
Quarterly 47(2):128-148.
Swets J. A. 1969. Effectiveness of information retrieval methods. American
Documentation January 1969:72-89.
Tenopir C. 1983. Dialog's knowledge index and BRS/After Dark: Database searching
on personal computers. Library Journal March 1, 1983:471-473.
Yang, C. C. 2004. Content-based image retrieval: A comparison between query
by example and image browsing map approaches. Journal of the American Society
for Information Science and Technology 30(3):254-267.
Banwell, L. and Graham, C. 2004.
http://www.informationr.net/ir/9-2/paper167.html.
Accessed on March 30, 2005.
Choo, C.W., Detlor, B.and Trunbull, D. IS on the Web: An integrated model of
browsing and searching.
http://www.firstmonday.dk/issues/issue5_2/choo/index.html
Accessed on March 30, 2005.
Griffiths, J. R. and Brophy P. Student searching behavior in the JISC information
environment
http://www.ariadne.ac.uk/issue33/edner/
Accessed on March 30, 2005
Moss, N. and Hale, G. 1999. Cognitive style and its effect on internet searching:
A qualitative investigation.
http://www.leeds.ac.uk/educaol/documents/000001189.htm
Accessed on March 30, 2005.
Turnball, D. 2003. New approaches for studying and building IS models: A possible
hybrid approach.
http://www.ischool.utexas.edu/~donturn/research/SIGUSE2003Worksheet.html
Accessed on March 30, 2005.
Wilson T. D. Recent trends in user studies: Action research and qualitative
methods
http://infomrationr.net/ir/5-3/paper76.html
Accessed on March 30, 2005.