Information Retrieval Louiza Patsis
DIS 812 May 4, 2005

Click Here for Power Point Presentation

An Evaluation of User Information Retrieval on www.CDC.gov

ABSTRACT

Novice and expert users look to the Internet to find information on
various health topics. One of the most popular health websites is
www.cdc.gov, the website of the Centers for Disease Control. Five tasks
were given to five users. The purpose of the study was to assess the effects of user characteristics such as experience, academic background, terms used and information goal on IS and search success. Pre- and post-test questionnaires were given to the users to assess their thoughts and input on the website and the tasks. Evaluation of the results of the tasks and the pre- and post-test questionnaires was undertaken, and included relevance on material found, search methods, and users' reactions and thoughts.
INTRODUCTION
Many studies have been conducted to show user influence on information seeking (IS) and information retrieval (IR). These studies have revealed that several factors associated with users influence their behavior and search success. These factors include user academic experience, domain, system, website and IS experience, and information goals. A search process can be complex and iterative. Information need may change due to documents found that shift user focus and lead to query expansion, and due to new knowledge accumulated by the user. Ellis (1995) developed an IS behavior model with six generic features: starting, browsing, chaining, differentiating, monitoring, extracting, verifying, networking and information managing. Starting refers to starting a search. Chaining refers to identifying new sources of information from bibliographies and retrieved documents. Accessing refers to identifying and locating sources of information. Browsing refers to looking for information in areas of potential interest. Differentiating refers to filtering the amount of information obtained using nature, quality, relative importance and usefulness. Monitoring refers to maintaining awareness of development in a topic. Extracting refers to identifying relevant material from a document. Academic background and user experience can influence all of these factors.
User Experience
User experience refers to experience that users have in the domain, in online IS in general and in IS on the Internet or a particular search engine. Studies have been conducted on how user experience affects search terms used. These terms can influence IR. Lancaster et al. (1972) conducted a study searching the Epilepsy Abstracts Retrieval System, in order to determine how much use was made of the system, how successfully it was used, problems encountered and what is the general user reaction to searching in the system (Lancaster et al. 1972, 223). The authors attributed search failures to: use of incorrect terms or illogical strategies, failures due to the level of strategy adopted and failure to cover all possible approaches of retrieval. (Lancaster et al. 1972, 231).
Failures were not due to indexing failures of the delegation of searches to intermediaries.
The primary reason of success for system experts was their use of more search terms. The authors wrote that more searching aids may decrease recall failures. Recall failures were often due to overexhaustive or overspecific strategies (Lancaster et al. 1972, 237), often conducted by novice users. The authors concluded that training in the form of novice users observing expert users, intermediary help and instruction, print materials, online instruction an instructional film is needed to help users know what search terms to use.
Fenichel (1981) and Howard (1982) found that users with search and database experience were more effective in their searching. Howard (1982) studied DIALOG and ERIC users of five different experience levels: novice, moderately experienced with ERIC, moderately experience without ERIC, very experienced with ERIC, and very experienced without ERIC. She conducted interviews to find out about their academic background and online experience and training. She measured effectiveness, error rate command language used, procedural errors and dubious practices. She also measured cost effectiveness in searcher minutes, dollars per reference retrieved and direct costs. Searchers were categorized into very and moderately experienced online and ERIC searchers, and very and moderately-experienced online and non-ERIC searchers. She compared ERIC system experienced users with and without domain experience. Very experienced ERIC users conducted more restricted searches that were higher in precision. Howard (1982, 323) found that they viewed the most sets, searched the most terms and spent the least time preparing before going online, achieved highest speed scores, made less errors, and performed searches in close correlation to online-experienced, non-Eric experience groups. Novices took more time to search, but this was not significant. Very experienced non ERIC users achieved high recall and low precision. Users with ERIC experience searched more thesaurus terms than non ERIC experienced users. (Howard 1982, 320) Users with no ERIC experienced performed the most free text searching. The very experienced ERIC group performed the most cost-effective searches and achieved the highest precision ratio.
Borgman (1989) found that user experience with computers and task domains affected their searches and search outcomes. In their study of the search abilities of undergraduates, technical aptitudes affected search results, and were found to cluster.
Marchionini et al (1993, 65) wrote that IS involves several skills that get polished with practice: procedural skills, such as eliciting problem clarification from users, determining terminology and mapping it to databases, applying search strategies in opportunistic ways, interpreting feedback from systems, and assessing relevance of items according to expressed or anticipated user needs. Marchionini et al. (1993) compared domain experts to systems experts. Domain experts were content-driven, sometimes using technical terms based on their knowledge, and formed expectations related to possible answers. Systems experts were problem-driven, used more system features, focused on documents, and browsed more than they would when using online databases. Information seekers were evaluated using questionnaires and think-aloud, and interviews were conducted. Factors studied included: problem definition, experience, query formulation, reflection, stopping, monitoring and system effectiveness. Many of the domain experts had system experience. They made better inferences and used acronyms. Subject experts focused on query formulation and system manipulation, and were search driven. Many used different databases. Marchionini et al (1993, 61) found that the chance of a search success depends on the information seeker’s experience with the task domain, general Internet experience and experience on that the answer will be. Both domain and search experts used systems features to narrow down searches by date and to limit searches by language. Systems experts used more special, advanced features than domain experts. These features included author nationality, thesaurus, verb tenses, field limits and a wider variety of search models and strategies. Their conclusion was that designers can improve systems by reducing responsibilities for search strategies or by using short cuts.
In the law system evaluation, for instance, the four users were: the attorney with government contracts expertise (AE); the attorney with less government contract expertise (AN); the search expert with government contracts expertise (IE); and the search expert with little government contracts expertise (IN). Marchionini et al (1993, 56-57) found that AE and IE were analytic, confident and had workmanlike ways of searching. Users with domain and subdomain expertise formulated precise and straightforward queries. AN and IN tried to compensate for low domain knowledge with exploratory probes, varied attacks and wishful reasoning.
Lazonder (2000) conducted a study to find out about user World Wide Web experience and its influence on user searches for websites and information on websites. Users were 25 second-graders, and were asked questions on Dutch literature, the domain. Novice and experienced users had equal knowledge of the domain. Factors measured were: success, time, effectiveness and efficiency. Users with higher experience had higher success rates. The time for search was equal among novice and experienced users. Experienced users were more likely to keep searching and hone in on the desired documents. Experts were not better at browsing. The authors found that experts were better at locating needed websites, but that experts and novices had about the same success rates in locating information on websites. Lazonder (2000, 579) found that experts, in locating websites, were overall faster, more efficient and effective, had higher performance scores, and used less actions to come up with successful results. Regarding browsing, they concluded that little training is needed; even novices pick up how to browse quickly. Efficiency was the ration of the number of successfully completed tasks to time to complete tasks. Effectiveness was the overall number of actions to complete a task.
Slone (2002) conducted a study of user searches on Internet search engines, a website, and an online catalog. She found that novices and experts exhibited similar scrolling behavior, novices used less backtracking and experts used a wider variety of mental models. She found that both novice and expert users relied on serendipity, linking and other tasks that were not cognitively overbearing. Users abandoned the task when it became too difficult. Slone (2002) found little difference in the browsing and scrolling behavior of novices and experts searching library online or Web-based catalogs.

Searching versus Browsing

When it comes to Internet IR, some users prefer to search while others prefer to browse. Often, searching involves looking for something specific, while browsing involves looking for something new and interesting. Hauck et al. (2000, 209) referenced several studies that showed that, in general, expert users prefer to search while novice users prefer to browse. Lazonder et al.(2000) found that expert users tended to search more than browse.
Choo, Betlor and Turnball (2000) studied employees who searched for information on the Web. They categorized browsing into: directed browsing, which occurs in a systematic, focused way and is directed by a specific object or target; semi-directed browsing, which occurs in a predictive or generally purposeful way, as when the target is less definite; and an undirected browsing, which occurs when there is no real goal and very little focus. Information need and purpose influenced browsing and searching.
Academic Background
Borgman (1989) studied UCLA undergraduates of different majors to see how their personality and academic background influenced IR. Borgman (1989, 238) cited previous studies that had shown that users with a math or technical background, and a high technical aptitude were better in IR. Technical aptitude referred to factors such as spatial and reasoning aptitudes and background. Previous studies had also shown that users involved with social sciences, humanities and sciences have different IR behavior. Borgman hypothesized that user’s personality and technical aptitude lead to a choice of major, which then was an intervening variable in IR. Borgman (1989, 243-244) used Carl Jung’s personality trait classifications of concrete experience versus abstract conceptualization and active experimentation versus reflective observation. Computer experience was controlled. Math and science majors were better and faster at completing initial search tasks. Borgman (1989, 246) found that engineering students had a search pattern similar to programmers’ patterns in other studies, and English majors had search patterns similar to graduate library school students in other studies. Students with the highest technical aptitudes had personality characteristics similar to skilled searchers and to programmers. Borgman (1989, 248) concluded that personality characteristics, academic background and technical aptitudes influenced IR performance. No strong link was found between personality and technical aptitudes, as between major and technical aptitudes. She concluded that a more extensive study needs to be conducted to learn exactly what behavior is influenced by each of the factors studied.
Hill (1997, 235) wrote that IS involves thinking (planning and organization), acting (browsing and searching), integrating (differentiating and monitoring) transforming (extracting), and reaching resolution (decision making and monitoring). She found that domain and system knowledge affect search process and success.
Research Questions
Research questions in this study are:
1. How does user domain, website and system experience influence search success, including time to finish?
2. How does use academic background affect search success?
METHODOLOGY
Five users were chosen. User 1 is a domain novice, and has Internet expertise mostly for personal purposes such as shopping and email. User 3 is a nurse with domain, Internet and website expertise. User 2 and User 4 are domain novices, have Internet expertise, and User 2 has used the website once. User 5 is a gastroenterologist with domain experience, and average system experience. He usually conducts textbook research for questions in health. Users 4 and 5 had no website experience.
Pre-Test Questionnaire
A pre-test questionnaire was emailed to the users to get information on their gender, age, academic and experience background, and on their search behavior. (Appendix I) The first two questions were about the gender and age of the users. The study was not specifically designed to obtain the effects of gender and age on user searching. The third question asked what degree the user holds, to get an idea of their academic background. The fifth and sixth questions, for the same aim, asked about their major of focus and extent of experience in biology. Questions 7 through 9 asked about the extent of user Internet experience, including how often they search and what type of searches (topic of interest, academic or corporate searches) they conduct. This was to get a further idea of their domain and systems experience. Question 11 asked how confident searchers were on taking the test, to see if this influenced their searching in any way. Question 12 asked if users stayed on target when searching or browsing the Internet. The purpose of this question was to get an idea of their conciseness in their Internet IS. Question 13 asked if users found pictures and graphics useful. The effect of the answers to the last three questions on search success were beyond the scope of this study, but were included in the questionnaire to get a better idea of user background. Questions 14 and 15 were asked to provide an idea of how social effects of family background and cognitive style influenced users. However, these are beyond the scope of this study, and could be used in future research to assess how cultural-social and cognitive styles affect IR.
When the questionnaire was first made up, the scope of the study was to look at more factors of user IR, including cultural, social, personality and cognitive style factors. However, the questions and design of the study did not allow for these factors to be studied. Question 14 pertained to cultural and social background factors that may influence IR. Question 15 was written to get an idea of the cognitive skills of the users.


Search Tasks
The same five questions were emailed to each user. (Appendix II) Answers were to be found only on www.cdc.gov. The questions started from the most specific, finding answers to specific questions, and became broader. The last question allowed the users to search and/or browse on a topic of interest. After each question, users were asked to indicate the search time, how many sittings it took to answer the question, if they asked a person or the computer for help, if they quit, if they searched or browsed, and, if both, which yielded the right answer. They were asked to note the query terms, links and URLs they used. This set of questions after each main task were given in order to assess the time and ease in which the users undertook each task, if they searched or browsed, and how proficient they were in formulating queries.
Formulation of Questions
The questions were written in order of difficulty, the first one being the most difficult. The answer to Question 1 could be found by searching or browsing, but search terms were not so obvious to get. The answer to Question 2 could be obtained by searching or browsing with perhaps some difficulty for a novice user. Then browsing was required to find the answer inside the article. Care had to also be taken to pick search terms for this question, but using the full title, especially with the authors, would yield the result. This question was a little easier than Question 1. Question 3 was easy and straightforward to answer by browsing or searching. Questions 4 allowed the users to choose what facts to write. Question 5 gave users most freedom; they were allowed to search for what was of interest to them.
Question 1 was chosen for the following reasons: the right query terms would have to be chosen, such as “women and race and heart disease” so as not to obtain irrelevant information. If the user had chosen to browse, they would have to click on the appropriate link on the left, “Diseases and Conditions”, and then scroll to the appropriate link “Heart Disease”. Then they would have to find the tenth link on the page “Women and Heart Disease: An Atlas of Racial and Ethnic Disparities in Mortality”, and find in this PDF the appropriate information. It was assumed that some amount of domain and systems sophistication and experience would be needed for this search.
Question 2 was chosen for the following reasons: searchers had a variety of ways to search or browse for the article and information in the article. They could have chosen the journal link on the left of the website, or chosen to look up the article by title or author. Looking it up by one author does not yield an immediate result. The searchers would have to have patience if their first result did not work and look up by both authors or by authors and title. Searchers who were the most experienced in the website and domain searching probably would be better at formulating query terms for the search. Part b was included to verify that the users did find the article.
Question 3 was chosen because the results could have been found out in a variety of ways by browsing or searching. It was assumed that the more experienced users would stay on the mark and find results the fastest, without getting discouraged from lack of results at first. Information on rapid HIV testing and CLIA can easily be found by searching. Or the user can use the Diseases and Conditions link, then the HIV/AIDS link, and then the first link Advancing HIV Prevention: new Strategies for a Changing Epidemic. The PDF would then be browsed for answers to parts a, b and c. To find part b searching, an appropriate query such as would have to be formulated.
On Question 4, the users were able to look for themselves for three facts, and had more freedom in how to search or browse. This question was chosen to provide insight into user style – what users would do if they had freedom to search or browse and look for what they wanted on a topic – according to user experience.
Question 5 was chosen for the same reason. Even more freedom was given to users. This time they could choose what they wished to look for. The information need would be their own. The researcher wanted to see if searching or browsing for a topic of interest would make the information more successful or quicker. According to Slone (2002), searchers looking for recreational goals would rely more on serendipity and would be likely to give up on difficult tasks. Experienced users were more likely to have mental models of how the Internet works. This aided in their searches. Novices did not link heavily and produced less term generation. This question was used to see if user less experienced in the domain, system and website would be more likely to give up on a recreational task.
Post-Test Questionnaire
A post-questionnaire was given at the same time to be completed after the five questions. (Appendix III) The first question was asked to determine the extent of website experience. The second question was asked to determine if the environment in any way impeded their IS. Environment may have affected search times. If so, experience and academic background may not have been the only factors that influenced search times. Questions 3 to 5 were asked to see if a certain type of question was found most difficult by the users, and to assess question difficulty in general. Question 6 was asked to see if users completed the task on their own, or if they used the computer or someone for help. Questions 7 a-d were asked to get an idea of what users thought of the website. These questions were not used in the final analysis since the intention of the research was to find how user academic background and experience affect IR. Questions 7e was asked to assess question difficulty in general. Questions 8 and 9 were asked to learn even more about what users thought of the website. Question 10 was written to see how the users felt about the easiness of answering each question in their own words. These questions were not used for the final evaluation. Question 11 was asked to assess user confidence during the test, in order to see if this hampered their IS in any way. It was assumed that the less experienced users would have less confidence. Questions 12 and 13 were asked to see what users felt about the interface and website. Question 14 was asked to see if the users felt lost during the test. Question 15 was asked to determine if users usually give up on Internet IS tasks. Questions 16-20 were asked to see what users thought of the worth of answering each question in terms of time. These questions were not used for the final evaluation. Question 21 was asked to see if users strayed away from answering questions to browse or to use another website. If users strayed, their time results may not have been accurate. Question 22 was asked to see of what value answering each question was to the users. This question was not used for the final evaluation. Question 23 was asked to allow users to write their comments in their own words, in case the questions did not cover all of the pertinent facts about their IS behavior.
RESULTS AND DISCUSSION
Following is a review of the pre-test questionnaire, the questions and the post-test questionnaire.

Pre-test Questionnaire
Gender and Age
Users 2 and 5 were male. User 1 is 28 years old, User 2 is 49 years old, User 3 is 50 years old, User 4 is 40 years old and User 5 is 45 years old. There is no huge difference among ages, except for User 1, the most inexperienced user. Implications of age were beyond the scope of this study.
Academic Background
User 1 has the least amount of college, earning an Associates Degree. User 2 has a Bachelor of Science in computer programming. User 3 has the most relevant academic background; she is a nurse. User 4 has a Bachelor of Arts in Computer Science and a Masters of Science in computer information systems. User 5 has the most education measured in years; he is a gastroenterologist who served for some months being the head of gastroenterology at Northshore University Hospital in Manhasset, Long Island.
User Domain, System and Website Experience
Everyone used the Internet daily, and User 1 used it more for recreational purposes and less for professional purposes. Other people tended to use it for both. User 2 and User 4 had computer system training, yet their average search times differed by almost three minutes. User 3 had the most of all types of experience; domain, website and systems. User 1 had little domain experience, and used the Internet mostly for personal and recreational uses such as email. Users 2 and 4 had used the website once each. It may be concluded, although not definitively, that combined domain, website and Internet experience does make a difference in search times.

User Confidence
User 1 had average confidence. User 2 was “very confident”. User 3 was confident. When it comes to confidence, User 4 said “not very” and User 5 said “Okay”. No correlation was found between this and search times, except that perhaps. User 4’s low confidence had her take longest to answer some of the questions, despite her computer experience. It may have had something to do with why User 5, who is a physician, took longer to answer some questions than User 3. Effects of confidence were inconclusive for this study, and were beyond the scope of this study. This did not seem to affect IR.
Staying on Target, Pictures and Graphics and Family Background
Everyone agreed that they stay on target and avoid irrelevant information when browsing or searching. All but User 4 and User 5 agreed that pictures and graphics are helpful. User 5 answered “neither” and User 4 answered “disagree”. No one reported that their family affected their searching. User 2’s family has a technical background.
Questions
All users answered the questions correctly. Average times to answer questions were calculated using the first four questions. The following are the average task times:
User 1
(time in min) User 2
(time in min) User 3
(time in min) User 4
(time in min) User 5
(time in min)
Question 1 15 12.5 3 20 6
Question 2 8 4.1 2 1 5
Question 3 20 3.12 1 10 8
Question 4 15 3.54 6 3 8
Question 5 5 4 4 5 2
Average Time 14.5 6 3 8.5 7

User 3 may have been the quickest because of her combined academic background, and domain, website and Internet experience. The academic backgrounds and experience of Users 2 and 4 were similar. User 4 did have a Masters degree in the computer field, though. Yet, User 4 had search times that were 2.5 minutes faster than those of User 2. It may be that User 2 was more confident. User 2’s huge Internet experience and confidence may have been the reasons that he took a shorter amount of time to answer than User 5. All users were successful on Question 5, and times were 5 minutes or less. The nature of the question to one where users look for a topic of choice did not seem to affect search success or time, regardless of user academic background and experience.
Everyone completed each question at one sitting and did not ask for help. Users used similar links. User 3 used the most URLs. No correlation was found between links and academic background or experience. Users for the most part did not note links used when browsing. The questionnaire may have been confusing. Some users may have confused “links” with “URLs”.
The following is a table of the search and browsing behavior of the users.
User 1
(time in min) User 2
(time in min) User 3
(time in min) User 4
(time in min) User 5
(time in min)
Question 1 Search Both/Browse Search Both/Browse Search
Question 2 Search Search Search Search Search
Question 3 a. Search
b. Browse a. Search
b. Browse a. Search
b. Browse a. Search
b. Browse a. Search
b. Browse
Question 4 Both/Search Search Search Search Search
Question 5 Both/browse Search Both/Search Browse Search

Past studies have shown that experienced users tend to search. There was no correlation in this study. User 5 searched except for Question 3b. Other than that, all users searched more often than browsed. User 1 and User 4 browsed the most. They also took the most time to search. Their academic backgrounds and Internet experience are different. They have low domain experience and little or no website experience.
Other Questions after Each Task
No user quite or asked for help. All users answered questions in each sitting. Most users did not put down search terms. When they did, they were similar or the same among all users. No user provided links. They may have misunderstood links for URLs. All users did not provide URLs all of the time. When they did, they were similar or the same, except for User 3 who provided more URLs. This was mainly because she provide separate URLs for each step. For instance, for Question 1, she put down both: http://www.dcd.gov/node.do/id/0900f3ec8000e035 and http://www.cdc.gove/doc.do/id/0900f3ec802720b8.
Post-Test Questionnaire
User 4 and User 5 had never used www.cdc.gov before. Other users had used it once, except for User 3 that uses it often. Environment did not affect searching. No one quit or asked for help.
Question 3
User 1 User 2 User 3 User 4 User 5
2; easiest to locate 5; his choice All Easy 2; easiest to locate; used title All easy; his own #5

For Question 3, User 2 and User 5 found Question 5 to be the easiest. User 3 thought that all were easy, as was expected. User 1 and User 4 found the second question easiest, looking for a specific title and article. These users had different backgrounds.
Question 4
User 1 User 2 User 3 User 4 User 5
#3;had to browse a lot None #4; question is less structured #1; took the longest #3b

For Question 4, it can is significant but inconclulsive that User 1 found question 3 the hardest, stating that she had to browse a lot, since this is a straightforward question and to browse for specific definitions is not necessary. It is not the easiest way to complete the task. Perhaps if she would have been more of an experienced Internet user in terms of science or professional activity, she would have searched differently. This question took her the longest, 20 minutes. User 3 found Question 4 hardest to answer, since it was less structured. It took here the longest than any other question she had answered – 6 minutes. It is surprising that she found any questions difficult. User 4 found Question 1, the most difficult question, the hardest to answer. It took her a longer time to answer this question than any other question. User 5 surprisingly found one question hard – the HIV question. It took him the same amount of time – 8 minutes – to answer this one as Question 4. It is surprising that User 5, a physician, and User 1, the most novice, found this question the hardest. For Questions 5 and 6, no one gave up on questions or asked for help from a person or the website.
Question 7
User 1 User 2 User 3 User 4 User 5
1-3 Agree;
4 Disagree; 5 Neither All Agree All Agree 1-4 Agree; 5 Disagree All Agree

For Question 7, Users 2, 3 and 5 agreed that all questions were easy to answer. This was surprising because User 3 did state that Question 4 was less structured. User 1 thought Questions 1 through 3 were easy to answer, 4 was difficult and marked “neither” for 5. User 4 thought that all but five were easy to answer. It is not possible to determine for sure if user experience and academic background had to do with how they viewed questions.
No one answered Question 8, to describe in their own words what they thought of answering each question in terms of easiness. There were discrepancies between Questions 4 and 7. For instance, User 1 found test Question 3 difficult, while answering that they agree that Questions 1 through 3 are easy to answer on Question 7. User 3 found test Question 4 the hardest, while answering on Question 7 that all questions were easy to answer. User 4 answered on Question 4 that test Question 1 was the hardest while agreeing that test Question 1 was easy to answer. User 5 found that test Question 4 was hard to answer, while agreeing on Question 7 that all test questions were easy to answer.
For Question 9, all users were confident, except that User 5 wrote “Okay” and User 4 wrote that her confidence rose as she took the test. Their confidence levels were not as high as they could have been. Perhaps this is why User 5, a physician, and User 4, a computer specialist, were not as quick as User 3 and User 2 respectively.
For Question 10, no one felt lost except for User 1 once. For Question 11, on whether they would give up once Internet browsing and searching gets too tough, User 1 disagreed (unexpected), User 3 and User 2 strongly disagreed (expected) and User 4 and User 5 agreed (unexpected). No one gave up on this test. Perhaps the questions were easy or the website was easy to use. For Question 12, no one strayed or asked for help. This may mean that questions were easy for different types of users.
Limitations of the study included little time, no observations, think-aloud, and interviews with the users, only five users were used, and that no user was a complete systems novice.
CONCLUSION
Findings in this study were not conclusive. Academic background and experience were not found to play a big role in IR, IS and task completion time. The exception is that User 3, the nurse with the most domain, website and Internet experience, completed the tasks quicker and used the most search links. Terms and URLs used did not differ enough to allow for conclusions on user academic background and experience and terms used. Conducting searches on topics of interest while answering Question 5 did not seem to be affected by academic background or experience; all users found what they were looking for and found it in 5 minutes or less.
Reasons that more conclusions cannot be draw from this study may be that the website is relatively easy to use for even those without much domain and system experience. Another reason may be that questions were too easy. Unlike in Slone 2002, users did not abandon the task when it became too difficult. Most users did not think that any of the questions were very difficult. All of the users answered the questions correctly. User 3, the user with the highest level of domain, website and Internet experience did complete the questions fastest.
All users had a similar amount of confidence in using the website. User 1 posted average to normal confidence; there may be some correlation with this and her slowest average search time. Academic factors may have made a difference since User 3 answered quickest and User 1 answered the slowest. Users used similar search terms. All users found the correct answers.
FUTURE STUDY
In the future, users will include ones with domain experience and little or no website and systems or no experience. Less questions and question tasks will be asked, so as not to be so demanding. Perhaps then the users would not leave any blanks. Interviews will take place before and after the test to really get an idea of users’ opinions, experience and reactions. Users will be observed. Interviews and observations will allow a better picture of user moves, and will solve the problem of users not completing all questions. There were disparities in the post-test questionnaire between questions 3 and 4, and 7 on which questions the users viewed the easiest or hardest. With interviews, such discrepancies would not occur. Questions on domain, website and Internet experience will contain range of time choices such as “2 to 5 times per day” to get a more accurate idea of years of academic experience, and times that a website or the Internet are used.
In future questionnaires, choices will be offered for users to indicate their age in an age range, so as not to be so intrusive. Some questions, such as the one asking if questions were easy to answer will not have a “neither” option. If checked, this is difficult to interpret. “Links” will be changed to “URLs” in the question: “Note the links that you used” and “Note the link(s) that you used to find the answer”. User 1 was confused and answered only under “Please provide the URL(s) of where you found your answer(s). This was a repetitive question. In future questions, the differentiation between links used during browsing and a URL will be made apparent.
In a future study, users that are even less domain, website and Internet experienced will be sought to produce a wider variety of results. Questions will be made tougher to answer and a less user-friendly website will be used. These changes may allow factors such as domain and system experience to weigh in more.
Future studies may include an analysis based on cognitive style, personality, and cultural and social background and factors. Personality and psychology tests would be administered and trained personnel would interpret them.


Appendix I

Pre-Test Questionnaire

Please provide us with the following information:


1. What is your gender?

2. What is your age?

3. What degree(s) do you hold?

4. How many years of college have you had?

5. What was your major or field of focus?

6. What is the extent of your experience in biology?

7. What is your extent of experience with the Internet?

8. How many times a day or week do you use the Internet?

9. What is the purpose of your use of the Internet – recreational browsing,

specific searching (if so, in what topic) or email?

10. What type of website do you usually visit? (academic, library, corporate, art, etc.)

11. How confident were you about taking the test before taking it?

12. Please put an “X” in the appropriate spot:

When searching or browsing the Internet, I usually stay on target and avoid irrelevant information.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

13. Please put an “X” in the appropriate spot:

When searching or browsing the Internet, I find pictures and graphics useful in making the search easier.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

14. Please provide any comments about how your family, cultural and social background prepared or hindered you form becoming a confident person in biology and Internet searching, or in general.

15. Put an “X” next to the sentence that most characterizes your beliefs:
Knowledge is known, and there is a right and wrong for everything.
Most knowledge is known, and there is probably a right way to find answers.
Usually, there is certainty about knowledge.
Knowledge is contextual; it is disconnected from any absolute truth.

16. Please feel free to provide any other comments.

Appendix II

Questions

To answer these questions, you will use ONLY www.cdc.gov. Please take the time to answer each question. There are questions at the end of each section that you will need to answer. Look at them first before beginning to answer the question. If it gets too difficult, you can ask for help, or you can stop and not answer. Note this. You can do the questions in more than one sitting. Whether you answer each question all at once, or at more than one sitting, please note the total time it took to answer each question. If you stop, note how long it took you before you stop. Also note if you browsed or search. Also note if you reformulated your query terms. Thank you.

Question 1

Search or browse to find the United States 1991-1995 statistics of what racial group of women ages 35 and up experienced the most heart disease.
If you find a graph, what other racial groups were depicted in the graph?

Answer


Time to finish _____________
How many sittings__________
Asked for help___________
Online help yes/no
By whom_________________
For what_______________
Did you stop completely? _________
Time before stopping________
Did you search or browse? _____________
If both, which yielded the successful answer? ______________
If you searched, what query terms/questions did you use, in chronological order

Note the links that you used

Note the link(s) you used to find the answer(s).


Did you read material in another language? Why/why not?
If you read material in another language, which language and what material?
Please provide the URL(s) of where you found your answer(s).

Question 2

Muin J. Khurry and George A. M. Mensah recently wrote the journal article “Genomics and the Prevention and Control of Common Chronic Diseases: Emerging Priorities for Public Health Action” for the journal Preventing Chronic Disease: Public Health, Research ad Policy.

a. Please find the article by browsing or searching. Indicate if you browsed or searched. If you searched, indicate the query term for each search and the query term for the successful search.

b. Find in the article the number two fact about family disease history that makes it ideal for public health practice, and write it here: ________________________________________________________________________________________________________________________________________________

c. Did you search or scroll to find the answer?
Time to finish _____________
How many sittings__________
Asked for help___________
Online help yes/no
By whom_________________
For what_______________
Did you stop completely? _________
Time before stopping________
Did you search or browse? _____________
If both, which yielded the successful answer? ______________
If you searched, what query terms/questions did you use, in chronological order


Note the links that you used

Note the link(s) you used to find the answer(s).

Did you read material in another language? Why/why not?
If you read material in another language, which language and what material?
Please provide the URL(s) of where you found your answer(s).

Question 3

a. Find information on rapid HIV testing. Provide the URL.
________________________________________________________________________

b. When is it recommended that pregnant women first get tested for HIV?
________________________________________________________________________

c. For what does CLIA stand?

________________________________________________________________________

Time to finish _____________
How many sittings__________
Asked for help___________
Online help yes/no
By whom_________________
For what_______________
Did you stop completely? _________
Time before stopping________
Did you search or browse? _____________
If both, which yielded the successful answer? ______________
If you searched, what query terms/questions did you use, in chronological order


Note the links that you used

Note the link(s) you used to find the answer(s).


Did you read material in another language? Why/why not?
If you read material in another language, which language and what material?
Please provide the URL(s) of where you found your answer(s).


Question 4

Find three facts about cholesterol and write them down.

__________________________

__________________________

__________________________

Time to finish _____________
How many sittings__________
Asked for help___________
Online help yes/no
By whom_________________
For what_______________
Did you stop completely? _________
Time before stopping________
Did you search or browse? _____________
If both, which yielded the successful answer? ______________
If you searched, what query terms/questions did you use, in chronological order


Note the links that you used

Note the link(s) you used to find the answer(s).


Did you read material in another language? Why/why not?
If you read material in another language, which language and what material?
Please provide the URL(s) of where you found your answer(s).


Question 5

Browse or search for your topic of interest, formulate a question, and find the answer to the question.

Topic of interest ________________________________________________________________________

Question ________________________________________________________________________________________________________________________________________________

Answer
________________________________________________________________________
________________________________________________________________________
________________________________________________________________________

Time to finish _____________
How many sittings__________
Asked for help___________
Online help yes/no
By whom_________________
For what_______________
Did you stop completely? _________
Time before stopping________
Did you search or browse? _____________
If both, which yielded the successful answer? ______________
If you searched, what query terms/questions did you use, in chronological order


Note the links that you used

Note the link(s) you used to find the answer(s).


Did you read material in another language? Why/why not?
If you read material in another language, which language and what material?
Please provide the URL(s) of where you found your answer(s).


Appendix III

Post-Test Questionnaire

1. Have you used www.cdc.gov before? If you have, when, how often, and for what did you use www.cdc.gov?

2. Did your environment affect your taking the test? i.e.

a. Was there time pressure?

b. Did you think that others were watching you?

c. Did you want to impress them?

d. Did you want to impress the test-administrator?

3. Which question did you find the easiest and why?

4. Which question did you find the hardest and why?

5. Did you give up on any questions and why?

6. Did you ask for help from the website or another person for any questions? How did it or they help? Were you comfortable asking for help?

7.Please put an “X” in the appropriate spot:

a. www.cdc.gov is a website that is easy to navigate.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

b. www.cdc.gov is a website that is easy to search.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

c. I found my experience taking this test pleasant.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

d. I would use this website in the future for personal use.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

e. For each question, answer: This question was easy to answer.
Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree


Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree


Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree


Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

8. Please put an “X” in the appropriate spot:

I found www.cdc.gov a website that clearly presents information.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

9. Please put an “X” in the appropriate spot:

I found www.cdc. a complex website.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree


Please provide any additional comments.

10. In your own words, what do you think of the easiness of answering each of the test’s questions:


11. How confident were you about taking the test during and after taking it?

12. What did you think of the www.cdc.gov interface?

13. In your own words, what do you think of www.cdc.gov?

14. Did you at any time during the test feel “lost”? While answering which question(s) did you experience this?

15. Please put an “X” in the appropriate spot:

I usually give up is Internet search/browsing tasks get too tough.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

16. Please put an “X” in the appropriate spot:

I think the time it took to answer Question 1 was worth it.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

17. Please put an “X” in the appropriate spot:

I think the time it took to answer Question 2 was worth it.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

18. Please put an “X” in the appropriate spot:

I think the time it took to answer Question 3 was worth it.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

19. Please put an “X” in the appropriate spot:

I think the time it took to answer Question 4 was worth it.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

20. Please put an “X” in the appropriate spot:

I think the time it took to answer Question 5 was worth it.

Strongly Agree Agree Neither Agree or Disagree Disagree Strongly Disagree

21.At any time during your test, did you stray from answering a question to browse the CDC website or another website? If so, when, why and for how long?

*22. Please put an “X” next to the appropriate phrase for each question:

Question 1
For me, this search was:
___of major value
___of considerable value
___of minor value
___of no value

Question 2
For me, this search was:
___of major value
___of considerable value
___of minor value
___of no value

Question 3
For me, this search was:
___of major value
___of considerable value
___of minor value
___of no value

Question 4
For me, this search was:
___of major value
___of considerable value
___of minor value
___of no value

Question 5
For me, this search was:
___of major value
___of considerable value
___of minor value
___of no value

* adopted from Lancaster 1972


23. Please feel free to provide any other comments.

Bibliography


Borgman C. L. 1989. All users of information retrieval systems are not created equal: An exploration into individual differences. Information Processing & Management 25(3):237-251.

Cooper W. S. 1968. Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation Jamuary 1968:30-41.

Cooper W. S. 1976. The paradoxical role of unexamined documents in the evaluation of retrieval effectiveness. Information Processing & Management 12:367-375.

Elkerton, J., & Williges, R. C. (1984). Information retrieval strategies in a file-search environment. Human Factors 26:171-184.

Fenichel C. H. 1981. Online Searching: Measures that discriminate among user with different types of experiences. Journal of the American Society for Information Science
32(1):23-32.

Friede A, Rosen D. H. and Reid J. A. CDC Wonder: A cooperative processing architecture for public health. Journal of the American Medical Informatics Association 1(4):303-312.

Hauck R. V., Sewell R. R., Ng T. D. and Chen H. 2001. Concept-based searching and browsing: A geoscience experiment. Journal of Information Science 27(4):199-210.

Hill, J. R. The world wide web as a tool for information retrieval: An exploratory study of users’ strategies in an open-ended system. School Library Media Quarterly Summer 1997:229-236.

Howard, H. 1982. Measures that discriminate among online searchers with different training and experience. Online Review 6(4):315-326.
Lancaster F. W., Rapport R. L., and Penry J.K. 1972. Evaluating the effectiveness of an on-line, natural language retrieval system. Information Storage and Retrieval 8:223-245.

Lazonder A. W., Biemans H. J. A. and Wopereis G. J. H. 2000. Differences between novice and experienced users in searching information on the World Wide Web. Journal of the American Society for Information Science 51(6):576-581.

Lesk, M.E. and Salton, G. 1969. Relevance assessments and retrieval system evaluation. Information Storage and Retrieval 4:343-359.

Marchionini G., Dwiggins S., Katz A and Lin X. 1993. Library & Information Science Research 15:35-69.

Marchionini, G. M. 1995. IS in Eletronic Environments Cambridge, United Kingdom: Cambridge University Press.

Salton Gerard. 1992. The state of retrieval system evaluation. Information Processing & Management 28(4):441-449.

Saravecic T., Kanto P., Chamis A., and Trivison D. 1988. A study of IS and retrieving. Background and methodology. Journal of the American Society for Information Science 39(3):161-176.

Saravecic T. and Kantor P. 1988. A study of information seekign and retrieving. II. Users, questions, and effectiveness. Journal of the American Society for Information Science 39(3):177-196.

Slone D. J. 2002. The influence of mental models and goals on search User 3terns during Web interaction. Journal of the American Society for Information Science and Technology 53(13):1152-1169.

Spink A. 2004. Multitasking information behavior and information task switching: An exploratory study. Journal of Documentation 60(4):336-351.

Swanson, D. R.1977. Information retrieval as a trial-and-error process. Library Quarterly 47(2):128-148.

Swets J. A. 1969. Effectiveness of information retrieval methods. American Documentation January 1969:72-89.

Tenopir C. 1983. Dialog's knowledge index and BRS/After Dark: Database searching on personal computers. Library Journal March 1, 1983:471-473.

Yang, C. C. 2004. Content-based image retrieval: A comparison between query by example and image browsing map approaches. Journal of the American Society for Information Science and Technology 30(3):254-267.

Banwell, L. and Graham, C. 2004.
http://www.informationr.net/ir/9-2/paper167.html.
Accessed on March 30, 2005.

Choo, C.W., Detlor, B.and Trunbull, D. IS on the Web: An integrated model of browsing and searching.
http://www.firstmonday.dk/issues/issue5_2/choo/index.html
Accessed on March 30, 2005.

Griffiths, J. R. and Brophy P. Student searching behavior in the JISC information environment
http://www.ariadne.ac.uk/issue33/edner/
Accessed on March 30, 2005

Moss, N. and Hale, G. 1999. Cognitive style and its effect on internet searching: A qualitative investigation.
http://www.leeds.ac.uk/educaol/documents/000001189.htm
Accessed on March 30, 2005.

Turnball, D. 2003. New approaches for studying and building IS models: A possible hybrid approach.
http://www.ischool.utexas.edu/~donturn/research/SIGUSE2003Worksheet.html
Accessed on March 30, 2005.

Wilson T. D. Recent trends in user studies: Action research and qualitative methods
http://infomrationr.net/ir/5-3/paper76.html
Accessed on March 30, 2005.