A study of sixth graders’ critical evaluation of Internet sources

SIXTH GRADERS’ CRITICAL EVALUATION OF INTERNET SOURCES 1

A Study of Sixth Graders’ Critical Evaluation of Internet Sources

Angela Kwasnik Johnson

Report of Practicum Study

Michigan State University


A Study of Sixth Graders’ Critical Evaluation of Internet Sources

Abstract

This study was a descriptive, task-based analysis to determine how sixth-grade students approach

the cognitive task of critically evaluating Internet sources. Pairs of sixth grade students in an

Information Literacy course evaluated four preselected Internet sites to determine their credibility and

appropriateness for two specific research scenarios. Data for analysis included written responses,

screencasts, and video of students while completing the task. Results suggest that these students tended

toward simplistic modes of evaluation in the face of increased cognitive load, though some moved

toward a more critical stance and many applied basic metacognitive strategies. The study points to the

importance of instructional approaches that teach students to flexibly apply evaluation criteria in ill-

structured environments, that teach advanced metacognitive strategies, and that instill habits of mind for

critical inquiry. Instruction that empowers students to practice healthy skepticism even in the face of

authority is also essential.

Introduction

Electronic communication in the digital age is fundamental. On average Americans spend 60

hours a week online, 42% viewing information sources outside of social networking and

communications like email (Smith, 2010). Ninety percent of school-age students use the Internet

(Greenhow, Robelia, & Hughes, 2009). School districts are replacing traditional textbooks with digital

devices and libraries are exchanging books for digital texts (Christensen, Johnson, & Horn, 2008; Levy,

2007; Postal, 2011; Reynolds, 2011; Williams, 2006). An executive summary of the National

Educational Technology Plan recommends the integration of technology in all areas of education,


including “learning resources” and “technology-based content” (United States Department of Education,

2010). Technology, information, teaching, and learning are deeply intertwined.

Meanwhile, the speed and ease with which we access vast quantities of ever-changing

information has exploded (The Economist, 2010). Traditional literacy empowered people through

access to information not otherwise at their disposal; literacy in the digital age requires we cull from an

endless data stream the information that serves us toward personal fulfillment and productive social

participation. While literacy once solved the problem of too little information, it must now solve the

problem of too much. In light of such abundance, a capacity to sort the “wheat from the chaff” of online

information sources is an essential 21st century skill. I question whether my students possess that skill.

What’s more, students’ failure to discern Internet sources seems not to be localized; it has been observed

by many researchers who note that students often overlook issues of authorship, bias, or authority when

selecting Internet sources (Chen, 2003; Coombes, 2008; Hirsh, 1999; Kuiper, Volman, & Terwel, 2005).

Purpose

Given the prevalence of digital communications and the certain shift from traditional print to

digital information sources in both libraries and classrooms, educational systems need to support the

development of students’ critical thinking in digital environments. Specifically, students should learn to

apply critical evaluative processes to determine the credibility and appropriateness of Internet resources

for their information needs. A prerequisite to the design of instructional lessons with this outcome is a

clear understanding of what our students currently do when they assess online information. This study

aims to address that need—to examine how sixth grade students determine the credibility and

appropriateness of the information they encounter on the Internet. Because eleven-year-olds are on the

verge of thinking critically about many things they encounter, sixth grade is an appropriate time to

introduce this skill.


Related Research

The extent to which young people possess effective critical evaluation skills for the Internet

environment has been debated. On the one hand, students today are considered “digital natives” for

whom a technology infused life is natural and intuitive, while those born before the Internet are “digital

immigrants” who must apply old world skills to a strange new environment (Prensky, 2001). Many

empirical studies suggest, however, that digital natives are not as critically literate as the characterization

depicts. Coombes (2008) likens today’s youth to “digital refugees” whose outward confidence in

adopting new technologies for personal needs belies their impatience with critically reading search

results, their assumption that what they cannot find quickly does not exist, and (most relevant to this

discussion) their equating of relevance with authority. In fact, Bilal (2001), Griffiths and Brophy

(2005), Hirsh, (1999), Kuiper et al. (2005), and Sutherland-Smith (2002), among others, have all

documented the failure of students to question the authority of Internet resources. Hirsh (1999) found

students showed little concern for authority and authorship while doing research and were trusting of

information they found without question, while Kim and Sin (2011) and Walraven, Brandgruwel, and

Boshuizen (2009) found students evaluated web sites superficially, relying on surface features or

branding. Griffiths and Brophy (2005) found that college students looked only at the first page of search

engine results and were “satisfied that these initial ten or so results (were) good enough” (p. 551).

Fallows (2005, in Coombes, 2008) found that students knew little about how search engines work, yet

were confident in using them; they “stop searching once they think they have found an answer and have

a tendency to rely on single sources” (p. 3). Several studies have found evidence of shallow reading and

minimal attention to online text, two practices that might logically impede critical evaluation of

information. A study of New Zealand students found little evidence of students going beyond basic

fact-finding (Ladbrook & Probert, 2011), and others found students “bouncing” from site to site without


reading any one in depth (Nicholas, Huntington, Williams, & Dobrowolski, 2004; Bell, 2010). Kuiper

et al. (2005) concluded that, in general, student Internet use resulted in “insufficient knowledge,

understanding, and insight” (p. 309) and recommended “specific attention … be paid to learning to

assess the relevance and reliability of information” (p. 309).

Yet, some recent evidence suggests that students are becoming more critically literate as

technology continues to infuse their lives. One comprehensive investigation of students’ Internet

evaluation skills was a MacArthur Foundation study conducted by Flanagin and Metzger (2010). In

their extensive survey of 2,747 U.S. children between the ages of eleven and eighteen, students showed

considerable skill in assessing Internet sources:

(Students) demonstrated an understanding of the potential negative consequences of believing

false information online, a tendency to question information that comes from deceptive sources

like hoax Web sites, the ability to differentiate between one-sided and two-sided information

presentations, general feelings of distrust toward strangers on the Internet, and the inclination to

put more effort into assessing the credibility of highly consequential information (e.g., health

information) than less consequential information (e.g., entertainment information). (pp. 105-106)

Although I hesitate to describe survey data as “demonstrating” a skill, the results are encouraging and

were corroborated by Head and Eisenberg (2010), whose survey of college students found currency and

author credentials among the most important criteria used for evaluating sites for academic research.

Likewise, Pareira (2009) found college students had a greater concern for textual rather than visual cues

while evaluating sources, and Nicolaidou et al. (2011) found students showed a general distrust of

Internet sources. In Flanagin and Metzger (2010) many students stated the importance of credibility and

reported that they found analytical processes more effective than heuristic “hasty and feeling-based”

evaluation. Yet in practice they did not always base evaluations on the analytical processes they


claimed were preferential. A tendency to knowingly exchange source quality for convenience, known in

the research as satisficing, has been noted by several researchers and is one possible explanation (Kim &

Sin, 2011; Mothe & Shut, 2011; Hargittai, Fullerton, Menchen-Trevino, & Thomas, 2010). In any case,

the inconsistency between students’ words and actions begs the question of whether they failed to

critically evaluate because they chose not to or because they were unable to do so.

It may also be that experience prompts closer evaluation. More experienced Internet users were

more likely to use analytic strategies to evaluate credibility and those who reported having a negative

experience as a result of false Internet information (or hearing of someone else having a negative

experience) were more concerned about Internet credibility (Flanagin and Metzger, 2010). These

students used “more cognitively demanding tools” in their assessment of credibility (p. 108). Coiro and

Dobler (2007) also found that students who spent more time online were more critical of sources. Both

suggest that experience with the Internet leads to more careful and purposeful evaluation.

Theoretical Framework

Two theoretical bodies of literature informed this study. The first of these was the study of new

literacies. While traditional definitions of literacy emphasize the decoding and comprehension of

language in its written, alphabetic form, new literacies expand the definition to include the cognitive

processes of communication in a wide variety of formats, both written and visual (Leu, Kinzer, Coiro, &

Cammack, 2004). The comprehension of messages in multimedia formats requires linguistic, semiotic,

and representational assimilations (Leu et al., 2004).

In addition, because textual contexts are more frequently digital, and because digital formats are

by nature highly mutable, new literacies are by definition deictic. In other words, because they are

dependent upon the media through which they work, literacy activities change constantly, and our

definition of literacy must be open to change as well. Texts in this landscape become highly situative,


and their understanding requires repeated contextualization (Coiro et al., 2008; Leu et al., 2004). Coiro

et al. (2008) posited that

Literacy is no longer a static construct from the standpoint of its defining technology for the past

500 years; it has now come to mean a rapid and continuous process of change in the ways in

which we read, write, view, listen, compose, and communicate information. (p. 5)

This changing nature of literacy also requires a new kind of engagement. Mackey and Jacobson

(2011) argued for a new literacies approach that is “grounded in the idea that emerging technologies are

inherently different from print and require active engagement with multiple information formats through

different media modalities” (p. 68). While new literacies share with traditional literacies the processes

of decoding, word recognition, and comprehension, they make additional demands that complicate

reading (Coiro, 2003; Coiro & Dobler, 2007; Duke, Schmar-Dobler, & Zhang, 2006). Hartman,

Morsink, and Zheng (2010) posited that one complication resides in the “multiple plurals” (p. 140) of

online texts; that is, the various elements that combine to establish meaning—for example, reader,

author, task, context, and so forth—are in themselves plural and continually shifting, and therefore

confound the act of constructing meaning online exponentially. The authors suggest that a reader must

integrate three types of knowledge in comprehending online text that are far less influential while

reading offline text. These include knowledge of identity—knowing who wrote a text and how authors

“construct, represent, and project online identities” (p. 146); knowledge of location—knowing how to

“orient oneself in a website” and “in cyberspace” (p. 148); and knowledge of one’s own goal—knowing

why one is reading and remaining focused on that goal. The act of evaluation presents itself in each of

these respectively as assessment of an author’s trustworthiness, of a site’s effectiveness in helping to

orient the reader, and of a site’s match to one’s goals for reading. So, for example, a reader attends to

the identity of an author; the author’s interests, intent, and agenda; his intended audience; his ability to


utilize rhetorical and technological means to his end; and his treatment of the content in relation to

others’ treatment of it—all of these place the text within a broader map of relationships through which

its value for the reader, in his own multi-faceted context, must be defined. This process might be

likened to a textual GPS (TPS?), a kind of “text positioning” in the global-digital landscape that

provides a route to the determination of a text’s value.

In this vein, Leu et al. (2004) asserted that “Multiple, critical literacies populate the new

literacies of the Internet, requiring new skills, strategies, and insights to successfully exploit the rapidly

changing information and media technologies continuously emerging in our world” (p. 1596).

According to Hartman, Morsink, and Zheng (2010), these skills reside largely in the realm of

metacognition. Critical evaluation becomes one important aspect of applying metacognitive skills in the

online environment, requiring active engagement, the regulation and application of strategies for

comprehending online text, and attention to the affordances and constraints of the technology in which a

text resides.

In addition, the theory of new literacies is inclusive of several alternative perspectives for

approaching critical evaluation. These include views of critical literacy as extensions of media literacy,

grounded in the study of film, television, advertising, and other visual media (Daley, 2003); and as a

sociocultural force that develops social consciousness and empowers individuals in democracies

(Alvermann, 2008; Elmborg, 2006; Fabos, 2009). Leu et al. (2004) and Coiro et al. (2008) defined new

literacies broadly to include the influence of these perspectives. By grounding this study in new

literacies I consider their views important contributions to my own.

A second theory informing this study is Cognitive Flexibility Theory (CFT), posited by Spiro,

Feltovich, Jacobson and Coulson (1992), which examines the difficulty of knowledge transfer from

structured to ill-structured environments. Recognizing that oversimplification often leads to a


“reductive bias” (p. 61) that makes learning transfer problematic, the authors argue that cognitive

flexibility is required for successful knowledge transfer. It allows for the application of knowledge in

varied and irregular contexts, and equips one to reassemble existing knowledge in flexible ways that

adapt to those unique contexts. Since the Internet is by nature nonlinear, complex, and ill-structured, it

presents the very challenges that cognitive flexibility equips the reader to meet. Readers without

cognitive flexibility face difficulty in the transferal of reading skills from traditional print environments

to online settings, and in developing and applying new literacy skills. The evaluation of Internet

sources, therefore, requires cognitive flexibility.

In this report the terms critical evaluation and Internet evaluation are used interchangeably to refer

to cognitive skills employed by students to determine the credibility and relevance of a given site for a

specifically designated purpose. Those skills involve analysis and inquiry regarding authorship,

topicality, usability, currency, organization, attractiveness, accuracy, or any other criteria on which

students decide whether a site is or is not a good choice. Which criteria students employ while making

such choices is the focus of this study: How do students make judgments about the appropriateness and

credibility of Internet sources?

Finally, I chose to focus on the evaluation of Internet sites without the influence of pedagogical

intervention. Before evaluation skills can effectively be taught, we should know the extent to which our

students already practice them. Knowing “where they are” is a necessary prerequisite to meeting them

there with appropriate instruction.

Method

This study was a descriptive, task-based analysis to determine how sixth grade students

approached the cognitive task of critically evaluating Internet sources. I designed an evaluation task in

which students worked in pairs to critically evaluate four pre-selected Internet sites to determine their


credibility and appropriateness for two specific school research scenarios. Students received written

explanations of the purpose of each scenario and were directed to the sites via links on the school media

center web site. Students were instructed to work through the record sheet for each site, first examining

the site, then discussing the strengths and weaknesses of the site for the purpose explained in the

scenario, and then recording the strengths and weaknesses of the site for that purpose. After listing

strengths and weaknesses the students rated each site based on a 5-star scale and wrote a one-sentence

explanation for their rating. They completed this process for each of the sites and then answered four

open response questions comparing the sites. Students were free to click anywhere on the sites as they

conducted their evaluations and were free to move back and forth between the sites in their final

comparisons. They were encouraged to discuss their opinions openly with their partners. While one of

the students navigated the sites using the computer, the other recorded answers.

I designed the evaluation task to reveal students’ thinking about Internet source credibility as

authentically as possible. To this end, I refrained from providing Internet evaluation checklists or other

evaluation forms for students to utilize. Walraven et al. (2009) found that students articulate evaluation

criteria more successfully than they use them, explaining this phenomenon with reasons related to “time

pressures, motivation, and convenience” (p. 244). If students choose not to apply criteria they can

articulate, it is unlikely they’ll apply extensive checklists in authentic situations. Meola (2004) agreed,

suggesting that students “do not want to do any more work than necessary” (p. 334) and for this reason

checklists may be unrealistic. In addition, Meola (2004) and Ostenson (2009) both suggested that

checklists are misleadingly positivistic and may discourage critical thinking by creating the impression

that evaluation is a simple objective process, an example of the reductionist bias Spiro (1992) cautions

against. In addition, at least one recent study showed that such lists do influence student thinking in

designed studies and therefore obscure or transform the cognitive processes that would authentically


take place without them (Gerjets, Kammerer, & Werner, 2010). Because my interest was in learning

how students judge Internet sources in settings in which such checklists are not available or not likely to

be used, I did not include scaffolds like these in the study.

Site and Participants

As the media specialist in a middle school comprising grades six through eight, I taught an

Information Literacy course to sixth graders for two class periods each day. All sixth graders

(approximately 220 students) rotated through this seven-week course. Although my choice of sixth

graders for this study was in part a matter of convenience, research shows adolescents are likely to

accept faulty reasoning or weak evidence, especially when they are in general agreement with an

opinion (Steinberg, 2005). This makes early adolescence a logical time to introduce critical evaluation

skills. Flanagin and Metzger found that, among 11 to 18 year olds, the youngest of these students (sixth

graders) had the greatest difficulty recognizing false information on the Web. Therefore, in order to get

a sense of what cognitive discrepancies or alternative reasoning might lead to students’ lack of quality

discernment, I have chosen sixth graders in the hope they will provide greater opportunity for

observation of those. Students in the middle grades also experience considerable physical, intellectual,

social, and emotional transformation. It is logical, then, to consider this an important time for the

teaching and acquisition of critical evaluation skills.

Our district combines all sixth graders in the district in one middle school, but most of these

students have attended one of three elementary schools in the district prior to their arrival in our

building. The district is predominately white, middle-class and a mix of rural-suburban, with an average

family income of approximately $42,000 in 2009. The rate of students qualifying for free and reduced

lunch was 28% and the racial make-up of the district was 87% white, 6% African-American, 2% Asian-

American, 4% Hispanic, and 1% Native American Indian at the time of the study.


This convenience sample of 42 male and 23 female students was comprised of 29 male and 15

female students in the third month of their sixth grade year, and 13 male and eight female students in the

fifth month of their sixth grade year. Three male students were special education students with learning

disabilities in reading, and one male student received Title I services in reading remediation. The

sample included three sections of the course Information Literacy, which I taught each afternoon in two

consecutive class periods. Because our district has no elementary curriculum in information literacy,

this course presented the first opportunity to deliver a consistent skill set in these areas to all students

within a grade level. Sixth graders enter the middle school with varying levels of competence in these

areas, and I have a strong interest in gauging their competence at the beginning of the course to

determine relevant goals for their growth.

This study was grounded in my own practice and involved students assigned to my course; it is

therefore not generalizable to all sixth graders. However, the total sample of 65 students did provide

sufficient variation to reveal the thinking of a group of sixth graders in a fairly typical American middle

class community as they evaluated Internet sources for school research. It may, therefore, be useful to

researchers and practitioners in similar cultural and educational settings where students conduct research

online for school assignments.

Internet Site Selection

Four Internet sites were included in the evaluation task. I piloted a total of eight sites related to

four different scenarios to determine which sites were appropriately accessible to sixth graders and how

much time their evaluation required. I did this with two classes of students at the beginning of the

school year. Based on pilot results I chose two pairs of sites for the final evaluation task. In this study

the term Internet sources refers to several types of Web sites available on the Internet. Because of my

interest in examining the students’ cognitive processes while they encountered a variety of Internet


source types and problems, I sought a variety of sites for the task. I selected sites based on the following

criteria: (a) type of site—I included sites students would be likely to encounter in a typical search for

academic purposes and for extracurricular purposes; (b) type of problem—I sought sites problematic for

evaluation in different ways including problems with authorship, currency, relevance to assignment,

consistency of information, bias, and commercialism; (c) I balanced the two criteria above with concerns

for appropriateness of subject content, reading level, and student interest. Table 1 lists the sites with

explanation of the unique evaluation issues each presented.

Table 1

Internet Sites Included in Evaluation Task

Title of Site Type of Site Relevant Evaluation
Issues

Task 1, Site 1: Hoax site made by Authority of authors,
All About Explorers: Samuel de Champlain teachers to teach Reliability of info
http://allaboutexplorers.com/explorers/champlain/students Internet
evaluation skills.
Task 1, Site 2: Wikipedia—Samuel de Champlain Open source Ability of any reader to
http://en.wikipedia.org/wiki/Samuel_de_Champlain encyclopedia edit, Appropriateness
of text for 6th grade
level

Task 2, Site 1: Commercial site for Authority of authors,
Fresh Healthy Vending company franchising Possible bias, Purpose
http://www.freshhealthyvending.com/healthy- healthy food vending is commercial rather
vending/fast-food-meals-for-kids-worsen-obesity- machines than informational
in-america/
Task 2, Site 2: Government Authority of authors,
Let’s Move! informational site Site affiliation
http://www.letsmove.gov/

Each group of students examined two scenarios with two sites for each. One scenario placed

student research in the context of a social studies class; the other in the context of a student council

advising a principal. Both described the context of the information need, the specific information


sought, and the audience for the assignment. Because I was interested in examining how student

evaluation is influenced by contextual factors including purpose, goals, and audience, it was important

to include detail in this regard. Students read and examined the sites together, discussed and then listed

qualities supporting their using and not using the site (“strengths” and “weaknesses”). Then students

rated the sites for use on the project using a 5-star scale. After evaluating each pair of sites, students

responded to four questions in open response format comparing their ratings of the two sites. Appendix

A contains student instructions and task completion forms.

The decision to pre-select sites for evaluation instead of having students conduct their own

searches was carefully deliberated. Past studies have shown evaluation is an important element of the

search process (Leu et al., 2004; Coiro & Dobler, 2007). However, Cho (2011) differentiated between

anticipatory evaluation strategies used during search and selection, and confirmatory evaluation

strategies “based on an understanding of both internal and external features of texts” that occur once a

site is selected (p. 322). Anticipatory evaluation is frequently based on surface features (Cho, 2011)

while confirmatory evaluation involves closer reading (Goldman et al., in press; Kiili, Laurinen &

Marttunen, 2007). Because search difficulties consume cognition that might otherwise be used for

evaluation (Cho, 2011) and monopolize students’ time (Kiili, Laurinen, & Marttunen, 2007; Walraven,

Brandgruwel, & Boshuizen, 2009), eliminating the distraction of search might shed light on

confirmatory evaluation, which would more likely include the practices of deep reading, the application

of cognitive flexibility to different sites, and the construction of knowledge from textual and contextual

clues. In addition, in studies by Walraven et al. (2009) and Kiili et al. (2007) in which students made

their own site selections, students often rejected sites silently and without justification, thereby

obscuring the criteria they utilized in doing so. Although I recognized that students in an inauthentic

task might attend to evaluation more closely than they would in a more authentic situation, it was not my


intention to measure the amount of time spent evaluating sites, but instead to examine the criteria

students could and did apply when given adequate time to evaluate. In addition, choosing specific sites

provided the opportunity for a finely tuned examination of specific evaluation criteria that were—or

were not—applied and to examine and compare these in the context of specific site genres. For

example, in the first scenario students evaluated sites presenting information in a traditional

encyclopedic format for a traditional school assignment, but one was a spoof site and the other an open

source site. These choices lent themselves to different questions of authorship, allowing for examination

of the students’ approach to an important issue affecting a site’s general trustworthiness. In the second

scenario students evaluated sites presenting information in blog format; the first was corporate-

sponsored and the second government-sponsored. In the former the corporate sponsor was fairly

obscure (a vending machine company), while in the second the sponsor was high profile (Michelle

Obama’s Let’s Move! campaign). I hoped this comparison would encourage discussion of

trustworthiness with regard to author purpose, sponsorship, and site format.

Reading Measure

Previous research indicates that offline reading ability is related to online reading ability, and by

extension, to critical evaluation as an element of new literacy (Coiro & Dobler, 2007; Duke & Zhang,

2008; Ostenson, 2009). As a measure of offline reading performance I used a state standardized test of

reading comprehension. The Michigan Educational Assessment Program (MEAP), though admittedly a

limited view of reading ability, is considered a fairly dependable measure of offline comprehension

allowing for the consistent comparison of scores between students and across classes statewide. It was

also conveniently acquired. The MEAP gives each student a scaled score and places the student in one

of four proficiency categories: highly proficient (1), proficient (2), partially proficient (3), or not

proficient (4). For the purpose of this study these student proficiency ratings were used to divide


students into four offline comprehension proficiency groups. Within those proficiency groups I paired

each student with another whose score was as close to his or her own as possible. Reading ability does

not in itself equate with critical evaluation skill; however, since decoding and comprehension are to

some extent required for evaluation, the absence of these skills could prevent or inhibit effective

evaluation of sources. Likewise, students who excel as readers may be better equipped to make

inferences that contribute to effective evaluation. In dividing the students into MEAP level pairs, I

hoped to highlight behaviors that would be more or less prevalent at different reading levels. If patterns

existed, these might inform the design of instruction to address student needs.

Student Pairing

Students completed this task in pairs while sharing a laptop computer with Internet access. I chose

to pair students to encourage the verbalization of criteria they used in judging the quality of Internet

sources. Each pair conducted the task together, was encouraged to share both conflicting and common

opinions regarding the task, and was required to come to consensus in their answers. Since pairs

completed one set of responses together, verbalization of their reasoning would be likely.

Reinking, Malloy, Rogers, and Robbins (2007) found that the grouping of students influenced the

extent to which they exchanged ideas and strategies for completing a common task. In an effort to

control confounding peer influences, I sought the input of the core sixth grade teachers when creating

pairs. Specifically, I consulted the language arts teachers who were familiar enough with the students by

the time of the study (November) to know their personalities and friendships. In addition, I applied the

following criteria while pairing students, in order of importance:

1. MEAP score. Each pair of students shared the same proficiency rating in MEAP reading

comprehension wherever possible and was paired with a student whose score was as close to

their own as possible. This allowed for examination of the relationship between MEAP


reading comprehension proficiency and critical evaluation. Additional goals were to minimize

feelings of inadequacy, the monopolization of conversation by one person, the suppression of

vocalized thoughts by one person, or the influence of a more able reader’s approach to a text

on a less able reader.

2. Ability of the students to communicate effectively with one another. Because the collection of

data regarding students’ cognitive processes was dependent on their verbal articulation, it was

important the two students were generally friendly toward one another. Therefore, I avoided

creating pairs with personality conflicts or from competing social cliques in favor of creating

amicable pairs.

Although it was not possible to balance pairs perfectly, I made great effort to avoid pairs with

confounding issues of competition, dominance or submission.

Procedure

Students completed the task in the library media center, a large space allowing students to spread

out but also allowing for general oversight. No other students were present during task administration.

Students completed the task within the first five days of their Information Literacy class during two

regular periods between 12:00 and 12:50 p.m. or between 12:55 and 1:45 p.m. The first two classes

began the course in November, the third class in January. Students were allowed 30 minutes to

complete the first part of the task (Day 1) and 30 minutes to complete the second part of the task (Day

2), but three pairs were given additional time (up to ten minutes) because their completion of the task

required it.

Students in each pair were assigned one of two jobs: computer operator or recorder. The operator

managed computer movements during the task; the recorder wrote the pair’s answers with paper and

pencil. I sought the input of classroom teachers to determine if one of the students in each pair was


especially skilled or comfortable using a computer. If so, I assigned that student to recording rather than

computer operating. This encouraged both students to engage in the task by discouraging the computer

operator from navigating the task independently or without verbalizing his or her thoughts. Because

sites were preselected, the task required minimal navigation skills. I decided, therefore, that the

possibility of a less skilled computer operator confounding results was less likely than the possibility of

a skilled operator completing the task independently and without input.

Data Collection

Four types of data provided evidence for the research. First, each student pair completed one set

of written response sheets (see Appendix A). These included directions, an explanation of the research

scenario, a chart for listing both strengths and weaknesses for each of the two sites, a place for rating

each site and writing a one-sentence explanation of the rating, and a final page with questions asking

students to compare the two sites. Second, recorded screencasts produced a running record of the

students’ Internet navigation with the cursor viewable. Third, the screencasts recorded audio of the

students’ conversations during the task. Finally, the webcam recorded the faces of the students as they

worked. An online screencast recorder, Screencast-o-matic, was used for screencast, audio, and webcam

recordings. Twelve computer stations were set up in the media center and adjoining offices with as

much distance between them as was practical to reduce background noise and prevent pairs from

influencing each other during the task while still allowing for oversight. Students completed the task in

two consecutive class days.

Two types of problems occurred during data collection. Although pairs were predetermined,

several were reconfigured ad hoc due to student absences. This occurred for both tasks and in every

class, resulting in a total of 65 participating students being configured into 35 pairs over the course of

the study. In some cases pairs evaluating the first two sites were not identical to those evaluating the


second two sites because pairs were reconfigured due to absences on the second day of the task. In all,

there were 25 pairs remaining consistent through both tasks and ten pairs in which the members

evaluated just one of the two sets of sites. Because of this complication, a few pairs contained students

with different MEAP score levels, though students in a pair never differed by more than one level. In

those cases I identified the group MEAP level as the higher of the two, since the task was shared and

therefore the reading skill applied by the group would likely equal the skill of the highest level reader.

Video footage of the students completing the task corroborated this choice.

A second problem arose in the collection of audio and video data. Although the Screencast-O-

Matic program was opened and set to record when the students arrived, several groups inadvertently

stopped the recording or closed it before video could be saved. In other sessions the students mistakenly

turned off the computer microphones so no audio was captured. In three sessions the Screencast-O-

Matic program stopped recording inset video without warning and in two cases it completely closed

mid-recording. In all, the 35 pairs conducted 60 comparative site evaluations, each recorded separately.

Written responses were collected for all of these, but 17 videos were lost or unusable. This left 43

videos for analysis of 30 successfully recorded pairs.

Analysis of Written Data

The combination of data in varying formats provided the following datasets for analysis: written

responses to task questions, screencasts of students’ online navigation, audio of conversations between

students while completing the task, and video of students as they interacted. In groups with complete

datasets, triangulation provided varying perspectives from which to examine the students’ critical

evaluation processes.

In analyzing the datasets, written responses were the primary focus of investigation, as these

provided a view of all students’ evaluation criteria. Therefore, I analyzed all written data before viewing


the videos. Then, after completing the video analysis described below, I revised codes where the video

data justified doing so. In only six instances were written codes changed based on insight gained by

watching video, and these revisions were retained in the written data. Because students recorded site

strengths and weaknesses in chart form, with each box containing a single strength or weakness, these

were considered independent units of analysis. Responses to other questions were parsed according to

units of meaning. So, if a student summarized his site rating in a single sentence by stating, “The site is

nice looking and has tons of information” the compound sentence was divided into two units. If the

student wrote a sentence containing a list, I parsed the list into units of meaning separated by commas.

If a student conveyed a single idea in each sentence of a response, the entire sentence was considered

one unit of analysis. Complex sentences were divided into two units of analysis if each clause identified

a separate evaluation criteria, but were considered single units if clauses further explicated a unit of

meaning already expressed in the sentence.

I began analysis by reading through all responses and making general notes regarding common

themes. I followed the general principles of grounded theory articulated by Glaser and Strauss (in

Rollag, 1998) with attention to both inductive and deductive processes. In this tradition the researcher

may approach the data with a “best guess” coding system, but the final codes emerge organically. Using

this method, I began by constructing a preliminary list of codes from those used by Kiili et al. (2007),

Walraven et al., (2009) and Ostenson (2009) and grouped these into a general categories commonly

found in Internet evaluation studies. In a preliminary pass of the written data I assigned a code to any

unit of analysis appropriately described by a code on this list. When a data unit presented content not

represented by the existing list but relating to the critical evaluation process, I created and applied a new

code. Because my goal was to examine what criteria students actually applied in their evaluations, I

allowed additional codes to emerge from the data. Categories emerged through constant comparative


analysis and responses were attributed to existing codes until the need for new codes was exhausted.

When all units were coded, I compared codes across groups to fine-tune and verify codes and sub-codes.

This process was repeated three more times, until I was confident codes had been applied consistently

within and across pairs. Finally, I grouped codes into three general categories: Engagement,

Usefulness, and Trustworthiness. Most student responses could be ascribed to one of these, though two

additional categories were required for Social Appropriateness and Unjustified Evaluative Statements.

To assess the reliability of coding, a second coder with significant experience in online literacy

received three hours of training during which the coding system was explained and a single group was

randomly selected and coded by the second coder. Differences were discussed until agreement in the

application of codes was achieved. The second coder then coded written data from eight student pairs

randomly selected from the total 36 pairs. Inter-rater agreement was 84%.

Video Data Analysis

After coding and compiling written responses, I viewed videos for the purposes of (a) reinforcing

codes in the written data; (b) revising codes in the written data; (c) adding and coding evaluative

comments that were spoken but not recorded in writing; and (d) noting interesting or unexpected

behaviors related to the evaluation process. I did not transcribe the videos in their entirety. Instead, I

transcribed only comments that fell into one of the four categories above. Comments were parsed into

units of analysis in the same way written comments were parsed. However, an ongoing verbal exchange

of several turns was counted as a single unit as long as the evaluation criteria discussed remained the

same throughout the conversation—in this way, a criterion discussed by the students would be noted as

a criterion they considered, but would not be over-counted simply because they were “thinking aloud.”

If a comment reinforced an existing code—that is, it clearly was spoken in the context of writing a

specific response—I transcribed the comment in a column adjacent to the original written comment to


explicate the initial code but did not code the transcribed addition. This was a common occurrence. If a

comment prompted the revision of a code in the written data—that is, the discussion surrounding the

students’ written response indicated the initial code was inaccurate—I transcribed the comment in a

column adjacent to the original written comment and revised the initial code to reflect the added

information. This was an uncommon occurrence; as noted above, just six original codes were revised in

the entire data set based on video data. If an evaluative comment on the video was not included in the

written responses—for example, when an evaluative statement was made by one student, was followed

by a conversation about whether to include it on the written form, and the final consensus was not to

include it—I transcribed these additional comments, added them to the written dataset as separate units

to be analyzed, and coded them. In other cases the students discussed evaluation criteria but did not

record them, presumably because they either forgot to write or felt they had already written enough.

These additions accounted for some differences between the paper-only results and the paper-and-video

results. In groups where students discussed the sites at length, additions were considerable; in groups

where students discussed little, additions were minimal. To determine the rates at which students used

different criteria, I counted the frequency of pairs using each code and divided these by the total number

of pairs. I followed this process for the print data alone and again for print and video data combined.

Where interesting or unexpected comments or behaviors were noted, I transcribed or described these for

future reference but did not code them as evaluation criteria.

Findings

The following section contains the findings of this study, beginning with discussion of evaluation

categories that emerged through coding and analysis. This is followed by discussion of evaluation

criteria frequencies gathered from written data and finally by discussion of further insight gained from

video data.


Categories of Evaluation Criteria

Appendix B shows the final coding scheme for this study with questions defining each code and

examples of student responses ascribed to each code. The first category, Engagement, contained

evaluation criteria affecting a student’s engagement with a site. Engagement in this case included initial

surface-level criteria influencing whether the student was likely to remain on the site or to click away;

six general criteria emerged as influencing Engagement:

• visual elements, subdivided further into (a) pictures and images, (b) color and graphics, (c)

text qualities, and (d) general evaluative statements regarding appearance of a site;

• advertising, when viewed as a positive or attractive site element;

• interactivity, subdivided further into (a) multimedia, and (b) opportunities for social

networking or user response posting;

• links to internal or external pages that were promising for their potential;

• interest in the page’s informational content; and

• evaluative statements related to Engagement but without specific detail to reveal exact

criteria for Engagement evaluation.

Links were considered a subcategory of Engagement because, rather than providing information (in

which case they would be considered a criterion of Usefulness), they were viewed as potential sources

and students were therefore more likely to remain on sites containing those links. This code applied to

students who viewed links in a positive light and were therefore more engaged by them.

The last category above contained statements such as, “I just find this page really blah” or “This

site really hooks me.” In these examples it was difficult to determine why the students were or were not

attracted to the sites, but their comments suggested they valued the site’s ability to engage them and

were therefore coded as Engagement. Likewise, in the visual elements category, statements such as


“Oh, this is pretty!” were coded as general evaluative statements regarding appearance; they suggested

that appearance was valued but were not specific enough to determine how appearance was judged.

A second theme that emerged from the data was Usefulness. This theme might also be viewed as

“need-meeting” in that it reflected criteria valuing a site’s ability to meet the needs of the user. In this

regard needs included both informational needs—providing the information that a user needed; and

usability needs—providing information in a way that was accessible and user friendly. Usefulness

criteria included

• match to information need, subdivided further into (a) information quantity, (b) information

specificity or breadth, (c) information novelty, and (d) general evaluative statements

regarding topicality;

• currency of site;

• intended audience of the site;

• language or comprehension tools available on the site;

• organization of the site pages and/or navigation tools available on the site;

• speed, loading time, or access features of the site;

• extent to which site contained features likely to distract the reader, including links as useless

distractions or impediments to usability;

• evaluative statements related to Usefulness based on comparison to another site or source;

and

• evaluative statements related to Usefulness but without specific detail to reveal exact criteria

for Usefulness evaluation.

Comparison to other sites emerged from the data as a category for placement of statements in which

students indicated the usefulness of one site in comparison to another. This subcategory provided a


place for statements using comparative usefulness as an evaluative tool without the detail specific

enough to place them in another category. For example, “this site has more information than the other”

was coded under information quantity as the primary measure of usefulness. “This site is better for what

we need than the first one,” however, was coded as a general comparison. Therefore, evaluation by

comparison was assigned when a comparison had been done, but when evidence for no other evaluation

criteria existed.

As discussed previously, general evaluative statements were assigned the most specific criteria

reasonably determined by the statements. So, for example, if a student said a site “gives us exactly what

we’re looking for,” it was coded as match to information need (d) because it lacked the specificity

required to determine exactly how the information need was met. Likewise, the statement “This one is

really user friendly” was recognized as belonging in the Usefulness category but could not be assigned a

specific code within it. A “user friendly” site might be one created for a student audience, one providing

language tools, one that’s organized, and so forth, therefore landing in the last category above.

The third theme that emerged from the data was Trustworthiness. This term has been used by

researchers (Bråten, Strømsø, & Britt, 2009, 2011; Jessen & Jorgensen, 2012) to include criteria

affecting a user’s confidence in a site or in the information it provides. Trustworthiness encompasses

reliability and to some extent quality as perceived by the user. Criteria for Trustworthiness included

• whether references or sources were provided;

• whether information matched the prior knowledge of the user;

• site type or genre;

• authorship;

• site reputation or prior experience with site;

• purpose or intended audience of the site;


• cross-textual or outside verification of a site’s trustworthiness; and

• evaluative statements related to Trustworthiness but without specific detail to reveal exact

criteria for Trustworthiness evaluation.

Consistent with rationale described for the previous two themes, the final category above contained

statements such as “I don’t know if I believe this,” which could clearly be assigned to the

Trustworthiness theme but were not detailed enough to be ascribed to a specific criterion within it.

One minor category and one general category emerged from the data in addition to the above three

categories. Social Appropriateness was added to encompass statements in which students made value

judgments of a site based on whether they perceived its content as socially, morally, or school

appropriate. For example, in response to the photograph of an obese child on one site, a student

described a weakness by noting, “It’s rude.” In this case, it appeared the student objected to the picture

because it inconsiderately drew attention to a negative quality, and having it posted on the Internet could

cause emotional harm to the obese child. In another case a student described a site as “inappropriate”

for school because it contained information about the murder of an English monarch. Evaluative

statements like these were not common but did require the addition of a fourth category.

The final category encompassed evaluative statements that could not be ascribed to any specific

thematic group. These statements were fairly frequent. I assigned statements to this category when they

indicated a student was evaluating a site without indicating the criteria he was using, for example, “I just

really don’t like this site” or “This is great information!” Although it contained the word ‘information’

the latter of these could not be placed in the general Usefulness category because it was not clear

whether the student viewed the site as ‘great’ because the information was engaging (Engagement),

whether it met a need (Usefulness), or whether it was trustworthy (Trustworthiness). This fifth category

emerged for such very general evaluative statements.


Frequencies of Evaluation Criteria

Table 2 shows the percentage of pairs who applied criteria from each of the general categories at

least once in their written responses as well as the percentage of pairs evaluating sites by each of the

sub-categories within the broader categories. Figure 1 shows the frequencies of all criteria ranked from

most to least frequent and color-coded by general category. All pairs cited criteria falling into the

Usefulness category. More than three quarters of the pairs (83%) cited criteria relating to Engagement,

54% of the pairs cited criteria relating to Trustworthiness, and 54% made evaluative statements that

were too general to categorize. Finally, 14% of pairs cited criteria relating to Social Appropriateness.

Following is a discussion of specific findings within each of the five categories in order of most

frequently appearing category (Usefulness) to least frequently appearing category (Social

Appropriateness).


Use of Evaluation Criteria as a Percentage of Pairs Applying Criteria

Criteria Percentage

Engagement .83
Pictures .80
Promising Links .43
Social Networking/Response Forum .23
General Attractiveness Unspecified .20
Interesting Content .17
Color .14
Advertising (Viewed Positively) .11
Multimedia .11
Text Appearance   .06
General Engagement Unspecified .03
Usefulness      1.00
Information Specificity/Breadth   .94
Information Quantity .91
Organization/Navigational Tools .66
Information Usefulness Unspecified .57
Language/Comprehension Tools .46
Usefulness by Comparison   .26
Distracting Features .17
Intended Audience .17
Speed/Access .17
Currency .11
Information Novelty .11
General Usefulness Unspecified   .06
Trustworthiness .66
General Trustworthiness Unspecified   .31
Authorship .26
Prior Knowledge of Topic .14
Site Reputation   .11
References/Sourcing .06
Site Genre   .03
Site Purpose .03
Social Appropriateness .11
General Value Judgment Unspecified     .54


Figure 1

Evaluation Criteria as a Percentage of Groups

Criteria as Percentage of Groups
0                  20             40                60                80
Info SpeciIicity 100
Info Quantity
Pictures
Org/Nav Tools
Gen Value Judgment
Lang/Comp Tools
Links
Gen Trustworth Unspec
Authorship
Usefulness by Comp
SocNet/Forums
Gen Attrac Unspec
Speed/Access
Distracting Feat
Intend Audience
Interestng Content
Prior Knowl Topic
Color
Appropriateness
Site Reputation
Currency
Info Novelty
Multimedia KEY:
Advertising (Pos)
References/Sourcing                   = Usefulness

Gen Usefulness Unspec
                  = Engagement
Text Appearance
Site Purpose                   = Trustworthiness
Site Genre
                  = Social Approp
Info UseIl Unspec

Gen Engagmt Unspec                   = Gen Value Judgement
0 20 40 60 80 100


Usefulness. Usefulness was the most common criterion students used to evaluate the sites. Not

surprisingly, whether a site provided information matching a need was the most common gauge of its

usefulness for students. Every pair in the study evaluated sites according to whether they met an

information need at some point during the task. Of subcategories relating to information need, three

were most commonly cited. Comments expressing interest in the specificity/breadth of information

available were most common (94% of pairs); for example, students wrote that a source gave “more

detail,” gave “important dates of time,” or more specifically told “when he was born, died, and his

religion.” When students commented on the quantity of information (91% of pairs), they wrote

statements such as “there’s way too many facts,” or “too short about him.” General statements of

information need (57%) included responses such as “no facts about him” or “gave us the information we

needed.” Occasionally students considered the novelty of information provided (11%), as in “tells us

new stuff.”

Other criteria commonly cited in the Usefulness category were organization/navigation,

language/comprehension tools, and comparison to other sites. Organization/navigation was a broad sub-

category encompassing text features such as subtitles, headings, captions or footnotes; organizational

tools such as contents, indexes, tags and categories; or navigational tools such as search boxes, tabs or

site maps. Over two thirds of pairs (66%) recognized some element of these in determining a site’s

usefulness. Statements indicating an interest in organization were those such as “a lot of captions,”

“organized in order by year,” “they’re put in paragraphs with directing headlines,” or “needs to be more

categorized.” Language/comprehension tools, cited by 46% of pairs, encompassed site elements that

affected one’s ability to read and comprehend the information presented. Here students most commonly

referenced reading level or difficulty of vocabulary, but many also noted within-text hyperlinks as useful

for looking up the meaning of a word or clarifying a concept. For example, one student wrote, “it says


how to pronounce his name in French,” and another wrote, “it gives the definition of places and people.”

About one quarter of the student pairs (26%) determined usefulness by comparing to other sites but

without citing specific criteria related to usefulness. The least common categories within Usefulness

were usefulness based on intended audience of the site (17%), “not as good for kids”; presence of

distracting features (17%), “It has links and advertisements that might distract us from are (sic) work”;

speed or access issues (17%), “You have to be a member to get more information”; currency of the site

(11%), “not a new website”; and general evaluative statements about Usefulness (6%), “needs to be

more user friendly.”

Engagement. The second most commonly cited evaluation category was Engagement. Of all

criteria within Engagement, reference to pictures was the most common, with 80% of groups making

mention of pictures in their assessments. Students frequently wrote that there were “lots of pictures” or

“not that many pictures” as strengths or weaknesses. The existence or quantity of promising links on a

page or site emerged in 43% of groups. Here students’ written comments included “links to other web

sites,” “gives you other places to look,” and “more places to get information.”

The category of interactivity was subdivided into two codes: (a) multimedia; and (b) opportunities

for social networking, response/participatory forums. The second of these was the third most utilized

code under Engagement, with 23% of groups including it in their criteria. Almost one quarter of the

students, then, viewed the opportunity to engage with other users or with the authors as a site strength.

These students wrote comments such as “able to write back” or “it has a way to contact them.” General

attractiveness was mentioned by 20% of groups, as was evidenced by comments such as “visually

attractive” or “boring and plain.” Another 17% of groups indicated that their interest in the content of a

page was a factor in their evaluation. These students wrote that “the article was interesting” or “one had

boring info.” The subcategory of multimedia content and the category of advertisements presented in


11% of the groups. In the multimedia subcategory students wrote it was a strength when a site “gave

you treasure hunts,” “show(ed) videos,” or had “slide shows.” In the advertisements category one pair

indicated they viewed advertising positively by listing “good advertising” as a strength. Engagement

categories emerging from the data least frequently were use of color (14%), “really white-hurts eyes”;

visual features of text (6%), “the fonts are small;” and general evaluative statements regarding

Engagement (3%), “needs to be more exciting.”

Trustworthiness. Trustworthiness was the third most cited category with 57% of groups

including criteria related to trustworthiness at least once in their written responses. General evaluative

statements of trustworthiness appeared most frequently (31%). These included comments such as, “we

don’t know what’s accurate” and “correct information.” The most common criterion by which

trustworthiness was determined was authorship, with 26% of groups evaluating at least one site by

considering issues of authorship. Many of those were in reference to Wikipedia. For example, it was

common for students to write statements like, “Anyone can write on site” or “Wikipedia, anyone can

change it.” Neither example references authorship specifically, but both indicate the students were

concerned that issues of authorship called into question the site’s trustworthiness. A second criterion

students used to evaluate trustworthiness was the students’ prior knowledge of the site’s topic. This

code was applied when students made evaluative comments suggesting that an inconsistency between

the text and their prior understanding of the topic called into question the site’s trustworthiness.

Comments like these were made only when students evaluated All About Explorers, a spoof site created

by media specialists to test students’ critical evaluation skills. The All About Explorers page about

Samuel de Champlain contained gross errors and anachronisms intended to be clues to the site’s

questionable reliability, but only 14% of groups noted those inconsistencies in their written evaluations.

In response to the article’s assertion that Samuel de Champlain owned a National Hockey Team, one


pair wrote, “How could he be on a hockey team?” and another wrote, “how did he get a hockey league

back in the 1800’s (sic).” Another pair wrote, “Seems to go back and forth between times,” and still

another wrote, “dates and events are scrambled.” Evaluation based on site reputation emerged in 11% of

groups in response to both Wikipedia and the Let’s Move! site. When students wrote, “not a trusted web

site” or “On Wikipedia not everything’s true,” their comments were assigned to the subcategory site

reputation because they recognized Wikipedia as a site that might not be trustworthy, but did not

necessarily understand that Wikipedia can be authored by its users. Only 6% of groups mentioned

references or sourcing as criteria for evaluation of trustworthiness, with one group noting “it doesn’t tell

where the info came from.” Finally, written data showed just one group (3%) based evaluation on site

purpose, writing that the site was “not really for a vending machine” while apparently working from the

assumption that seeking a vending machine was a purpose of the exercise.

The fourth category emerging from the data, General Value Judgments, emerged to include

general evaluative statements that could not be assigned to a specific category because they were not

based on any clear criteria. Of all pairs, 54% made comments falling into this category. These were

statements such as, “good facts,” “They have a president’s challenge—that’s really good!” or “not 100%

comfortable with this site.”

The least common category emerging from the data was Social Appropriateness; 11% of groups

made comments attributable to this category. Evidence of students evaluating sites based on Social

Appropriateness included comments describing site content as “rude” or “inappropriate.”

Video Analysis

In analyzing video data several themes emerged that enriched the findings regarding evaluation

criteria used by students and provided further insight into their behaviors as they evaluated Internet sites.

The analysis below begins with discussion of certain patterns of reading behaviors practiced by the


students, followed by discussion of noteworthy conversations and/or behaviors that accompanied the

students’ application of several types of criteria.

Reading behaviors. Video provided insight into several patterns of reading behaviors practiced by

students during the evaluation task. The purposes of the two tasks were to collect information for a

school presentation and to provide input on healthy snacks to the school principal. Both required

evaluation of the information provided on the sites, which in turn required some level of reading

comprehension. The extent to which students read the information on each web page was therefore

relevant to their evaluation of those sites.

The combination of screencast with cursor highlighted, audio, and video inset of students viewing

the screen was adequate to determine whether students read the paragraphed text aloud, read the text

silently, or glanced at the pages without reading in depth. Pairs who could be seen and heard reading the

text aloud were assigned a two; pairs whose visual focus, scrolling patterns, and conversation indicated

they were reading at least part of the text silently were assigned a one; and those whose visual focus,

scrolling patterns, and conversation indicated they were glancing at titles or not reading at all were

assigned a zero. Students following this last pattern scrolled down the page too quickly to do more than

glance at titles, looked away from the screen, or began speaking while glancing. Table 3 shows the

number of videotaped pairs following each reading pattern for each of the four web sites along with the

MEAP level of each of those pairs, while Table 4 shows the percentage of those pairs who practiced

each of the three reading patterns while evaluating the four sites.


Table 3

Reading Patterns of Videotaped Pairs by Web Site and MEAP Level
_____________________________________________________________________________________________
Pair MEAP All About   Wikipedia** Fresh Healthy Let’s Move!**
Level* Explorers** Vending**
_______________________________________________________________________________________________________________
1 1 1 1 1 1
2 1 0 0 2 0
3 1 2 2 2 2
4 1 0 0 0 0
5 1 1 0 1 1
6 1 2 0 2 2
7 1 2 2 2 2
8 1 0 0 0 0
9   1 1 0 1 0
10 1 1 0 0 0
11 1 1 0 1 1
12 2 0 0 0 0
13 2 1 0 1 0
14 1 0 0
15 1 0 0
16 1 2 0
17 1 2 2
18 1 0 0
19 1 0 0
20 2 0 0
21 3 0 0
22 1 0 0
23 1 0 0
24 1 0 0
25 1 0 0
26 1 2 2
27 2 2 2
28 2 2 0
29 2 2 2
30 3 0 0
________________________________________________________________________________________________
Note. * 1 = Highly proficient; 2 = Proficient; 3 = Not Yet Proficient
          **2 = Read Text Aloud; 1 = Read Text Silently; 0 = Glanced or Did Not Read Text


Table 4

Reading Patterns of Each Site by Percentage of Videotaped Pairs Applying Them
_______________________________________________________________________________________________________
Web Site Read Aloud Read Silently Glanced/Did Not Read
_______________________________________________________________________________________________________
Task 1

All About Explorers .29 .24 .48

Wikipedia .14 .05 .81

Task 2

Fresh Healthy Vending .36 .23 .41

Let’s Move! .27 .14 .59

Engagement. The first category in which video data provided deeper insight into evaluation

criteria was in the category of Engagement. Within this category, discussion is warranted regarding

visual aspects of web sites and site interactivity.

Visual aspects. Frequencies in the written data showed that visual aspects of web sites were

important to students, but videos suggested this was true even among better readers. As discussed

earlier, MEAP reading scores were used to assess offline reading ability before students were paired.

Raw scores ranged from 596, the highest recorded score in the class, to 489, the lowest recorded score in

the class. MEAP scores of 538 and above were considered “highly proficient” according to Michigan

State Department of Education scoring. Seventeen pairs whose MEAP scores indicated highly

proficient (Level 1) reading ability evaluated at least one site based on visual aspects rather than on

written content, as indicated by their not reading at least one of the four sites while evaluating. For

example, Dawson, whose MEAP score was a 559, did not read the text on any of the sites but expressed

preference for one site over another because “It's a whole heckuva lot more colorful, instead of having a

whole bunch of words on it . . . that isn’t very cool, man . . .” and later reiterated, “It just doesn't have


very many pictures; it's more about the information, which is just kinda not cool; I like pictures.”

Another group, Jada and Mark, whose MEAP reading scores were 552 and 578, also did not read the

text on either of the two sites they evaluated. In discussing one of them Jada said, “I think it's good

‘cause it's got pictures that help demonstrate things. Well, maybe not demonstrate but show what it's

talking about, ‘cause if it's just a web site with, like, two full pages of words, you don't really wanna

read that.” Despite their strong offline reading ability, these students chose not to read the sites and

instead prioritized visual aspects over site content. Only one student in the videos made a comment

suggesting pictures were less important that content. Erin responded to her partner, Ethan in the

following exchange:

Ethan: There isn't many pictures…

Erin: Yeah, but you don't really need many pictures, and there’s more than the other page.

Erin and Ethan were one of the pairs who did choose to read the text on the page before evaluating it,

not surprising in light of Erin’s recognition that content was more important in her evaluation than

pictures.

Interestingly, many students showed concern not just for the presence of pictures but also for their

quality. Austin, a special education student with a reading disability, and Noah, a Title I student, did not

read the text on the web sites but were careful to note differences between the quality of the pictures.

Austin said, “It didn’t have enough pics, like colorful. They’re all black and white.” Later he reacted to

a colorful portrait on the Wikipedia site: “See, that’s a good picture there.” Mark was critical of the All

About Explorers site because its portrait of Samuel de Champlain was a black and white drawing: “just

sketches, not actual pictures.” He preferred the Wikipedia site with its painted portrait and photographs

of geographical sites related to de Champlain, explaining his preference for “actual pictures, and pictures

of statues.” Addison, referring to the same drawn portrait, listed it as a weakness because “the first


picture was a cartoon.” Another pair, Lukas and José, had the following exchange:

Lukas: Do you think it would be better if they took a real picture of him?

José: I think a camera was made in the . . . (inaudible)

Lukas: Okay that's true, but if it wasn't true, would you be more comfortable with a . . . camera

picture or a drawn picture?

José: A camera picture.

In these examples students expressed preference for color and realism over black and white drawings

and expected high quality photographic images on a site even if the topic was historical.

Interactivity. A second theme surfaced in the students’ discussions of site interactivity. About a

quarter of the students found opportunities for social networking, response postings, or interactive

features important, but most comments didn’t reveal a thorough investigation of these features or their

functions. For example, Bailey saw that All About Explorers had a page with treasure hunts. She

glanced momentarily at the screen while scrolling down far enough to read the subtitle “Your Mission”

and said to her partner, Karlos,

Bailey: “Oh, it gives you treasure hunts; that’s a good thing.”

Karlos: “Gives you something to do.”

Their discussion does not go beyond this exchange and they quickly leave the Treasure Hunts page.

Dawson mentions the treasure hunt as a positive feature when he says to Mark, “And it has a treasure

hunt, which I'm not really sure what that is… whoa, I see what this is. Mark, you can put down treasure

hunts. It has like stuff that you can do for that.” Like Bailey, Dawson did not read the Treasure Hunts

page, but skimmed it enough to feel confident that he understood it, and regarded it a strength.

Whether they thoroughly investigated them or not, most students viewed opportunities for

interactivity as positive. Dawson and Kaelie mentioned that “You can share information; you can blog”


on the Fresh Healthy Vending site and when they noticed the social networking icons in the lower right

corner of the screen, Dawson said, “you can talk about this on Facebook . . . and then you can Twitter on

here…there's Twitter, My Space, Facebook, and Youtube.” However, there isn’t more discussion of

what it means to “Twitter” or how these social networking sites would be useful to their ends.

Jada and Mikaela justified their interest in interactivity more clearly in the following exchange:

Mikaela: It should be more child enjoyable where kids can actually do stuff on it instead of just

read.

Jada: Yeah, I remember a social studies site where you could, like, pretend you were an explorer

and choose if you wanted to, like, trade with the Indians and stuff like that... I learned a lot more. I

got a 100% on that test. I would use that instead of this. But if I didn't have that, I'd probably use

this.

Here Mikaela and Jada relayed a clearer vision of what interactive features mean to them, emphasizing

the ability to interact not just with the creators of the site and with other users, but with the information

as well.

Other students grappled with aspects of interactivity to determine if they were strengths or

weaknesses. Kiki noticed that Fresh Healthy Vending was in blog format and allowed for readers to

post responses to the page. She wasn’t sure if this was positive or negative: “Maybe another strength is

that you can post your responses on the site…also could be a weakness, too, but… (trails off)”

Likewise, Mercelia and Erik noticed the Facebook link on the Let’s Move site and contemplated its

value:

Mercelia: Oh look--they have a Facebook! They have a link on Facebook…oh wait, that could be

a bad thing. No, don't put that one down, because Facebook could be good and bad.

Erik: Some people like it and some people see it as... (trails off)


It was interesting that in these cases the students were uncertain whether opportunities for social

networking should be viewed as strengths or weaknesses within the context of a school research

assignment.

Social Appropriateness. A second category in which video data provided insight into the

students’ application of the criteria in question was in the category of Social Appropriateness. In this

case comments were made almost exclusively about the Fresh Healthy Vending site, which had on its

page an impactful photograph of an obese child eating at McDonald’s. The photograph caught the

immediate attention of many students, and some clearly viewed it as inappropriate. Three of the seven

videotaped groups attending to Social Appropriateness in their evaluation did so by commenting on this

photograph, all of them negatively. Bray and Alex said the photo was “rude,” John said it was “mean,”

and Kaemin thought users would have “rude responses” to it. These students felt sympathy for the boy

in the photo and believed it was inconsiderate of site creators to publicize his obesity via a web site

photo. However, two groups found the same web site impactful in a positive way, commenting that it

would “inspire kids to go out and play,” and that it “makes you want to eat healthy.” One group

commented on the social appropriateness of the All About Explorers site, stating, “there’s a whole

chapter on how he was murdered . . . not really school appropriate.” In summary, seven of the 30

videotaped groups did consider the social appropriateness of the web sites, bringing to their evaluations

personal beliefs regarding online content they viewed as appropriate or inappropriate for certain social

settings or uses.

Usefulness. In the third category where video enriched frequency data, students discussed criteria

related to Usefulness. Specifically, the videos provided additional understanding of students’ concern

with whether a web site matched an information need and with organizational features, specifically

traditional and online text features. I divide the following discussion into three relevant points of


interest: match to information need, traditional offline text features, and online text features.

Match to information need. Video of students during the evaluation task also revealed patterns

regarding how the students determined the usefulness of the web sites they encountered. As frequencies

show, the students were largely concerned with whether sites met their information needs. However,

their discussions provided further insight into specific evaluation criteria when viewed in the context of

reading behaviors. For example, although all the students showed concern for whether web sites met

their information needs, many did so without reading the text in question. Of the total 30 pairs who

were successfully videotaped, 21 pairs evaluated at least one site without reading its paragraphed text,

and in eighteen of those pairs the students made evaluative statements regarding the site’s relevance to

their information needs. Their comments included, “It shows, like, the complete biographies and

autobiographies about them,” “It tells you what to eat,” or “It shows many food choices and is very

detailed.” In these cases, the determination of whether a site matched the information needs of its user

was made by quickly glancing, scanning headings, or by assessing quantity of information—that is,

through methods other than close reading.

Not surprisingly, students who did not read closely often evaluated the usefulness of information

on quantity rather than on content. In many cases quantity was viewed positively. For example, Mark

favored the Wikipedia site over the All About Explorers site: “This has a LOT better stuff, I mean, look

at this--it's huge!” Alex also favored Wikipedia because “It gave way more information,” as did

Dawson: “This has a lot better information, I mean, look at this, it’s HUGE!” And Bailey viewed All

About Explorers less favorably: “I wouldn’t use that site, it just doesn't have enough information, look.”

Liz also thought it was a problem that the site had “not that many paragraphs.” Mercelia, however,

favored All About Explorers. She responded to it affirmatively by exclaiming, “Look at all this! We

could so do a total report on this!” When Kyle and Logan evaluated Fresh Healthy Vending, they


decided it didn’t “have a whole lot of information… yeah, for research it's not enough.” Kaelie shared a

similar view in the context of a conversation about weaknesses of the same site when she said, “There

isn’t much writing.” In all of the cases above the students did not read the sites closely but evaluated

them favorably when it appeared there was a large amount of information to read.

There were some students, however, who viewed large amounts of information less favorably.

These students also did not read the text closely, but made negative evaluative statements based on

length. Addison saw how long the Wikipedia page was and responded negatively: “Geez! Dude, this is

WAY too long.” Rose Mary agreed that “There's too much words; they need to, like, narrow it down.”

Steve viewed the All About Explorers site as more manageable in length, stating, “It's the perfect amount

of information for a site; not too much and not too little.”

Other students didn’t completely agree on how much information was ideal. Ben and Akasha have

this brief conversation regarding the length of the Wikipedia page:

Akasha: Too long . . . (scrolling down the page)

Ben: Well, nothing is TOO long.

Akasha: Really? What about a report? That would be WAY too long.

Likewise, Mercelia and Erik discuss how much is too much information when evaluating Let’s Move!:

Mercelia: This is a lot of facts!

Erik: Too much facts?

Mercelia: No, it’s good!

Erik: Too much facts isn’t good.

Mercelia: Yes, but this isn’t too much facts.

These students struggled to determine whether more information was better when it came to evaluating

the sites and sometimes, like Steve, compared the lengths of the two sites to help determine which

A study of sixth graders’ critical evaluation of Internet sources

A study of sixth graders’ critical evaluation of Internet sources

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Similar to A study of sixth graders’ critical evaluation of Internet sources

Similar to A study of sixth graders’ critical evaluation of Internet sources (20)

More from aj6785

More from aj6785 (13)

Recently uploaded

Recently uploaded (20)

A study of sixth graders’ critical evaluation of Internet sources