ONPAR High School Science Test Overview
Assessing REAL Science on a Large-Scale Assessment: The Promise of Computer-Interactive Items for High School Students with Language Challenges
WIDA and the Center for Applied Linguistics, in partnership with Pacific Metrics Corporation, the Virginia Department of Education, in partnership with the New Jersey and North Dakota Departments of Education, are working on a project to improve the assessment of complex science knowledge and skills in two end-of-semester benchmark tests in Chemistry and Biology for all high school students, and especially those with language challenges (i.e. less English-proficient English language learners, students with learning disabilities in reading, and students with hearing impairments). We are developing and studying computer-based interactive item prototypes, and considering when these kinds of items are comparable to traditional item approaches.
The dynamic items use the computer's capabilities to replace large amounts of language by using animation and interactive techniques to present items, and allow students to demonstrate their skills by interacting with stimuli, assembling, modeling, and drawing. Some of the cognitively complex interactive items also use programmed algorithms to present sequenced items where students' responses to a first set of questions condition how they will move through the item to a common final screen. Comparability of the interactive items with language-intensive traditional items will be investigated by studying how the students with language challenges and native English speakers with no IEPs will perform on pairs of traditional and interactive items which measure the same target content at the same grain size.
Because of the complex comparability issues that arise when different kinds of items, forms and tests are used in a state's academic testing system, the project will convene a cognitive panel to develop a defensible codification system that will define comparability arguments. This codification system will delineate the benefits and limits of different types of observations and explicate the kinds of evidence needed to defend common score inferences when the skills of different students are measured with different instruments, or when item types in the assessment system change over time.
|

|