Assessment is a great tool for learning if it is used properly. But, when assessments are developed poorly or implemented improperly it can be pretty scary. There are so many ways to make mistakes, but we are limiting our discussion to four areas to avoid scaring you too much.
-
Test Misuse- Number one on our list is the use of tests for purposes for which they were not intended and have not been validated. Tests are designed for a specific purpose(s). As with many things, it is impossible to be all things for all people. If a test is well developed, it is likely to meet its intended purpose(s) well. The problems arise when that test is then used for a purpose other than for which it is intended.
Before you can use a test for a new purpose, you need to find out whether it is well-suited for that purpose. Measurement experts refer to this as validation; a test must be validated for its intended purpose. The test user must gather evidence supporting its legitimacy for the new purpose. Here are a few good examples of how tests can be misused:
-
A college admissions test is designed to predict how students will do in their freshman year of college. While the test is shown to be valid for that purpose, a college proceeds to use that test for placement into freshman English courses, without finding out if it is useful for that purpose.
-
A publisher provides a school with a formative math assessment used to provide ongoing feedback to students, that is shown to be effective for that purpose. The school then decides to use the test as a high stakes assessment for deciding on whether students should be promoted to the next grade.
In short, just because a test works well for one purpose, it does not necessarily mean it is useful for other purposes. If you think it can be used in new ways, check it out first!
-
Test Unreliability- Fundamentally, you need to be able to count on a test giving you consistent information from time to time and from form to form. If you can't get the same information (e.g., score) on a test from one day to the next, then how can you be confident in the information you are getting. If the test is not reliable, then you really are not getting useful information. You certainly would not want to make important decisions based on an unreliable test.
There are several ways to evaluate the reliability of a test; SEG, or any other reputable testing organization, can help you make this determination. The bottom line is: make sure the test is reliable; don't just assume that it is consistent enough to use for making important decisions.
-
Poor Alignment of Content- All too often we see tests that are not well aligned to the content. A test, may otherwise follow good test development practice, but fail to align properly with the content that is being taught. Again, if a test is not precisely aligned to the domain of content you are interested in, the information you get from that test is likely not telling you what you need to know.
This often happens when the developer of the test very generally identifies the domain to be covered, then rushes forward, writing items that do not really reflect what is being taught. For example, we once saw a test labeled American History that measured the range of important content from the inception of the nation through the current day. But, the text/course it was designed to measure only covered content up until 1900. This misalignment severely mitigated the value of this test.
The best way to avoid poor alignment is to clearly identify the content to be measured, using detailed objectives, competencies, a content outline or other definitional approach. And, clearly specified the extent to which each of these definitional elements should be covered.
-
Mistaking the Map for the Territory- Tests are proxies. No matter how thorough, no matter what types of test items you are using, and no matter how good it is, a test is still a representation of the underlying knowledge, skill, attitude or other underlying construct for which you are trying to get information. Tests are a useful, but imperfect, representation of knowledge or skills, much like a map is a useful, but imperfect, representation of the territory it pictures.
Remembering this is critical. So, a student who gets an "80" on a test is not an "80". This information should be used in conjunction with other information in order to get a more complete picture of the students' knowledge or skills.
-
Construct Irrelevant Variance- This is a great term to use when you want to impress people at cocktail parties. It is a fancy way of saying that some of what you are measuring with a test is the result of something other than the knowledge or skills you intended to measure. When this happens, the information you get from the test is, in part, information about something other than the content and skills you thought you were measuring.
There are many possible sources of "construct irrelevant variance". Some common examples include: speediness, or the effect of not having enough time to complete the test properly, misunderstanding the instructions, unintended trickiness of the questions, and a lack of familiarity with the type/style of the questions. The bottom line: You should take reasonable steps to eliminate any extraneous influences on test scores. You want to be sure that the results are truly a reflection on the test takers knowledge and skill.
By engaging the help of a professional assessment development organization, such as SEG Measurement, you can avoid these and other pitfalls.
SEG has worked with many educational publishers and technology providers, from start-ups to the largest industry players, to develop high quality assessment programs. With nearly 40 years of experience in research, we know what it takes to conduct sound efficacy research and create high quality assessments. Please email us to discuss your needs: selliot@segmeasurement.com or call us at 800.254 7670 ext. 102.
|