Standardized Testing Validity

882 Words2 Pages

For centuries, assessments have been used to determine student knowledge level and understanding. In recent years, testing has become a major topic of debate; some believe testing students is the best way to measure student growth and performance, as well as teacher effectiveness. From the other perspective, some believe that increased standardized testing lacks validity and reliability. Quite often, other elements are not considered such as students with learning disabilities, students with varying socioeconomic status and race, and students testing in their second language. Assessments taken by students with a different first language encounter several issues in addition to language tests such as the TOEFL test or placement tests. It is questioned …show more content…

Validity applies to whether a test is measuring what it is supposed to measure, meaning a test for reading comprehension isn’t actually measuring listening abilities. There are three different forms of validity that further address the assessment: face validity, content validity, and criterion-related validity. Face validity refers to whether the test appears like that type of assessment should, content validity makes sure the questions apply, and criterion-related validity compares the test with outside criteria. Major standardized tests such as the TOEFL and IELTS are often considered valid, but many disagree as there can always be issues when building the test. Reliability is another contested element of assessment. Reliability assumes that a test will produce the same scores when given to the same group of people. With standardized test, reliability is more difficult to ensure because test questions can be biased, testing conditions can vary, and the students’ mental condition can be different. These elements can skew results, which is why many are frowning upon standardized assessments and looking further into test development or other non-standardized testing …show more content…

Test developers have to be sure that test questions do not contain any type of bias. This can be extra difficult when considering those who do not have English as a first language. While some questions and answers could seem fairly straightforward, many can pose a problem to those from other countries that have differing traditions, customs, and values. Traditionally, assessment creators have followed a certain structure to help them create the materials. This included writing the skill, writing the specification, writing the item, assembling the test, and piloting the test while gaining critical feedback throughout the process (Coombe et al. 2012). While straightforward, several problems arose when trying to use this method for language testing. One single test specification can produce multiple questions on multiple assessments, but the writers are often restricted to a certain type or format of question. When looking at language learners, this model did not accurately represent the desired outcome of language programs. It is questioned whether the test items accurately reflect the desired objective from the test creator and the language teacher. Test security also becomes an issue. It creates a catch-22 in that to make the test items more accurate more people should provide feedback, but the more people that view the test, the less secure it becomes. It is

Open Document