Versant

The Versant suite of tests are computerized tests of spoken language available from Pearson PLC. Versant tests were the first fully automated tests of spoken language to use advanced speech processing technology (including speech recognition) to assess the spoken language skills of non-native speakers. The Versant language suite includes tests of English, Spanish, Dutch, French, and Arabic. Versant technology has also been applied to the assessment of Aviation English, children's oral reading assessment, and adult literacy assessment.

History

In 1996, Jared Bernstein and Brent Townsend founded Ordinate Corporation to develop a system that would use speech processing technology and linguistic and test theory to provide an automatically delivered and automatically scored spoken language test. The first English test was called PhonePass. It was the first fully computerized test of spoken language using speech recognition technology. In 2002, the name PhonePass was changed to PhonePass SET-10 (Spoken English Test) or simply SET-10. In 2003 Ordinate was acquired by Harcourt Assessment and later in 2005 the name of the test changed to its current name, Versant. In January 2008, Harcourt Assessment (including Ordinate Corporation) was acquired by Pearson and Ordinate Corporation became part of the Knowledge Technologies group of Pearson. In June 2010, Versant Pro Speaking Test and Versant Pro Writing Test, were launched.

Product description

Versant tests are typically fifteen-minute tests of speaking and listening skills for adult language learners. (Test length varies slightly depending on the test). The test is delivered over the telephone or on a computer and is scored by computer using pre-determined data-driven algorithms. During the test, the system presents a series of recorded prompts at a conversational pace and elicits oral responses from the test-taker.^[1] The Versant tests are available as several products:

Versant English Test
Versant English - Placement Test [1]
Versant English - Writing Test
Versant Spanish Test
Versant Arabic Test
Versant French Test
Versant Aviation English

Additionally, several domain-specific tests have been created using the Versant framework in collaboration with other organizations. These tests include the Versant Aviation English Test (for aviation personnel), the Versant Junior English Test (for learners of English, ages 5 to 12), and the Dutch immigration test (exclusively available through Dutch Embassies). The Versant scoring system also provides automated scoring of the spoken portion of the four-skills test, Pearson Test of English, available in late 2009.

Versant test construct

Versant tests measure "facility in a spoken language", defined as the ability to understand spoken language on everyday topics and to respond appropriately at a native-like conversational pace. While keeping up with the conversational pace, a person has to track what is being said, extract meaning as speech continues, and formulate and produce a relevant and intelligible response.^[2] The Versant tests are designed to measure these real-time psycholinguistic aspects of spoken performance in a second language.

Test format and tasks

Versant tests typically have six tasks: Reading, Repeats, Short Answer Questions, Sentence Builds, Story Retelling, and Open Questions.

Task	Description
A. Reading	Test takers read printed, numbered sentences. The sentences are relatively simple in structure and vocabulary, so they can be read easily and in a fluent manner by literate speakers of the target language.
B. Repeats	Test takers repeat sentences verbatim. Sentences are presented in order of increasing difficulty. The audio item prompts are delivered at a natural conversational pace.
C. Short Answer Questions	Test takers listen to questions and answer each of these questions with a single word or short phrase. Each question asks for basic information, or requires simple inferences based on time, sequence, number, lexical content, or logic.
D. Sentence Builds	Test takers are presented with three short phrases. The phrases are presented in random order and the test taker is asked to rearrange them into a sentence.
E. Story Retelling	Test takers listen to a story and are then asked to describe what happened in their own words. Test takers are encouraged to tell as much of the story as they can, including the situation, characters, actions and ending.
F. Open Questions	Test takers listen to a question asking for an opinion and provide an answer with an explanation. The questions deal either with family life or with the test taker's preferences and choices. This task is used to collect a spontaneous speech sample. The test taker's responses are not scored automatically, but these responses are available for human review by authorized listeners.

Versant technology

Automated administration

Versant tests can be administered over the telephone or on a computer. Test takers can access and complete the tests from any location where there is a landline telephone or an internet connection.

Test takers are given a Test Identification Number and listen to a recorded examiner's voice for instructions which are also printed verbatim on the test paper or computer screen. Throughout the test, test takers listen to recorded item prompts read by a variety of native speakers. Because the test is automated, large numbers of tests can be administered and scored very rapidly.

Automated scoring technology

Versant test scores are posted on-line within minutes of the completed test. Test administrators and test takers can view and print out their test results by entering their Test Identification Number on the Versant website. The Versant score report is composed of an Overall score (a weighted combination of the subscores) and four diagnostic subscores: Sentence Mastery (i.e., grammar), Vocabulary, Fluency, and Pronunciation. The Overall score and subscores are reported on a scale from 20 to 80.

The automated scoring technology is optimized using a large number of speech samples from both native and non-native speakers. Extensive data collection is typically carried out to collect a sufficient amount of such speech samples. These spoken responses are then transcribed to train an automatic speech recognition system.

Each incoming response is then processed automatically by the speech recognizer that has been optimized for non-native speech. The words, pauses, syllables and phones are located in the recorded signal. The content of the response is scored according to the presence or absence of expected correct words in correct sequences as well as the pace, fluency, and pronunciation of those words in phrases and sentences. Base measures are then derived from the segments, syllables and words based on statistical models of native and non-native speakers.^[3] Much documentation has been produced regarding the accuracy of Versant's automated scoring system.

Score use

Versant tests are currently used by academic institutions, corporations, and government agencies around the world. Versant tests provide information that can be used to determine if employees or students have the necessary spoken language skills to interact effectively. For example, the Versant English Test was used in the 2002 World Cup Korea/Japan to measure the English skills of over 15,000 volunteers and assign the appropriate workers to the most English-intensive tasks.^[4] The Versant Spanish Test was used in a study by Blake, et al. (2008)^[5] to evaluate whether distance-learning courses are as valid a way to start learning a foreign language as traditional face-to-face classes that meet five times a week with respect to oral proficiency.

Validation

Relationship to other tests

Versant test scores have been aligned with the Common European Framework of Reference (CEFR). Below are the mappings of Versant scores and other tests' scores to the CEFR. Versant English overall scores can be used to predict CEFR levels on the CEFR scale of Oral Interaction Skills with reasonable accuracy.^[6]

Versant^[7]	CEFR^[8]	IELTS^[9]	TOEIC^[10]	TOEFL iBT Speaking^[11]	TOEFL iBT^[12]
20-80	A1-C2	0-9	10-990	0-30	0-120
20-25	<A1	0
26-35	A1	1-2		8-13
36-46	A2	3		13-19
47-57	B1	3.5-4.5	550+	19-23	57-86
58-68	B2	5-6		23-28	87-109
69-78	C1	6.5-7	880+	28+	110-120
79-80	C2	7.5+

A series of validation studies has found that the Versant English Test correlates reasonably with other measures of spoken English skills. For example, the correlation between the Versant English Test and TOEFL iBT Speaking is r=0.75 and the correlation between the Versant English Test and IELTS Speaking is r=0.77.

Machine-human correlation

One of the common criticisms of the Versant tests is that a machine cannot evaluate speaking skills as well as a human can. Knowledge Technologies, the company that produces and administers the test, claim that the Versant English Test's machine-generated scores are virtually indistinguishable from scores given by repeated independent human raters at the Overall level.^[13]

Another criticism is that the Versant tests do not measure communicative abilities because there are no interaction exchanges between live participants. Versant, in Downey et al. (2008) claim that the psycholinguistic competencies that are assessed in their tests underlie a larger spoken language performance. This claim is supported by the concurrent validity data that Versant test scores correlate highly with other well-known oral proficiency interview tests such as ACTFL OPIs or ILR OPIs.

The usefulness of Versant products has been challenged by a third party.^[14]

Management

Alistair Van Moere, President
Ryan Down, Director, Product Management

References

^ Balogh, J., Barbier, I., Bernstein, J., Suzuki, M., & Harada, Y. (2005). A Common Framework for Developing Automated Spoken Language Tests in Multiple Languages. JART Journal, 1(1), 67-79.
^ Bernstein, J. & Cheng, J. (2007). Logic and validation of fully automatic spoken English test. In M. Holland & F.P. Fisher (Eds.), The path of speech technologies in computer assisted language learning: From research toward practice (pp. 174-194). Florence, KY: Routledge.
^ Balogh, J. & Bernstein, J. (2006). Workable models of standard performance in English and Spanish. In Y. Matsumoto, D.Y. Oshima, O.R. Robinson, & P. Sells (Eds.), Diversity in language: Perspective and implications (pp. 20-41). Stanford, CA: Center for the Study of Language and Information Publications.
^ "Korean Organizing Committee For the 2002 FIFA World Cup-Korea/Japan Has Chosen Ordinate's..." AllBusiness.com. 2 Nov 2008.
^ Blake, R., Wilson, N.L., Pardo-Ballestar, C., & Cetto, M. (2008). Measuring oral proficiency in distance, face-to-face, and blended classrooms. Language Learning and Technology, 12(3), pp.114-127.
^ Bernstein, J. & De Jong, J. H.A.L. (2001). An experiment in predicting proficiency within the Common Europe Framework Level Descriptors. In Y.N. Leung et al. (Eds.), Selected Papers from the Tenth International Symposium on English Teaching (pp. 8-14). Taipei, ROC: The Crane Publishing.
^ Bernstein, J. De Jong, J. Pisoni, D. & Townshend, B. (2000). Two Experiments on Automatic Scoring of Spoken Language Proficiency. In P. Delcloque (Ed.) Proceedings of InSTIL2000: Integrating Speech Technology in Learning (pp. 57-61). University of Abertay Dundee, Scotland.
^ Council of Europe (2001). "Common European Framework of Reference for Languages" (PDF). Cambridge University Press.
^ Relating IELTS scores to the Council of Europe's Common European Framework.: "Archived copy" (PDF). Archived from the original (PDF) on February 3, 2007. Retrieved January 20, 2009.{{cite web}}: CS1 maint: archived copy as title (link)
^ Tannenbaum, Richard and Caroline E. Wylie (2005). "Research Reports: Mapping English Language Proficiency Test Scores onto the Common European Framework" (PDF). Educational Testing Services. Archived from the original (PDF) on November 21, 2008. Retrieved January 20, 2009.
^ "Mapping TOEFL iBT on the Common European Framework of Reference" (PDF). 2007. Archived from the original (PDF) on February 6, 2009. Retrieved January 20, 2009.
^ "Mapping TOEFL iBT on the Common European Framework of Reference" (PDF). 2007. Archived from the original (PDF) on February 6, 2009. Retrieved January 20, 2009.
^ Downey, R., Farhady, H., Present-Thomas, R., Suzuki, M., & Van Moere, A. (2008). Evaluation of the usefulness of the Versant for English test: A response. Language Assessment Quarterly, 5, 160-167.
^ Christian W. Chun (2008): Comments on ""Evaluation of the Usefulness of the Versant for English Test: A Response"": The Author Responds, Language Assessment Quarterly, 5:2, 168-172

External links

[1] Balogh, J., Barbier, I., Bernstein, J., Suzuki, M., & Harada, Y. (2005). A Common Framework for Developing Automated Spoken Language Tests in Multiple Languages. JART Journal, 1(1), 67-79.

[2] Bernstein, J. & Cheng, J. (2007). Logic and validation of fully automatic spoken English test. In M. Holland & F.P. Fisher (Eds.), The path of speech technologies in computer assisted language learning: From research toward practice (pp. 174-194). Florence, KY: Routledge.

[3] Balogh, J. & Bernstein, J. (2006). Workable models of standard performance in English and Spanish. In Y. Matsumoto, D.Y. Oshima, O.R. Robinson, & P. Sells (Eds.), Diversity in language: Perspective and implications (pp. 20-41). Stanford, CA: Center for the Study of Language and Information Publications.

[4] "Korean Organizing Committee For the 2002 FIFA World Cup-Korea/Japan Has Chosen Ordinate's..." AllBusiness.com. 2 Nov 2008.

[5] Blake, R., Wilson, N.L., Pardo-Ballestar, C., & Cetto, M. (2008). Measuring oral proficiency in distance, face-to-face, and blended classrooms. Language Learning and Technology, 12(3), pp.114-127.

[6] Bernstein, J. & De Jong, J. H.A.L. (2001). An experiment in predicting proficiency within the Common Europe Framework Level Descriptors. In Y.N. Leung et al. (Eds.), Selected Papers from the Tenth International Symposium on English Teaching (pp. 8-14). Taipei, ROC: The Crane Publishing.

[7] Bernstein, J. De Jong, J. Pisoni, D. & Townshend, B. (2000). Two Experiments on Automatic Scoring of Spoken Language Proficiency. In P. Delcloque (Ed.) Proceedings of InSTIL2000: Integrating Speech Technology in Learning (pp. 57-61). University of Abertay Dundee, Scotland.

[8] Council of Europe (2001). "Common European Framework of Reference for Languages" (PDF). Cambridge University Press.

[9] Relating IELTS scores to the Council of Europe's Common European Framework.: "Archived copy" (PDF). Archived from the original (PDF) on February 3, 2007. Retrieved January 20, 2009.{{cite web}}: CS1 maint: archived copy as title (link)

[10] Tannenbaum, Richard and Caroline E. Wylie (2005). "Research Reports: Mapping English Language Proficiency Test Scores onto the Common European Framework" (PDF). Educational Testing Services. Archived from the original (PDF) on November 21, 2008. Retrieved January 20, 2009.

[11] "Mapping TOEFL iBT on the Common European Framework of Reference" (PDF). 2007. Archived from the original (PDF) on February 6, 2009. Retrieved January 20, 2009.

[12] "Mapping TOEFL iBT on the Common European Framework of Reference" (PDF). 2007. Archived from the original (PDF) on February 6, 2009. Retrieved January 20, 2009.

[13] Downey, R., Farhady, H., Present-Thomas, R., Suzuki, M., & Van Moere, A. (2008). Evaluation of the usefulness of the Versant for English test: A response. Language Assessment Quarterly, 5, 160-167.

[14] Christian W. Chun (2008): Comments on ""Evaluation of the Usefulness of the Versant for English Test: A Response"": The Author Responds, Language Assessment Quarterly, 5:2, 168-172

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]