iPads for Assessment – lowering barriers with greater authenticity

The phrase “performance assessment” can mean many different things to different people. (A google search for the quoted phrase returns over a million hits…)

The lack of “auto-complete” when I check the URL for scale.stanford.edu suggests that the topic may not be has widely researched as I would have guessed, or that this is not a major source or force associated with the phrase.  But to my thinking “performance assessment” conflates and represents multiple trends in assessment of interest to teachers, eduministrators and educrats alike: assessment that is more “authentic“, that somehow gets at “critical thinking” skills, and relies on evidence produced by an examinee that demonstrates what the examinee “knows and understands” — and their ability to apply that knowledge to novel situations – for critical analysis and problem solving.

An example often sited for these sorts of assessments is the CLA+ from the Council for Aid to Education.  The Collegiate Learning Assessment has been around for some years, and has been joined by a College and Work Readiness Assessment.  But the CLA+ would by almost any definition be a great “poster child” for performance assessment.  As they say here:

“CAE has pioneered the use of performance-based tasks in our Collegiate Learning Assessment to evaluate critical thinking skills of college students. CLA+ measures critical thinking, problem solving, scientific and quantitative reasoning, writing, and the ability to critique and make arguments.”

Which I’d say pretty much echoes my own definition above.  But of course this sort of assessment has many challenges for use at scale — from the expense and complexity of development and administration to the reliability and validity of scoring.  [In the interest of being even-handed, I will also point to folks that are less enamored of CAE’s success in achieving their own lofty goals, for example here.]

I will save discussion of the broader topic for another day; my focus here is the challenge of developing “accessible” assessments — that is, tests (aka “educational measurement instruments”) for use by students with different abilities and special needs which make traditional assessments less well suited to them.

It seems self-evident that students that study and learn with assistive technology (AT) — particularly technology to assist the visually impaired, but including a broad spectrum of accommodations mandated by regulation and law — are going to be best served by assessments that use that same assistive technology.  The whole idea of “construct-irrelevant variance” and the importance of validity (and reliability) have been formalized around the notion that any aspect of an assessment which creates variability in the results that does not arise from aspects of that which is being measured must surely be a bad thing, and to be avoided as much as possible by a valid assessment.  A written biology exam that requires visual acuity, reading skills, or what have you, might work well for a population of students with relatively consistent capabilities in these areas.  For the same assessment to work for an examinee with different abilities in reading or sight, however, assistive technology should intermediate the assessment content so that the differently-abled student will be measured for same degree of mastery and understanding of the biology domain, and for possessing the same knowledge and skills, as the majority-ability group is with the standard assessment.

Providing an examinee that requires accommodated assessment with “assistive technology” they are unfamiliar with clearly presents additional challenges to demonstrating competence.  A student with no special needs could learn to use a Dvorak or Colemak keyboard, for example, but to force them to use one on an assessment for typing an essay, rather than the Qwerty keyboard they use in the normal schoolwork, would be objectionable (on at least measurement grounds) to just about anyone.

Which is a long way of getting to this point: as because large scale standardized testing continues to grow in adoption across the entire spectrum of K12 education, and statutes and practice insure that testing is largely inclusive and provides accommodation, the challenge of providing reasonable accommodation in assessment to the broadest spectrum of users at the least possible cost has been driving the large scale assessment companies to find a solution that, stepping back, seems to be largely divorced from “teaching and learning” as we know it.

The laudable goals of the “IMS Global”, or rather, “IMS Global Learning Consortium“, in both their standardization efforts around “assessment interoperability” and “assessment accessibility”, have led to what can generously be described as a “complex” solution.  The APIP Standard and the “standardization process” itself has moved in fits and starts on this front, and the current draft APIP standard may well be abandoned for a fresh attempt.  Meanwhile, a very large number of states are moving ever closer to adopting CCSS (or CCSS-inspired) exams across K12, and more and more of these will be computer-delivered.

At the same time, Apple Computer, with their introduction of advanced “accessibility” features in iOS starting several versions (and years) ago, has produced probably the largest and most widely adopted Assistive Technology in K12 to date.  And one might expect others in the device space like Microsoft and Android (Google?) to at least try to catch up at some point (Microsoft for example became a sponsor of a new “accessibility professionals” organization perhaps signaling an increased focus on AT).

So my thought is this:

Would not a simple, native “iOS” test delivery system that was designed from the ground up to harness the power of Apple’s accessibility solution — and the ecosystem of devices and software already integrated with it — provide a far more elegant and cost-effective solution to the challenges of “accessible assessment” than some tacked-on, custom-coded APIP-enabled solution that would attempt to create seamless, cross-platform, reusable accessible content in a world where even standard test items are rarely “interoperable” without custom conversion routines?  

And while automated validation of an xml-based encoding standard might seem like a nifty way of assuring interoperability, the considerable distance between achieving validation of the APIP encoding of test content files against a markup-schema definition and demonstrating a verified, psychometrically equivalent delivery experience for different students on different platforms seems like a potentially ruinous chasm.

It is easy to google and find comparisons of iOS to Android on the accessibility front, and easy to find many AT users that are converts to iOS (and not so much visa-versa).     Microsoft’s offerings seem to have a long way to go — and when I see third party apps offering to add accessibility features to Windows Phone 8 or Android, it seems pretty clear they are well behind the curve.

Apple’s long-standing support for a users’ choice of language has made Apple products much preferred to bi-lingual households since forever; the decade plus lead they have over Microsoft’s clumsy, slam-in-a-different language / “code page” heritage is still reflected in the relative elegance (to say nothing of the price!) of their solution. And so it seems for accessibility to me. Apple took the time and thought about the problem as an integral part of how they have built their product, and it shows.

I will watch with interest for the release of the Smarter Balanced Assessment Consortium‘s open source test delivery solution.  Perhaps practical, simple and interoperable APIP-powered item authoring, test creation, delivery and scoring solutions will enter the marketplace and make accessible assessments in K12 an easy and elegant reality as more and more of these tests go online. But it seems unlikely in the first instance; and that, even if these new systems work, they will require the students to be trained, and to practice, so that their specific skills with the Assistive Technology (AT) come test day will not / will minimally interfere with the measurement itself.

So why create one more barrier if the whole point of AT is inclusiveness? And “authenticity” in testing is about having kids demonstrate what they have learned on a test in the same way that they actually learn every other day in school, and use that knowledge in “real life” — not on some artificial, built-for-assessment “testing platform”.

Perhaps this is a project worth pursuing.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s