Building Better E-Assessments
By Margaret Driscoll

Online testing offers untapped potential for instructional designers building e-courses.

Assessments are the foundation of effective instructional practices and return-on-investment studies. The power of tests and assessments will become exponentially more important with the advent of content management systems and learning management systems. Once those applications become AICC standard compliant, course developers can take advantage of assessments as a means of driving advanced learning practices, such as branching and personalization.

In addition to offering a variety of instructional practices, data from testing can be used to conduct item analysis and strengthen courses. Indeed, often it's test questions that drive the following instructional practices:

Posttests assess a learner's mastery of content at course completion. For example, after taking a course on basic banking principles, learners are given a 25-item test to assess their understanding of key concepts. Test data is used to determine whether the learner mastered the principles, and a comparative review of scores can determine course effectiveness.

Pretests assess the learner's readiness for instruction and can exempt learners from studying material they already know. Before starting a course on Lotus Notes, the learner is given a test to determine what he or she already knows about using mail, bookmarks, calendar, Web access, and more. Test data will let learners skip modules they already understand.

Remediation test items can pinpoint knowledge that the learner lacks and branch to additional material. Based on test items that the learner missed, the system determines which sections need additional instruction.

Adaptation questions get a feel for to the learner's strengths and weaknesses. For instance, after taking several lessons and completing the posttests, the delivery system (based on programming decisions made by course developers) determines what different scores imply. If the system observes that the learner consistently achieves a perfect score, it may change the pace of the program, provide fewer simple practice exercises, and introduce moderate or difficult examples more quickly.

Personalization can be used in combination with adaptation questions to create programs that offer individualized instruction. In this case, a learner answers a series of questions to build a learner profile, including name, job title, primary language, and location. Using profile and testing data the system can tailor a lesson, providing practice exercises that are the right level of difficulty and delivering more relevant examples.

Beware easy test-building software

You don't have to search the Internet very long to find stunning examples of bad tests. Meanwhile, test-building tools have never been easier to use or more accessible, making it possible for anyone to author a test. But ease-of-use and accessibility come with a risk: Technically appealing tests may not be good educationally.

Online test-building tools have the same problems as traditional pencil-and-paper tests plus some new problems introduced by technology. The most common problem with online tests is failure to link questions to the course objectives. It's important to determine if an objective that reads "the learner will be able to use, apply, calculate, build" is tested with an item that assesses the learner's ability to use, apply, calculate, and build. However, it's challenging to verify when objectives link to test items because maneuvering within a course can be difficult and time consuming.

An obvious problem with online tests is the inadequate number of questions that assess the learner's mastery of the content. Many online courses use only one or two questions for each objective, which often are recycled from the practice items. Too few questions may fail to accurately assess the learner's level of comprehension. The questions must proportionally test all content presented, and questions should range from simple to complex. That kind of breadth provides a fair assessment of the learner's mastery.

Another common problem is that online tests tend to disproportionately assess memorization. Many questions simply ask learners to identify, find, select, and define items using true-and-false or multiple-choice answers. It's easier to write those types of questions, but many courses require learners to gain high-level skills, such as applying, analyzing, synthesizing, and evaluating. When high-level skills are the goal, memorization questions are inadequate at assessing the learner's mastery.

These problems are compounded when course developers chose to use bells and whistles without understanding how those features affect the assessment. For example, many systems offer a timed-test option, but course developers aren't clear about when to use it. Basically, if a timed performance isn't required, developers shouldn't use it. Likewise, graphics and animation may be interesting graphically, but they're a distraction if they don't add content to the test.

Building online tests

Test-building software is simply a tool. The basic principles are the same when writing online or paper-and-pencil tests. But software to create online assessments is getting more sophisticated, as are the reports they generate. If you want to perform statistical tests for reliability and validity, these programs offer useful tools. However, they also have the potential to increase the GIGO quotient (garbage in/garbage out).

Here are some tips for writing solid test items.

True-and-false questions are best for testing memorization of factual information. These questions are generally easy to write. If you find you have several true-and-false questions, check whether memorization is a course objective. If memorization isn't the goal, consider developing other test items. To develop true-and-false questions:

  • write one true-and-false statement for each fact the learner must memorize
  • test only significant facts rather than trivia
  • use statements that are unequivocally true or false
  • write short statements
  • eliminate unnecessary detail that may confuse learners
  • avoid negative statements and double negatives which can add an unnecessary cognitive burden.

Multiple-choice questions test facts and the application of knowledge. These questions are more difficult to write, but they're the foundation for branching, item analysis, adaptive learning, and personalization. To develop multiple-choice questions:

  • test only one piece of information per question
  • use only plausible distracters
  • use four or five distracters
  • avoid using answers that can be easily eliminated
  • review questions for inadvertent clues
  • be sure the software supports partial credit answers
  • avoid using "all of the above" and "none of the above" answers as distracters. If you use them, make "all of the above" and "none of the above" plausible answers.

Fill-in-the-blank questions test facts or the application of knowledge. These questions are challenging to write because limitations in some packages accept only an exact word or phrase. To develop fill-in-the-blank questions:

  • write questions that have a single answer
  • keep in mind the software's limitations
  • use only one blank per question
  • use fill-in-the-blank when you want the learner to recall information rather than select or identify information
  • consider the risk of spelling and debatable answers. If the system can support a range of answers, be prepared to manually review and grade responses for fairness.

Matching questions test for memorization, and are considered a variation on multiple-choice questions. To develop matching questions:

  • check the screen layout; items should fit on the screen without scrolling
  • consider the effect of different screen resolutions
  • avoid using software that draws connecting lines between items
  • note whether learners can use the same distracter multiple times
  • provide more possible distracters (answers) than stems (terms to be defined). When the columns are of equal length, learners can guess the answers through process of elimination.

Testing-building software options

Whether you own software for generating online tests or are thinking about buying software, consider whether you might use the following features:

Test banks and randomization. Test banks are a warehouse for individual questions that use metatags. Metatags provide information about each question, such as objective, author, difficulty, and so forth. Test banks are an important feature for creating random tests that provide each learner with a unique set of questions.

Scoring. In most software programs, course developers have the ability to control the value for correct, incorrect, and blank answers. Variable scoring gives you greater control over weighing items. When using variable scoring, be sure to provide learners with clear directions regarding how answers will be scored.

More important, designers should take a look at how test-item types are implemented. Fill-in-the-blank, drag-and-drop, and matching formats differ dramatically among test-building tools. For example, for fill-in-the-blank items, some systems accept only one right answer, others accept five possible answers, and some use fuzzy logic. Consider the question, "What is the capital of the United States?" Correct answers can be Washington D.C.; Washington DC; Washington, DC; DC; and so forth.

Timing. Some programs can track latency or the time it takes a learner to complete a single item or the overall test. The ability to control and monitor time is useful only when testing a time-dependent skill.

Clear instructions. When asking learners to answer fill-in-the-blank or multiple-choice questions, be sure to provide directions. For example, state whether spelling is a scoring factor or if they can select more than one answer on multiple-choice questions.

Reviewing and submitting answers. Programs offer various options for submitting completed tests. For example, questions can be presented sequentially so that the learner is required to answer them in a specific order before moving forward. Another option is to break the test into sections relative to the content and have sections submitted one at a time.

An alternative scheme is to have learners scroll through questions, allowing them to return to ones they don't know. In this scheme, learners submit the entire test when completed. Scrolling though a long test may allow more consideration before final submission, but it can also become unmanageable for the learner. Overall, it's desirable to be able to return to unanswered questions. Good design may provide cueing devices or indicators to identify unanswered questions.

Testing is a powerful online tool, but there's a long way to go before online testing is used to its full potential. If you're designing online tests, review guidelines for developing effective pencil-and-paper test items. Determine how test data can drive sophisticated practices, such as branching, remediation, and personalization. The next step is to move from test design to implementation of online testing. Consider scoring options, analysis reports, and how to link test scores with learner profiles. Finally, learn as much as you can about the features of the test-building software you plan to use.


Margaret Driscoll is director of strategy and ventures at IBM Mindspan Solutions and author of Web-Based Training from Jossey-Bass. She also is an adjunct faculty member at the University of Massachusetts and Suffolk University in Boston.

 

 
 
Request more information or report issues with this page.
To add pages to your ASTD Favorites you must be logged in.