Grading the 2010 AP English Language Exam: Eyesores

During a week in Louisville, I spent 53 hours reading student essays that were recorded in illegible scrawls requiring intense eyestrain to decipher. During that time, I graded more than 2,000 exams, spending a little less than a minute on each essay. I quickly grew tired of reading about Jon Stewart, Wanda Sykes, Chris Rock, Larry the Cable Guy, Tina Fey, and a slew of humorists I had never heard of before my arrival in Louisville. The only thing that pulled me through this slog of essays was the occasional gem in the rough, an essay whose unintentional comedy would lead to laughter. Let me share with you the last of these gems which students thought would impress exam readers:

The following are excerpts from actual exams; each excerpt is in italics, with my commentary in normal typeface.

I think that humorists are to entertain and nothing else. If they were trying to send a message, wouldn’t they get a reply?

Because television hosts like Stewart and Colbert are seen by millions, they know what they’re talking about. Sure—and because the National Enquirer is read by millions, I believe that aliens have abducted Britney Spears.
Glenn Beck is a giant jerk.
If he hear me tell a joke like that, he slap me faster than two jiggles of a jackrabbit’s ass. HUH?
My brother and I are demon hunters who drive around the country in our 1967 Impala fighting the forces of evil. You know, suddenly the National Enquirer is looking a lot more credible.
There are those who don’t like comedians because they take offense and one should not be so touch-e. You know what? Touché.
There’s songs out that reveal people are devil worshippers. What is it with the demons and devils? Were these students possessed?
For example, “Mary! Mary! How does your garden grow, filled with trash and gum wads.” This portrays how her sidewalks are filled litter. Dude—quit forcing Captain Planet onto Mother Goose!
In my lifetime I have lived in a family of foolish people. Which explains you.
Once a stand up comic such as Adam Sandler expressed the fact of him never being able to make a woman orgasim this put many people to understand that ‘your not alone.’ (Shaking head)
Like a political cartoon on Obama that had mooses and elephants to represent the Republican and Democrat Party. Donkeys—Elephants and donkeys. And the plural is moose.
• But my favorite: Comedians point out the ugly truth so that even the airhead bimbo who wrote a similar paper to the one your reading now can understand the subliminal message. Their ability to humorously attack the wishy-washy statements of government without getting pimpslapped by federal agents is what makes humor’s role in society extremely vital. Now you’re talking—students who are bimbos writing about comedians getting pimpslapped—that’s more like academic discourse!

Grading the 2010 AP English Language Exam: Prepare to be Assimilated

While the grading standards set forth in the official grading rubric for each essay question might seem to be straightforward, you’ll find that most graders disagree strongly as to what makes for an “adequate” essay versus an “inadequate” essay—and that those disagreements are even more stringent when you’re discussing minor variations: What distinguishes an inadequate 3 from an inadequate 4? An adequate 6 from an adequate 7?

The sorts of natural disagreements that any two individuals might have over these sorts of questions are complicated by grader demographics. I would estimate that approximately 50% of the 2010 English Language exam graders were high school teachers; another 35% or so were teachers at the community college level, and the remaining 15% were either graduate students or faculty members at major universities. Think, for a moment, about the implications of that spread—a university professor almost certainly grades papers of a higher quality than a community college professor, and a community college professor almost certainly grades papers of a higher quality than most high school teachers. This means that graders come into the process with widely divergent expectations which must be reconciled so that scores will be standardized and no student’s scores will be skewed by a single grader’s prejudice against writers who regularly split their infinitives.

In this sense, the “Reading,” which is how ETS refers to the week-long grading process, is a lot like the Borg—if you’re not prepared to be assimilated into a greater collective, you’re in for a rude awakening. The “Chief Reader” presides over the grading process; for the 2010 English language exam, this was a BYU professor named Gary Hatch who, tragically, died just a month before the Readingwas to convene and was replaced by a University of professor named David Joliffe. Joliffe oversees the entire process of grading the three essays, but three “Question Leaders” are also designated to oversee the grading of each question. These question leaders oversee between 300 and 400 graders, who are grouped into tables of ten, and each table is presided over by a “Table Leader.” When more than 1,100 graders descended on the 2010 AP English Language exam on June 11, 2010, they were greeted by table leaders who had already been onsite for two days deriving a consensus as to the which essays merited a score of 3, which a score of 4, etc.

These table leaders, in conjunction with the question leader, had copied sample essays that reflected the entire range of scoring possibilities to help graders develop standardized scoring criteria—but graders had to fall in line with the standards that table leaders developed over two days in just four hours. Naturally, this would produce heated disagreements at each table as to why one sample essay deserved a 4 when a university professor saw it as a 2 and a high school teacher saw it as a 6. For four hours we haggled over sample essays while the question leader periodically polled the room to determine whether we were arriving at consensus. When I raised concerns over the last sample essay before graders would switch over to “live books” of ungraded exams, my (wonderful) table leader stared at me with exasperation: “Monk, we have no more time for disagreement. This is a 7. See it as a 7. Be assimilated.”

So I abandoned my individual will and became part of the Borg.

Of course, readers were still adjusting to the grading standards at this point, so table leaders periodically spot checked every reader at their table during the first two days, re-grading every fifth essay or so. When table leaders felt that their charges were straying too far from the established standards—a scoring difference of more than one point—they pulled that reader aside and explained why the essay he or she had given an 8 was really a 4. My weak, fleshy brain was repeatedly disciplined for not adopting the mechanical correctness of the Borg. I resolved to do better.

The following are excerpts from actual exams; each excerpt is in italics, with my commentary in normal typeface.

There were two problems in the grading the exam that were particularly problematic for me. The first problem arose when students made statements that were clever—or at least required thought—but I wasn’t sure whether or not the subtleties of their prose were intentional or not. For instance:

Humorists are a big joke. How is one to interpret this?
Humorists are like Santa Claus on Christmas Eve. He may not be real or the truth but he brings smiles to all. In an essay defending claiming that humorists are important players in society, how much credence can I give to the ironic undertones here?
Harassment charges would be brought down on sexually explicit comics like Thor’s hammer. Maybe—but more to the point, is it wrong to invoke the god of thunder in an academic essay?
Humorists are why we aren’t a communist nation. They keep us divided. This might be true . . . but does the student actually understand this argument?
Humorists don’t wear the condom of censorship while breeding out the beautiful baby known as the naked truth. Well, when you put it that way, I guess they don’t. But do I reward you for a sophisticated metaphor or punish you for using informal language?

The second problem was a student tendency to describe works that I considered “serious” as “humorous” because they did political work—and the students understood that de Botton wanted them to talk about the political function of humor.

• As a result, I got students who cited the following works as “humorous” literature: Stowe’s Uncle Tom’s Cabin (nothing like slavery for a good laugh!), Shakespeare’s Macbeth, Hamlet, and King Lear; Machiavelli’s The Prince (ah, the humor of despotism!); Orwell’s 1984; Miller’s The Crucible (and repression!); Stephanie Meyer’s Twilight series (and ineptitude! Okay—that was a bit harsh); Conrad’s Heart of Darkness (did he even think about the title?); Thoreau’s Walden; Sinclair’s The Jungle (nothing funnier than drowning in a vat of boiling fat); Tolkein’s Lord of the Rings series (I admit that hobbits are funny); Ellison’s Invisible Man; Golding's The Lord of the Flies (whose macabre depictions of adolescent cruelty are NOT funny); Melville’s Bartleby (nothing like depression and suicide for a good laugh!); Ayn Rand’s The Fountainhead; and Dante’s Divine Comedy. At least this last one had “comedy” in the title, but every one of these books is far more dark than comic, more tragic than titillating.
• We also had students who suggested that films were funny, including The Dark Knight and The Godfather. Yup—a barrel of laughs, those two. "Why so serious?"
• Perhaps most inexplicable were the list of political figures that students described as “humorists.” These included Gandhi (!), Martin Luther King Jr., John Locke, and Thomas Hobbes. Nothing funnier than Leviathan, let me tell you.

As a grader I wanted to reward students for what were, occasionally, intelligent analyses of challenging texts—but I also had to consider the fact that these students failed to understand the basic point of de Botton’s argument that humor makes political statements possible in circumstances when serious works such as the ones above would have been repressed or censored. That was a tough balancing act. Similarly, should I reward students whose arguments were sound but whose facts were faulty?

Kurt Vonnegut, in his novel Animal Farm, satirizes communism. Well, no. Orwell satirizes communism—so does the student get credit or not?
Dickens satirizes the French government in Les Miserables. Well, no. Victor Hugo does—credit or not?
Satirical writers have been around since we came to North America. In Praise of Folly is one of the greats. The writer shows that no changes are ever occurring and we are a corrupt nation. First of all, In Praise of Folly was written by a Dutchman while Columbus was still alive, so there was no “nation.” But the rest of the argument was sound . . .
Like a woman trying to cover up her blemish, society attempts to cover up its mistakes using a little puff of powder. Back to the ways in which comedians are pimples . . .
Mark Twain wants to have someone institute an emancipation policy on slavery in Huckleberry Finn. Well—no, he doesn’t, because Abraham Lincoln emancipated the slaves 20 years before Huck Finn was ever published! But he does criticize slavery as an historical institution . . .

It was hard to know which students deserved the benefit of the doubt, and that question often made a significant difference in score.

Grading the 2010 AP English Language Exam: The Rubric

After the prompt, the official grading rubric is the most important document for any grader in assessing the quality of a given exam. Each essay is scored on a scale from 1-9, but graders are encouraged to interpret that range as a series of decisions. The first question every grader is supposed to ask: “Is this an upper-half or lower-half paper?” Lower-half scores include 1-4; upper half papers include scores 5-9.

Once the grader has identified the essay as “upper-half” or “lower-half,” they break it into a further subset. Lower-half papers are further divided into two groups. Those that are “Inadequate” have evidence that is “inappropriate, insufficient, or less convincing” or the argument is “inadequately developed.” Inadequate papers receive a score of 4; inadequate papers that demonstrate “less success” in responding to de Botton receive a score of 3. The other major category for lower-half papers is “Little Success,” for papers that “misunderstand the prompt, or substitute a simpler task by responding to the prompt tangentially with unrelated, inaccurate, or inappropriate explanation.” Papers that demonstrate “little success” received a score of 2; papers that demonstrated “little success” but were “especially simplistic” received a score of 1.

On the upper-half of the equation were two similar divisions; papers receiving a score of 8 or 9 were those deemed “effective” and those receiving a score of 6 or 7 are “adequate.” But the hardest score, by far, was the upper-half number that hovered somewhere between an “inadequate” 4 and an “adequate” 6; essays designated with a score of 5 are supposed to “convey the writers ideas” but in an “uneven, inconsistent, or limited manner.” Telling the difference between an inadequate 4, an uneven, inconsistent, or limited 5 and an adequate 6 was the most difficult distinction for any reader to make.

Graders were expected to “take everything into account: content, organization, diction, sentence structure, spelling—everything.” Everything, that is, except handwriting, because if we were allowed to factor handwriting into the grading, there would be few if any upper-half scores. Over a week of grading, nothing required more effort to grade than a superior essay hidden behind indecipherable handwriting—except maybe a bad essay hidden behind bad handwriting. Nonetheless, I resolved not to penalize bad handwriting, lest the ghosts of my youth—when I routinely received failing grades for my handwriting—come back to haunt me.

I’m sure that many exam-takers worry about typos and grammatical errors submarining their scores, but those issues were largely ignored by graders who recognized that every single one of these essays was a first draft written under pressure. Only if a paper contained “many and distracting errors in grammar and mechanics” did the rubric instruct graders that it could note receive a score “higher than a 2.” I ignored the vast majority of grammatical and mechanical errors that I saw over the past week, but some are just too delicious not to share.

The following are excerpts from actual exams; each excerpt is in italics, with my commentary in normal typeface.

• Without appreciating the irony of his mistake, one student misspelled Ellen Degeneres’ name as Degenderous. Do you think, perhaps, that he appreciated her stand-up routines on a subliminal level?
• In what might have been the single most wretched sentence of the entire exam, one student stated that, For example, writer of the pervious centreryes (100 years) relayed on satire in play, opera, book, and poemty. Holy Hannah! I’d rather be struck blind than read book and poemty written by this student for an entire centrery.
• Then there was the rather defensive—and lazy—student who asserted, I know how to do an argument essay, to bored to do it. Also, this gym is cluster phoebis to the max, I can’t work in this compact of a place. Hey, if you’re to tired, far be it from me to beg for more.
• This one would have been pretty clever, if only it had been intentional: The role of humorists in society can be characterized by bringing in to light the fears and destresses of society. Nothing to relieve distress like a little de-stressing comedy.
. . . as for a humorist, they just wanted a good massage to be pass down. Hey, me too—after a day spent unmoving in these chairs, I could use a good massage myself.
These humorists say unspeakle things. Yup—and so do you.
• Then there’s the student who said exactly what I think about most of the vulgar comedians I read over and over about during the week: There are comedic writers in every orafice of the entertainment business. You got that right—if only they’d stay in those orifices, instead of popping out, like pimples on . . . You know what? I think we should end this analogy.
Many cartoonists are key players in hiddening hidden messages. No comment.
• I frequently heard about people that no one in their right mind would consider a humorist—including radio alienators such as Rush Limbaugh. The student, apparently, remembered that Limbaugh made Obama out to be an illegal alien. Or, at least, I assume that’s what s/he was thinking.
• Other conservative figures invoked by students included the Christian comic Brian Regan: Comedy is a tool Brian uses to evangulate people. I can only assume that evangulate is a composite of evangelize and strangulate—which, in all fairness, is a combination that I think does justice to Regan’s routines.
• In another Christian-themed gaffe, one student wrote, Screaming “Jesus sucks in church” is very inappropriate. Yes—and so is screaming “Jesus sucks” in church, which is what I think she meant.
• Many students had spent too much time listening to Stephen Colbert’s “Word of the Day,” and apparently trusted overmuch in his “truthiness”: Comedians show the idiocracy in subjects that must be addressed.
• Of course, such baffling assertions were often followed up with a recognition of the students' uncertainty; as though an AP grader might pop out of their paper and respond, I frequently found the question, Does this make any sense?
• Sometimes student responses did make sense—especially when they stated their point more than once: Protaining to this topic not only do a person like my self agree with Botton’s claims . . . but I also support it. Ok; glad we confirmed that one.
• Perhaps the single funniest spelling error was made in an essay describing Jonathan Swift’s famous “Modest Proposal” in which he satirically suggests that the British government raise Irish babies as food to alleviate the suffering caused by the potato famine; describing this cannibalistic suggestion, one student called Swift’s essay an outrageous and digesting solution to the problem. And, I’m sure the student would be quick to point out, Swift’s solution was also disgusting.
• But my favorite typo—bar none—came at the end of an essay, as the student concluded his argument with a rhetorical flourish: How else could this point be better statted? Err . . . do you really want me to answer that?

Grading the 2010 AP English Language Exam: The Prompt

Educational Testing Services— ETS—administers two Advanced Placement (AP) English exams, one that assesses students’ ability to write (English Language) and one that assesses student knowledge of famous works of literature (English Literature). On the appointed day in May every year high school students across the country line up to take these exams, hoping that their score, on a 1-5 scale, will allow them to test out of freshman composition in their college of choice; for most students this means that they need to score at least a 3 on the exam. Once students have completed the exam, ETS assembles the exams and transports them to a single location, where graders from across the country will assemble to read the essay portion. This year, that assembly took place in Louisville, Kentucky, where between 1,100 and 1,200 educational professionals—ranging from high school teachers to adjunct faculty at community colleges to graduate students to tenured faculty at Research I universities—descended to grade more than 350,000 English Language exams.

The AP English Language exam includes a series of multiple choice questions and three essay questions. Each essay question asks students to do something slightly different: the first asks students to synthesize and summarize three different sources of information; the second asks students to analyze the language of a selected passage of prose, usually a speech or persuasive essay; and the third asks students to construct an argument. Students write all three essays by hand in an exam booklet, and their final score (1-5) is determined by the averaging the score assigned to each essay and the multiple choice questions.

While every student writes three essays, a different grader reads each essay (to make sure that no student’s score is overly dependent on a single perspective) and each grader is assigned to a specific question. When I arrived at the first day of grading and registered as a reader, on Friday, June 11, 2010, I was informed that I would be grading essay question three: argument. I had already read the prompt for each of the essay questions as part of my preparation, but before I arrived in the cavernous hall that would be my home for the next week I re-read the prompt:

“In his 2004 book, Status Anxiety, Alain de Botton argues that the chief aim of humorists is not merely to entertain but ‘to convey with impunity messages that might be dangerous or impossible to state directly.’ Because society allows humorists to say things that other people cannot or will not say, de Botton sees humorists as serving a vital function in society. Think about the implications of de Botton’s view of the role of humorists (cartoonists, stand-up comics, satirical writers, hosts of television programs, etc.). Then write an essay that defends, challenges, or qualifies de Botton’s claim about the vital role of humorists. Use specific, appropriate evidence to develop your position.”

Rereading the prompt, I was excited—this was, I thought, clearly the most interesting of the three questions, and I looked forward to reading essays about funny people, events, and art for the next week. What I didn’t consider was the fact that many of my students would not understand the prompt at a basic level. I would definitely be laughing as I read these essays over the next week, but most of laughter would be prompted by the unintentional comedy of students’ misunderstandings and misstatements.

The following are excerpts from actual exams; each excerpt is in italics, with my commentary in normal typeface.

The first problem that students seemed to have was coming to terms with the definition of the word “humorist”—despite the root word “humor” and the many examples provided in the prompt. For instance, I had students who wrote:

Humorist hmm . . . what is your opinion about this people? Most likely your thinking that they are humans who work to make other people laugh. Yes. You’re right. I was thinking that they are humans.
[Humorists’] feelings are usually pure, unadulterated and unedited, stripped away of stupid niceties and fluffy language, the voice of humorists become the voice of reason. Um . . . right. This is how I’ve always thought of Robin Williams—the pure and unadulterated voice of reason.
Although many satirists and comedians receive a lump sum of money after a days work, the majority do so out of compassion, and love for the people. So now humorists are both the voice of reason AND Christ figures? Who also seem to have a lot in common with Judas? I’m confused.
Humorist can basically be your second parents. Methinks this student watched Adam Sandler’s Big Daddy one too many times . . .
For example, the founding fathers of the United States would be considered as humorists. You know, I always thought that Washington was a funny guy.
Humorists are emotionless and do not care for other people’s feelings. Of course not—that’s why they spend their lives making other people laugh.
Commercials are also most of the time humorist. Um . . . you mean humorous?
Humorists play a vital role in society along with all the other organisms in today’s world. Yup, humorists and gut bacteria—vital organisms.
Pearl in The Scarlet Letter is a humorist. Yes, I always thought that Nathaniel’s novel about adultery, sin, and the Puritan culture of shame was hilarious. This, however, was only my second favorite Scarlet Letter reference—I couldn’t stop myself from laughing out loud when I read about the humorous circumstances of “Heather” (instead of Hester) and “Ruby” (instead of Pearl). At least the student remembered that the daughter’s name was a precious stone of some sort.

The second problem that students had with the prompt revolved around their understanding of who Alain de Botton was and what they needed to say about him. I had students who wrote:

Botton will make the audience to be active because of his humor. He will not bore them and will not make them fall asleep. The purpose of being like Botton is to aim what people wants to hear. Me too. I want to be funny like de Botton too.
Maybe Alain also believes in a better tomorrow; one where presidents can be safe from flying shoes or where chickens can cross roads without being questioned about their motives. Anonymous student, this is a tomorrow that I want to live in.
Mark Twain and Alain de Botton sound similar to me. Me too. But please, continue: As soon as I read that Botton is a humorist writter Twain instintly popped into my head and that is a excellent writter. Yes, Twain writtes almost as well as you do. Botton may have been close to Twain, they may have been best friends. Well maybe they would have been friends—if they had lived in the same century!
If people like Botton don’t like it then boo-hoo build a bridge, get over it! The world doesn’t revolve around you. People like him are so stupid. I hate people like him. Yikes! I hope they never use my name in an AP prompt.

The last prevalent misunderstanding of the prompt involved a failure to comprehend the word “impunity.” I could have pulled any number of samples just like these:

Alain de Botton is against humorists things because they impunity message. Okay . . . misunderstanding the word impunity clearly wasn’t the only problem here.
Do some of the things we hear, see, or read give us impunitive messages that can be harmful or dangerous? This one was fun to think about—how would you define impunitive?

