01 December 2015

Mail Time: What is your test format in AP?

From Josh:

I'm a new AP physics teacher and have been struggling with designing a set format for my summative assessments.*  I've read a lot of what others do but I'd like to see if there's anything particular that you do in class.  Some teachers have suggested making the entire test out of 45 points with 15 MC and a FRQ or two.  

* i.e. TESTS

Also, do you scale your tests?  I've read a lot of teachers do but personally I find that skewing the data on the learning objectives and it hides what they learn.  I can see scaling an actual AP exam given in class. The students at my school get a bump in GPA for taking an AP course so to scale it on top of that provides a huge jump and I'm having a hard time justifying that.  

Hey, Josh... you've asked a couple of million dollar questions.  There's no one best answer, obviously. I'll give some detailed thoughts below about what I do.

But first, the disclaimer: while it's important to assign a fair grade to a fair test, that's NOT the fundamental point of testing.  Our overriding goal is for students to get the right feedback in the right context to improve their long-term understanding of physics.  A test is the standard, useful tool for that feedback; grades and GPA are powerful motivators. But we don't want students lawyering for points at the expense of figuring out concepts.  To that end, I think you've got the right idea: come up with a standard approach for your class, so that the "rules of the game" are fixed and known.

So how do I structure AP Physics 1 tests?

I try to test during lab periods, to get the longest chunk of time possible.  Students do better on longer tests (because they have more opportunity to see problems they can handle, because they can knock off an easy-for-them problem quickly and have more time on the difficult problems), so I give as long a test as I can arrange.  When there's not a convenient long period, I've given, say, 40 minutes multiple choice one day, then 40 minutes free response the next to make a full test.

In constructing the test, I try to use exclusively old AP items, even though that means using physics B items as well as released physics 1 items.  That's fine for me, 'cause I figured out that it's best to start my class targeting the old Physics B exam, then introduce more writing and description as the course continues.  I've decided to include "short answer" items designed to take about three minutes to answer, as well as free response and multiple choice.  These short answer items are also straight off of released AP items: either multiple choice with "justify your answer," or a single part of a free response question.  

Whatever you do, I suggest being very consistent with the timing: new AP 1 multiple choice should be 1:45 or so per question, and free response should be 2 minutes per point, with 7 point and 12 point questions.  Students may run out of time on a test early in the year, and that's a fine learning experience.  As the year goes on, if you're consistent with test structure, students will learn the correct pace.  And then they will know exactly how to deal with time on the authentic AP exam.

As for scaling the tests... I wouldn't think in terms of "learning objectives" -- just teach physics.  The tests should reflect how the AP exam tests the material you're covering in class.  If you give authentic AP items -- thus approximately controlling the difficulty of the tests -- then you can make a reasonable guess that the approximate percentages from last year will hold.  Last year, about 70% was a 5, 55% a 4, 40% a 3.  [On the Physics B exam, the scale was five points lower -- that is, 65% was a 5.]  

When I convert the raw percentage on a test to a publishable school scale, I have two considerations:

(1) Corrections contribute half-credit back to the raw percentage.  See this post and search the blog for "corrections."

(2) Ideally, a 5 with corrections converts to an A; a 4 converts to a B or B+; and a 3 converts to a B or B-.  2s become Cs, 1s become Ds or Fs.

In the past I've used a "square root curve" to convert from the corrected test score to a 90-80-70-60 school scale.  Lately, especially as the raw standard for an AP score has increased, I've gone to a scale based on the New York Regents exam -- there, about an 85% converts to an A- equivalent, a 70% converts to a B- equivalent.

Exactly how you convert doesn't matter... the only important part here is that the conversion from raw scores to a "school scale" must be identical throughout the year, and NOT EVER based on performance.  "Curving" a test by, say, making the highest score an A creates a perverse set of incentives for students to tank, or for high-performing students to be ostracized.  (Would a baseball team ever encourage their star player to strike out to help everyone else's batting average look good?)  If the whole class earns 3s and 4s rather than 4s and 5s, then they get Bs not As -- oh well.  If everyone gets 5s and As, that's fine too.  The beauty of AP is that it gives you an external standard to aim for, one that you can blame on the "evil" College Board.  You're not the author of the students' grade, just the publisher.  

So what happens if your test scores are lower than you hoped?  Well, it might happen occasionally.  Just like a football team shouldn't fire the coach and change their team identity because they lose the first two games of a 16-game season, you shouldn't panic or change based on a few poor performances.  At final grade issuing time, you can adjust grades based on in-class work like quizzes and homework.  I always recommend adjusting the entire class -- that is, don't take pity on a borderline student and bump him from a B+ to an A-, but rather drop an extra quiz such that EVERYONE's course grade rises slightly.  This will achieve the same bump for borderline students without any perceived or real favoritism.  And if you find yourself bumping people who don't deserve the bump, then perhaps the bump you're considering is a bad idea for everyone.  :-)  I find that there is a nearly 100% long-term correlation between performance on homework/quizzes/lab work and performance on test -- so over the course of a year, small decisions about grades balance out to give fair overall grades.  

Good luck... if you come to one of my summer institutes, we can talk a lot, lot more about how to structure testing for maximum benefit.  And, the beauty of the institute is that it involves a bunch of other teachers, too, who can share their ideas.  


  1. My first few assessments every year are not quite "AP style" but as we move further along they become 100% AP and are graded on a similar scale. The students are quite proud when I tell them they are ready for an AP style test.