## 21 June 2011

### Bad Graphs part II: don't force the best-fit through the origin

In today's episode of Bad Graphs, we begin with another poor scale.

BAD GRAPH #4:  Scaled to less than one-quarter page
As you can see -- or maybe you can't, it's so small -- these data points are plotted correctly, but in a teeny weeny portion of the page provided.  A proper graph takes up well over half the available room on the page.  The standard for credit on this particular AP problem (2010 B2) was for the graph to take up more than 1/4 page.  Scaling across a whole page is a skill that must be taught -- it does not come naturally out of math classes.
BAD GRAPH #5:  Can't see the data points without a magnifying glass
If you want to get extra-technical, the size of the data points on the graph should reflect the experimental uncertainty in each quantity measured.  That is, if you could measure to the nearest milliliter, than the data points should be as big as half of a box in the vertical direction.  (And if large uncertainty would make the points ridiculously big, then you're supposed to use error bars.)

For the purposes of AP exam questions or labs within my course, all I ask is that the data points be clearly visible, as in all of the other BAD GRAPHS shown in this post.  The graph above, though, shows itty bitty dots and a nice best-fit.  Sure, the best-fit will yield a reasonable slope, but without easily seen evidence of where that slope came from.

BAD GRAPH #6:  best-fit line forced through the origin
A best-fit line should reasonably indicate the trend of the data.  There is no one "best" best-fit, but rather a range of allowable best-fits.  I've occasionally had my class draw the steepest possible best-fit, then the shallowest, and note that the value of the slope is somewhere between these two extremes.

The problem with the graph above is much more subtle than with some of the other BAD GRAPHs.  This student has drawn the best-fit line by starting at the origin of coordinates, and only then trying to approximate the trend of the graph.  Problem is, for one thing, the origin is not a special spot on the graph.  The point (0 kg, 0 m3) is no more important than the point (.04 kg, .000054 m3).  Even in the case where (0,0) is a data point, it's a data point like any other.  Would you insist that the best-fit line always go through the third data point?

In this particular experiment from the 2010 AP exam, the y-intercept of the graph was explicitly non-zero.  (In fact, the last part of the question demanded students to figure out that the y-intercept represented the volume of fluid displaced by the floating cup alone, without any additional mass.)  Forcing the best-fit through the origin not only artifically steepens the graph's slope, but it obscures the physically meaningful y-intercept.

Of course, forcing best-fits through the origin isn't always as subtle.  Trust me.  When we graded this problem, we saw the not-totally-unreasonable version above, but also we saw plenty of these:

BAD GRAPH #7:  Curved to get to the origin

Yuk.  But this one takes the cake...

BAD GRAPH #8:  Forced through the origin that isn't even the origin
It's perfectly acceptable, and sometimes desirable, not to begin an axis at zero.  However, you gotta recognize that what looks like the origin isn't necessarily the actual origin, in that case.  This grapher would have been fine, except for forcing that line through the origin that, after all, isn't the origin.  Boux.

One more set of BAD GRAPHs tomorrow.  But I promise, I'll include a couple of GOOD GRAPHs as well.

﻿

1. You are wrong here---there is a strong theoretical justification for 0 mass at 0 volume that does not include experimental error. It is much easier to justify fitting a line through the origin than it is to justify an arbitrary line. You need some phenonmenon to invoke to justify an offset other than 0 (like bias in your measuring equipment).

I would have given more points to a student who recognized that the model they were fitting had one free parameter (density) rather than 2.

If you are trying to justify model that density is a constant, then even the straight line is almost assuming your conclusion, so I see no advantage to the 2-parameter model.

2. Gas Station,please do see the original AP exam question, 2010 #2. The y-intercept has nothing to do with experimental error, but represents the volume of fluid displaced by the empty cup.

GCJ

3. I apologize---I had not seen the question itself, and so did not realize that the volume and mass measurements both had offsets in the experimental setup.

Indeed for the experiment described there it is necessary to include the second parameter, and forcing the line through zero is (as you said) a serious mistake.

Sigh, I should learn to follow all the links before I open my big mouth.

4. Not a problem! I really wish I could post the problem itself here to prevent confusion. Silly copyright. :-)

5. I need to get into that Greg Jacob's science class!! My future depends on it.