This is excellent. Last night a friend sent us this post by redditor TheIndieArmy, who observes that "the popularity graph for the name Bruce looks awfully familiar."
We love coincidences, but the resemblance of this graph to the Big Black Bat — the looming stature, the pointy headgear, the long, billowing cape — was too uncanny not to investigate further.
The good news is that the graph totally checks out. The labels on the graph, though, are a little confusing (see below), which made double checking its truthiness a little tough. Its title claims the graph shows how many babies were named "Bruce" over time, but what's with the "Ranking" radio button? What do the numbers 0 through 15,000 (along the y-axis) have to do with ranking?
Well, technically they have everything to do with one another, insofar as the number of children named "Bruce" in a given year will affect the name's position on a rank of popular baby names — but associating rank and total number with a single axis on one chart is a confusing mistake.
The graph (which, I later learned, came from Disney's Baby Zone) gets its data from the Social Security Administration — an agency known for managing America's social insurance program and keeping a tally on important things like baby names. A visit to the SSA's website confirms there was a dramatic rise and subsequent fall in the popularity of the name "Bruce" between around 1920 and 1970.
But if the above graph charted the name's rank over time, and if we assume that higher ranks are represented with lower numbers (the highest rank being 1st place), we would expect a dip in the chart spanning those time points, not a spike. And on an accurately labeled graph, that's exactly what we see. Here's what the popularity graph of the male name "Bruce" looks like over time, if your y-axis corresponds to the name's rank on a list of popular names (data via SSA, graph via WolframAlpha):
The lesson here is that when it comes to graphs, the way you present your data matters. This holds doubly true if your goal is to accurately depict an issue wrought with contention and heavy with social significance — like global warming, or a plot resembling the outline of Batman. For instance: graphing the popularity of the name "Bruce" over time according to its rank among other baby names gives the graph above. It looks nothing like Batman. Graphing the name's popularity over time according to the fraction of newborns named Bruce in any given year also looks wonky — sort of like a Batman ice-sculpture that's about 12 hours past its prime:
Narrowing in on the perfect Batman graph is not easy. Plot the fraction of newborns named "Bruce" on a logarithmic scale and the resemblance vanishes entirely. Plot the estimated current age distribution, and you get something that looks a little more like Batman, but with a droopy-looking right ear. The Name Tracker applet on the Family Education Network also gets its data from the Social Security Administration, but appears to only use half a dozen data points for 80 years' worth of naming trends, which gives us the funny looking graph pictured on the left (the units on the y-axis are also in units we haven't encountered yet: the name's ranking per 1 million babies). The upshot: the units you choose, and how you choose to present those units, have a dramatic effect on how the information you're trying to communicate is perceived.
But back to Batman. How do we get an outline of the Dark Knight's signature cowl? And how do we make it as symmetric as possible?
As the graph on Disney's Baby Zone (i.e. the graph submitted to reddit by TheIndieArmy) attempts to communicate, this is accomplished by plotting the annual number of babies named "Bruce" over time:
The graphs' peaks, and the units on their axes, match up. If you notice that the Batman outline in the WolframAlpha graph looks a little shorter and fatter than the one from Disney's Baby Zone, that's because the units on the former's x-axis have been spaced further apart.
This is all to say that a graph's formatting can provide many different pictures of the information it presents. Does the graph submitted to reddit by TheIndyArmy check out? Absolutely. But it's just one way of looking at the data, a fact that allows for manipulation and — in the case of contemporary issues of greater import than the popularity of a baby name over time (global warming, for example) — misrepresentation. Learn how to distinguish good graphs from bad graphs and you'll be better equipped to identify deliberate distortions of information. (Or verify the integrity of a sourceless chart resembling the outline of a comic book character. Whichever.)