The American Historian

U.S. History in a Statistical Age

John Majewski

We live in a golden age of quantification. Statistics have revolutionized sports (think Michael Lewis’s Moneyball: The Art of Winning an Unfair Game, 2003) and politics (think Nate Silver’s FiveThirtyEight.com. Perhaps no single work reflects the growing influence of statistical thinking than Thomas Piketty’s Capital in the Twenty-First Century (2014), an unlikely bestseller with more than one hundred tables and graphs. The prominence of statistics in political and policy discourse reflects the fact that many of today’s great issues hinge on quantitative analysis. What is happening with income inequality? How fast is the planet warming? Is the Affordable Care Act viable? Answering these questions persuasively requires sophisticated statistical analysis and the ability to explain that analysis to nonspecialists. The Internet, which is ideal for presenting dynamic and provocative statistical graphics, has contributed to the growing popularity of quantification. News organizations—whether the venerable New York Times or start-ups such as Vox.com— fill their publications and websites with attractive graphs, maps, and other forms of visual data.

While public discourse is becoming more statistical, historians of the United States are becoming less so. To quickly ascertain the degree to which U.S. historians use statistics, I reviewed the last five years of the Journal of American History (JAH), looking to see how many articles contained at least one table, graph, or map based on statistical data. Such graphics usually indicate significant primary research to compile new data, and constitute a clearly visible and easily measurable manifestation of statistical thinking. The results are striking. Only 10 percent of recent JAH articles included any type of statistical graphic at all, and only three articles used quantitative data as a major piece of evidence. Statistics were more commonly embedded in the text, where they served as context and background. When putting their work on a big professional stage, most U.S. historians are more comfortable making statistics bit players rather than giving them leading roles.

Historians of the United States have not always been so reticent in putting statistical data center stage. In the 1960s and 1970s, historians believed that quantitative analysis was one way of understanding the behavior of ordinary men and women. Quantitative history seemed more democratic than traditional history that typically focused on prominent politicians, diplomats, and statesmen; ordinary men and women literally counted in a statistical table. Quantitative history seemed destined to grow as social history became more entrenched. Counting statistical graphics for articles published in the JAH between 1969 and 1974—the heyday of the new social history and new political history—reveals that 23 percent of articles contained at least one table or data-driven map. In 1974, Robert P. Swierenga wrote in the JAH that quantitative historians increasingly displayed “a growing self-confidence” and that “there is now a feeling that historians can make a positive contribution in their own right to sociological, political, economic, and statistical theory.”[1]

Forty years of hindsight shows that Swierenga’s optimism was misplaced. Precisely at the time Swierenga was writing, quantitative history began migrating to more overtly social science disciplines, where it would find a congenial home. The trend was especially pronounced in economic history, which developed a distinct brand of advanced econometrics (often called cliometrics) that dominated the Journal of Economic History (JEH). The models and econometrics of the JEH appeal mostly to highly trained specialists. A few prominent scholars (such as Naomi R. Lamoreaux and Gavin Wright) write economic history that appeals to both economists and historians, but for the most part quantitative economic history composes a tiny minority within history departments. As economics grew more imperialistic and boldly ventured into demography, politics, and sociology, so too did economic history. The cliometrics revolution ended up incorporating many forms of demographic and quantitative political history.

To understand why that shift occurred so decisively, it is important to understand the limited nature of quantitative analysis in the 1960s and 1970s. Historians from that generation liked to count, but their statistical tools were usually limited to basic averages and percentages. Descriptive statistics such as means and medians were and are important tools, but they only take historians so far. To better understand causality among different statistical variables, historians needed to use more complex methods such as regression analysis, a powerful tool that is foundational to social science statistics. Regressions can help public health specialists determine smoking’s impact on cancer rates and help political scientists understand the influence of race and ethnicity on voting behavior. Regressions and other advanced statistical techniques, though, require substantial training, which has a high cost in forgone opportunities. Faced with the choice of doubling down on more advanced statistical training or retreating from quantitative history, historians ended up ceding scholarly territory to social scientists.

From a pragmatic standpoint, the move away from quantitative history made sense. Consider the decision facing a graduate student in a history department. Graduate students learning foreign languages, reading for field exams, and conducting archival research are hard-pressed to find time to master advanced statistics. Learning statistical methods, moreover, is only the beginning of quantitative research. To usefully employ such methods, a graduate student would have to take the time to construct large data sets to answer historically significant questions. Given these heavy investments, scholars with an inclination toward statistical analysis naturally gravitate toward economics, while scholars more interested in analyzing texts and other nonquantitative sources naturally gravitate toward history. GRE scores over the last four years—summarized in the GRE Guide to the Use of Scores—confirm this disciplinary sorting. On average, students that applied to history programs had higher verbal reasoning and analytical writing scores than students that applied to economics programs, but prospective history students lagged well behind prospective economic students in quantitative reasoning. Prospective history grad students, in fact, scored lower on quantitative reasoning than in any other discipline in the social sciences and the humanities, including English and art.[2] Not surprisingly, students applying to history Ph.D. programs consider their quantitative reasoning scores to be relatively unimportant. As one poster wrote on a Grad Cafe online forum, “For History, I don't see how your [GRE] math score would matter at all.”[3]

One might view the divide between economists and historians as an efficient division of scholarly labor. The migration of quantitative history to the social sciences, though, sometimes had the quality of a nasty divorce rather than a natural parting of the ways. The heated debate over Robert Fogel’s and Stanley Engerman’s Time on the Cross: The Economics of American Negro Slavery (1974) symbolized how quantitative history became enmeshed in wider ideological issues. In Time on the Cross Fogel and Engerman ambitiously sought to bring cliometrics to the study of U.S. slavery. Their arguments generated plenty of controversy: slaveholding planters made substantial profits (hence slavery was a capitalistic institution); slaves were well fed and rarely beaten (cruel treatment of slaves was bad for business); and many slaves displayed a bourgeois work-ethic (how else could planters have made so much money?). Historians would generally agree that plantation agriculture was indeed highly profitable for the slaveholders, but the book’s arguments about slave culture were anathema to historians who focused on the resistance of the enslaved. Fogel and Engerman aggressively trumpeted the superiority of advanced statistical techniques over what they labeled as the “traditionalist interpretation.” The dichotomy between quantifiers and traditionalists was not really true—statically-minded economic historians were among the most vocal critics of Time on the Cross—but such rhetoric nevertheless created an inevitable “us” versus “them” dichotomy that made it all too easy for many historians to dismiss quantification, in the words of Herbert G. Gutman, as a “numbers game.”[4]

Just as many historians were finding quantification increasingly alien and alienating, a new emphasis on meaning, representation, and language offered an intellectually exciting paradigm—one that tended to view numbers not as representations of a reality, but as another set of symbols that can obscure as much as they can illuminate. An excellent example of this kind of cultural approach is Walter Johnson’s widely acclaimed Soul by Soul: Life inside the Antebellum Slave Market (1999). Soul by Soul covers some of the same ground as Time on the Cross, but it is as different as one could imagine. Whereas economic historians use slave prices to measure the economic value and productivity of slaves—implicitly assuming that prices reflected objective values—Johnson is interested in the cultural significance of the prices that slave traders so coldly recorded in their account books. For Johnson, those account books constitute “a history of back and forth estimations, crass manipulations, hazy connections, and occasional revolts—the daily history of the slave trade between the prices.”[5]

Soul by Soul presaged a new history of capitalism in which historians have become increasingly interested in the origins of commercial markets and financial institutions. Some scholars working in this emerging field (including Johnson himself) tend to be suspicious of quantitative economic history. Johnson’s latest work, River of Dark Dreams: Slavery and Empire in the Cotton Kingdom (2013), continues the criticism of Fogel and Engerman nearly forty years after the publication of Time on the Cross. As scholars seek to understand capitalism, though, it is not clear that they can ignore the material reality that quantification can usefully uncover. Many of the same set of issues that make quantification important among today’s public intellectuals—such as measuring economic inequality—are also important in thinking about the rise of capitalism in the nineteenth century. There are signs that historians interested in the history of capitalism are becoming more interested in joining “the numbers game.” Cornell University’s History of Capitalism Initiative, for example, sponsored a two-week summer camp that sought to give participants “confidence in applying quantitative methods and economic theory to historical research.”[6]n>

Where, then, should the historical literature go? The common-sense answer is that historians should combine quantitative and cultural approaches to understand the origins and impact of capitalism. That appears to be precisely what is happening, especially in recent work on slavery and the South. Caitlyn C. Rosenthal’s award-winning Harvard University dissertation “From Memory to Mastery: Accounting for Control in America, 1750–1880” (2013) examines how slaveholders pioneered new accounting methods, because slaveholders could better control their labor force and they often had more to gain from such methods. In some ways, this is careful empirical work. Rosenthal carefully calculates the spread of numeracy and accounting methods in ways familiar to traditional quantitative history. Her focus on “mastery” and “control,” though, suggests a cultural approach that examines the moral system that accounting methods represented. In somewhat similar fashion, William G. Thomas’s The Iron Way: Railroads, the Civil War, and the Making of Modern America (2011) used geographic information systems (GIS) to calculate the percentage of the population in each state that lived within fifteen miles of a railroad station in 1860. That data provides the groundwork for understanding how frequent contact with railroads shaped the political consciousness and military strategies of the Civil War period.

The GIS methods that Thomas employs have great potential to merge quantitative methods and cultural analysis. Mapping data, for example, allows historians to analyze the cultural construction of space. The most quantitative article in the last five years of the Journal of American History—Cameron Blevins’s “Space, Nation, and the Triumph of Region: A View of the World from Houston” in the June 2014 issue—uses this approach. Blevins computed the number of place names used in an ordinary late nineteenth-century newspaper (the Houston Daily Post) to reconstruct how the newspaper imagined Houston’s geography. Houston, it turns out, was in the Midwest, not the South, a finding that Blevins attributed to the growth of the national railroad network that made midwestern markets important for Houston’s business community. Blevins’s research is part of Stanford’s Spatial History Project, which sponsors a number of projects that involve mapping quantitative data to understand constructions of space.

It is too early to tell how widespread these methods will become, but the moment seems right for historians to once again use quantitative analysis to enrich conversations about the past. We should not expect statistical analysis to somehow produce a more objective, scientific view of the past. Today’s most thoughtful quantifiers, in fact, clearly recognize that statistical analysis, however big the data sets and sophisticated the methods, does not produce unimpeachable truths. As Nate Silver cautions in The Signal and the Noise: Why So Many Predictions Fail—but Some Don’t (2012), statistics do not allow us to reach an objective, impeachable conception of Truth. Rather, the real lesson of statistical thinking is that we “must become more comfortable with probability and uncertainty.”<>[7] When considering statistics from this perspective, there is no reason why anyone should consider history as inherently anti-quantitative. No discipline, in fact, is better able to analyze statistical evidence in terms of uncertainty, contingency, and context. Historians have much to gain from the better integration of statistics into their narratives, and statistical analysis has much to gain from historians.

JOHN MAJEWSKI is the interim dean of Humanities and Fine Arts at the University of California, Santa Barbara, and a professor in the department of history. He is the author of Modernizing a Slave Economy: The Economic Vision of the Confederate Nation(2009).

 

[1] Robert P. Swierenga, “Computers and American History: The Impact of the ‘New’ Generation,” Journal of American History, 60 (March 1974), 1046, 1070.

[2] GRE Guide to the Use of Scores, http://www.ets.org/s/gre/pdf/gre_guide.pdf, table 4, p. 31.>

[3]“GRE Scores?” forum, Grad Cafe Forums, http://forum.thegradcafe.com/topic/36023-gre-scores/.

[4] Herbert G. Gutman, Slavery and the Numbers Game: A Critique of Time on the Cross (1975).

[5]> Walter Johnson, Soul by Soul: Life inside the Antebellum Slave Market(1999), 47.

[6]Cornell University IRL School, “Summer Camp,” http://hoc.ilr.cornell.edu/summer-camp.

[7] Nate Silver, The Signal and the Noise: Why So Many Predictions Fail—but Some Don’t (2012), 15.