Visceralizing Uncertainty With Data Science

When numbers are fickle, scientists turn to new ways to represent uncertainty.

| March 2020

Excerpted from Data Feminism by Catherine D’Ignazio and Lauren F. Klein. Reprinted with Permission from The MIT PRESS. Copyright 2020.


Scientific researchers are now proving by experiment what designers and artists have known through practice: activating emotion, leveraging embodiment, and creating novel presentation forms help people grasp and learn more from data-driven arguments, as well as remember them more fully.

As it turns out, visceralizing data may help designers solve one particularly pernicious problem in the visualization community: how to represent uncertainty in a medium that’s become rhetorically synonymous with the truth. To this end, designers have created a huge array of charts and techniques for quantifying and representing uncertainty. These include box plots, violin plots (figure 3.6), gradient plots, and confidence intervals.

Unfortunately, however, people are terrible at recognizing uncertainty in data visualizations, even when they’re explicitly told that something is uncertain. This remains true even for some researchers who use data themselves!data-feminism-cover

For example, let’s consider the Total Electoral Votes graphic displayed as part of the New York Times live online coverage of the 2016 presidential election (figure 3.7). The blue and red lines represent the New York Times’s best guess at the outcomes over the course of election night and into the following day.



The gradient areas show the degree of uncertainty that surrounded those guesses, with the darker inner area showing electoral vote outcomes that came up 25 percent to 75 percent of the time, and the lighter outer areas showing outcomes that came up 75 percent to 95 percent and 5 percent to 25 percent of the time, respectively. If you look closely at the far left of the graphic, which represents election night (everything prior to the 12:00 a.m. axis label), the outcome of Trump winning and Clinton losing easily falls within the 5 to 25 percent likelihood range.

Although many election postmortems pronounced the 2016 election the Great Failure of Data and Statistics, because most simulations and other statistical models suggested that Clinton would win, most forecasts did include the possibility of a Trump victory. The underlying problem was not the failure of data but the difficulties of depicting uncertainty in visual form.




Facebook Instagram Twitter


click me