Data Visualization: You Must ‘C’ It to Believe It

Ultimately, what makes data visualizations so effective at conveying a point is that they don’t require much analysis on the viewer’s end because they’ve already done the thinking for you. That’s both seductive and potentially misleading.

That’s exactly why we have to be careful about not merely accepting a visually expressed story at face value. Any data visualization should be subjected to a triple “C” test.

Context: This includes contextual information for the graphs, which sometimes indicates that the results visualized represent outliers rather than typical results. Getting the context also requires getting the baseline for the survey, including timelines, locations, and the population size and type used to get the numbers.

As data visualization tools include ways to slice and dice your data, it is not all that difficult to zero in on just the segment that yields the results you want. So you need to know the larger context, as well as any added-in points that are outside that particular context.

Correlation: This is the supposed strongpoint of visualizations: showing up correlations. But they are easily manipulated and misleading, as there are many correlations of time that are not necessarily causally connected—though visualizations can make them appear that they are.

Causation: This is what real insight is all about: finding out what causes what. There is no substitute for thinking this through, no matter how seductive it may be to simply go with the correlations presented by the visualization.

If you’ve assessed the data visualizations in context and questioned whether the correlations are directly linked, then you may conclude that there is direct causation. But you do have to do your own thinking for that.

With the proliferation of data visualization tools, which make it easier for anyone to graphically present numbers they may have cherry-picked for the occasion, you can’t just believe what you see. You have to “C” it through your own analysis.