The increased use of data in journalism has paralleled the overall rise of big data. Today’s strongest stories use real data that can enhance, corroborate or even challenge facts. In Newhouse’s Communications@Syracuse program, we are exploring this very topic.
Based on some of the key takeaways from our data-driven journalism course, we’ve compiled best practices for making the most of your data.
1. Find data that supports your interests
Image: Mashable, Bob Al-Greene
There is no shortage of large data floating around on the Internet, but fishing for stories in random numbers won’t help you produce the best story. Instead, figure out the story you want to tell and then look for data that supports or challenges that story. The figures will be easier for you to interpret if you understand the data and it reflects your interests.
2. Know what is being measured and how
The names of many data points are vague. “Income,” for instance, could be net, gross, household, individual, monthly or yearly. Take the time to read the “About the Data” documents that accompany most reliable datasets. These will help you to understand the data and ensure you aren’t comparing apples to oranges.
3. Interview your data by asking it questions
Let’s say you have a dataset that lists professions, incomes, injury rates and a bunch of other variables. You could ask it: “What’s the relationship between a college degree and income? What profession has the highest annual pay? Which profession has the highest on-the-job injury rate?”
Recently, there have been a lot of stories written about how location affects online search results. Why? Because Google is able to track its searches by location. The authors of those stories about this practice asked the data, “In what way does where I live change what I search for online?”
4. Examine the data from all angles
Image: Guido Rosa, Corbis
Most big datasets are going to have outliers at the extremes, trends in the middle and correlations that make sense. These data points often have a great story to tell.
For example, take baseball player salaries. While the salary of the average baseball player isn’t nearly as interesting as the salary of the highest-paid player, the trend in average player salaries over time is newsworthy. So, too, would be an analysis of whether salary correlates to performance on the field. Looking at the data for these sorts of less-than-obvious correlations may bring new life to a tired story or lead you to a new one.
5. Watch out for pitfalls!
Always ask, “Does this finding make sense?” There will always be items that correlate, but don’t necessarily have a relationship with each other.
The classic example comes from a statistics class. As people eat more ice cream, the crime rate goes up. Of course, there’s no causal relationship between ice cream consumption and crime; they both just happen to be things that go up when another variable, temperature, rises. Always beware of false conclusions that will mislead you.
6. Make it visual; make it interactive
Visualizations can bring data to life and can even help in the interpretation of data. Sometimes, you can see trends in data that aren’t apparent when you only look at numbers or statistical tests. By injecting interactive qualities into your visualization, you’re able to invite your audience to ask questions about the data while they examine it.
7. Work in teams
There are a lot of skills involved in complex data projects — from data-mining to data interpretation, storytelling and visualization. Few people have all of these skills. Build on others’ strengths to do something special with numbers and charts.
8. Don’t forget to interview
There’s no reason there should be “stories”and “data stories.” Sources can direct you to interesting data, help you understand the data, suggest questions that you can ask the data, detect faulty assumptions in your analysis and provide color and insight into the topic. The best stories often borrow from both data analysis and traditional journalism.