We’re hard at work editing chapters for our Year 1 data literacy book. While we’re rolling around ideas, here are some ideas from Geoffrey James’ “9 Ways to Spot Bogus Data” in Inc., subtitled “Decision-making is hard enough without basing it on data that’s just plain bad.”
If you don’t know what some of these questions are asking, stay tuned … we’ve got you covered. Soon, anyway.
Good decisions should be “data-driven,” but that’s impossible when that data isn’t valid. I’ve worked around market research and survey data for most of my career and, based on my experience, I’ve come up with touchstones to tell whether a set of business data is worth using as input to decision-making.
To cull out bogus (and therefore) useless data from valid (and therefore potentially useful) data, ask the following nine questions. If the answer to any question is “yes” then the data is bogus:
Will the source of the data make money on it?
Is the raw data unavailable?
Does it warp normal definitions?
Were respondents not selected at random?
Did the survey use leading questions?
Do the results calculate an average?
Were respondents self-selected?
Does it assume causality?
Does it lack independent confirmation?
Let us know which of these you’d like to see unpacked in a future blog post!
Data is not the same as “truth” and no matter how large a mountain of evidence you have, the same numbers can be used to support many different conclusions. Into this world, perhaps the best approach might be to recognize that instead of “fake” and “true” news we have a hundred shades of gray in between.
Numerical estimates, such as ballpark figures or “guesstimations,” abound in school, work, and our lives. For example, you can roughly calculate the impact of shopping with a reusable grocery bag, instead of using plastic bags, for a year. But how can anyone know that? How do we make sense of “guesstimations?” Are they even grounded in good mathematical principles?
Our team member Connie Williams shared a video of a talk by Dr. Lawrence Weinstein, a professor at Old Dominion University. In his lecture, “Guesstimating the Environment,” he points out that “guesstimations” are inherently imprecise. He covers the use of “guesstimations” in topics ranging from ethanol to windmills and considers issues by calculating estimates. While “guesstimations” are imprecise, they do provide a way to understand the scope of a problem.
Watching this lecture, or a portion of it, could spark a discussion about “guesstimations” in the news and academic resources with your students. Some questions to discuss include:
Where do “guesstimations” appear?
What purposes do “guesstimations” serve?
What are the limitations of “guesstimations?”
What are appropriate uses and applications of “guesstimations?”
Dr. Weinstein also asks a key question about a “guesstimation:”
Is this a lot or a little?
It can be hard to know if a “guesstimations” is big or small. Consequently, Dr. Weinstein emphasizes the need to compare the numbers to something else. A comparison is a great way to make sense of numbers, whether they are estimates, actual counts, probabilities, or statistics. When creating or evaluating “guesstimations,” a helpful rule of thumb is to find something with which to compare it or help to put it in context. In the grocery bag example, he explains how to compare a person’s annual use of plastic bags to gasoline burned by driving her car. It turns out that the amount of plastic bags that an individual uses is insignificant compared to how much gas that her car burns. The lecture contains many more examples like this — have a look!
Image: “Bags Plastic Shopping Household Colorful Sunny” by BRRT, on Pixabay. CC0 Public Domain.
For example, Johnson and Gluck shed light on self-reported data:
How many times did you eat junk food last week?
How much TV did you watch last month?
How fast were you really driving?
When you ask people for information about themselves, you run the risk of getting flawed data. People aren’t always honest. We have all sorts of biases. Our memories are far from perfect. With self-reported data, you’re assuming that “8” on a scale of 1 to 10 is the same for all people (it’s not). And you’re counting on people to have an objective understanding of their behavior (they don’t). (p. 20-1)
Johnson and Gluck acknowledge that “[s]elf-reported data isn’t always bad…. It’s just one more thing to watch out for, if you’re going to be a smart consumer of data.” This salient point is easy to keep in mind when looking at sources with students, reading the newspaper, browsing the web, listening to the radio on the way home from work, etc.
Everydata isn’t about the math; it’s about understanding the data and numbers that you encounter. Take a look at it for more practical tips like that one!
Often, when students make pie charts, they end up with several very narrow slices. One thing we can encourage our students to do is think about what happens to their readers’ comprehension when the slices get so narrow that they’re almost impossible see.
[F]ake news exists because as a society we have failed to teach our citizens data and information literacy … [R]equiring programming and data science courses in school would certainly create more technically-literate citizens, but this is not the same as data literacy and the kind of critical and devil’s advocate thinking it requires. Technology is also not a panacea here, as there is no simple magic algorithm that can eliminate false and misleading news. Instead, to truly solve the issue of “fake news” we must blend technological assistance with teaching our citizens to be data literate consumers of the world around them.
We focused our project on how students “read” and “write” with data because we know how important it is to develop this kind of critical thinking around data.
As you approach the New Year, what will you do to help your students better understand the data in the world around them?