If you’re working with data, chances are, there will be at least a few times where you encounter the “nightmare scenario”. Things go awry — values are missing, your sample is biased, there are inexplicable outliers, or the sample wasn’t as random as you thought. Some issues you can solve, other issues are less clear. But before you tear your hair out — or, before you tear all of your hair out — check out The Quartz guide to bad data. Hosted on GitHub,The Quartz guide lists out possible problems with data, and how to solve them, so that researchers have an idea of what next-steps can be when their data doesn’t work as planned.
With translations into six languages and a CreativeCommons 4.0 license, The Quartz guide divides problems into four categories: issues that your source should solve, issues that you should solve, issues a third-party expert should help you solve, and issues a programmer should help you solve. From there, the guide lists specific issues and explains how they can or cannot be solved.
One of the greatest things about The Quartz guide is the language. Rather than pontificating and making an already frustrating problem more confusing, the guide lays out options in plain terms. While you may not get everything you need for fixing your specific problem, chances are you will at least figure out how you can start moving forward after this setback.
The Quartz guide does not mince words. For example, in the “Data were entered by humans” example, it gives an example of messy data entry then says, “Even with the best tools available, data this messy can’t be saved. They are effectively meaningless… Beware human-entered data.” Even if it’s probably not what a researcher wants to hear, sometimes the hard, cold truth can lead someone to a new step in their research.
So if you’ve hit a block with your data, check out The Quartz guide. It may be the thing that will help you move forward with your data! And if you’re working with data, feel free to contact the Scholarly Commons or Research Data Service with your questions!