- Daniel Caesar dons cape for whiteness—and gets canceled Wednesday 4:29 PM
- Triton is a new malware ‘deliberately’ designed to put lives at risk Wednesday 3:23 PM
- ‘Into the Dark: I’m Just F*cking with You’ is one of the series’ best Wednesday 1:54 PM
- Trump’s latest prop, a map of ISIS, gets memed Wednesday 12:54 PM
- HBO sends fans on a global scavenger hunt for 6 Iron Thrones Wednesday 11:51 AM
- The Awkward Family Photos game is Cards Against Humanity for meme lovers Wednesday 11:50 AM
- London firefighters’ organization accuses ‘Peppa Pig’ of sexism Wednesday 11:41 AM
- YouTuber accused of abusing her children to make kid-friendly content Wednesday 11:20 AM
- Ari Fleischer’s Iraq War tweet isn’t going over well Wednesday 10:54 AM
- Cop arrested for recording man’s genitals, forcing mentally ill man to twerk Wednesday 10:37 AM
- MoviePass rebrands its unlimited plan, again Wednesday 10:37 AM
- Former Alaska senator launches meme-filled 2020 primary campaign Wednesday 10:17 AM
- The Shane Dawson cat controversy has resulted in these sex memes Wednesday 10:06 AM
- Sarah Sanders mocks CNN reporter with ‘dear diary’ tweet Wednesday 9:03 AM
- Know what you’re signing up for thanks to these dating site reviews Wednesday 8:58 AM
This tool exposes how easy it is to manipulate scientific data
Spoiler alert: It’s stupidly easy.
In the world of research science and academia, you’ll often hear this cynical mantra: “publish or perish.”
The idea is that researchers who frequently publish their work in journals often do better in their careers. Those who publish less are more prone to stagnation, or possibly even losing their jobs altogether.
Such pressures to publish may be leading some researchers to manipulate their data, intentionally or not. Though it may seem difficult to finesse data by accident, it’s actually stunningly easy. In a blog post on FiveThirtyEight titled, “Science Isn’t Broken,” science journalist Christie Aschwanden lets readers see for themselves how easy it is. The article—a fascinating read itself—contains a tool that allows you to work with real-world data to see if you can find a connection between the overall economic health of the nation and whether Republicans or Democrats currently hold the most power. You can choose to include or exclude certain variables. Watch as your results—and their publishability—change with your choices.
Your results are publishable if, after your manipulations, you achieve a “p-value” of less than 0.05.
A p-value, if you didn’t take statistics or need a refresher, is a measure of probability. In statistics you are testing two hypotheses: the proposed hypothesis (e.g. “coffee causes cancer”) and the null hypothesis (there is no relationship between coffee and cancer.)
The null hypothesis is a very important concept in statistics and is the basis for calculating the p-value. Basically, the p-value tells you the probability that you would get your observed data set (such as a large number of people who consumed a lot of coffee and later got cancer) if the null hypothesis were true. That means the smaller the p-value—the smaller the probability that your data set would exist if there were no effect—the more likely it is your results are real. (For the record, there is no known link between coffee and cancer, it’s just an example.)
The problem, as Aschwanden demonstrates with her tool, is that it’s astoundingly easy to get a low p-value.
Using the tool, you can create 1,800 different combinations of variables to yield your result. Of those, 1,078 yielded a “publishable” result (a p-value of less than 0.05, an arbitrary standard for “statistical significance” accepted by many scientific journals.)
Aschwanden is certainly not saying anything that isn’t already known to the scientific community—p-values are notoriously easy to hack.
The problem, Aschwanden says, is that despite the bar being laughably low, many journals will readily accept a p-value of 0.05 or lower as reason enough to publish an article. There are more issues Aschwanden discusses at length, but one biggie she didn’t quite touch on is the propensity for journals to publish only positive results.
In undergrad science, you’re taught that any result, even a negative one, is a good result. However in the real world of science, not all results are created equal. Journals have a strong bias towards only publishing “positive” results, or papers that boast low p-values indicative of a potential effect. Showing that there’s no relationship between two variables is also incredibly important, but less interesting.
As a result, some are calling for a boycott of the p-value. One journal has even refused to accept papers whose results rely on p-values alone. There are more powerful statistics out there to assess data. The problem is that many researchers may not understand how to use all the tools in their statistical toolbox. P-values are great because they’re easy to calculate and everyone, supposedly, knows what they mean.
Ultimately Aschwanden’s point is that science is very difficult to do and that even scientists can fall prey to bias and misconduct, just like any other human. We—including the scientists themselves—must always be aware of that.
Image via Intel Free Press/Wikimedia (CC BY 2.0)
Cynthia McKelvey covered the health and science for the Daily Dot until 2017. She earned a graduate degree in science communication from the University of California Santa Cruz in 2014. Her work has appeared in Gizmodo, Scientific American Mind, and Mic.com.