Have you ever found yourself in an argument with someone who claims they have “evidence” for something and will not shut up about it? Do you find yourself feeling uncomfortable with quarrelling with said “evidence”, even though you know the other person is wrong?
Here is a simple guide to spotting what is good evidence and what is not.
Is it even research?
This shouldn’t even need to be said, but I have seen a lot of people who believe a hypothesis to be evidence. It is not. This is particularly true in the case of evolutionary psychology: much of it is hypothesising without research. In other words, the authors publish a paper about what they think might be the case without actually testing whether it is the case.
It is perfectly possible to publish hypotheses without tests to stimulate discussion and debate. There is a whole journal dedicated to this! Medical Hypotheses has published some gems, including a particularly offensive paper about how “mongoloid” is an adequate name for people with Down’s Syndrome because like people from the Far East, people with Down’s Syndrome sit cross-legged and like eating food with MSG in it. Seriously. That was actually published.
Check whether the “evidence” contains data collection and statistical tests. If it does not, it is likely to be wild speculation, not evidence.
What sort of research is it?
This graphic is called the “pyramid of evidence“. It is a good way of looking at the best sorts of evidence in medicine, although it can be applied elsewhere. At the bottom is “background information”–upon which hypotheses are formed and, as seen above, sometimes published to stimulate debate. Moving up through the pyramid, we see better types of evidence: case studies, cohort studies, randomised controlled trials, and then, right at the top, systematic reviews. A systematic review is the “gold standard” of evidence. It takes all of the data for all of the tests of a theory, drug, medical intervention, etc, and puts it together into one data set, spitting out an “effect size” which tells us exactly how effective or “correct” the object in question is.
The differentiation between different types of research is important. Cohort studies are usually correlational: while it is not entirely true that correlation does not imply causation, correlational studies can only point us in the right direction. To properly establish causation, we need to manipulate some variables. Say, for example, we want to test whether exposure to feminist thought leads to lower levels of sexism. This can be tested by exposing one group of people to feminist thought, while having a control group of people who were not exposed to feminist thought. Before and after exposure, one would measure levels of sexism. This study has actually been done, and found that sexism decreased following exposure to feminism.
If the “evidence” being provided is one correlational study, then it might not be very good evidence. Ask if there’s any systematic reviews available, or at the very least an experimental study.
Quality of the evidence
On the evidence pyramid, there is a second dimension: quality of the research. Quality is made up of a number of important attributes, and it is important to check whether the evidence is good quality or not.
One crucial indicator is the sample. To get good results, the experiment needs to be conducted on a large group of people. The sample should, ideally, comprise of different people from different walks of life. Unfortunately, a lot of psychological research is conducted on psychology students, which throws a lot of it off-kilter, as students are younger and richer than most of the rest of us, and a lot wiser to taking psychological tests. Look and see who was in the study. It is a useful way of understanding how well the results apply to everyone else.
Another aspect of quality is the state of the comparison group. If there is no comparison group whatsoever, be very cautious: the evidence is probably terrible quality. I have seen many people try to draw conclusions about the differences between men and women based on studies of only men, or only women. The comparison group, if present, needs to be, of course, comparable. If a study is testing the differences between men and women, and the women in the comparison group are less educated, for example, then the results could be down to education rather than gender.
For the sake of brevity, I point you towards this excellent (freely available) paper which teaches readers to critically evaluate the quality of a paper. Knowledge of this is power.
Popular science books are not evidence
Anyone can write and publish a book, particularly with the age of self-publishing. Even books from “big names” such as Steven Pinker are not good evidence, as books are not subject to peer review. Peer review is a process which is used in the academic community for checking whether a paper is valid: before anyone publishes the paper, it will be read through by several other experts in the same field of research. Often, the reviewers will want to see some of your data to verify your findings. They also, more often than not, send the paper back to you and tell you that perhaps you might want to reinterpret your findings or clarify certain bits of the research, or that you’ve made a massive honking error. They also ask you to draw attention to the limitations of your research, so readers can be aware of any of the possible pitfalls in the papers outlined above. It’s a lengthy process, but it means that journals aren’t publishing any old crap.
For books, this is not the case. Often, the text is read by an editor with no experience in the field of research. If the writer fucks up somewhere, it won’t get caught and will be published anyway.
One example of this is the book The Spirit Level. There are a few holes in the evidence presented in the book which are dealt with in the reply book The Spirit Level Delusion. The author of Delusion rightly criticises problems which appear in the book, though, unfortunately, is tilting at windmills: most of the peer-reviewed evidence upon which The Spirit Level is based stands up pretty well. It is only some of the bits that didn’t get peer-reviewed and were thrown into the book anyway which can be picked apart. Essentially, The Spirit Level stands up, but due to the sloppiness of the book publication process, it left itself with some open goals in the form of downright shoddy analysis, leaving many (wrongly) thinking the entire theory disproved.
If the only evidence linked is books, be wary. Demand to see peer-reviewed evidence instead. These days, a lot of it is available for free, and even if a paper is not, you can usually see the abstract.
I hope this guide will be helpful for would-be troll-slayers. Use your knowledge. Use it wisely. Happy hunting!