# Statistics: A Good P-value is Not Enough

From the Bitesize Bio channel

Like many scientists, I don’t consider myself a statistics expert. But I am determined to do things right in my science, and that includes statistics.

In my experience, a lot of scientists who are “scared” of statistics fall into the trap of ignoring the existence of anything beyond a t-test. But using the right method to analyse your data is essential to having confidence in your results, and there are a lot more methods out there than the t-test.

So rather than asking Excel to do a quick t-test for any type of data, I take out my statistics book and read until I’m confident that I have found the right statistical method to use.  If you are scared of stats, I hope that this article can convince you to do the same.

Many tests are based on the assumption that the data follows a normal distribution. However, this is often not true for biological data: for instance, you cannot have a negative concentration of a certain protein in your blood. Likewise, very small sample sizes (e.g. n<10) require special treatment. My point here is not to explain to you what you need to do in these cases, but to make you aware that choices need to be made.  As an example, let’s take a look at how P-values can be used and misused.

## Abusing the P-value

Choosing the right technique is not all there is to it; the way you present the outcome is equally important. I often see people cite P-values in articles without mentioning the effect size found. A P-value in itself says nothing about biological meaning.

As an example, let’s consider two correlations (associations between two continuous variables). A correlation coefficient ‘r’ describes the degree of ‘straight line’ association between the values of two variables, and it can take any value between -1 and 1 (see approximate examples in the figure). The closer r is to -1 or 1, the more the points in a scatter diagram lie on a straight line. An r close to 0 means there is no specific pattern in the scatter diagram; that is, the two variables are not correlated.

If we only give the significance level of a correlation, we have no idea about the strength of the association, and thus its relevance. If we compare r=0.4 and p<0. 001 with r=0.82 and p=0.06, people might get more excited about the former, due to its high statistical significance.

However, all the former says is that with great certainty (p<0.001: there is only 0.1% chance that you found this outcome when it is actually not true) there is a correlation coefficient of 0.4, which is in fact not very impressive. On the other hand, p=0.06 is generally considered non-significant, as the level of statistical significance is often arbitrarily set at 0.05. But in this example, a correlation of 0.82 is quite strong, so something seems to be going on here and may be more biologically relevant than something showing a statistically significant correlation of 0.4. To give you a feel for what these numbers mean: R2 (R squared) represents the fraction of the variance of Y that is explained by the variance of X, so roughly from r=0.75 and above we’re really talking business.

With a larger sample size, the correlation of 0.82 might easily have reached statistical significance. Which, by the way, you could have known beforehand had you performed, as one should, a power analysis before starting data acquisition, which tells you the sample size required to detect a biologically significant effect.

This was just one example to make my point about how important it is to correctly present your data. I hope the take-home message is clear: a P-value by itself is never informative!

So when reading an article, do look at the data in graphs and tables, and not just at the P-value and the author’s conclusions; you might find that the author had an optimistic interpretation of his results.

If you have never understood anything of statistics and you don’t want to think about it at all, that’s ok (you’re not the only one!). Ask someone who is not afraid of statistics, ideally before you embark on your project, to ensure that you are collecting the right type of data to answer your questions.

## Source:

Altman DG. 1999. Practical statistics for medical research. Chapman & Hall/CRC.

Enter your email to be informed when we publish more articles like this on BsB, and also get access to all of these goodies:

• Free ebooks and audiobooks on the topics that matter to you
• Advance notice of new webinars and eBooks

# Perfectionism: Are you on the downward spiral?

Do you fear failure every time you do an experiment? Do you feel constantly stressed about obtaining poor results? Do you feel personally culpable when an experiment goes wrong? If you answered “yes” to any or all of these questions, you may be suffering from perfectionism. For a scientist, this is a particularly damaging trait [...]

# Does (Should) Your Lab Rock?

My PhD was a soul-less affair. It was also rock-less, jazz-less and pop-less. And all because my supervisor was of the opinion that music in the lab was a distraction that reduced concentration and our ability to do the job. “Rubbish!”, I thought, “Nothing helps you through a mindless task like splitting cells, pipetting or [...]

# Choosing a Post Doc Position

After all that hard work, you finally have your PhD. Now what? If your career choice is academic research, your first post-doc position beckons. The choice of where, and with whom, to take up a post-doc position is a very important one as it is at the post-doc stage where publications are required to move [...]

# Delivering Effective Criticism

Criticism is not just valuable, it is essential for a person’s development as a scientist, or anything else for that matter. Well that’s not entirely true. Not all criticism is valuable, it has to be the right kind of criticism. It has to be constructive and better still, well delivered in order to inspire the [...]

# Judith R. Brouwer

As a PhD student in Rotterdam, The Netherlands, and as a postdoc in Paris, France, she worked on trinucleotide repeat disorders. First she focussed on the pathogenesis but gradually she moved to studying epigenetic consequences of these unstable...

1. from on

Dear Judith,
Nice piece! I completely agree with you. 1: There is a lot of statistics abuse out there. And 2: don’t be afraid of stats, it is not scary at all!

When you only present the p-value of your correlation, you indeed miss out on the strength of the effect. But more importantly, you miss out on the direction as well. A positive correlation indicates that when one parameter increases, the other one increases as well. With a negative correlation, when one increases the other one decreases. This has major implications for the interpretation of your findings. Yet another reason to always start with describing your findings before even mentioning the stats!

Good luck,
Alma Tostmann
Epidemiologist

• from on

Hi Alma,
That’s a very good point, thank you for pointing it out.
And thanks, too, for assuring us that there’s no need to be afraid!

## Subscribe to Channels

To receive information about any of our new channels click on the button below.
subscribe to the channel newsletter »

## Write for us

Have a short tip, a written
article or a video you'd like
to see published?
write for us »