# Statistics for Biologists: Chi Square Test and its use in Biology

Statistics is one of the most hated subjects by biologists around the globe. In spite of its daily dose of abuse, knowledge of statistics can be a life-saver.

Chi square distribution and test is one of the most important and widely used probability distribution in inferential statistics for biology and life science students. It is generally based on proportions of variables present in the experimental condition.

The Chi square test is popular because of its many strengths, including:

- Easy to compute
- Can also be used for data collected on the nominal (categorical) scale
- Can be used to study the difference among the various variables in consideration
- Does not make assumptions about the distribution of data (e.g. normality)
- Can be used for large data sets

Because of its popularity, I thought I’d review it for you here.

## How is the chi square test used?

The chi square test can be used in two ways:

### 1. Goodness of Fit Test

Goodness of fit test is used when you have a widely accepted theory and want to check whether your observed values are in sync with the theory or not.

- The goodness of fit test is normally used in genetics where the genotypic and phenotypic ratios have already been established for a given test and population.

- You can also use this test in case when the expected outcome has already been established. For example: You want to understand the outcome of an experiment that you set in your field based on the test cross given by Mendel. You observe that the results are not according to the accepted theory. In this case you can check the p value of the chi square test for goodness of fit test to determine whether the observed values are in accordance to the test or not [similar example is explained later]. If p value <0.05, your experiment is a success. If not, better luck next time!

- In cases related to the Hardy-Weinberg principle.

### 2. Test for Independence of Attributes

Independence of attributes, or χ^{2 }test of association of attributes, is used to understand how the two attributes are connected to one another. It is used to study if proportions of one variable are different from the values of other variables.

- Comparison of parameters/attributes among control and test populations
- Evaluation of correlation of disease symptoms with the disease in case of clinical data

## Steps for the proper calculation and understanding of the chi square test

The following steps are generalized steps that could be used for both Goodness of Fit Test as well as for Test of Independence of Attributes.

There are 3 simple steps for every chi square test:

### 1. Develop your hypotheses

It may seem obvious, but to perform any statistical test, you must first define what you are testing. This is based on the hypothesis of your research. For the chi square test, you need a null hypothesis and an alternative hypothesis.

*Formulate the null hypothesis*

The null hypothesis (H0), also known as the hypothesis of NO difference, states that there is no difference in the results before or after the test is performed. For example, let’s say you want to understand the effect of sunlight on the plant growth. In this case, your null hypothesis will state that the sunlight has no effect on plant growth.

*Formulate the alternative hypothesis*

The alternative hypothesis (H1) is always opposite to that of the null hypothesis. In our plant study, the alternative hypothesis would be that the amount of sunlight affects plant growth rate.

### 2. Do your calculations

Once you know what you are testing, you can apply the calculations. There are many different software packages that you can use, but the formula for testing is:

*Determine Degree of freedom (df)*

df is the number of parameters of the system that may vary independently without violating any constraint imposed on it. Degree of freedom can be easily determined using a matrix system. If you are working with a matrix of 2 (rows) x 3 (columns) then the degree of freedom is:

df = [(2-1)x(3-1)]= 2

Note: Every Chi square calculation can be represented as matrix as explained later in example 4.

### 3. Find p

Use the chi square tables (an example is attached to this post) to determine the *p* value. You should only check the p value corresponding to the degree of freedom calculated in step 3.

## Now, let’s have a look at performing these tests with a few examples

### Example of Goodness of Fit Test

Assume that you have crossed pure breeding plants of genotype A/A, B/B, a/a, b/b and obtained di-hybrid A/a, B/b. You then test crossed this to a/a, b/b. The resulting F1 generation matrix of the offspring was:

*Step 1: Develop your hypotheses*

H0 = the resulting F1 generation is in accordance with the established theory (1:1:1:1).

H1 = the resulting F1 Generation is not in accordance with the established theory.

*Step 2: Do your calculations*

df=3 [{df=(r-1) x (c-1)} , in this case r=type of genotypes in study i.e. A/B, a/b, A/b and a/B and c=No. of conditions in which genotypes are being studies (viz. observed values and expected values.)

*Step 3: Find p*

Find the χ^{2} value from Chi Square table at df=3

χ^{2} = ±7.81

**Result**: You can accept H0 because the results are in accordance with the established theory. χ^{2} = 5.2 and lies between -7.81< 5.2 <+7.81 at α=0.05 ; α is called the **confidence interval****. **An α = 0.05 is acceptable when the sample size is >30. However, if the samples size is <30, then 99% of curve is accepted at α=0.01.

### Example for Independence of Attributes

Assume that in a scenario you have two groups of patients: one diseased and the other non-diseased. 37/54 lucky (or rather unlucky!!) diseased and 13/66 non-diseased individuals were chosen for the administration of drug 1.

*Step 1: Develop your hypotheses*

H0 = Drug 1 does not improve the disease condition

H1 = Drug 1 improves the disease condition

*Step 2: Do your calculations*

**Observed matrix table**

**Expected matrix table**

df=(r-1) x (c-1), in this case r= 2 (number of conditions under observation i.e. diseased non-diseased) and c = 2 (for treated and untreated groups)

*Step 3: Find p*

Find χ^{2} value from Chi Square table at df=1

χ^{2} = ±3.841

**Result**: You can accept the Ho as the results are in accordance with the established theory. χ^{2} = 2.89 and lies between -3.841< 2.89 <+3.841 at α=0.05

## When can you not use chi square test?

Although chi square is a powerful statistical test, it can not be used in all situations. In particular it is not valid:

- When the sampling is biased. For example: if you deliberately choose larval stages of insects for study over pupae and adult stages even if they are present at the collection site. The sampling is considered biased and accurate entomological deductions could not be made using chi square statistics.
- When sample size is very small (usually less than or equal to 5 is considered very small in this case, but generally it should not be used when the sample size is less than 50.) For smaller sample sizes, the fischer’s test might be used)
- In case of dependent variables; where presence/absence of variable B will always depend on presence/absence of variable A.
- When data is anything except frequency data. For example, if you are counting how many patients show resistance to a particular drug versus how many show susceptibility, than a chi-square is appropriate. If the data is present in any other format, the chi square tests is useless.
- In case where strength of relationship is required. The chi square test merely talks about the independence of two variables; it cannot be used to determine the degree of independence.

Hopefully this has helped you understand what a Chi square test is and when and how to use it.

How can I interpret the results of Chi square?

Give me an example please

Why is the Confidence interval taken at 0.05 for sample size>30?

I winced in pain and stopped reading further after seeing, “If p value <0.05, your experiment is a success. If not, better luck next time!" Oh, my goodness — no, no, no. 🙂 The magnitude of the p-value has NOTHING to do with the "success" of an experiment. If the experiment is properly designed and properly carried out (i.e., no experimental error and no measurement error), then the experiment is a success, regardless of the result. A p-value is useful only for making an *inference* about the statistical *population* represented by the *observed sample*. If "sum[(observed – expected)^2/expected]" is bigger than zero, then the *observed* frequencies are different from the expected frequencies. A p-value isn't needed for the investigator to see if chi-square (which is a difference) is bigger than zero for her sample. What is the probability of detecting a difference as big or bigger than the one observed *if* the difference doesn't exist in the population from which the samples were taken? Ah, that's the question answered by the p-value.

Nice article, though I do have to chime in and fix one claim. “Does not make assumptions about the distribution of data (e.g. normality)”. This is false. Where do the p-values come from? Basically what you’re doing from a mathematical standpoint is you are using the pooled values as the “true” model (i.e. null hypothesis) that there is no effect, and then you compute the mean squared error (variance) for the values found for every effect. You then ask the question, do these variances make sense? If it was completely deterministic and your (null) hypothesis was correct, the values for the pooled data would match the factors in the correct portion and all would be dandy and the test statistic would read zero.

However, what you instead is that the stuff you added all are squares of something with mean zero variance 1 (since you subtracted the mean and divide away the variance term, that’s why the subtraction and division are in the equation), and you have to ask how likely is this? Well, that’s not enough information without knowing the distribution (the mean and the variance doesn’t determine the distribution for all probability distributions, just very simple ones like Poisson and Normal distributions). However, you make the assumption that each of these are mean zero variance 1 (standard) normally distributed random variables (if you have infinite data, this assumption is always correct). Thus you mathematically define the Chi Square distribution of degree k to be the sum of k standard normal random variables. From here some math ensures to find out what that distribution is (the easiest way being via characteristic functions) and thus you find the probability of having this sum of squares greater than or equal to a given value.

But notice what you had to do: you had to make the normality assumption on each term of the sum in order to link the test statistic to the very mathematical object (the Chi Square distribution) in order to use that to compute a probability of seeing a value this big or larger.

However, there is a way to compute (in some sense) an “exact” probability for these kinds of problems, and that would be the Fisher’s exact test. A quick Google search will bring up the test and how to do it. However, in practice it’s not really that different from a Chi Square test, the differences only really come up when you have the standard issues that plague tests which rely on normality (i.e. have a large outlier).

P.S. I see this is your first article on the site. Very well done and I hope to see more in the future. Cheers!