Research Study Bias
Biostatistics

Bias in Research Studies

Bias refers to a systematic error in a research or statistical study. The first large category of bias we will look at is selection bias. Selection bias occurs when those selected for a study are not properly selected or when they are not all retained in a cohort study. The following are subtypes of this bias.

Sampling bias also called ascertainment bias occurs when the study population is different from the target population because during selection of subjects there was no randomization. Remember, we can not put the entire population into a study, so the sample we select should be as representative as possible. To do this, we can use randomization.

Non-response bias occurs when there is a high rate of non-responders, and those who don’t respond differ from those who do in a significant manner.

Berkson bias occurs when the research study uses hospitalized patients. This is because the results from this particular population may not apply to the target population. Often, this is because those in hospital are less healthy than the general population.

The healthy worker effect is the opposite of the Berkson bias, as it occurs when those enrolled in the study are generally healthier than the target population. This is often seen in cohort studies that look at occupational disease. The problem occurs because the exposed group consists of workers which are generally healthier than the non-exposed group which is selected from the general population. This is because those who are very ill or are disabled are often excluded from the workforce. This effect was first observed in 1885 by William Ogler who found that workers in vigorous occupations had lower mortality than those in easier occupations and those who were unemployed.

Prevalence or Neyman bias occurs when the exposure being studies occurred so long in the past, that some affected individuals have died or recovered, and are therefore missed during the assessment portion of the study.

Attrition bias occurs when there is significant loss of study subjects during follow-up, and those who didn’t do the follow-up are different from those who did in a significant manner.


The second large category of bias we will look at is observational bias, which occurs when there are problems with classification or measurement of the outcome, exposure, or other variable. The following are subtypes of observational bias.

Recall bias occurs when those with the disease or outcome being studied are more likely to report (and recall) an exposure compared to those who are healthy i.e. the control subjects. This bias is often seen in retrospective studies. To reduce recall bias, the study should be done as close as possible to the time of the exposure.

Observer bias also called observer-expectancy bias occurs when there are differences in how the study is interpreted by researchers or when they have an expectation before the study begins that affects how they observe or interpret the data. For example, a researcher who believes that an exposure or treatment will have positive effects, is more likely to observe a positive effect rather than a negative effect. To reduce observer bias, a study can be double blinded, such that both the observers and the subjects are not aware of which group is receiving the treatment/ exposure. To do this, a placebo can be given to the non-exposed (control) group.

Procedure bias occurs when study subjects in the exposed (treatment) and the non-exposed (control) groups are not treated the same. For example, those with disease may be given better care. Similar to observer bias, to reduce this, we can double-blind and use placebo to reduce this procedure bias.

Measurement bias occurs when the data is measured inaccurately. To reduce this, the data collection methods should be standardized.

Reporting bias occurs when fear of stigmatization leads subjects to under or over report exposures.

The Hawthorne effect occurs when study subjects who are aware they are being observed behave differently. This was first described in the 1920s in the Hawthorne plant of the Western Electric Company, when they observed an increase in productivity of workers aware they were being observed.

Surveillance or detection bias occurs when those with exposure are monitored more closely than those who were not exposed because of the characteristics of the outcome or disease. This in turn leads to increased probability of identifying the disease or outcome in the exposed group compared to the non-exposed group. To reduce detection bias, data collection should be standardized and subjects as well as observers blinded as to the allocation of study subjects into groups.

Lead-time bias occurs when an outcome or disease is detected early in its course leading to the perception that the disease now has a longer survival rate. To reduce this, the severity of the disease when it was first detected needs to be accounted for.

Protopathic bias occurs when a treatment is given early in the course of a disease, usually before it is diagnosed, and when the disease becomes more severe in and is diagnosed, it appears as if the treatment caused the disease.

Confounding bias occurs when there is a confounding variable, which is something that is related to both the exposure and distorts their relationship because it is not in the causal pathway from exposure to outcome.

For example, daycare attendance is a confounding factor when studying how well the Haemophilus influenzae type b vaccine protects young children against developing H. influenzae type b. This is because daycare attendance is associated with an increased risk of developing H. influenzae type b infection. Also, daycare attendance is higher in vaccinated children, typically because most daycares require vaccination. In this case, the vaccine might be seen as less effective. Whereas, if the majority of children in daycare where unvaccinated, and daycare attendance led to more infections on its own, the bias would be in the opposite direction suggesting that vaccines are very effective.

To reduce confounding bias, the study subjects can be stratified based on the confounding factor or multivariable analysis can be used. As with selection bias, randomization is probably the most effective way to reduce confounding bias, especially when we can’t identify the confounding factor(s).

Susceptibility bias occurs when the risk of the outcome occurring is different between the exposed and non-exposed groups, but this difference is not due to the exposure, rather it is due to a confounding variable.


References

  • Healthy worker effect – pubmed article.
  • Feigin and Cherry’s Textbook of Pediatric Infectious Diseases, page 102.
  • BMJ journals: CME Bias. (pdf)