Mendelian Randomization
What is Mendelian Randomization
- = a statistical and Causal inference#Instrumental Variables (IV) method used to assess causal relationships between a risk factor (exposure) and an outcome
- It is often used in biomedical and social science research
- It can be univariable or multivariable
How does it work
Intuition (why "Mendelian")
Because genes are randomly assigned at conception (following Mendel's laws), they are generally independent of confounders.
-> If the genetic variants associated with the exposure are also associated with the outcome, this provides evidence for a causal effect.
The logic is similar to a “natural randomized controlled trial”.
Step by step
- Identify instrumental variables (IVs)
- select genetic variants (usually SNPs) that are strongly associated with the exposure.
- mathematically, for each SNP
and exposure :
where
- Check independence from confounders
- ensure that
is not associated with confounders of the exposure-outcome relationship:
- Estimate the effect of exposure on outcome
- use the genetic instrument to estimate the causal effect
of the exposure on outcome :
- since
is partially determined by , the instrumental variable (IV) estimate of can be computed as:
- Interpret causal effect
- if
is significantly different from zero, it suggests a causal effect of the exposure on the outcome. - this estimate is less likely to be biased by confounding or reverse causation due to the random allocation of genes.
- if
Key Assumptions
- Relevance: genetic variants are associated with the exposure
- Independence: genetic variants are independent of confounders
- Exclusion restriction: genetic variants affect the outcome only through the exposure
Advantages
- reduces confounding compared to observational studies
- can help distinguish correlation from causation
- useful for exposures that cannot be randomized experimentally
Limitations
- weak instruments can bias results
- pleiotropy (genetic variants affecting multiple traits) can violate assumptions
- requires large sample sizes for sufficient statistical power