class: left, middle, inverse, title-slide .title[ # Confounding ] .author[ ### Mabel Carabali ] .institute[ ### EBOH, McGill University ] .date[ ### 01-08-2022 Updated: ( 2024-09-25) ] --- class: middle # Yes, I measured it, but... is it CAUSAL? .footnote[ **NEW** and refurbished material from previous instructors and other resources including material from [Epidemiology by design by Daniel Westreich](https://academic.oup.com/book/32358)] --- class: middle ## .red[WHAT??] <img src="images/confusedog.png" width="50%" style="display: block; margin: auto;" /> --- class: middle ## Fundamental Problem <img src="images/L5_MG9.jpeg" width="70%" /> <br> [Maldonado & Greenland 2002](https://doi.org/10.1093/intjepid/31.2.422) --- class: middle ## Fundamental Problem The causal effect: `$$Y^{x=1} - Y^{x=0}$$` - One potential outcome is **observed** & the other potential outcome is **counterfactual** If we’ve observed `\(Y^{x=1}\)`, then by definition we **have not observed** `\(Y^{x=0}\)` **The fundamental problem of causal inference is a problem of missing data.** > _"Related: we can’t formally, with absolute certainty, attribute cause in any individual case except for necessary causes._ - One can’t die of AIDS without having HIV – by the definition of AIDS. - However, we can sometimes attribute cause with high probability. <span style="color:red"> _You can imagine how it is when an epidemiologist testifies in court about liability due to an exposure._</span> --- class: middle ## Four Causal Types In every population, there exists four types of people, based on their potential outcomes under exposure and non-exposure. - <span style="color:purple"> Doomed </span> - <span style="color:blue"> Immune </span> - <span style="color:red"> Harmful </span> - <span style="color:green"> Preventive </span> -- **_Thus, the exposure only has an impact on the outcome on those people that are harmed and those that are protected._** --- class: middle ### Four Causal Types | Type | E =1 | E=0 | |:-------------------------|:-------:|:------:| |<span style="color:purple"> Doomed </span>| 1 | 1| |<span style="color:red"> Harmed `\(^1\)` </span> | 1 |0 | |<span style="color:green"> Preventive </span> |0 | 1| |<span style="color:blue"> Immune </span> |0 | 0| `\(^1\)` Also called "Causative" --- class: middle ### Four Causal Types Imagine we know that 20% of the population are doomed, 30% are immune, 30% are harmed and 20% are protected by the exposure. What is the causal effect of the exposure on this population (say in 100 people)? + UnderE=1,___people become diseased + UnderE=0,___people become diseased - RD: - RR: --- class: middle ### Four Causal Types + The proportion that get the diseased when `\(E=1: p_{doomed} + p_{causative}\)` + The proportion that get the diseased when `\(E=0: p_{doomed} + p_{preventive}\)` `$$RR_{causal} = \left(\frac{p_{doomed} + p_{causative}} {p_{doomed} + p_{preventive}}\right)$$` -- - This relationship demonstrates that the size of a causal risk ratio not only **tends to vary** with the proportion of individuals in the target population whose outcome is altered by exposure **(i.e. `\(P_{causative}\)` and `\(P_{preventive}\)`)**, - but also tends to vary with the proportion of individuals in the target population for whom disease is **inevitable** by the end of the etiologic time period **(i.e. `\(P_{doomed}\)`)**. --- class: middle ### Four Causal Types - A note on confounding `$$RR_{causal} = \left(\frac{p_{doomed} + p_{causative}} {p_{doomed} + p_{preventive}}\right)$$` In the real world, the denominator **is not observable**, and so a substitute of some kind is used in its place. - If this observed substitute population has a risk that **is not equal to the risk in the exposed had they been unexposed**, then the <span style="color:purple"> RR estimate is biased </span> - and we refer to this bias as **“confounding”**. --- class: middle ### Four Causal Types `$$RD_{causal} = (p_{doomed} + p_{causative}) - (p_{doomed} + p_{preventive}) = (p_{causative}-p_{preventive})$$` - The causal **RD does not** depend on the proportion of people that are doomed. - It will only vary with the proportion of individuals in the target population whose outcome is actually altered by exposure --- ### Four Causal Types - It follows that any factor that affects `\(P_{causative}\)` or `\(P_{preventive}\)` can modify the size of a ratio or difference effect measure, - and the factor can modify the size of a ratio effect measure even if it affects only `\(P_{doomed}\)`. - Not surprisingly, an effect measure could vary from one population to another or from one time period to another - Unless we expect other causal factors to have similar distributions across the populations or periods. - **Which IRL WE DO NOT!** --- ### Four Causal Types | Type |Outcome E=1|Outcome E=0 | Prop. in Exposed | Prop. in Unexposed | |:----------------------------------------------------|:---------:|:----------:|:----------------:|:------------------:| |**Type 1 <span style="color:purple"> Doomed </span>**| 1 | 1| p1 |q1| |**Type 2 <span style="color:red"> Causative </span>** | 1 |0 | p2 | q2| |**Type 3 <span style="color:green"> Preventive </span>**|0 | 1| p3| q3| |**Type 4 <span style="color:blue"> Immune </span> ** |0 | 0|p4 |q4| <br> `\(RR_{causal}\)` = risk in the exposed / risk in the exposed had they been unexposed = `\((p1+p2) / (p1+p3)\)` - As we can’t observe the counterfactual we substitute in for `\(p1+p3\)` using the unexposed group: `\(RR_{association}= (p1+p2) / (q1+q3)\)` - **No confounding** when `\((p1+p3)=(q1+q3)\)` --- class: middle ### Four Causal Types Imagine we know that: - 20% of the population are doomed, - 30% are immune, - 30% are harmed and - 20% are protected by the exposure. What is the causal effect of the exposure on this population (say in 100 people)? + Under `\(E=1\)` , (0.2 (doomed) + 0.3 (causative) = 0.5) `\(\to\)` 50 people become diseased + Under `\(E=0\)` , (0.2 (doomed) + 0.2 (preventive) = 0.4) `\(\to\)` 40 people become diseased - RR: 0.5/0.4 = 1.25 `\(\to\)` 25% more - RD: 0.5 - 0.4 = 0.1 `\(\to\)` 10 More - <span style="color:green"> RD: 0.3 (causative) - 0.2 (preventive) = 0.1 `\(\to\)` 10 More </span> --- ### Four Causal Types - We **never know causal types** of individuals.!! - We **do not even know the distribution** of those causal types in the population! - There may be individual causal effects but no average causal effects - Consider a drug exposure; equal numbers of people might be harmed and helped by the drug. With more information (genotyping?) we could figure out finer categories of who is helped, and only give the drug to those people. **But those are STILL average causal effects.** - If there are no individual causal effects, however, there cannot be an average causal effect. - While a bunch of -1s and +1s might average to 0, the average of a whole bunch of 0s is definitely 0. [Epidemiology by design by Daniel Westreich](https://academic.oup.com/book/32358) --- ## Notes on potential Outcomes - Dichotomous exposures are a simplification - If you have more than two levels of exposure, you have more than two potential outcomes - One potential outcome for each level of the exposure `E.g., if exposure is # cigarettes smoked/day, and that number ranges from 0 to 60, then you have 61 potential outcomes.` - We still only observe (at best) one potential outcome. - Deterministic potential outcomes are a simplification. - Can consider potential outcomes to be stochastic (non-deterministic; random), though we will not. --- class: middle ##What are causal effects? **Formally** the average causal effect (ACE) in the population, is the **average causal effect** of treatment A on outcome Y if: `\(Pr[Y^{a=1} = 1] ≠ Pr[Y^{a=0}= 1]\)` in the population of interest. - i.e. the null hypothesis is `\(Pr[Y^{a=1} = 1] = Pr[Y^{a=0}= 1]\)`, no average causal effect. Note that this is different from the absence of an individual causal effect, because **individuals IRL have different values for `\(Y^{a=1}\)` and `\(Y^ {a=0}\)`** --- class: middle ##What are causal effects? - Because risk is an average of outcomes, can rewrite the definition of a non-null average causal effect in the population as `\(E[Y^{a=1}] ≠ E[Y^{a=0}]\)` so that the definition also applies to non-dichotomous outcomes. - The average causal effect `\(E[Y^{a=1}] − E[Y^{a=0}]\)` is always necessarily equal to the average `\(E[Y^{a=1} − Y^{a=0}]\)` of the individual causal effects `\(Y^{a=1} − Y^{a=0}\)`, -- - because a difference of averages is always equal to the average of the differences. When there is no causal effect for **any** individual in the population, i.e., `\(Y^{a=1} = Y^{a=0}\)` for all individuals, then the “sharp” causal null hypothesis is true. **<span style="color:purple"> Average causal effects can be identified from data, even if individual causal effects cannot. </span>** --- class: middle ## How do we estimate causal effects? - .red[At the individual level, we generally do not]. - At the population level, we find substitute populations. - If a population P1 is exposed to smoking, we find a substitute population P2 - which ( _we believe_) represents the experiences of population P1 if, - counter to fact, P1 had not been exposed to smoking. [Maldonado & Greenland 2002](https://doi.org/10.1093/intjepid/31.2.422) **Validity of inference depends on the appropriateness of the substitute population.** --- class: middle ### From Hernán & Robins Figure 1.1 as a causal contrast: <img src="images/L6_HRfig1.png" width="60%" /> <br> [What if? Hernán M & Robins J](https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/) --- class: middle ## Counterfactuals [BBT- Counterfactuals- The game or "The Zazzy substitution"](https://www.youtube.com/watch?v=0lpY0Kt4bn8) --- class: middle ### Oversimplified counterfactuals A paper came out several years ago estimating population attributable fractions for social epidemiology exposures. - For example, population attributable fraction for the outcome of death, and the exposure of living in high-poverty neighborhoods, for the US in (say) 2010. - What contrast the population-attributable fraction estimating? Think about the comparison: - a US in 2010 with no high-poverty neighborhoods, but everything else remains constant. - Is that plausible? <br> [Epidemiology by design by Daniel Westreich](https://academic.oup.com/book/32358) --- ### Identification assumptions **[What if? Hernán M & Robins J](https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/)** > _We say that an average causal effect is (non-parametrically) identifiable when the distribution of the observed data is compatible with a single value of the effect measure. Conversely, we say that an average causal effect is non-identifiable when the distribution of the observed data is compatible with several values of the effect measure._ <br> **What is required for causal inference?** - Tricky question. [See Greenland EJE 2017](https://doi.org/10.1007/s10654-017-0230-6) --- class: middle ## What is required for causal inference? One sufficient (though not necessary!) set of conditions `\(^1\)` for causal effect estimation is: + **<span style="color:blue"> Consistency </span>** (well-defined interventions; treatment variation irrelevance) + **<span style="color:blue"> Exchangeability </span>**, or - Conditional exchangeability with positivity + **<span style="color:blue"> Positivity </span>** - We assume temporality in all cases: that causes precede effects. - We expect absence of Systematic Error! `\(^1\)` .small[You will get more of this in 705 and 710] --- class: middle ### Consistency <img src="images/L6_consistency1.png" width="90%" /> <br> --- class: middle ### Consistency [What if? Hernán M & Robins J](https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/), > _Consistency simply means that the outcome for every treated individual equals his outcome if he had received treatment, and the outcome for every untreated individual equals his outcome if he had remained untreated._ > - This is an axiom rather than an assumption. > - The more critical assumption is that to the extent that there are multiple versions of treatment, they do not matter. Is also the assumption that we do in fact observe one of the potential outcomes for each individual. --- class: middle ### Consistency - Among people who **were exposed** their outcome was not different than it would have been if they had being **assigned** the exposure - the same for unexposed. - Recall Potential outcomes Observed vs substitute population observed outcomes! So Westreich calls this assumption, _**“treatment variation irrelevance”**_ - because it relates to whether there is **meaningful** variation in the **observed exposure or treatment** --- class: middle ### Treatment variation irrelevance? (a.k.a., Consistency) Example: say we’re interested in addressing the issue of **hypertension on mortality**. So we compare, _**observationally**_, a group of individuals with hypertension to a group of individuals without. But what is the implied causal mechanism? Many possible ways to change a hypertensive person to a non-hypertensive person: - Medication - More exercise - Meditation / stress management - Less salt in the diet - Quit smoking - Combinations of all the above. Some of these have quite different implications for overall health than others. --- class: middle ### Treatment variation irrelevance? (a.k.a., Consistency) Often, the route you take to changing a factor has a strong implication for the effects of changing that factor - So you have to assume that exposure contrasts **vary only in ways which are functionally irrelevant to the effect of that cause**. - E.g., daily aspirin. Let’s assume we don’t care whether you take it in the morning or at night, and we don’t care how much water you drink when you take it. - Does this seem like something we could assume in the hypertension case? - What if (unknowingly) the individuals: - exercise more; start meditating; stop or reduce salt consumption and quit smoking ** But we measure the effect of Aspirin - Alone** 🤡 --- ### Interference & Consistency “Consistency” means that if `\(A_i\)` = a, then `\(Y_i^a\)` = `\(Y^{Ai} = Y_i\)` > An implicit assumption in the definition of counterfactual outcome is that an individual’s counterfactual outcome under treatment value A=a does not depend on other individuals’ treatment values. -- **Interference:** - The counterfactual OUTCOME depending on another person’s treatment. - Interference between individuals is common in studies of infectious or transmissible exposures (viruses, information, etc). - The counterfactual `\(Y_i^a\)` for an individual `\(i\)` is not well defined because an individual’s outcome depends also on other individuals’ treatment values. - We can only make inferences conditional on this additional exposure. The assumption of no interference is included in the “stable-unit-treatment-value assumption (SUTVA)” described by Rubin. .small[See:Tchetgen Tchetgen EJ, VanderWeele TJ. On causal inference in the presence of interference. Stat Methods Med Res. 2012 Feb;21(1):55-75. PMID: 21068053] --- class: middle ### Exchangeability .pull-left[ <img src="images/L6exchange.png" width="40%" /> ] -- .pull-right[ <img src="images/L6exchange1.png" width="40%" /> ] <br> -- <img src="images/L6independence.jpeg" width="50%" /> --- class: middle ### Exchangeability - The condition that the potential outcomes under **assigned treatment are independent** of the actual treatment received. - Possibly, conditional on a set of confounders. - ~The baseline risk (pre-exposure/treatment) for the outcome among the exposed and unexposed is independent of the exposure they got. - E.g: In randomized trial in which codes got switched and everyone randomized to exposure got placebo and vice versa: we would expect the same result, because exchangeability is expected to hold. **Formally:** Statistical independence between the potential outcomes and the exposure or treatment **actually received** by the study participants. `\(Y^x \coprod X\)` or `\(Y^{x} \perp \!\!\! \perp {X}\)` or `\(Pr(Y^x | X=1) = Pr(Y^x |X=0)\)` --- class: middle ### Exchangeability Generally speaking, this is the assumption that there is .red[no confounding, and no selection bias]. - Or more likely, _**no unmeasured/uncontrolled confounding, selection bias**_. <br> -- <br> - I know! you started this lecture to talk about **<span style="color:red">confounding </span>** and we have not done so yet. - We will get there ... **But! Confounding is the main thing that people are worried about when they tell you that correlation is not causation.** **<span style="color:purple"> But there IS SO MUCH MORE THAN THAT!.</span>** --- ### Positivity .pull-left[ <img src="images/L6positivitytrap.jpg" width="80%" /> ] -- .pull-right[ <img src="images/L6positivity1.png" width="90%" /> ] <br> --- ### Positivity **Formally:** `\(Pr(X=x)> 0\)` - There must be observed individuals at all levels of exposure (or treatment) for all levels of all covariates (confounders) in the study. - E.g., aspirin: Say we’re studying the effect of daily aspirin on risk of heart attack. - We would want to account for age, because older people more likely to take daily aspirin; - Older people have more heart attacks. (Draw the DAG.) - But IF in our study, everyone over 40 takes daily aspirin; - No one under 40 takes daily aspirin. - We can’t separate the effects of aspirin and age. **No positivity.** --- class: middle ### Positivity An extreme version of this is “exposure opportunity” - in a study in which the exposure is prostatectomy: - we should not be studying people without a prostate This condition can be loosened (carefully) with model interpolation, and (still more carefully) with model extrapolation. ⚠️ **not always advisable** ⚠️ --- class: middle ### Positivity **Note:** - Positivity only matters with regard to variables on which you need exchangeability. - So if you have exchangeability conditional on certain variables, then you also need positivity on those variables. Aspirin and Age Example: > we were explicit that we lacked exchangeability because of age (“because older people more likely to take daily aspirin; older people have more heart attacks”), so we had to account for it in the model. - Because of this, we also needed positivity on age. So positivity is itself contingent on exchangeability, called “unconditional exchangeability" - .blue[Westreich calls it "conditional exchangeability with positivity”] --- class: middle ### Other conditions - In reality, other conditions have to be met in order to assert causality. - In real data, most of the time, you have to model; - <span style="color:purple"> So you have to assume the statistical model is correct.</span> **We generally have to assume that there are no dependent happenings (no interference):** - That is, that my exposure does not affect your outcome - When is this violated? When is this violated and good for public health? **No measurement error is also a necessary condition for causal effect estimation** --- ### A case: Suppose we have a database with information collected among 800 alcohol drinkers and abstainers over 10 years. <span style="color:purple"> Is this risk difference for death at 10 years a causal effect of alcohol? </span> <img src="images/L6alcohol.png" width="40%" style="display: block; margin: auto;" /> [Canadian Institute for Health Information. Alcohol Harm in Canada [Product release]](https://www.cihi.ca/en/alcohol-harm-in-canada) --- ### A case: **Consistency:** alcohol consumption is a big category. - Do you want to _"combine"_ people who drinks one beer a day in with those who drinks a six pack a day? - Only beers, or do wine and cocktails and whiskey counts? And does abstainers mean the same thing to everyone: > .small[If you only drink a glass of wine a week – when you’re out at the bar – is that “non-drinking”? (Consider: how do you imagine moving drinkers into the abstaineres category?)] -- **Exchangeability:** are **<span style="color:purple"> other </span>** causes of death over represented among drinkers? `E.g., do drinkers exercise less, or have worse LDL cholesterol? Smoking?` `If so, then how much of the observed risk difference is due to drinking, versus the other factors?` -- **Positivity:** following on exchangeability, can you ever separate out baseline risks? What else? --- class: middle ## Finally! ... Confounding --- class: middle ### Internal validity versus external validity: External Validity also involves successful generalization outside of the sample. Three main threats to internal validity: - Confounding, Selection bias, and Information bias - Confounding means “confusion” of effects. A formal definition of confounding is still actively debated. ` VanderWeele TJ, Shpitser I. On the definition of a confounder. Ann Stat. 2013 Feb;41(1):196-220. PMID: 25544784` - Confounding is an imbalance, and can occur by chance (in a randomized trial) or by the influence of a common cause of exposure and outcome. `Greenland S. Randomization, statistics, and causal inference. Epidemiology. 1990;1(6):421-9.` --- class: middle # What is a confounder? <img src="L7_EPIB704_Confounding_files/figure-html/unnamed-chunk-11-1.png" width="50%" style="display: block; margin: auto;" /> --- class: middle # What is a confounder? .pull-left[ <img src="images/confdef.png" width="100%" /> ] -- .pull-right[ <img src="images/confdef1.png" width="100%" /> ] --- class: middle #### General conditions that allow a variable to be a confounder: <span style="color:purple"> Modern Epidemiology 3 p. 133: </span> - **1) Independently predictive of the disease (i.e. within strata of exposure)** When assessing whether the putative confounder is predictive of outcome, it should be done within strata of exposure, to guard against the possibility that it predicts outcome only because it is correlated with exposure. -- Suppose this is the DAG: `\(C\)` → `\(E\)` → `\(Y\)` - In this case, C is not a confounder because it does not have an independent effect on Y. - But there will be an observed association between C and Y, by virtue of their common association with E. - But it is not an independent association. **That’s why we should assess this criterion within levels of exposure.** - Stratified by E, the association between C and Y is null if there is no direct effect (as shown in the DAG). --- class: middle #### General conditions that allow a variable to be a confounder: <span style="color:purple"> Modern Epidemiology 3 p. 133: </span> - **2) Associated with the exposure (often in the non-diseased)** The association between putative confounder and exposure can be made in the entire study population, but traditionally has been assessed in the controls (or non-diseased) as a stand-in for the population. `\(A\)` ← `\(L\)` → `\(Y\)` -- .small[ - If the study is a cohort, then presumably the assessment at base-line is made among a population without any disease. - If the study is a case-control study then the controls are meant to represent the source population that gave rise to the cases. - If the outcome is common, however, do NOT stratify on the outcome when making this assessment, as this would introduce collider stratification bias (since the disease node is a collider, being affected by both confounder and exposure). ] --- #### General conditions that allow a variable to be a confounder: <span style="color:purple"> Modern Epidemiology 3 p. 133: </span> - **3) Not on the causal pathway between exposure and outcome.** This third criterion was overthrown by Robins in 1986-1987: `Robins J. The control of confounding by intermediate variables. Stat Med. 1989;8(6):679-701`. But it remains true for traditional confounder control methods in some textbook. - This can’t be known from the data. - There is no test that will verify this condition, and so it must be based on subject matter knowledge. `Hernán MA, et al. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology.` `Am J Epidemiol. 2002;155:176-84.` - .small[We’ll start to learn to go beyond this _“But is not an intermediate in the causal pathway between exposure and outcome”_ in few slides and if you take EPIB 710] --- class: middle ###Conditions that allow a variable to be a confounder: 📣 ** .red[Modern Epidemiology 4th, page 268]** 💡 > <span style="color:darkblue"> _The developments in causal inference over the past decades, summarized in Chapter 3, have made clear that this definition [ ...the traditional criteria described from ME3... ] of a “confounder” is inadequate. It is inadequate because there can be a pre-exposure variable associated with the exposure and the outcome, the control of which introduces, rather than eliminates, bias_ </span> --- class: middle ##Confounding **<span style="color:blue"> More formally knowing about the Potential Outcomes/counterfactual thinking: </span>** - Confounding is an **inadequacy of the substitute population** to “stand in” effectively for the experience of the target population under the specified exposure condition. - Confounding is present if **the substitute imperfectly represents** what the target would have been like under the counterfactual condition. - An association measure is confounded (or biased due to confounding) for a causal contrast if it does not equal that causal contrast because of an imperfect substitution. - A confounder is a variable that at least partly explains why confounding is present. --- class: middle ##Confounding and the substitute population The equalities that must be met to control confounding are: - E0/F0 = A0/B0 in scenario 1 (target observed under exposure 1) and - C1/D1 = A1/B1 in scenario 2 (target observed under exposure 0). + If the entire target is not observed in one or the other exposure states, - Then both substitutions are necessary (except in the unlikely case of incidental balancing out of the two substitution biases). --- class: middle ##Confounding and “study design”? - Because it is the goal of all etiologic designs to estimate causal contrasts, different study designs are just different ways of choosing a target that corresponds to the study question, and choosing substitutes and sampling subjects from target and substitutes into the study. - Advantages and disadvantages of different designs are just intended to balance trade offs among bias, variance, study costs and study time. - Some studies, by design, are more prone to confounding `\(^1\)` - but **<span style="color:purple"> all </span>** are potentially susceptible to confounding if the substitute **_miss-represents the target population_** `\(^1\)` More on this in EPIB 703. --- class: middle ##Confounding <span style="color:purple"> Recall:</span> No confounding when (p1+p2) = (q1+q2). - This is a different criteria for confounding that previously. - Thus, confounding may be present for one target population but absent for another target population. --- class: middle **Example using the causal types** |**E=1** | | |**E=0** | | | |:-------|-------|-----|---------|------|--------| |Type 1: |Pdoomed (p1)|= 0.3|Type 1:| Pdoomed (q1)|= 0.2| |**Type 2:** |Pcausal (p2)| <span style="color:red"> = 0.1 </span>| **Type 2:**|Pcausal (q2) |<span style="color:green"> = 0.3 </span> | |**Type 3:** |Ppreventive (p3) |<span style="color:red"> = 0.2 </span> |**Type 3:** | Ppreventive (q3)|<span style="color:green"> = 0.2 </span> | |Type 4: |Pimmune (p4 |= 0.4| Type 4:| Pimmune (q4)| = 0.3 | **Target: E=1** - RDcausal= `\((p_1^{E=1} - p_2^{E=1})\)` = <span style="color:red"> 0.1-0.2 </span> = **.blue[-0.1]** - RDassociation= `\((p_1 + p_2) - (q_1 + q_3)\)` = (0.3 + <span style="color:red"> 0.1 </span> ) – ( <span style="color:green"> 0.2 </span> + 0.2) = **.blue[0]** <span style="color:blue"> _Exposure appears null but is actually preventive_ </span> --- class: middle |**E=1** | | |**E=0** | | | |:-------|-------|-----|---------|------|--------| |Type 1: |Pdoomed (p1)|= 0.3|Type 1:| Pdoomed (q1)|= 0.2| |**Type 2:** |Pcausal (p2)| <span style="color:red"> = 0.1 </span>| **Type 2:**|Pcausal (q2) |<span style="color:green"> = 0.3 </span> | |**Type 3:** |Ppreventive (p3) |<span style="color:red"> = 0.2 </span> |**Type 3:** | Ppreventive (q3)|<span style="color:green"> = 0.2 </span> | |Type 4: |Pimmune (p4 |= 0.4| Type 4:| Pimmune (q4)| = 0.3 | **Target: E=0** - RDcausal= `\((p_1^{E=0} - p_2^{E=0})\)` = <span style="color:green"> 0.3-0.2 </span> = **.blue[0.1]** - RDassociation= `\((p_1 + p_2) - (q_3 + q_1)\)` = (0.3 + <span style="color:red"> 0.1 </span> ) – ( <span style="color:green"> 0.2 </span> + 0.2) = **.blue[0]** <span style="color:blue"> _Exposure appears null but is actually harmful_ </span> **Direction of confounding depends on the target population of interest** --- class: middle ### QUESTIONS? ## COMMENTS? # RECOMMENDATIONS? --- class: middle ### Some extra slides --- **A Classic Example to Read:** [GREENLAND and ROBINS Int J Epidemiology 1986; 15 (3): 412-418](https://doi.org/10.1093/ije/15.3.413) - Suppose (unknown to us) exposure has no effect, i.e., there are no Type 2 or 3 individuals, and that we measure a risk factor having the following joint distribution with exposure and type: <img src="images/L6_greenlandRobins.png" width="100%" /> **Note that only the last 2 lines of the table would be observable in a real study.** - From these last two lines (the observable data) we can see that the factor is associated with exposure (100 of 200 exposed have the factor, as opposed to 100 of 500 unexposed), and is predictive of outcome among the unexposed (among the unexposed, 0.45 of those without the factor will get the disease, as opposed to 0.70 of those with the factor). --- **A Classic Example to Read:** [GREENLAND and ROBINS Int J Epidemiology 1986; 15 (3): 412-418](https://doi.org/10.1093/ije/15.3.413) + If we fail to control the factor, we will observe disease in an equal proportion (1/2) of the exposed and unexposed, correctly indicating no exposure effect, + while if we control the factor, exposure will incorrectly appear to be preventive of disease in both strata. + More precisely, p3=0 and thus p1+p3 = p1= 0.50= q1 = q1+q3 so that the <span style="color:blue"> crude estimate is unconfounded </span>, + while 0.70 * (100) + 0.45 * (100) = 115 `\(\neq\)` 0.50 * (200), <span style="color:purple"> so that the adjusted estimate is confounded.</span> **<span style="color:blue"> Mindbogglingly! </span>**