class: left, middle, inverse, title-slide .title[ # Poisson Regression - Basic concepts ] .author[ ### Mabel Carabali ] .institute[ ### EBOH, McGill University ] .date[ ### updated: 2024-11-27 ] --- class: middle <img src="images/poisson_dist.jpg" width="90%" style="display: block; margin: auto;" /> --- class: middle **Expected competencies:** - Knowledge about linear, logistic and binomial regressions. - Knowledge about count data and Poisson distribution. -- ### Objectives - Revise Poisson distribution and model assumptions - Provide tools to estimate Risk Ratios, Risk Difference and Rate Ratios using Poisson regressions --- class:middle ### Poisson distribution The Poisson distribution gives the probability mass function (discrete) of all possible numbers of new cases, from 0 to `\(\infty\)`. <br> If `\(Y_i\)`= # of new cases with `\(\lambda\)` = expected number of cases in a given time period and `$$Y_i \sim Poisson (\lambda)$$` -- then the probability mass function that `\(Y=k\)`, is: `$$P(Y=k) = \dfrac{\lambda^{k}e^{-\lambda}}{k!}$$` -- for `\(k = 0, 1, 2,\)` ... and `\(λ > 0\)`, where `\(λ\)` is both the mean and the variance of `\(K\)`; where `\(λ\)` = expected number of cases in a given time period. --- class:middle ### Poisson distribution This can be also presented as: `$$y_i ∼Poisson(e^{X_iβ})$$` where the linear predictor `\(X_i β\)` is the logarithm of the expected value of measurement `\(y_i\)`, or `\(ln(\lambda)\)`. - In count-data regressions, each unit `\(i\)` corresponds to a setting (typically a spatial location or a time interval) in which `\(y_i\)` events are observed. -- <br> - Under the Poisson model, `\(sd( y_i ) = \sqrt E(y_i)\)`; thus if the model accurately describes the data, we also have a sense of how much variation we would expect from the fitted curve. **.red[For a Poisson random variable, the variance = mean =]** **E(Y) = `\(\lambda\)`** --- class:middle ### Poisson distribution <img src="L21_EPIB704_Poisson_Regression_files/figure-html/unnamed-chunk-2-1.png" width="60%" style="display: block; margin: auto;" /> -- **Checking distribution** .pull-left[ ```r mean(dat$art) ``` ``` ## [1] 1.692896 ``` ] .pull-right[ ```r sd(dat$art); var(dat$art) ``` ``` ## [1] 1.926069 ``` ``` ## [1] 3.709742 ``` ] --- class:middle ### Poisson considerations - As `\(λ\)` increases Poisson distribution approximates to a normal distribution - When `\(n\)` is large and `\(p\)` is very small, Poisson distribution approximates the binomial distribution. -- .pull-left[ **Example:** Consider binomial distribution with probability = 0.01. What is the probability of 6 occurrences in 1200 tries? ```r # binomial dbinom(6,1200,0.01) ``` ``` ## [1] 0.02516172 ``` ```r # poisson dpois(6,1200*0.01) # lambda = n*p ``` ``` ## [1] 0.02548128 ``` ] -- .pull-right[ **To approximate the true (binomial) results:** To approximate the mean - `\(µ = E(X) = λ = np\)` To approximate the standard deviation. - `\(σ = sqrt(λ)= sqrt(np)\)` ] --- class:middle ### Poisson considerations <img src="L21_EPIB704_Poisson_Regression_files/figure-html/unnamed-chunk-6-1.png" width="80%" style="display: block; margin: auto;" /> --- class:middle ## Poisson Model Assumptions **Poisson Response:** The response variable is a count per unit of time or space. **Independence:** The observations must be independent of one another. **Mean = Variance:** Conditional means equal the conditional variances. **Linearity:** The log of the mean rate, log(λ), must be a linear function of x. - the mean values of `\(Y\)` at each level of `\(X\)`, `\(λ_{Y|X}\)`, fall on a curve, not a line, although the logs of the means should follow a line. --- class:middle ## Poisson Model - A generalized linear model with a log link and a Poisson distribution, - Used to model count variables - A natural model for rates in data sets with person time (e.g: Mortality). -- .blue[ - Under predicts zero counts and over predicts the first set of counts - Assumes same rate of outcome among the individual observations but incorporating “observed heterogeneity” relax this assumption. ] --- class:middle ### Illustration using the articles and PhD years after graduation .pull-left[
Characteristic
N = 915
1
art
Mean; Median (Q1, Q3)
1.69; 1.00 (0.00, 2.00)
fem
Men
494 / 915 (54%)
Women
421 / 915 (46%)
mar
Married
606 / 915 (66%)
Single
309 / 915 (34%)
phd
Mean; Median (Q1, Q3)
3.10; 3.15 (2.26, 3.92)
1
n / N (%)
] .pull-right[
Characteristic
N = 915
1
profage
Mean; Median (Q1, Q3)
33.2; 33.0 (31.0, 36.0)
kid5
0
599 / 915 (65%)
1
195 / 915 (21%)
2
105 / 915 (11%)
3
16 / 915 (1.7%)
ment
Mean; Median (Q1, Q3)
9; 6 (3, 12)
1
n / N (%)
] --- class:middle ### Illustration using the articles and PhD years after graduation ```r pois0 <- glm(art ~ 1, family= "poisson", data = dat) ``` -- ** Exponentiated coefficient** ``` ## exp(Est.) 2.5% 97.5% z val. p ## (Intercept) 1.69 1.61 1.78 20.72 0 ``` -- **Summary** ``` ## ## Call: ## glm(formula = art ~ 1, family = "poisson", data = dat) ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 0.52644 0.02541 20.72 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (Dispersion parameter for poisson family taken to be 1) ## ## Null deviance: 1817.4 on 914 degrees of freedom ## Residual deviance: 1817.4 on 914 degrees of freedom ## AIC: 3487.1 ## ## Number of Fisher Scoring iterations: 5 ``` --- class:middle **Illustration using the articles and PhD years after graduation** <table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;"> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> Observations </td> <td style="text-align:right;"> 915 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Dependent variable </td> <td style="text-align:right;"> art </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Type </td> <td style="text-align:right;"> Generalized linear model </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Family </td> <td style="text-align:right;"> poisson </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Link </td> <td style="text-align:right;"> log </td> </tr> </tbody> </table> <table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;"> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> 𝛘²(0) </td> <td style="text-align:right;"> -0.00 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Pseudo-R² (Cragg-Uhler) </td> <td style="text-align:right;"> 0.00 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Pseudo-R² (McFadden) </td> <td style="text-align:right;"> 0.00 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> AIC </td> <td style="text-align:right;"> 3487.15 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> BIC </td> <td style="text-align:right;"> 3491.97 </td> </tr> </tbody> </table> <table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;border-bottom: 0;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> exp(Est.) </th> <th style="text-align:right;"> 2.5% </th> <th style="text-align:right;"> 97.5% </th> <th style="text-align:right;"> z val. </th> <th style="text-align:right;"> p </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> (Intercept) </td> <td style="text-align:right;"> 1.69 </td> <td style="text-align:right;"> 1.61 </td> <td style="text-align:right;"> 1.78 </td> <td style="text-align:right;"> 20.72 </td> <td style="text-align:right;"> 0.00 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> Standard errors: MLE</td></tr></tfoot> </table> --- class:middle ### Illustration using the articles and PhD years after graduation <img src="images/artpredict.gif" width="60%" style="display: block; margin: auto;" /> --- class:middle ### Illustration using the articles and PhD years after graduation If the follow-up time is not specified, the software assumes that person time is one for everyone, and Poisson regression actually produces the RISK ratio: -- .pull-left[ ```r pois1 <- glm(art ~ fem + mar, * poisson(link="log"), data = dat) round(jtools::summ(pois1, confint= T, exp = T)$"coeftable", 2) ; #summary(pois1) ``` ``` ## exp(Est.) 2.5% 97.5% z val. p ## (Intercept) 1.89 1.77 2.03 18.19 0.00 ## femWomen 0.79 0.71 0.87 -4.49 0.00 ## marSingle 0.97 0.87 1.09 -0.50 0.62 ``` ] -- .pull-right[ <img src="images/poisPRM.gif" width="120%" style="display: block; margin: auto;" /> ] --- class:middle ### Illustration using the articles and PhD years after graduation To get the RATE ratio instead of RISK ratio, we need to tell the software where to find the follow-up time: ```r *dat$logprofage <- log(dat$profage) #summary(dat$profage) *pois2a <- glm(art ~ fem + mar + offset(logprofage), poisson(link="log"), data = dat) ``` -- ``` ## exp(Est.) 2.5% 97.5% z val. p ## (Intercept) 0.06 0.05 0.06 -81.49 0.0 ## femWomen 0.79 0.72 0.88 -4.30 0.0 ## marSingle 0.97 0.87 1.08 -0.52 0.6 ``` **.purple[The exponentiated coefficient give then the Rate Ratio.]** --- class:middle ### Illustration using the articles and PhD years after graduation <img src="images/artpred_T.gif" width="60%" style="display: block; margin: auto;" /> -- **comparing "goodness-of-fit" using AIC (Akaike Information Criteria)** ``` ## Empty Risk Rate ## [1,] 3487.147 3467.868 3470.165 ``` --- class:middle ### Illustration using the articles and PhD years after graduation ```r *pois2b <- glm(art ~ fem + mar + offset(profage), poisson(link="log"), data = dat) round(jtools::summ(pois2b, confint= T, exp = T)$"coeftable", 2) ``` ``` ## exp(Est.) 2.5% 97.5% z val. p ## (Intercept) 0.00 0.00 0.00 -997.79 0.00 ## femWomen 1.04 0.94 1.15 0.76 0.45 ## marSingle 0.71 0.64 0.79 -6.15 0.00 ``` .red[Using the variable age without the log-transformation changes the estimates and the likelihood is affected as evidenced by the AIC] **comparing "goodness-of-fit" using AIC (Akaike Information Criteria)** ``` ## Empty Risk Rate Rate_nolog ## [1,] 3487.147 3467.868 3470.165 14060.01 ``` --- class:middle ### Illustration using the articles and PhD years after graduation <img src="images/artpred_T1.png" width="60%" style="display: block; margin: auto;" /> - Under predicts zero counts and over predicts the first set of counts - Assumes same rate of outcome among the individual observations but incorporating “observed heterogeneity” relax this assumption. --- class:middle ### Poisson Models for Risk Ratios and Risk Differences Recall our example: **Wells in Bangladesh** .pull-left[
Characteristic
N = 3,020
1
switch
1,737 (58%)
arsenic
Median (Q1, Q3)
1.30 (0.82, 2.20)
Min, Max
0.51, 9.65
dist100
Median (Q1, Q3)
0.37 (0.21, 0.64)
Min, Max
0.00, 3.40
assoc
1,277 (42%)
1
n (%)
] -- .pull-right[ <img src="L21_EPIB704_Poisson_Regression_files/figure-html/unnamed-chunk-25-1.png" width="150%" height="150%" style="display: block; margin: auto;" /> ] --- class:middle ### Recall our example: **Wells in Bangladesh** **.black[Model "switching" to a safer well (outcome) as a function of whether the owner belongs to a community association and As. level.]** <br> |Model | Frequentist Estimates| |:---------------------------|:-------------------:| |Linear Model `\(^1\)` | RD= -0.036; RR= N/A | |Logit Model (Predicted probs.) `\(^2\)`| RD= -0.033; RR= 0.939| |AME (prediction `StdGLM`) `\(^2\)` | RD= -0.031; RR= 0.948 | |GLM: Log-Binomial `\(^2\)` |RD= N/A; RR= 0.928| <br> `\(^1\)` Single predictor; `\(^2\)` Two predictors --- class:middle ### Poisson Model and Risk Ratios –our example- **Wells in Bangladesh** ```r mod.glm.p.rr <- glm(switch ~ assoc + arsenic, data = wells, * family=poisson(link="log")) ``` -- ``` #> #> Call: #> glm(formula = switch ~ assoc + arsenic, family = poisson(link = "log"), #> data = wells) #> #> Coefficients: #> Estimate Std. Error z value Pr(>|z|) #> (Intercept) -0.7489 0.0473 -15.83 < 2e-16 *** #> assoc -0.0559 0.0488 -1.14 0.25 #> arsenic 0.1258 0.0192 6.55 5.6e-11 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> (Dispersion parameter for poisson family taken to be 1) #> #> Null deviance: 1921.5 on 3019 degrees of freedom #> Residual deviance: 1880.2 on 3017 degrees of freedom #> AIC: 5360 #> #> Number of Fisher Scoring iterations: 5 ``` --- class:middle Poisson Model and RRs –our example- **Wells in Bangladesh** <table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;"> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> Observations </td> <td style="text-align:right;"> 3020 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Dependent variable </td> <td style="text-align:right;"> switch </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Type </td> <td style="text-align:right;"> Generalized linear model </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Family </td> <td style="text-align:right;"> poisson </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Link </td> <td style="text-align:right;"> log </td> </tr> </tbody> </table> <table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;"> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> 𝛘²(2) </td> <td style="text-align:right;"> 41.25 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Pseudo-R² (Cragg-Uhler) </td> <td style="text-align:right;"> 0.02 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Pseudo-R² (McFadden) </td> <td style="text-align:right;"> 0.01 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> AIC </td> <td style="text-align:right;"> 5360.21 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> BIC </td> <td style="text-align:right;"> 5378.25 </td> </tr> </tbody> </table> <table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;border-bottom: 0;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> exp(Est.) </th> <th style="text-align:right;"> 2.5% </th> <th style="text-align:right;"> 97.5% </th> <th style="text-align:right;"> z val. </th> <th style="text-align:right;"> p </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> (Intercept) </td> <td style="text-align:right;"> 0.47 </td> <td style="text-align:right;"> 0.43 </td> <td style="text-align:right;"> 0.52 </td> <td style="text-align:right;"> -15.83 </td> <td style="text-align:right;"> 0.00 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> assoc </td> <td style="text-align:right;"> 0.95 </td> <td style="text-align:right;"> 0.86 </td> <td style="text-align:right;"> 1.04 </td> <td style="text-align:right;"> -1.14 </td> <td style="text-align:right;"> 0.25 </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> arsenic </td> <td style="text-align:right;"> 1.13 </td> <td style="text-align:right;"> 1.09 </td> <td style="text-align:right;"> 1.18 </td> <td style="text-align:right;"> 6.55 </td> <td style="text-align:right;"> 0.00 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> Standard errors: MLE</td></tr></tfoot> </table> --- class:middle ### Poisson Model and Risk Ratios, our example- Wells in Bangladesh ``` #> exp(Est.) 2.5% 97.5% z val. p #> (Intercept) 0.473 0.431 0.519 -15.83 1.82e-56 #> assoc 0.946 0.859 1.041 -1.14 2.52e-01 #> arsenic 1.134 1.092 1.178 6.55 5.61e-11 ``` -- <br> - Poisson models provides RISK ratios with biased confidence intervals. - The model assumes a Poisson distribution when in fact the outcomes are binary. - To overcome issues with the variance, we could use a variance estimator that is robust to the misspecification of the model. --- class:middle ### Poisson Model and RRs (Robust Variance) –our example- **Wells in Bangladesh** ```r jtools::summ(mod.glm.p.rr, confint=T, exp=T, * robust= "HC0")$"coeftable" ``` ``` #> exp(Est.) 2.5% 97.5% z val. p #> (Intercept) 0.473 0.445 0.502 -24.48 2.63e-132 #> assoc 0.946 0.889 1.006 -1.77 7.74e-02 #> arsenic 1.134 1.110 1.159 11.50 1.33e-30 ``` -- **Quasi-Poisson Risk Ratio** ```r mod.glm.qp.rr <- glm(switch ~ assoc + arsenic, data = wells, * family= quasipoisson(link="log")) jtools::summ(mod.glm.qp.rr, confint=T, exp=T)$"coeftable" ``` ``` #> exp(Est.) 2.5% 97.5% t val. p #> (Intercept) 0.473 0.445 0.502 -24.31 2.19e-119 #> assoc 0.946 0.888 1.006 -1.76 7.90e-02 #> arsenic 1.134 1.107 1.162 10.06 1.87e-23 ``` .small[*Note: Pseudo-R2 for quasibinomial/quasipoisson families is calculated by refitting the fitted and null models as binomial/poisson.] --- class:middle ### Poisson Model and Risk Differences –our example- **Wells in Bangladesh** ```r mod.glm.p.rd <- glm(switch ~ assoc + arsenic, data = wells, * family=poisson(link="identity")) ``` -- ``` #> #> Call: #> glm(formula = switch ~ assoc + arsenic, family = poisson(link = "identity"), #> data = wells) #> #> Coefficients: #> Estimate Std. Error z value Pr(>|z|) #> (Intercept) 0.4388 0.0278 15.80 < 2e-16 *** #> assoc -0.0268 0.0275 -0.97 0.33 #> arsenic 0.0891 0.0138 6.47 9.8e-11 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> (Dispersion parameter for poisson family taken to be 1) #> #> Null deviance: 1921.5 on 3019 degrees of freedom #> Residual deviance: 1876.2 on 3017 degrees of freedom #> AIC: 5356 #> #> Number of Fisher Scoring iterations: 5 ``` --- class:middle ### Poisson Model and Risk Differences ```r jtools::summ(mod.glm.p.rd, confint=T)$"coeftable" ``` ``` #> Est. 2.5% 97.5% z val. p #> (Intercept) 0.4388 0.3844 0.4932 15.800 3.09e-56 #> assoc -0.0268 -0.0806 0.0271 -0.974 3.30e-01 #> arsenic 0.0891 0.0621 0.1161 6.470 9.83e-11 ``` -- **Quasi-Poisson Risk Difference (Robust Variance)** ```r mod.glm.qp.rd1 <- glm(switch ~ assoc + arsenic, data = wells, * family= quasipoisson(link="identity")) # jtools::summ(mod.glm.qp.rd1, confint=T)$"coeftable" ``` ``` #> Est. 2.5% 97.5% t val. p #> (Intercept) 0.4388 0.4033 0.47429 24.23 1.19e-118 #> assoc -0.0268 -0.0619 0.00837 -1.49 1.36e-01 #> arsenic 0.0891 0.0715 0.10674 9.92 7.54e-23 ``` --- class:middle **Why and When do we use Poisson for Risk Ratios and Risk Differences?** **.purple[Poisson regression]** - Used for rare outcomes - When the follow-up has different lengths -- **.black[Logistic regression:]** Used with same length of follow-up time -- **.blue[Poisson approximation to Binomial]** - Cohort studies where all patients have equal follow-up times. - Poisson regression can be used similarly as logistic regression, with a time-at-risk value specified as 1 for each subject. - If the model adequately fits the data, this provides a correct estimate of the AdjRR <br> .small[Chen et al.: Comparison of robustness to outliers between robust poisson models and log-binomial models when estimating relative risks for common binary outcomes: a simulation study. BMC Medical Research Methodology 2014 14:82. ] --- class:middle ## Comparing Results **.black[Model "switching" to a safer well (outcome) as a function of whether the owner belongs to a community association and As. level.]** <br> |Model | Frequentist Estimates| |:---------------------------|:-------------------:| |Linear Model `\(^1\)` | RD= -0.036; RR= N/A | |Logit Model (Predicted probs.) `\(^2\)`| RD= -0.033; RR= 0.939| |AME (prediction `StdGLM`) `\(^2\)` | RD= -0.031; RR= 0.948 | |GLM: Log-Binomial `\(^2\)` |RD= N/A; RR= 0.928| | .red[Poisson Model] `\(^2\)` | .red[RD = -0.027; RR= 0.946]| `\(^1\)` Single predictor; `\(^2\)` Two predictors --- class:middle **Why and When do we use Poisson for Risk Ratios and Risk Differences?** - For common outcomes, Poisson regression is likely to compute a confidence interval(s) that is conservative, suggesting less precision than is true. - Poisson regression produces wider confidence intervals (compared with a log-binomial model and stratified analysis) because Poisson errors are overestimates of binomial errors when the outcome is common - Poisson errors approximately equal binomial errors when the outcome is rare <br> .small[ Guangyong Zou. A Modified Poisson Regression Approach to Prospective Studies with Binary Data, AJE, Vol 159: 7 (2004).702–706 (https://doi.org/10.1093/aje/kwh090). Chen et al. Comparison of robustness to outliers between robust poisson models and log-binomial models when estimating relative risks for common binary outcomes: a simulation study. BMC Med Res Method 2014 14:82. ] --- class:middle ### Robust Poisson models - The Poisson regression uses a logarithm as the natural link function under the generalized linear model framework. - The robust Poisson regression model uses the classical sandwich estimator under the generalized estimation equation (GEE) framework to correct the inflated variance (also known as over-dispersion) in the standard Poisson regression. - The technique is known as modified Poisson regression or pseudo-likelihood estimation. - This correction can be achieved by using the `robust` option form `jtools` in `R` or the `quasipoisson` from `glm`, also using the REPEATED statement in SAS Proc GENMOD; the ROBUST option in STATA’s Poisson procedure. --- class:middle ### Robust Poisson models vs. Log-Binomial - In presence of linear confounders, the two models yield comparable relative biases. - With the non-linear confounders, the robust Poisson model outperform the log-binomial model. Larger differences with rare outcomes. - The robust Poisson models are more robust to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. .small[Chen et al. Comparison of robustness to outliers between robust poisson models and log-binomial models when estimating relative risks for common binary outcomes: a simulation study. BMC Medical Research Methodology 2014 14:82. ] --- class:middle ### Log-Binomial vs Robust Poisson? - Of the two methods, the log-binomial method is generally preferred due to the fact that the MLEs estimated by the log-binomial models are more efficient compared to the pseudo-likelihood estimators used by the robust Poisson models. - Spiegelman and Hertzmark recommend using the log-binomial models over the robust Poisson models when convergence is not an issue. - It appears that the gain in efficiency is beneficial to log-binomial models only for samples of small sizes. - Due to the concern of lack of efficiency for the robust Poisson models for small samples, log-binomial may still be the choice when the sample size is small. --- class:middle ### Other Count Models - Negative Binomial (Overdispersed Poisson Models) - Zero Inflated Models - Zero Truncated Models .pull-right[ ### .red[… Out of the scope for this session ] ] --- class: middle ### QUESTIONS? ## COMMENTS? # RECOMMENDATIONS? --- class:middle ## Distributions <img src="images/distributions.png" width="80%" style="display: block; margin: auto;" /> Vittinghoff E., Glidden D.V., Shiboski S.C., McCulloch C.E. (2012) Generalized Linear Models. In: Regression Methods in Biostatistics. Statistics for Biology and Health. Springer, Boston, MA --- class:middle ### Log-Binomial vs Robust Poisson? - Software utilizes iterative weighted least squares (IWLS*) approach or variations of IWLS to find MLEs for generalized linear models. - For log-binomial models, the weights used by the IWLS approach contain the term 1/(1-p), where `\(p = exp(XTβ)\)` with a range from 0 to 1. - The MLE of a log-binomial model is likely to be too sensitive to outliers because a very large p has a large influence on the weights. - MLE and pseudolikelihood estimators are deteriorated in presence of outliers. - The level of deterioration differed when the relationships between the confounder and the outcome was not in a simple form --- class:middle ### Maximum likelihood estimate of lambda `$$MLE = \ell(\lambda; y_1, \dots, y_n) = \sum_{i = 1}^n \log f(y_i; \lambda) = \sum_{i = 1}^n \log \frac{\lambda^{y_1}exp^{-\lambda}}{y_1!}$$` Take first derivative (notation `\(\prime = \frac{d(.)}{d\lambda}\)`) `\(\ell^\prime(\lambda; y_1, \dots, y_n) = \sum_{i = 1}^n \log \frac{\lambda^{y_1}exp^{-\lambda}}{y_1!}^\prime\)` `\(= \sum_{i = 1}^n \left\{ -\lambda + y_i \cdot \log(\lambda) - \log(y_i!) \right\}^\prime\)` `\(= \sum_{i = 1}^n \left\{ -1 + y_i \cdot \frac{1}{\lambda} \right\}\)` `\(= -n + \frac{1}{\lambda} \sum_{i = 1}^n y_i\)` --- class:middle ### Maximum likelihood estimate of lambda The first-order condition for maximizing the log-likelihood sets its derivative to zero `$$\ell^\prime(\lambda; y_1, \dots, y_n) = 0$$` `$$-n + \frac{1}{\lambda} \sum_{i = 1}^n y_i = 0$$` `$$n \cdot \lambda = \sum_{i = 1}^n y_i$$` `$$\lambda = \frac{1}{n} \sum_{i = 1}^n y_i = \bar y$$` Thus, the maximum likelihood estimator is simply the empirical mean `\(\hat \lambda = \bar y\)` --- class:middle ## Poisson distribution (PD) If instead are given the average rate, *r*, at which events occur then `\(\lambda = rt\)`, and `$$P(k\, events\, in\, interval\, t) = \dfrac{rt^{k}e^{-rt}}{k!}$$` --- class:middle ## Poisson distribution (PD) PD a special case of the binomial, with trials n -> `\(\infty\)` and P(success in any trial) -> 0 If p is small, binomial P(k successes) `\(\approx\)` poisson P(k with `\(\lambda\)` = np) <br> **Example:** Consider binomial distribution with probability = 0.01. What is the probability of 8 occurences in 1000 tries? ```r # binomial dbinom(8,1000,0.01) ``` ``` #> [1] 0.113 ``` ```r # poisson dpois(8,1000*.01) # lambda = n*p ``` ``` #> [1] 0.113 ``` --- class: middle ### Offset vs Exposure In most applications of count-data regression, there is a baseline or exposure, some value such as the number of person years that the counts occurred over We can model `\(y_i\)` as the number of cases in a process with rate `\(\theta_i\)` and exposure `\(\mu_i\)` `$$y_i \sim negative \, binomial (\mu_i \theta_i, \phi)$$` where, as before, `\(\theta_i = e^{X_i \beta}\)` and includes Poisson regression as the special case of φ → ∞ The logarithm of the exposure, `\(log(\mu_i)\)`, is called the offset in GLM terminology. The regression coefficients `\(\beta\)` now reflect the associations between the predictors and `\(\theta_i\)` (for example, the rate of deaths) Putting the logarithm of the exposure into the model as an offset, is equivalent to including it as a regression predictor, but with its coefficient fixed to the value 1. Another option is to include it as a predictor and let its coefficient be estimated from the data. --- class: middle ## AIC = Akaike information criterion `\(AIC = - 2(log-likelihood) + kn_{par}\)`, where `\(k=2\)`, equivalently `\(AIC = 2k-2 \mathrm {\ln}(\hat {L})\)` `\(k\)` = number of estimated parameters in the model `\(\hat {L}\)` = maximum value of the likelihood function for the model Given a collection of models for the data, AIC estimates the quality of each model, relative to each of the other models. Hence, AIC provides a means for model selection --- class: middle ## BIC = Bayesian Information Criterion `\(BIC = - 2(log-likelihood) + kn_{par}\)`, where `\(k= log(n)\)`, and `\(n\)` is the number of observations