satRday Presentation: Algorithmic Financial Planning

Anisa Mathura

7 March 2020

R Packages used for Regression modelling

  • vtreat

  • mgcv

  • standardize

The Problem and the Data

  • PROBLEM: Create a statistical regression model that relates acute hospital profit per day to operational influences

  • DATA: Operational, Revenue, Cost and Efficiency metrics for 43 acute hospitals over 3 years i.e. 129 data rows

  • INTENTION: Parsimonious model that effectively caters for the diversity of experience across the hospitals

To Explain NOT Predict

  • Isolate the metrics which are statistically significant profit per day drivers

  • Ascertain the influence of the Hospital Manager vs Head Office on these metrics

  • Use the results and appropriate presentation to convince financial executives of the “many-model approach” to business strategy!

The Process

  • Analyse the potential explanatory variables against the dependent variable

  • Evaluate a Linear regression model to ascertain the statistically significant variables

  • Evaluate a Generalised Additive Model with the same variables

  • Decide on how to present results

Plots of Explanatory variables

Summary of a Linear Model using 3 way Cross Validation

## 
## Call:
## lm(formula = fmla.lin, data = HP_Data[split$train, ])
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1356.82  -176.03     4.02   173.02   700.42 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.510e+02  4.251e+02   0.355    0.723    
## Occupancy    2.146e+03  2.733e+02   7.852 1.37e-11 ***
## Med_PPD.Rt  -1.339e+03  2.780e+02  -4.815 6.63e-06 ***
## Net_Rev.PPD  1.494e-01  2.474e-02   6.039 4.34e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 314.2 on 82 degrees of freedom
## Multiple R-squared:  0.8376, Adjusted R-squared:  0.8317 
## F-statistic:   141 on 3 and 82 DF,  p-value: < 2.2e-16

Residuals vs Fitted Values

Q-Q Plot to check Normality

Scale Location

Leverage vs Residuals

Summary of a Generalised Additive Model using 3 way Cross Validation

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## OpEB_PPD ~ s(Occupancy) + s(Med_PPD.Rt) + s(Net_Rev.PPD)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1846.75      31.42   58.77   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                  edf Ref.df      F  p-value    
## s(Occupancy)   2.330  2.889 18.956 4.19e-08 ***
## s(Med_PPD.Rt)  8.736  8.970  2.826  0.00504 ** 
## s(Net_Rev.PPD) 1.000  1.000 16.084  0.00014 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.856   Deviance explained = 87.6%
## GCV = 1.0014e+05  Scale est. = 84924     n = 86

Next Steps

Modelling…

  • Drop outliers, in particular loss making hospitals and re-evaluate the models

  • Experiment with interaction terms for the GAM

Presentation of results…

  • Present the results in a non-offensive manner

  • Decide whether it is a model to

    • target profit
    • measure performance or
    • strategically manage hospitals

END