Last updated: 2019-04-10

Checks: 6 0

Knit directory: wflow-divvy/analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.2.0.9000). The Report tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(1) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    analysis/.DS_Store
    Ignored:    data/Divvy_Stations_2016_Q1Q2.csv
    Ignored:    data/Divvy_Stations_2016_Q3.csv
    Ignored:    data/Divvy_Stations_2016_Q4.csv
    Ignored:    data/Divvy_Trips_2016_04.csv
    Ignored:    data/Divvy_Trips_2016_05.csv
    Ignored:    data/Divvy_Trips_2016_06.csv
    Ignored:    data/Divvy_Trips_2016_Q1.csv
    Ignored:    data/Divvy_Trips_2016_Q3.csv
    Ignored:    data/Divvy_Trips_2016_Q4.csv
    Ignored:    data/README.txt
    Ignored:    data/data.tar.gz
    Ignored:    docs/.DS_Store

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File Version Author Date Message
html df35db6 Peter Carbonetto 2018-08-24 Build site.
Rmd a860ded Peter Carbonetto 2018-08-24 wflow_publish(“station-map.Rmd”)
html 54fcf4e Peter Carbonetto 2018-04-14 Re-built station-map, time-of-day-trends and seasonal-trends webpages
Rmd de31b24 Peter Carbonetto 2018-04-14 wflow_publish(c(“station-map.Rmd”, “seasonal-trends.Rmd”,
Rmd db2ffe0 Peter Carbonetto 2018-04-14 wflow_publish(“station-map.Rmd”)
html 51163d7 Peter Carbonetto 2018-03-12 Ran wflow_publish(“*.Rmd“) with version v0.11.0 of workflowr.
html ab9176e Peter Carbonetto 2018-03-09 Added code_hiding to the analysis R Markdown files.
html e48700b Peter Carbonetto 2018-01-30 Ran wflow_publish(“station-map.Rmd”) for demo with Simon.
html b32e833 Peter Carbonetto 2018-01-18 Re-built all webpages using workflowr v0.1.0.
html 0401587 Peter Carbonetto 2017-11-16 Updated license.html, setup.html, station-map.html and
Rmd 9463eb6 Peter Carbonetto 2017-11-16 wflow_publish(c(“setup.Rmd”, “license.Rmd”, “time-of-day-trends.Rmd”,
Rmd 6b9ddf1 Peter Carbonetto 2017-08-02 Added header with between-section spacing adjustment, and removed <br> tags from R Markdown files.
html 727b8d9 Peter Carbonetto 2017-07-13 Re-built all the analysis files; wflow_publish(Sys.glob(“*.Rmd“)).
Rmd 6d02ffc Peter Carbonetto 2017-07-13 Made a dozen or so small adjustments to the .Rmd files.
html bf818d8 Peter Carbonetto 2017-07-07 Ran wflow_publish(c(“index.Rmd”, “setup.Rmd”, “station-map.Rmd”,
Rmd e4ba033 Peter Carbonetto 2017-07-07 Removed use of word ‘notebook’.
html 597355d Peter Carbonetto 2017-07-07 Ran wflow_publish(c(index.Rmd,first-glance.Rmd,station-map.Rmd,time-of-day-trends.Rmd)).
Rmd f7da4f6 Peter Carbonetto 2017-07-07 Fixed a broken link, and made a bunch of small revisions to the notebooks.
html f62f674 Peter Carbonetto 2017-07-05 Re-built all the files without cached chunks.
Rmd 96f2db4 Peter Carbonetto 2017-07-05 wflow_publish(c(“index.Rmd”, “first-glance.Rmd”, “station-map.Rmd”))
html 08c0318 Peter Carbonetto 2017-07-05 Build site.
Rmd 8113086 Peter Carbonetto 2017-07-05 I have a first draft of the station map notebook.
Rmd 67b8d2b Peter Carbonetto 2017-07-04 A variety of improvements to the data analysis notebooks.
Rmd 5c4fd93 Peter Carbonetto 2017-06-29 wflow_publish(“first-look.Rmd”)

In this analysis, I will use the Divvy trip and station data to generate a map of Chicago.

I begin by loading a few packages, as well as some additional functions I wrote for this project.

library(data.table)
# Warning: package 'data.table' was built under R version 3.4.4
library(ggplot2)
# Warning: package 'ggplot2' was built under R version 3.4.4
source("../code/functions.R")

Read the data

As before, I use function read.divvy.data to read the trip and station data from the CSV files.

divvy <- read.divvy.data()
# Reading station data from ../data/Divvy_Stations_2016_Q4.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q1.csv.
# Reading trip data from ../data/Divvy_Trips_2016_04.csv.
# Reading trip data from ../data/Divvy_Trips_2016_05.csv.
# Reading trip data from ../data/Divvy_Trips_2016_06.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q3.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q4.csv.
# Preparing Divvy data for analysis in R.
# Converting dates and times.

Get total number of departures by station

I use the trip data to get the total number of departures by station. From these data, I create a “departures” column.

divvy$stations <-
  cbind(divvy$stations,
    data.frame(departures = as.vector(table(divvy$trips$from_station_id))))
head(divvy$stations)
#                           name latitude longitude dpcapacity online_date
# 456        2112 W Peterson Ave 41.99118 -87.68359         15   5/12/2015
# 101              63rd St Beach 41.78102 -87.57612         23   4/20/2015
# 109          900 W Harrison St 41.87468 -87.65002         19    8/6/2013
# 21  Aberdeen St & Jackson Blvd 41.87773 -87.65479         15   6/21/2013
# 80     Aberdeen St & Monroe St 41.88042 -87.65560         19   6/26/2013
# 346   Ada St & Washington Blvd 41.88283 -87.66121         15  10/10/2013
#     departures
# 456        500
# 101       1068
# 109       4813
# 21        9425
# 80       10577
# 346       8480
summary(divvy$stations$departures)
#    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#       1     557    3058    6188    9029   90042

Create a Divvy stations map

A plot of the Divvy stations by geographic location (latitude and longitude) traces the outlines of the City of Chicago and the Lake Michigan shore. Further, the location of the downtown is apparent by scaling the area of each circle by the number of trips.

The University of Chicago Divvy station is highlighted in red.

divvy$stations <-
  transform(divvy$stations,
            at.uchicago = (name == "University Ave & 57th St"))
ggplot(divvy$stations,aes(x    = longitude,
                          y    = latitude,
                          fill = at.uchicago,
                          size = sqrt(departures))) +
  geom_point(shape = 21,color = "white") +
  scale_fill_manual(values = c("darkblue","red")) +
  theme_minimal() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

Version Author Date
df35db6 Peter Carbonetto 2018-08-24
b32e833 Peter Carbonetto 2018-01-18
f62f674 Peter Carbonetto 2017-07-05
This is the version of R and the packages that were used to generate these results.

sessionInfo()
# R version 3.4.3 (2017-11-30)
# Platform: x86_64-apple-darwin15.6.0 (64-bit)
# Running under: macOS High Sierra 10.13.6
# 
# Matrix products: default
# BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
# LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
# 
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
# 
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
# [1] ggplot2_3.1.0     data.table_1.11.4
# 
# loaded via a namespace (and not attached):
#  [1] Rcpp_1.0.0           knitr_1.20           whisker_0.3-2       
#  [4] magrittr_1.5         workflowr_1.2.0.9000 tidyselect_0.2.5    
#  [7] munsell_0.4.3        colorspace_1.4-0     R6_2.2.2            
# [10] rlang_0.3.1          dplyr_0.8.0.1        stringr_1.3.1       
# [13] plyr_1.8.4           tools_3.4.3          grid_3.4.3          
# [16] gtable_0.2.0         withr_2.1.2          git2r_0.23.3        
# [19] htmltools_0.3.6      assertthat_0.2.0     yaml_2.2.0          
# [22] lazyeval_0.2.1       rprojroot_1.3-2      digest_0.6.17       
# [25] tibble_2.1.1         crayon_1.3.4         purrr_0.2.5         
# [28] fs_1.2.6             glue_1.3.0           evaluate_0.11       
# [31] rmarkdown_1.10       labeling_0.3         stringi_1.2.4       
# [34] pillar_1.3.1         compiler_3.4.3       scales_0.5.0        
# [37] backports_1.1.2      pkgconfig_2.0.2