Last updated: 2019-07-31
Checks: 7 0
Knit directory: wflow-divvy/analysis/
This reproducible R Markdown analysis was created with workflowr (version 1.4.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(1)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: analysis/.DS_Store
Ignored: data/Divvy_Stations_2016_Q1Q2.csv
Ignored: data/Divvy_Stations_2016_Q3.csv
Ignored: data/Divvy_Stations_2016_Q4.csv
Ignored: data/Divvy_Trips_2016_04.csv
Ignored: data/Divvy_Trips_2016_05.csv
Ignored: data/Divvy_Trips_2016_06.csv
Ignored: data/Divvy_Trips_2016_Q1.csv
Ignored: data/Divvy_Trips_2016_Q3.csv
Ignored: data/Divvy_Trips_2016_Q4.csv
Ignored: data/README.txt
Ignored: data/data.tar.gz
Ignored: docs/.DS_Store
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view them.
File | Version | Author | Date | Message |
---|---|---|---|---|
html | 5357a3b | Peter Carbonetto | 2019-04-10 | Build site. |
Rmd | 61c85b2 | Peter Carbonetto | 2019-04-10 | wflow_publish(c(“seasonal-trends.Rmd”, “station-map.Rmd”, |
html | f4e627f | Peter Carbonetto | 2019-04-10 | Re-built first-glance analysis using workflowr 1.2.0.9000. |
html | 66feda4 | Peter Carbonetto | 2018-05-07 | Adjusted _site,yml slightly. |
html | 39bbd3a | Peter Carbonetto | 2018-04-14 | Re-built first-glance webpage with workflowr v0.11.0.9000. |
Rmd | ea5fb72 | Peter Carbonetto | 2018-04-14 | wflow_publish(“first-glance.Rmd”) |
html | 51163d7 | Peter Carbonetto | 2018-03-12 | Ran wflow_publish(“*.Rmd“) with version v0.11.0 of workflowr. |
html | ab9176e | Peter Carbonetto | 2018-03-09 | Added code_hiding to the analysis R Markdown files. |
html | b32e833 | Peter Carbonetto | 2018-01-18 | Re-built all webpages using workflowr v0.1.0. |
html | 7d0b902 | Peter Carbonetto | 2017-11-16 | Re-built first-glance.html with workflowr v0.8.0. |
Rmd | 1470002 | Peter Carbonetto | 2017-08-02 | Testing wflow_status() bug. |
Rmd | 6b9ddf1 | Peter Carbonetto | 2017-08-02 | Added header with between-section spacing adjustment, and removed <br> tags from R Markdown files. |
html | 727b8d9 | Peter Carbonetto | 2017-07-13 | Re-built all the analysis files; wflow_publish(Sys.glob(“*.Rmd“)). |
Rmd | 6d02ffc | Peter Carbonetto | 2017-07-13 | Made a dozen or so small adjustments to the .Rmd files. |
Rmd | b739bf9 | Peter Carbonetto | 2017-07-12 | Revised text in first-glance.Rmd. |
html | b739bf9 | Peter Carbonetto | 2017-07-12 | Revised text in first-glance.Rmd. |
html | 597355d | Peter Carbonetto | 2017-07-07 | Ran wflow_publish(c(index.Rmd,first-glance.Rmd,station-map.Rmd,time-of-day-trends.Rmd)). |
Rmd | f7da4f6 | Peter Carbonetto | 2017-07-07 | Fixed a broken link, and made a bunch of small revisions to the notebooks. |
html | f62f674 | Peter Carbonetto | 2017-07-05 | Re-built all the files without cached chunks. |
Rmd | 96f2db4 | Peter Carbonetto | 2017-07-05 | wflow_publish(c(“index.Rmd”, “first-glance.Rmd”, “station-map.Rmd”)) |
html | 5a4a3bd | Peter Carbonetto | 2017-07-05 | Another small adjustment to first-glance.Rmd. |
Rmd | 7d1aefc | Peter Carbonetto | 2017-07-05 | wflow_publish(“first-glance.Rmd”) |
html | c8f1418 | Peter Carbonetto | 2017-07-05 | Build site. |
Rmd | 4bb29bd | Peter Carbonetto | 2017-07-05 | Formatting adjustments to first-glance.Rmd. |
Rmd | 09bb3c4 | Peter Carbonetto | 2017-07-05 | A few adjustments to first-glance.Rmd. |
html | db8f335 | Peter Carbonetto | 2017-07-05 | Updated first-look.html. |
Rmd | 5e53297 | Peter Carbonetto | 2017-07-05 | Filled out first-glance.Rmd. |
html | d132d28 | Peter Carbonetto | 2017-07-05 | Re-built first-glance.html. |
Rmd | bbd4aa2 | Peter Carbonetto | 2017-07-05 | Added steps to extract dates and times from character strings in CSV files. |
html | bbd4aa2 | Peter Carbonetto | 2017-07-05 | Added steps to extract dates and times from character strings in CSV files. |
Here, we will take a brief look at the data provided by Divvy.
I begin by loading a few packages, as well as some additional functions I wrote for this project.
library(data.table)
source("../code/functions.R")
I wrote a function, read.divvy.data
, that reads in the trip and station data from the Divvy CSV files. This function uses fread
from the data.table
package to quickly read in the data (it is much faster than read.table
). This function also prepares the data, including the departure dates and times, so that they are easier to work with.
divvy <- read.divvy.data()
# Reading station data from ../data/Divvy_Stations_2016_Q4.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q1.csv.
# Reading trip data from ../data/Divvy_Trips_2016_04.csv.
# Reading trip data from ../data/Divvy_Trips_2016_05.csv.
# Reading trip data from ../data/Divvy_Trips_2016_06.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q3.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q4.csv.
# Preparing Divvy data for analysis in R.
# Converting dates and times.
We have data on 581 Divvy stations across the city.
nrow(divvy$stations)
# [1] 581
print(head(divvy$stations),row.names = FALSE)
# name latitude longitude dpcapacity online_date
# 2112 W Peterson Ave 41.99118 -87.68359 15 5/12/2015
# 63rd St Beach 41.78102 -87.57612 23 4/20/2015
# 900 W Harrison St 41.87468 -87.65002 19 8/6/2013
# Aberdeen St & Jackson Blvd 41.87773 -87.65479 15 6/21/2013
# Aberdeen St & Monroe St 41.88042 -87.65560 19 6/26/2013
# Ada St & Washington Blvd 41.88283 -87.66121 15 10/10/2013
We also have information about the >3 million trips taken on Divvy bikes in 2016.
nrow(divvy$trips)
# [1] 3595383
print(head(divvy$trips),row.names = FALSE)
# trip_id starttime bikeid tripduration from_station_id
# 9080551 2016-03-31 23:53:00 155 841 344
# 9080550 2016-03-31 23:46:00 4831 649 128
# 9080549 2016-03-31 23:42:00 4232 210 350
# 9080548 2016-03-31 23:37:00 3464 1045 303
# 9080547 2016-03-31 23:33:00 1750 202 334
# 9080546 2016-03-31 23:31:00 4302 638 67
# from_station_name to_station_id to_station_name
# Ravenswood Ave & Lawrence Ave 458 Broadway & Thorndale Ave
# Damen Ave & Chicago Ave 213 Leavitt St & North Ave
# Ashland Ave & Chicago Ave 210 Ashland Ave & Division St
# Broadway & Cornelia Ave 458 Broadway & Thorndale Ave
# Lake Shore Dr & Belmont Ave 329 Lake Shore Dr & Diversey Pkwy
# Sheffield Ave & Fullerton Ave 304 Broadway & Waveland Ave
# usertype gender birthyear start.week start.day start.hour
# Subscriber Male 1986 13 Thursday 23
# Subscriber Male 1980 13 Thursday 23
# Subscriber Male 1979 13 Thursday 23
# Subscriber Male 1980 13 Thursday 23
# Subscriber Male 1969 13 Thursday 23
# Subscriber Male 1991 13 Thursday 23
Out of all the Divvy stations in Chicago, the one on Navy Pier (near the corner of Streeter and Grand) had the most activity by far.
departures <- table(divvy$trips$from_station_name)
as.matrix(head(sort(departures,decreasing = TRUE)))
# [,1]
# Streeter Dr & Grand Ave 90042
# Lake Shore Dr & Monroe St 51090
# Theater on the Lake 47927
# Clinton St & Washington Blvd 47125
# Lake Shore Dr & North Blvd 45754
# Clinton St & Madison St 41744
I would also like to take a close look at the trip data for the main Divvy station on the University of Chicago campus. The Divvy bikes were rented almost 8,000 times in 2016 at that location.
sum(divvy$trips$from_station_name == "University Ave & 57th St",na.rm = TRUE)
# [1] 7944
This is the version of R and the packages that were used to generate these results.
sessionInfo()
# R version 3.4.3 (2017-11-30)
# Platform: x86_64-apple-darwin15.6.0 (64-bit)
# Running under: macOS High Sierra 10.13.6
#
# Matrix products: default
# BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
# LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
#
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] data.table_1.11.4
#
# loaded via a namespace (and not attached):
# [1] workflowr_1.4.0 Rcpp_1.0.1 digest_0.6.18
# [4] rprojroot_1.3-2 backports_1.1.2 git2r_0.25.2.9008
# [7] magrittr_1.5 evaluate_0.13 stringi_1.4.3
# [10] fs_1.2.7 whisker_0.3-2 rmarkdown_1.13
# [13] tools_3.4.3 stringr_1.4.0 glue_1.3.1
# [16] xfun_0.7 yaml_2.2.0 compiler_3.4.3
# [19] htmltools_0.3.6 knitr_1.23