Last updated: 2017-07-05

Code version: bbd4aa2

I begin by loading a few packages, as well as some additional functions I wrote for this project, into the R environment.

library(data.table)
source("../code/functions.R")


Reading the data

I wrote a function, read.divvy.data, that reads in the trip and station data from the CSV files downloaded from the Divvy website. This function uses fread from the data.table package to quickly read in the data (it is much faster than read.table). This function also prepares the data, notably the dates and times, so that they are easier to work with.

divvy <- read.divvy.data()
# Reading station data from ../data/Divvy_Stations_2016_Q4.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q1.csv.
# Reading trip data from ../data/Divvy_Trips_2016_04.csv.
# Reading trip data from ../data/Divvy_Trips_2016_05.csv.
# Reading trip data from ../data/Divvy_Trips_2016_06.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q3.csv.
# Reading trip data from ../data/Divvy_Trips_2016_Q4.csv.
# Preparing Divvy data for analysis in R.
# Converting dates and times.


A first glance at the Divvy data

nrow(divvy$stations)
# [1] 581
ncol(divvy$stations)
# [1] 5
names(divvy$stations)
# [1] "name"        "latitude"    "longitude"   "dpcapacity"  "online_date"
nrow(divvy$trips)
# [1] 3595383
ncol(divvy$trips)
# [1] 14
names(divvy$trips)
#  [1] "trip_id"           "starttime"         "bikeid"           
#  [4] "tripduration"      "from_station_id"   "from_station_name"
#  [7] "to_station_id"     "to_station_name"   "usertype"         
# [10] "gender"            "birthyear"         "start.week"       
# [13] "start.day"         "start.hour"
  • Number of stations
  • Number of trips in 2016

Which station(s) had the most activity?

Session information

sessionInfo()
# R version 3.3.2 (2016-10-31)
# Platform: x86_64-apple-darwin13.4.0 (64-bit)
# Running under: macOS Sierra 10.12.5
# 
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
# 
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
# [1] data.table_1.10.4
# 
# loaded via a namespace (and not attached):
#  [1] backports_1.0.5 magrittr_1.5    rprojroot_1.2   tools_3.3.2    
#  [5] htmltools_0.3.6 yaml_2.1.14     Rcpp_0.12.11    stringi_1.1.2  
#  [9] rmarkdown_1.6   knitr_1.16      git2r_0.18.0    stringr_1.2.0  
# [13] digest_0.6.12   evaluate_0.10.1

This R Markdown site was created with workflowr