Last updated: 2022-10-03

Checks: 7 0

Knit directory: Rduinoiot-analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20220923) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version eab2122. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/

Unstaged changes:
    Modified:   .github/workflows/schedule-commit.yaml

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/index.Rmd) and HTML (docs/index.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd eab2122 FlavioLeccese92 2022-10-03 wflow_publish(“analysis/index.Rmd”)
Rmd cbc289a FlavioLeccese92 2022-10-03
html cbc289a FlavioLeccese92 2022-10-03
html 1bf0eba FlavioLeccese92 2022-10-02 Build site.
Rmd d00239e FlavioLeccese92 2022-10-02 wflow_publish(“analysis/index.Rmd”)
html c0ec671 FlavioLeccese92 2022-09-30 Build site.
Rmd 98f9bfc FlavioLeccese92 2022-09-30 environment-setup
html 3e4ed7a FlavioLeccese92 2022-09-30 Build site.
Rmd a3cf2de FlavioLeccese92 2022-09-30 wflow_publish(c(“analysis/index.Rmd”, “dashboard/weather-report.Rmd”))
html e397546 FlavioLeccese92 2022-09-30 Build site.
Rmd e86011b FlavioLeccese92 2022-09-30 index schema
html e86011b FlavioLeccese92 2022-09-30 index schema
html f8456f2 FlavioLeccese92 2022-09-27 push
html a729819 FlavioLeccese92 2022-09-25 rearrange site
html 2f3dcbb FlavioLeccese92 2022-09-25 test
html 50dcd8c FlavioLeccese92 2022-09-24 index.html update
html 71b5efd FlavioLeccese92 2022-09-24 Build site.
Rmd e44daf0 FlavioLeccese92 2022-09-24 Publish the initial files for myproject
Rmd e1dee4f FlavioLeccese92 2022-09-24 commit new index
html e1dee4f FlavioLeccese92 2022-09-24 commit new index
Rmd 9a79767 FlavioLeccese92 2022-09-24 merge
Rmd 555d36a FlavioLeccese92 2022-09-24 the first commit
html 555d36a FlavioLeccese92 2022-09-24 the first commit

Motivation

I have always been fascinated by the potential of open-source tools interaction, among which R (long live R!) and lately Arduino.

For those who don’t know, Arduino is an open-source hardware and software company which designs and produces microcontroller kits for the deployment of digital services, both at a professional, hobby and educational level. Furthermore, the community is very active and smart.

Here you can find many projects, including Home Automation, Robotics and even more.

To me as a data scientist with a statistical background, electronics is a black box, for this reason I chose to a 99% plug-and-play solution: Arduino Oplà IoT Kit.

If you want to know more, here’s a video introducing you the kit:

The kit comes with 4 integrated sensor measuring Humidity, Pressure, Temperature and Light.

Starting to collect data from these sensor is very simple: you just need to deploy an appropriate sketch to the mother board, which can be done through the dedicated IDE or a guided procedure.

Once everything is set up, you will have data flowing from sensors to the cloud and visibile via a dashboard hosted on Arduino Cloud, free for 12 months with the kit purchase. Here you can access mine to have an idea.

Data are stored into a cloud database and retrival of data is possible throught an API which can be queried, guess what…

with R!

Goal

The goal of the project is to deploy a dashboard on github which shows Arduino sensor data and it is updated every 15 minutes. Documentation (and this very document you are reading) must automatically versionised at every commit and every version easily accessible. Every software used must be for free.

Project setup

The R ecosystem is constantly growing, adding new amazing productive tools such as Workflowr and Flexdashboard, which perfectly exploit Github actions in order to automatize their scope.

Furthermore, in order to make it easier to access Arduino Iot Cloud API , I developed an R package through pkgdown. The package is called Rduinoiot and can be found on CRAN.

Workflowr

Workflowr is an R package that supports you in creating an organized, reproducible and shareable project on Github or Gitlab.

Practically speaking, when opening RStudio to start a new project, if you have Worflowr installed, you will see a new option. By chosing it, you are creating automatically a git-versioned project with organized subdirectories.

To tell Worflowr which git account to use, run the function:

wflow_git_config(user.name = "Full Name", user.email = "email@domain")

Then you can create rmarkdown analyses and make them accessible by a customizable website hosted for free on Github or Gitlab.

The two type of files relevant for a workflowr site are *.Rmds and _site.yml.

Analysis files *.Rmd

Any type of rmarkdown file can be added to your site. Only restriction, of course, is that your files cannot be shiny markdowns since they need a server to process live user interactions and in our setup we do not have it, so avoid runtime: shiny. rmarkdown files are static html and usually not optimal for reporting analysis of data which require frequent updates. Additionally, for my porpuse an ordinary markdown would not have satisfied my graphical obsession.

For these two reasons, I decided to go with flexdashboard (to obtain a catchy but static report) + github actions (to update the static HTML with new data every 15 minutes). But we will talk about it later in this document.

The most important part of the rmarkdown is the header, which will be automatically generated by workflowr and is customizable:

---
title: "Using R to visualize Arduino-iot weather sensor data"
site: workflowr::wflow_site
output:
  workflowr::wflow_html:
    toc: true
    toc_float: yes
    theme: flatly
    highlight: textmate
    css: style.css
editor_options:
  chunk_output_type: console
---

Analysis files *.Rmd need to be stored under analysis/ folder.

Once you are done with your analysis, you can build your site locally:

wflow_build()

If not specified, workflowr will add to the local site each of your analysis in the analysis/ folder.

Then, if you are ready to put it online, simply choose which analysis you want to publish and run the following:

wflow_publish(c("analysis/index.Rmd", "analysis/license.Rmd"))

This creates a commit and at the next push git will trigger the action to build the site.

The last step to have our site online is to go on our Git (in this case, Github) project, go to Settings > Pages and under Source, select Deploy from branch. Then select main and as a folder /docs and save.

Structure site file _site.yml

The design of the site which will host analysis is defined by the _site.yml.

The most important things you can customize are the theme, the navigation bar, which analysis has to be shown and where (and any HTML file too) and the footer. Here you can find more information.

While I am writing this document, the _site.yml of this site is the following:

name: "Rduinoiot-analysis"
output_dir: ../docs
navbar:
  title: "Rduinoiot-analysis"
  left:
  - text: Home
    href: index.html
  - text: Weather report
    href: weather-report.html
  - text: License
    href: license.html
output:
  workflowr::wflow_html:
    toc: true
    toc_float: yes
    theme: flatly
    highlight: textmate
    css: style.css
editor_options:
  chunk_output_type: console

A style.css file is stored under subdirectory /docs for additional graphical customization.

Flexdashboard

Flexdashboard is an R package which allows to easily develop dashboards as you were writing a simple rmarkdown document.

I confess that I always had prefered Shiny apps for their interactivity, but they require a server and the graphical needs of my app exceeded the RAM constraint of shinyapps.io free plan.

However, even if not real time and without user interaction, refreshing the flexdashboard with new data driven by free github actions did the trick. Furthermore, having analysis together with documentation in a workflowr site seemed pretty cool.

As usual, creating a flexdashboard is straight-forward: if you have the package installed, in Rstudio choose new R Markdown… > From template > Flex dashboard. Again, the header is very important and you will note that, when comparing with the workflowr earlied showed, you will see:

---
...
output: 
  flexdashboard::flex_dashboard:
...
---

You can customize this dashboard seemengly to the common rmarkdown documents. If you want to have an idea, give a look at the file which generates the Weather report analysis from my Arduino’s sensors. The file is stored under subdirectory /dashboard.

Github actions

The last piece we need to get everything working is to automatize our scripts, in order to have the dashboard updated every 15 minutes with fresh new data from Arduino. In order to do that, we use Github actions!

We would want to automatize two jobs:

1. refresh-data.R

As you expect, this R script is in charge of getting new data from Arduino. It reads the old data and starting from the most recent data timestamp loop over sensors until current time. Then it overwrites the old dataset with a new .rds under the subdirectory /data.

Packages used are: Rduinoiot, tibble, dplyr and lubridate.

2. deploy-dashboard.R

This other R script makes sure that a new flexdashboard is created and moved under the subdirectory docs where workflowr site is built. In doing that we are mapping the dashboard right inside the documentation site!

Package used is rmarkdown.

library(rmarkdown)
render("dashboard/weather-report.Rmd")
file.rename("dashboard/weather-report.html", "docs/weather-report.html")

Both files are stored under the subdirectory /jobs.

Future development

Embed a flexdashboard document inside a workflowr analysis. In doing that we would have a versioned dashboard which can be useful in some situation.

3. Scheduler

OK. But how do we automatize R scripts? Once again, we have to define a proper .yml file.

First of all I decided to place all the R scripts I want to schedule under the subdirectory .github/workflows, in case I needed to trigger more actions in the future.

Set up API secrets

An essential and preliminary step is to set up ARDUINO_API_CLIENT_ID and ARDUINO_API_CLIENT_SECRET, so that Rduinoiot package will be able to retrieve the data through the API. Since we want them to be secret, we cannot send them together with the refresh-data.R script.

Github allow us to define environments where the jobs will run and inside these environments we can define the so called secrets. For doing that, go to Settings > Environments > New Environment, chose a name for your environment and click Configure environment. Scrolling down the configuration page you will see:

Then you need to choose a name for your secret and fill in the value. Once you have done, going to Settings > Environments you will see your new environment with 2 secrets.

schedule-commit.yaml

We will now see separately each part of the .yml file but here you can find the full script.

  1. We give a name to the action and we tell github actions when the script needs to run. In this case we scheduled every 15 minutes but also we want it to run every time a new push occurs on main branch, so we will make sure to have a potential quick failure after changes.
name: rduinoiot-jobs
on:
  schedule:
    - cron: "*/15 * * * *"
  push:
    branches:
    - main
  1. We define our jobs. In our case, a single job named refresh-data (with 2 R scripts). We tell Github to run the job on a macos system and on the environment called rduinoiot-jobs-environment on Github where we stored our secrets. Furthermore, under steps we use actions/checkout@master (documented here), r-lib/actions/setup-r@master which installs and sets up R (here) and r-lib/actions/setup-pandoc@v2 which installs pandoc, needed to knit rmarkdown (here).
jobs:
  refresh-data:
# The type of runner that the job will run on
    runs-on: macos-latest
    environment: rduinoiot-jobs-environment
# Load repo and install R
    steps:
    - uses: actions/checkout@master
    - uses: r-lib/actions/setup-r@master
    - uses: r-lib/actions/setup-pandoc@v2
  1. Now we need to import ARDUINO_API_CLIENT_ID and ARDUINO_API_CLIENT_SECRET secrets from rduinoiot-jobs-environment into .Renviron where R will be able to retrieve them, allowing to use Rduinoiot’s functions. To check if everything is going well, we print the content of .Renviron.
# Load API key into Renviron
    - name: Load API key
      env:
        ARDUINO_API_CLIENT_ID: '${{ secrets.ARDUINO_API_CLIENT_ID }}'
        ARDUINO_API_CLIENT_SECRET: '${{ secrets.ARDUINO_API_CLIENT_SECRET }}'
      run: |
        touch .Renviron
        echo ARDUINO_API_CLIENT_ID="$ARDUINO_API_CLIENT_ID" >> .Renviron
        echo ARDUINO_API_CLIENT_SECRET="$ARDUINO_API_CLIENT_SECRET" >> .Renviron
        echo "cat .Renviron"
        cat .Renviron
      shell: bash
  1. Almost there. We install packages required by the two R scripts and run them!
# Install R packages
    - name: Install packages
      run: |
        install.packages(c("Rduinoiot", "dplyr", "tibble","lubridate"), repos = "https://cloud.r-project.org")
        install.packages(c("rmarkdown", "flexdashboard", "echarts4r", "htmltools"), repos = "https://cloud.r-project.org")
      shell: Rscript {0}
# Run R refresh-data
    - name: Run refresh-data
      run: Rscript jobs/refresh-data.R
# Run R deploy-dashboard
    - name: Run deploy-dashboard
      run: Rscript jobs/deploy-dashboard.R
  1. Finally we commit all the new files created with this process.
# Add new files in data folder, commit along with other modified files, push
    - name: Commit files
      run: |
        git config --local user.name actions-user
        git config --local user.email "actions@github.com"
        git add --force data/*
        git commit -am "GH ACTION Headlines $(date)"
        git push origin main
      env:
        REPO_KEY: ${{secrets.GITHUB_TOKEN}}
        username: github-actions
…and we R done !

Conclusions

We learned how to exploit the combined power of Workflowr, Flexdashboard and Github actions in order to deploy on Github a live dashboard for free, showing Arduino Oplà Iot kit’s sensor values. We used Rduinoiot to easily retrieve data. The dashboard is available Rduinoiot or clicking on the tab of this site Weather report.

Future development

Having real time weather data may be not that useful by itself. For this reason I am planning to dig into the prediction mode! The idea is to exploit weather stations placed nearby my house by Arpae, the Regional agency for prevention, environment and energy of Emilia-Romagna (Italy). Since weather stations provide real time data but, disgracefully, not in a real API mode, I need to derive some workaround. In fact, Arpae allows user to get data through a web service which, when queried data of interest, sends the data to an email.

The idea is to automatize this procedure by sending the request and reading the data from email’s attachment. The R will be scheduled as an additional job called by Guthub action and I will be able to attempt weather predictions 🚀


sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] workflowr_1.7.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.9       compiler_4.1.0   pillar_1.8.1     bslib_0.4.0     
 [5] later_1.3.0      git2r_0.30.1     jquerylib_0.1.4  tools_4.1.0     
 [9] getPass_0.2-2    digest_0.6.29    jsonlite_1.8.0   evaluate_0.15   
[13] tibble_3.1.8     lifecycle_1.0.2  pkgconfig_2.0.3  rlang_1.0.5     
[17] cli_3.4.0        rstudioapi_0.13  yaml_2.3.5       xfun_0.30       
[21] fastmap_1.1.0    httr_1.4.4       stringr_1.4.0    knitr_1.37      
[25] fs_1.5.2         vctrs_0.4.1      sass_0.4.2       rprojroot_2.0.2 
[29] glue_1.6.2       R6_2.5.1         processx_3.5.2   fansi_1.0.3     
[33] rmarkdown_2.13   callr_3.7.0      magrittr_2.0.3   whisker_0.4     
[37] ps_1.6.0         promises_1.2.0.1 htmltools_0.5.3  httpuv_1.6.6    
[41] utf8_1.2.2       stringi_1.7.6    cachem_1.0.6