1 Setup

# Install packages 
if (!require("pacman")) install.packages("pacman")

## Loading required package: pacman

pacman::p_load(tidyverse, # tidyverse pkgs including purrr
               tictoc, # performance test 
               furrr) # parallel processing

2 Objectives

Learning how to use slowly() and future_ to make automation process either slower or faster

3 How to Make Automation Slower

Scraping 50 pages from a website and you don’t want to overload the server. How can you do that?

4 For loop

5 Map

walk() works same as map() but doesn’t store its output.
If you’re web scraping, one problem with this approach is it’s too fast by human standards.
If you want to make the function run slowly …

slowly() takes a function and modifies it to wait a given amount of time between each call. - purrr package vignette

If a function is a verb, then a helper function is an adverb (modifying the behavior of the verb).

6 How to Make Automation Faster

In a different situation, you want to make your function run faster. This is a common situation when you collect and analyze data at large-scale. You can solve this problem using parallel processing. For more on the parallel processing in R, read this review.

Parallel processing setup
- Step1: Determine the number of max workers (availableCores())
- Step2: Determine the parallel processing mode (plan())

# Setup 
n_cores <- availableCores() - 1
n_cores # This number depends on your computer spec.

## system 
##      7

plan(multiprocess, # multicore, if supported, otherwise multisession
     workers = n_cores) # the maximum number of workers

## Warning: [ONE-TIME WARNING] Forked processing ('multicore') is disabled
## in future (>= 1.13.0) when running R from RStudio, because it is
## considered unstable. Because of this, plan("multicore") will fall
## back to plan("sequential"), and plan("multiprocess") will fall back to
## plan("multisession") - not plan("multicore") as in the past. For more details,
## how to control forked processing or not, and how to silence this warning in
## future R sessions, see ?future::supportsMulticore

# 4.931 sec elapsed 
tic()
mean100 <- map(1:1000000, mean)
toc()

## 4.869 sec elapsed

# 2.536 sec elapsed 
tic()
mean100 <- future_map(1:1000000, mean)
toc()

## 3.487 sec elapsed

Make Automation Slower or Faster

Jae Yeon Kim

2020-09-18