Last updated: 2020-10-18

Inline csv file

# A tibble: 2 x 3
      a     b     c
  <dbl> <dbl> <dbl>
1     1     2     3
2     4     5     6

Skip some columns

  • metadata
  • commented lines that you don’t want to read
read_csv("The first line of metadata
  The second line of metadata
  1,2,3", skip = 2)
# A tibble: 1 x 3
      x     y     z
  <dbl> <dbl> <dbl>
1     1     2     3
read_csv("# A comment I want to skip
  1,2,3", comment = "#")
# A tibble: 1 x 3
      x     y     z
  <dbl> <dbl> <dbl>
1     1     2     3

No column names in data

read_csv("1,2,3\n4,5,6", # \n adds a new line 
         col_names = FALSE) # cols will be labelled seq from X1 .. Xn
# A tibble: 2 x 3
     X1    X2    X3
  <dbl> <dbl> <dbl>
1     1     2     3
2     4     5     6
         col_names = c("x", "y", "z")) # cols named as you provided here
# A tibble: 2 x 3
      x     y     z
  <dbl> <dbl> <dbl>
1     1     2     3
2     4     5     6

NA values

         na = c(".",
# A tibble: 1 x 4
  a         b     c d    
  <lgl> <dbl> <dbl> <lgl>
1 NA        1     2 NA   
# here we specify that the . and null
# must be considered to be missing values


  1. What function would you use to read a file where fields were separated with


    # from the ?read_delim help page
    read_delim("a|b\n1.0|2.0", delim = "|")
    # A tibble: 1 x 2
          a     b
      <dbl> <dbl>
    1     1     2
  2. Apart from file, skip, and comment, what other arguments do read_csv() and read_tsv() have in common?

    All columns are common across the functions.

    • col_names
    • col_types
    • locale
    • na
    • quoted_na
    • quote
    • trim_ws
    • n_max
    • guess_max
    • progress
    • skip_empty_rows
  3. What are the most important arguments to read_fwf()?

    • file to read
    • col_positions as created by fwf_empty(), fwf_widths() or fwf_positions() which tells the function where a column starts and ends.
  4. Sometimes strings in a CSV file contain commas. To prevent them from causing problems they need to be surrounded by a quoting character, like " or '. By default, read_csv() assumes that the quoting character will be ". What argument to read_csv() do you need to specify to read the following text into a data frame?


    Specify the quote argument.

    read_csv("x,y\n1,'a,b'", quote = "'")
    # A tibble: 1 x 2
          x y    
      <dbl> <chr>
    1     1 a,b  
  5. Identify what is wrong with each of the following inline CSV files. What happens when you run the code?

    read_csv("a,b\n1,2,3\n4,5,6") # only 2 cols specified but 

    only 2 cols specified but 3 values provided

    3 col names provided, but either too few, or too many column values provided

    2 col names provided, but only one value provided.
    closing " missing

    read_csv(“a,b1,2,b”) Nothing syntactically a problem, but the rows are filled
    with the column headings?

    read_csv(“a;b1;3”) the read_csv2 which reads ; as delimiters should have been used

    They all run, but most have warnings, and some are not imported as expected.

    read_csv("a,b\n1,2,3\n4,5,6") # only 2 cols specified but 
    Warning: 2 parsing failures.
    row col  expected    actual         file
      1  -- 2 columns 3 columns literal data
      2  -- 2 columns 3 columns literal data
    # A tibble: 2 x 2
          a     b
      <dbl> <dbl>
    1     1     2
    2     4     5
    Warning: 2 parsing failures.
    row col  expected    actual         file
      1  -- 3 columns 2 columns literal data
      2  -- 3 columns 4 columns literal data
    # A tibble: 2 x 3
          a     b     c
      <dbl> <dbl> <dbl>
    1     1     2    NA
    2     1     2     3
    Warning: 2 parsing failures.
    row col                     expected    actual         file
      1  a  closing quote at end of file           literal data
      1  -- 2 columns                    1 columns literal data
    # A tibble: 1 x 2
          a b    
      <dbl> <chr>
    1     1 <NA> 
    # A tibble: 2 x 2
      a     b    
      <chr> <chr>
    1 1     2    
    2 a     b    
    # A tibble: 1 x 1
    1 1;3  

