Import

In this part of the book, you’ll learn how to import a wider range of data into R, as well as how to get it into a form useful for analysis. Sometimes this is just a matter of calling a function from the appropriate data import package. But in more complex cases it might require both tidying and transformation in order to get to the tidy rectangle that you’d prefer to work with.

Our data science model with import highlighted in blue.
Figure 1: Data import is the beginning of the data science process; without data you can’t do data science!

In this part of the book you’ll learn how to access data stored in the following ways:

There are two important tidyverse packages that we don’t discuss here: haven and xml2. If you’re working with data from SPSS, Stata, and SAS files, check out the haven package, https://haven.tidyverse.org. If you’re working with XML data, check out the xml2 package, https://xml2.r-lib.org. Otherwise, you’ll need to do some research to figure which package you’ll need to use; google is your friend here 😃.