We will cover three main options how to importing data:
- Built-in Datasets
- Local Datasets (downloaded on your computer)
- Online Datasets
Below are examples on each of these methods to import data
Built-in Datasets
The iris
dataset is commonly used and built into R; no additional downloads are necessary.
Reading Data from Local CSV File
Option 2: Copy File Path Option
The readr package is the first package that we will learn about. This allows us to read in rectangular data like cvs files.
This is the typical way it goes when you have data located on your computer.
Option 3: URL Method
Often you get your data from online somewhere like GitHub or Kaggle. While there is the option to download the file, then read it from your computer locally, like we just did above, there’s also the option to read in data from a URL.
A benefit is that less memory is used on your computer. Another is that you can send your analysis with the data being read in from a URL, and others will be able to run the script without any data issues.
Below is an example.
How to Look at the Data?
view()
: this opens a new tab to view the data in a spreadsheet style like Excel
head()
: this returns the top six rows of a dataset, or the “head” of the data
Practices
First, let’s make a folder on your computer to use for bootcamp. I suggest putting it in your downloads folder or your documents folder and call it “R Bootcamp”.
Next, click the “Session” button at the top of RStudio, then “Set Working Directory”, then “Choose Directory” and use your computer’s file directory to navigate to the folder you made for bootcamp.
This set your “working directory” which basically makes it easier for us to deal with files.
Practice 1: 📃 Insurance 📑
Instructions:
- Click the link to download the insurance dataset.
- Use the two commands learned from above to look at the data.
Starter Code:
age | sex | bmi | children | smoker | region | charges |
---|---|---|---|---|---|---|
19 | female | 27.900 | 0 | yes | southwest | 16884.924 |
18 | male | 33.770 | 1 | no | southeast | 1725.552 |
28 | male | 33.000 | 3 | no | southeast | 4449.462 |
33 | male | 22.705 | 0 | no | northwest | 21984.471 |
32 | male | 28.880 | 0 | no | northwest | 3866.855 |
31 | female | 25.740 | 0 | no | southeast | 3756.622 |
Answer:
Practice 2: 🚗 Honda Cars 🚙
Instructions:
- Open this website in a new tab and copy the URL, or right-click and copy the URL.
- Use the two commands learned from above to look at the data.
Model | Trim | Year | Mileage | Price |
---|---|---|---|---|
Civic | LX | 2013 | 52911 | 9900 |
Civic | LX | 2017 | 27966 | 15495 |
Civic | LX | 2016 | 21639 | 14995 |
Civic | LX | 2017 | 21893 | 15595 |
Civic | LX | 2015 | 39868 | 11498 |
Civic | LX | 2015 | 26416 | 14000 |
Starter Code:
Answer:
Practice 3: 🦷 ToothGrowth 😁
Instructions:
- Look at the
ToothGrowth
dataset (this dataset is built into R)
len | supp | dose |
---|---|---|
4.2 | VC | 0.5 |
11.5 | VC | 0.5 |
7.3 | VC | 0.5 |
5.8 | VC | 0.5 |
6.4 | VC | 0.5 |
10.0 | VC | 0.5 |
Answer:
Practice 4: 📚 Student Performance Dataset 🧠
Instructions:
- Click the link to download the insurance dataset.
- Use the two commands learned from above to look at the data.
Starter Code:
gender | race/ethnicity | parental level of education | lunch | test preparation course | math score | reading score | writing score |
---|---|---|---|---|---|---|---|
female | group B | bachelor’s degree | standard | none | 72 | 72 | 74 |
female | group C | some college | standard | completed | 69 | 90 | 88 |
female | group B | master’s degree | standard | none | 90 | 95 | 93 |
male | group A | associate’s degree | free/reduced | none | 47 | 57 | 44 |
male | group C | some college | standard | none | 76 | 78 | 75 |
female | group B | associate’s degree | standard | none | 71 | 83 | 78 |
Answer:
Practice 5: 🏡 Rent Prices 🏘️
Instructions:
- Open this website and copy the URL.
- Use the two commands learned from above to look at the data.
Model | Trim | Year | Mileage | Price |
---|---|---|---|---|
Civic | LX | 2013 | 52911 | 9900 |
Civic | LX | 2017 | 27966 | 15495 |
Civic | LX | 2016 | 21639 | 14995 |
Civic | LX | 2017 | 21893 | 15595 |
Civic | LX | 2015 | 39868 | 11498 |
Civic | LX | 2015 | 26416 | 14000 |
Answer:
Practice 6: 🌺 Iris 🌻
Instructions:
- Look at the
iris
dataset (this dataset is built into R)
Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species |
---|---|---|---|---|
5.1 | 3.5 | 1.4 | 0.2 | setosa |
4.9 | 3.0 | 1.4 | 0.2 | setosa |
4.7 | 3.2 | 1.3 | 0.2 | setosa |
4.6 | 3.1 | 1.5 | 0.2 | setosa |
5.0 | 3.6 | 1.4 | 0.2 | setosa |
5.4 | 3.9 | 1.7 | 0.4 | setosa |
Answer: