We've come to the last chapter in Part III: Transform of our textbook.
Chapter 19 Joins is a significant chapter and a joins are a relatively
abstract topic. This is the last major topic that we will cover and it will take
time and practice to learn to use joins effectively.
When you are sifting through data, considering questions and looking for
connections, you cannot expect to find all of the data you need in one data frame.
Likely you are going to need to combine two or more data frames to produce some
analysis and that process is called joining.
Data frames are joined with keys. A primary key is a variable,
or set of variables, in a data frame that uniquely identifies each observation. A
foreign key is a variable, or set of variables, in a data frame that
corresponds to a primary key in another data frame.
dplyr provides six join functions which can be divided into two categories:
mutating joins combine variables from two data frames by matching
observations with their keys
left_join(), inner_join(),
right_join(), and full_join()
filtering joins use the keys to exclude observations
Assessment deadlines will be 11:59pm each Saturday.
All assessments are submitted to the Homework Folder inside your assigned
Google Drive folder.
There are no make-ups for missed assessments. Contact me before a deadline
if you have an issue meeting the deadline and we will find a mutually
agreeable solution.
Homework
Homework 12 (due Saturday, April 5)
The instructions for your homework are contained in the R script
file
homework_12.R.