21.3 Join tables
We have been working (mostly) with 2 objects so far:
- geneexp that contains gene expression information
- gtf that contains gene annotation information
How can we merge all the information into a single data frame?
tidyr provides an easy way to join 2 data frames, based on one or more columns containing common identifiers, to be able to merge relevant information together.
Relevant functions are the following:
left_join: keeps all observations from the first table (x)right_join: keeps all observations from the second table (y)inner_join: keeps the intersection of observationsouter_join: keeps the union of observations
Let’s try the 4 of them and check how many genes are left in each case (with nrow):
## [1] 420
## [1] 50
## [1] 49
## [1] 421