13.6 Exercise 8: Regular expressions
Create the script “exercise8.R” and save it to the “Rcourse/Module2” directory: you will save all the commands of exercise 8 in that script.
Remember you can comment the code using #.
correction
1- Play with grep
- Create the following data frame
df2 <- data.frame(age=c(32, 45, 12, 67, 40, 27),
citizenship=c("England", "India", "Spain", "Brasil", "Tunisia", "Poland"),
row.names=paste(rep(c("Patient", "Doctor"), c(4, 2)), 1:6, sep=""),
stringsAsFactors=FALSE)
Using grep: create a smaller data frame df3 that contains only the Patient but NOT the Doctor information.
correction
## [1] "Patient1" "Patient2" "Patient3" "Patient4" "Doctor5" "Doctor6"
## [1] 1 2 3 4
2- Play with gsub
Build this vector of file names:
vector1 <- c("L2_sample1_GTAGCG.fastq.gz", "L1_sample2_ATTGCC.fastq.gz",
"L1_sample3_TGTTAC.fastq.gz", "L4_sample4_ATGGTA.fastq.gz")
Use gsub and an appropriate regular expression to remove all but “sample1”, “sample2”, “sample3” and “sample4” from vector1.
correction
## [1] "sample1" "sample2" "sample3" "sample4"