13.6 Exercise 8: Regular expressions
Create the script “exercise8.R” and save it to the “Rcourse/Module2” directory: you will save all the commands of exercise 8 in that script.
Remember you can comment the code using #.
correction
1- Play with grep
- Create the following data frame
df2 <- data.frame(age=c(32, 45, 12, 67, 40, 27),
citizenship=c("England", "India", "Spain", "Brasil", "Tunisia", "Poland"),
row.names=paste(rep(c("Patient", "Doctor"), c(4, 2)), 1:6, sep=""),
stringsAsFactors=FALSE)
Using grep: create the smaller data frame df3 that contains only the Patient’s but NOT the Doctor’s information.
correction
- Use grep and one regular expression to retrieve from df2 patients/doctors coming from either Brasil or Spain. Brainstorm a bit!
correction
- Use grep and one regular expression to retrieve from df2 patients/doctors coming from either Brasil or England.
correction
2- Play with gsub
Build this vector of file names:
vector1 <- c("L2_sample1_GTAGCG.fastq.gz", "L1_sample2_ATTGCC.fastq.gz",
"L1_sample3_TGTTAC.fastq.gz", "L4_sample4_ATGGTA.fastq.gz")
Use gsub and an appropriate regular expression on vector1 to retrieve only “sample1”, “sample2”, “sample3” and “sample4”.