11.2 Exercise 6: scatter plot

  1. Import DataViz_source_files-main/files/GSE150029_rnaseq_log2.csv into an object called rnaseq.
correction
rnaseq <- read_csv("DataViz_source_files-main/files/GSE150029_rnaseq_log2.csv")


  1. Create a scatter plot that represents sample CTRL on the x axis and sample EZH on the y axis.
correction
ggplot(data=rnaseq, mapping=aes(x=CTRL, y=EZH)) +
  geom_point()


3. Color the points according to the gene_biotype

correction
ggplot(data=rnaseq, mapping=aes(x=CTRL, y=EZH, color=gene_biotype)) +
  geom_point()


3. Not very readable! Filter and plot only data corresponding to either lincRNA OR miRNA.

correction
ggplot(data=filter(rnaseq, gene_biotype=="lincRNA" | gene_biotype=="miRNA"), mapping=aes(x=CTRL, y=EZH, color=gene_biotype)) +
  geom_point()

# using the pipe:
filter(rnaseq, gene_biotype=="lincRNA" | gene_biotype=="miRNA") %>% ggplot(mapping=aes(x=CTRL, y=EZH, color=gene_biotype)) +
  geom_point()


4. Now select and only those lincRNAs and miRNAs that are expressed in CTRL at least 1.5 times more than in EZH.

correction
filter(rnaseq, (gene_biotype=="lincRNA" | gene_biotype=="miRNA") & CTRL > 1.5*EZH) %>% ggplot(mapping=aes(x=CTRL, y=EZH, color=gene_biotype)) +
  geom_point()

  1. Add a title to the plot, and make it bold (see theme() section of the course)
correction
filter(rnaseq, gene_biotype=="lincRNA" | gene_biotype=="miRNA" & CTRL > 1.5*EZH) %>% ggplot(mapping=aes(x=CTRL, y=EZH, color=gene_biotype)) +
  geom_point() +
  ggtitle("lincRNA and miRNA") +
  theme(plot.title = element_text(face = "bold"))


NOTE: If you want to label only one (or few) point(s), you can do it the following way:

First, filter the data frame:

SNHG8 <- filter(rnaseq, gene_name=="SNHG8")

Then, add it to geom_text:

filter(rnaseq, gene_biotype=="lincRNA" | gene_biotype=="miRNA" & CTRL > 1.5*EZH) %>% ggplot(mapping=aes(x=CTRL, y=EZH, color=gene_biotype)) +
  geom_point() +
  ggtitle("lincRNA and miRNA") +
  theme(plot.title = element_text(face = "bold")) +
  geom_text(data=SNHG8, label="SNHG8", show.legend = FALSE)