16.1 filter()

filter() will filter (keep or remove) rows.

We can start from geneexp object, and filter only down-regulated genes using the filter() function of {dplyr}.

If you need to import the data again, run the following command:

geneexp <- read_csv(file="GSE277039/DEG_counts_sample.csv")
filter(geneexp, DE=="DOWN")

== is a logical operator that represents equality. It means that filter will return rows in geneexp that are exactly equal to “DOWN”.

Logical operators:

Operator Description
< less than
<= less than or equal to
> greater than
>= greater than or equal to
== exactly equal to
!= not equal to
!x not x
x | y x OR y
x & y x AND y

We can have several conditions.

For example, we may want to extract only rows that have either UP OR DOWN in DE:

filter(geneexp, DE=="DOWN" | DE=="UP")

Here, we introduce another operator, |, that means OR: rows are kept if values are either DOWN OR UP in DE column.

A good practice is to assign the filtered output to a new object, for example:

geneexp_filt <- filter(geneexp, DE=="DOWN" | DE=="UP")

You can filter based on several columns, for example, select genes that:

  • Are Up-regulated (column DE)
  • Have an adjusted p-value < 0.01 (column padj)
Try by yourself before clicking here!
filter(geneexp, DE=="UP" & padj < 0.01)