3.2 Functions & packages
3.2.1 Functions
A function in R is a piece of code that takes an input (user data, parameters), processes some calculation, and outputs data.
For example: the mean() function would take a vector / series of numbers as an input, calculate and output their average.
Functions can take arguments/parameters. In the example above, the main argument to mean() would be a series of numbers given by the user.
In R code, you can recognize functions because of the parenthesis (“round brackets”) following their name.
3.2.2 Packages
3.2.2.1 What are packages?
A package in R stores, in standardized format, a set of functions, data and documentation.
They are developed and shared by the community, and vary in size and complexity.
Packages are stored in a library.
Packages are usually found in public repositories such as:
- CRAN (general repository for any type of data analysis).
- Bioconductor (initially specialized in high throughput data analysis / bioinformatics)
Anyone can create a package and stored it locally; creating packages is a great way to share code.
The previous function, mean(), is part of the {base} package that is available by default.
3.2.2.2 The “tidyverse”
The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.
Why do we use the tidyverse packages in this course?
- Easier to understand / more intuitive vocabulary: better for beginners.
- More “modern” style of coding.
- Uniform in style and logic across data manipulation and visualization.
In this course, we will use in particular, and in that order: