Hey,

This tiny tutorial is to help you get started with creating packages in R.

R packages are a great way to share and reuse code across projects, with other colleges (and the rest of the R community). They provide much better organization over defining functions in scripts and they facilitate reproducibility, usability and collaboration, which are pivotal in data analysis.

In this tutorial I’ll show you how to: create a basic package, structure code and functions, write comprehensive documentation and automated tests, name and license our package. You can find a lot more info in the R Packages available for free online.

Step 1: Create a project

Projects are nice because they allow you to work with relative file paths (no need to specify working directories or absolute paths).

In RStudio, you can do: New project > New Directory > R Package (using devtools). Choose the location and package name and select Create git repository to enable version control. You can also use renv to manage your package version independently of your R installation. And just hit Create Project.

A good package name

A few key considerations when naming an R package:

  • it should only contain ASCII letters, numbers, and periods (_ is not a valid character in CRAN package names)
  • it needs to have at least two characters
  • it should start with a letter and should not end with a dot

Is is a good idea to first check if the name you chose for your package is not already used elsewhere. The available package checks for existing packages (e.g. on CRAN, Bioconductor, or GitHub) and that the name you chose doesn’t have some unintended meaning (i.e. urban dictionary).

library(available)
available("package.name")

Folder structure

Hitting Create Project will open a new session and create the basic folder structure of an R package. This includes:

All R packages also include at least two supporting files called:

Step 2: Install and load the package

In RStudio, you should now see a new tab called Build in the environment window. Click on Build > Install to install the package. It will take a couple of seconds before you see the familiar: library(package.name) popping up in the command line.

The package is now loaded.

Note: the environment is still empty. None of the functions you define in the package are visible the environment.

Try this out:

hello() # runs mock function 
?hello # opens documentation

OK, LET’S GET STARTED :)

# Load devtools
library(devtools)

devtools contains a variety of helpful functions for package development. This also loads the usethis package, which provides an additional set of very convenient functions for R projects in general.

P.S. you can also create and install the package directly from the command line using:

# usethis::create_package("package.name")
# devtools::install()

Which will create the necessary folders/files described above.

1. Dependencies: using other packages

If you need to use functions from other packages (for example, foreign to read .sav files) you don’t load them using library(). Rather:

usethis::use_package("foreign")
# Note: you need to have the `foreign` installed for this to work

# Or, to avoid version issues...
usethis::use_package("foreign", min_version = "0.8.1")

By default, use_package() will add an “Imports” specification to the DESCRIPTION file.

2. Creating functions

You can add a new function to a package in a couple of ways:

# 1. create a .R script in R folder and write the function in there directly
usethis::use_r("function_name")

# 2. dump a function you already defined like 
dump("function_name", file="R/function_name.R")

During the development phase, it can be helpful to load all the package files into the current R session, so you can test and debug the new function(s) (prior to installation).

devtools::load_all(".")

Or click on Build > More > Load all (shortcut: ctrl-up-L).

Good practices

  • Each function (or component) of the package should have its own .R file. You may store a variety of smaller functions that you use throughout in a utils.R file. It’s also best practice to include an R file with the package’s name to help understand the package.
  • The name of the function should describe what it does, self-document if you can.
  • Each function should do one thing and do it well. Make sure that functions handle incorrect or unexpected input gracefully.
  • If you are using external functions inside your functions, specify which package they come from with ::.

3. Writing documentation with roxygen2

Documenting takes time but is also a fundamental element of a good R package. Function help files can be generated using the roxygen2 package.

In RStudio, place your cursor inside the function you want to document and go to Code > Insert Roxygen Skeleton (shortcut: ctrl-opt-up-R). In the function’s .R file (above the function definition) you should now see the roxygen2 header (of which every line starts with #').

Help files follow a specific structure that is set by CRAN, including:

  • Title, a short phrase summarizing the function purpose.
  • Description, provides information about the function’s goals.
  • Usage (automatically generated from the function definition).
  • Arguments, enumerating all the parameters the function accepts and clarifying the purpose and application of each. The @param tag followed by all parameter names is included automatically. It is helpful to start with the type of the argument (e.g., numerical, string, logical…).
  • Returns, for details about the function’s output (@returns). Also start with the output type here.
  • See also, optional additional references (@seealso), more useful for more specialized analyses that need citations.
  • Examples, showcase practical usage, helping users see how the function can be applied to solve specific problems.

Add the @export statement before the examples to make the function accessible to external users.

When functions take similar arguments, they can be included together in one help file.

After you have filled in the roxygen header you can add documentation to the package:

# check that the package can be loaded, update the NAMESPACE file, and generate the .Rd file
roxygen2::roxygenize() 

# or
devtools::document(roclets = c('rd', 'collate', 'namespace'))

# or, if you are tired of typing 
# In RStudio: **Build > More > Document** (shortcut: ctrl-up-D)

Now the man folder will include a new .Rd documentation file. Load the function again and check its documentation with ?function_name

Add R Markdown templates

R packages can also include R Markdown templates, stored in the inst directory. These help provide a structure for working with the package.

# Start an R Markdown template
usethis::use_rmarkdown_template("Tutorial")

use_rmarkdown_template() creates a file for you to customize and the directory structure inst/rmarkdown/templates/.

4. Adding data to the package

While data isn’t a core component of an R package, you may want to add data objects (for example dataframes) to ease the testing and demonstration of functions or other reasons. Data is often necessary for running examples, ensuring the package works as intended.

You can do that by:

# Store in an R object 
toy_data <- data.frame(
  'id' = seq(1:5),
  'f0101'= sample(1:50, 5, replace=TRUE),
  'f0102'= sample(0:1, 5, replace=TRUE)
)
# Add object to the package
usethis::use_data(toy_data)

use_data() makes a data directory if there isn’t any, and saves the R data object with the .rda extension. It’s best practice to place all external data (e.g. .csv files) in the data-raw directory.

5. (Almost done) check your package

A very handy function to test whether the package is working is:

devtools::check()

This performs a comprehensive series of tests (including syntax checks, package dependencies, documentation quality, coding standards…) to ensure correctness and adherence to package development standards.

If you get 0 errors ✔ | 0 warnings ✔ | 0 notes ✔, congrats, your package meets the requirements and standards set by the R community and you are ready to share it with the world.

6. Make it extra nice

Write unit tests

We can use the devtools package to test your package.

Dependable R packages also need unit tests. Unit tests are automated tests that check individual functions or pieces of code to ensure they produce the expected output under various conditions.

Vignettes

Moreover, it is best practice to include vignettes, which are long-form tutorials demonstrating how to use the package. All of this is included in the package in a standardized structure, providing users with a reliable and well-organized way to access and use the package’s content.

eet’s understand three different types of dependencies in R packages. They are , ensuring that all the necessary functions are available.

Suggests

Besides Imports (i.e., packages that are required for your functions to work properly and are automatically loaded when the package is loaded) there are another types of dependencies you can add to the DESCRIPTION file. Suggests are packages that are not required for the package to function, but they provide additional functionality or examples. Using Suggests is a courtesy to users, avoiding downloading difficult-to-install packages.

usethis::use_package("tibble", type = "Suggests")

Licences

When you are ready to share our package, you also need to think about how others can use the code. Two commonly used licenses are:

  • the Creative Commons Zero (CC0) License: permits free use, modification, and distribution, without the need for copyright notices, attribution, or licenses. Essentially, you allow anyone to use the package without any mention of you.
  • the MIT License: permits free use, modification, and distribution, but mandates that the copyright notice and license be included in substantial copies of the software. This is commonly used for software like R packages.
# Set the license in the DESCIPTION file
usethis::use_mit_license()
# usethis::use_cc0_license()

DONE! Now save it and share it!

If you already set up a git repository with the project, you can simply select the changes you want to commit in th Git tab and follow the instructions. Otherwise there are a few recommended workflows you could follow (see them here for example)

Let’s give it a README.md as well:

use_readme_rmd()

Then render it like so:

devtools::build_readme()