Assignment B-1: Making a function

Total Points: 100

This assignment covers making a function in R, documenting it, and testing it.

Setup

  1. Go to canvas to get your invitation to create a GitHub repository for this project. You can find this in the description of Assignment B-1. Name your repo as you wish, so long as it’s informative.

  2. Make a new Rmd file containing all of your code. Be sure to use some dialogue between code chunks.

Tidy Submission (15 points)

Follow these steps to submit your work. Be sure to familiarize yourself with the rubric for a tidy submission below, before doing these steps.

  1. Make a README file for your repository. It should be a brief document letting a visitor know what’s in this repository (at a high level) and some key things they should know about how to use the files in the repository.
  2. Tag a release in your GitHub repository corresponding to your submission before the deadline.
  3. Grab the URL corresponding to your tagged release, and submit that to canvas. Make sure the TAs and Lucy can see your repository! Either it should be public or private with TAs and Lucy added as collaborators.

Rubric:

  • The above steps were followed.
  • Your work must be reproducible from beginning to end, error-free.
  • Code should adopt a consistent and easy-to-read style – ideally, the tidyverse style, although we’re certainly not looking for strict adherence.
  • You use proper English, spelling, and grammar, and write concisely. If there’s any uncertainty in determining a grade here, the UBC MDS writing rubric will be referred to.
  • If there’s any further uncertainty in determining a grade for this tidy submission portion, the UBC MDS mechanics rubric will be referred to.

Exercise 1: Make a Function (25 points)

In this exercise, you’ll be making a function and fortifying it. The function need not be complicated. The function need not be “serious”, but shouldn’t be nonsense.

Function Ideas

  • Did you repeat any code for a data analysis in STAT 545A? If so, consider making a function for this action.
  • Consider bundling a specific group_by() %>% summarise() workflow.
  • Write a wrapper around an existing function.
    • For example, perhaps accepting a narrower range of inputs (like not allowing logical vectors), or providing a different output.
    • A specific example: the rqdist() function is a wrapper around quantreg::rq(), narrowing its functionality.
    • It’s usually better to narrow a function’s focus than to broaden, so that a function doesn’t end up doing too much.
  • Make a special plot that you’d want to repeat when exploring your data.

Guidelines

  • Your function should not rely on anything from your working environment.
  • Your function should not rely on “magic numbers” – pre-selected numbers (or options) that appear inside the function that can’t be accessed by a user of the function.
    • For example, maybe quantile(x, type = 1, ...) appears in your function. The choice of 1 is arbitrary – unless you’re making a function like quantile_type_1().
  • Your function should be reasonably flexible on its inputs. Here is an example of a function that is not reasonably flexible: a function called count_missing_values_by_group() that only works when the data frame has columns that are all of type fct.
  • The output is consistent – for example, always gives a list. An example of inconsistent output: sapply(1:3, seq_len) gives a list, and sapply(1:3, sqrt) gives an (atomic) vector.
  • Your function includes appropriate arguments. (Do you handle NA’s appropriately? Are you using the ellipsis properly? etc.)

Exercise 2: Document your Function (20 points)

In the same code chunk where you made your function, document the function using roxygen2 tags. Be sure to include:

  1. Title.
  2. Function description: In 1-2 brief sentences, describe what the function does.
  3. Document each argument with the @param tag, making sure to justify why you named the parameter as you did.
    • (Justification for naming is not often needed, but we want to hear your reasoning.)
  4. What the function returns, using the @return tag.

Exercise 3: Include examples (15 points)

Demonstrate the usage of your function with a few examples. Use one or more new code chunks, describing what you’re doing.

Note: If you want to deliberately show an error, you can use error = TRUE in your code chunk option.

Exercise 4: Test the Function (25 points)

Running examples is a good way of checking by-eye whether your function is working as expected. But, having a formal “yes or no” check is useful when you move on to other parts of your analysis.

Write formal tests for your function. You should use at least three non-redundant uses of an expect_() function from the testthat package, and they should be contained in a test_that() function (or more than one). They should all pass.

Example of non-redundant inputs:

  • Vector with no NA’s
  • Vector that has NA’s
  • Vector of a different type (if relevant)
  • Vector of length 0, like numeric(0).

Example of redundant inputs:

  • Providing a different number (unless one of these numbers have some significance, like an extreme point – just tell us if that’s the case)