Assignment B-4

Total Points: 100

For this assignment, you will pick one of three options described below.

  • Option A: Strings and/or Functional Programming
  • Option B: R Package
  • Option C: Shiny App

Setup

If you choose Option A, make a new repository in our homework Organization for this assignment. The link to do so can be found on Canvas in the description of Assignment B-4.

If you choose Option B or C, just continue your work on your R package / shiny app repository. (You tagged a release corresponding to the original assignment submission, right? If so, you can proceed to modify your repository’s contents without having to worry about your previous assignment submission. If you didn’t, please tag a release before working on this assignment, and update your previous assignment submission on canvas.)

Tidy Submission (25 points)

Follow these steps to submit your work. Be sure to familiarize yourself with the rubric for a tidy submission below, before doing these steps.

  • (5 pts) Your repository has a README file orienting a newcomer to your project; if one already exists, it’s updated as needed.
  • (10 pts) RRR: Code runs without errors and it is appropriately annotated, legible, and easy to follow.
  • (5 pts) You use proper English, spelling, and grammar throughout your repository.
  • (5 pts) Tag a release in your assignment repository.
  • Submit a link to your tagged release in your assignment repository to canvas.
    • It is very important that you submit a link to your work on canvas. If not, we won’t be able to find your work, because each option involves working in a different repository. The link you provide should link to a tagged release in your assignment repository, corresponding to the point in time of your repository’s life that we should be grading. This repository should be viewable by the TAs and Lucy - either it should be public or it should be private with TAs and Lucy added as collaborators.

Option A – Strings and functional programming in R

Complete 2 of the following 3 exercises using concepts and tools covered in class (i.e. stringr, purrr, regex, tidyverse, etc.)

Setup: Make a new repository if you’re going with this option, by clicking the link provided to you on canvas. Name the repository as you wish.

Exercise 1 (37.5 points)

Take a Jane Austen book contained in the janeaustenr package, or another book from some other source, such as one of the many freely available books from Project Gutenberg (be sure to indicate where you got the book from). Make a plot of the most common words in the book, removing “stop words” of your choosing (words like “the”, “a”, etc.) or stopwords from a pre-defined source, like the stopwords package or tidytext::stop_words.

If you use any resources for helping you remove stopwords, or some other resource besides the janeaustenr R package for accessing your book, please indicate the source. We aren’t requiring any formal citation styles, just make sure you name the source and link to it.

Exercise 2 (37.5 points)

Make a function that converts words to your own version of Pig Latin.

The specific input and output that you decide upon is up to you. Don’t forget to implement good function-making hygiene: we’ll be looking for (unrendered) roxygen2-style documentation (being sure to describe your Pig Latin conversion), examples of applying the function, 3 non-redundant tests, appropriate use of arguments, and appropriate amount of checking for proper input.

Your Pig Latin should incorporate two components:

Rearrangement component

The default Pig Latin rearrangement rule, as per Wikipedia, moves beginning letters to the end:

  1. For words that begin with consonant sounds, all letters before the initial vowel are placed at the end of the word sequence.
  2. When words begin with consonant clusters (multiple consonants that form one sound), the whole sound is added to the end
  3. For words beginning with vowel sounds, one removes the initial vowel(s) along with the first consonant or consonant cluster.

Modify this somehow. Maybe you move letters from the end to the beginning, or you change the rules altogether, keeping a similar level of complexity.

Addition component

The default Pig Latin addition rule is to add “ay” to the end of the word, after rearranging the letters of the word. You should choose some other addition rule.

Exercise 3 (37.5 points)

For this exercise, you’ll be evaluating a model that’s fit separately for each group in some dataset. You should fit these models with some question in mind.

Examples (do not use these examples):

  • Maybe your model is a linear model (using lm()) for each country’s life expectancy over time in the gapminder::gapminder dataset, where you are interested in each country’s overall trend in life expectancy.
  • Maybe your model is a distribution of body mass (using the distplyr package) for each penguin species in the palmerpenguins::penguins dataset, so that you can produce parametric prediction intervals (such as from a Normal distribution instead of using the quantile() function) for each species.

Your tasks are as follows.

  1. Make a column of model objects. Do this using the appropriate mapping function from the purrr package. Note: it’s possible you’ll have to make use of nesting, here.
  2. Evaluate the model in a way that interests you. But, you should evaluate something other than a single number for each group. Hint: you’ll need to use another purrr mapping function again.
  3. Print out this intermediate tibble for inspection (perhaps others as well, if it makes sense to do so).
  4. Unnest the resulting calculations, and print your final tibble to screen. Make sure your tibble makes sense: column names are appropriate, and you’ve gotten rid of columns that no longer make sense.
  5. Produce a plot communicating something about the result.
  6. Walk a reader through this analysis by providing an explanation as to what’s going on (in terms of a data analysis question and results, not necessarily in terms of what’s happening in the code).

Grading is as follows:

  • 5 points: the “story” behind your data analysis.
  • 5 points: the plot.
  • 15 points: the calculations.

Option B – R package

For this option, you will be updating your package to a new version, or creating an entirely new R package from scratch if you prefer.

This time, we will be evaluating the quality and overall design of your R package. Whereas the previous R package assignment focused on component additions evaluated by functionality (such as components of an R package being present and functional), this time we will evaluate how well your product forms a complete whole, satisfying a clearly stated purpose with an effectively designed interface (“interface” being how a user interacts with the R package). See the rubric in the below section for details.

In short, you will need to:

  • Make more functions and tests, and update relevant documentation.
  • Make a vignette.
  • Make a package website (use pkgdown, and activate it with GitHub Pages if your repository is public.
    • Activating GitHub pages should be as easy as clicking “Settings” -> “Pages” -> then clicking on the main branch, perhaps also selecting the docs folder.

Rubric

Keep in mind that you must address any feedback provided by the teaching team.

Basic structure and functionality (15 points)

Following the previous R package assignment instructions and rubric may be helpful for completing this.

  • Do you have all essential components for your R package?
    • All functions have tests
    • All functions and data have documentation
    • If data was included, did you include how the data was sourced?
    • DESCRIPTION has correct information

Code Mastery (25 points)

  • Is your code written efficiently?
    • Did you use tidyverse when appropriate?
    • Is your code easy to understand? Did you write code comments when appropriate?
  • Are your tests good quality?

Package design (30 points)

  • Does your package have a clear and consistent purpose?
    • Functions and data make sense to be packaged together
    • e.g. ggplot2 is a package all about plotting
  • Do package functions and data have user-friendly design?
    • Function arguments make sense
    • Data is organized appropriately and can be easily accessed
    • Documentation is helpful and complete
  • High-level documentation is effective
    • Vignette is present
    • pkgdown website is present, at least locally.
    • Purpose of package is clearly described
    • Functions and datasets are demonstrated
    • Examples should be easy to be run by users
    • If developing a data package, more weight will be put towards a providing a good quality vignette that showcases the data

Package development (5 points)

  • Did you use Github Issues, branches, and pull requests in an appropriate manner?
  • Did you address all errors and warnings from running check()? “Notes” are allowed.
  • Did you tag releases and update version numbers appropriately?

Option C – Shiny App

For this option, you will be implement additional features and improvements to your shiny app that you worked on in Assignment 3b, or creating an entirely new shiny app from scratch. This time, we will be evaluating the quality and overall design of your shiny app.

Before you start

If you are building onto assignment 3b shiny app, ensure that your assignment 3b shiny app is deployed and viewable on shinyapps.io. We will use this as a reference to mark your improved version. Complete this assignment by editing a copied version of your app. When finished deploy this on shinyapps.io as well. (If you lack space on shinyapps.io, let us know, we don’t want this to limit you.)

If you are starting a new app from scratch, state so in the assignment 3b README.

Requirements

If you are building onto assignment 3-B, after addressing feedback, you should have three working features implemented. If you are starting from scratch, you’ll need to produce the equivalent of Assignment 3-B (i.e. have three working features) before moving on.

You might find that to satisfy some of these requirements, new features will have to be implemented.

App design

This time we will be evaluating the overall app design. Your app should have a clearly defined purpose and function accordingly. The way the user interacts with the app is another important design decision. It should be easy for the user to both understand how the app is supposed to be used and also for the user to use the app.

Code mastery

Ensure that your code is of high quality. This means that a minimal amount code and dependencies are used (code efficiency). We will also look for appropriate usage of shiny-specific concepts, such as reactive expressions.

Rubric

3 features are implemented (25 points)

  • If continuing from assignment 3b, you should have fixed any issues with the original 3 features, and you should have 3 new working features
  • If starting from scratch, you have at least 3 working features

User friendly design (25 points)

  • Is it easy for users to understand how to engage with the app?
  • Were chosen widgets appropriate for each type of input?
  • Is the writing clear?
  • Does the layout make sense?
  • The purpose of the app is clearly indicated
  • The functioning of the app adequately achieves the purpose
  • There’s enough documentation to know how to use the app. Ideally, one doesn’t have to read all that much to use the app at a superficial level.

Code quality (25 points)

  • Code is efficient
  • Reactive expressions used appropriately
  • tidyverse used when appropriate
  • Code comments are appropriately used
  • Only essential R packages are loaded