14  Return Values and Scope

14.1 Returning early

Sometimes, you don’t need to run through the whole body of a function to get the answer. In that case you can return early from that function using return().

To check if x is divisible by n, you can use is_divisible_by(x, n) from assertive.

Alternatively, use the modulo operator, %%. x %% n gives the remainder when dividing x by n, so x %% n == 0 determines whether x is divisible by n. Try 1:10 %% 3 == 0 in the console.

To solve this exercise, you need to know that a leap year is every 400th year (like the year 2000) or every 4th year that isn’t a century (like 1904 but not 1900 or 1905).

assertive is loaded.

14.1.1 Instructions 100 XP

Complete the definition of is_leap_year(), checking for the cases of year being divisible by 400, then 100, then 4, returning early from the function in each case.

ex_16.R
is_leap_year <- function(year) {
  # If year is div. by 400 return TRUE
  if(is_divisible_by(year,400)) {
    return(TRUE)
  }
  # If year is div. by 100 return FALSE
  if(is_divisible_by(year,100)) {
    return(FALSE)
  }  
  # If year is div. by 4 return TRUE
  if(is_divisible_by(year,4)) {
    return(TRUE)
  }
  
  # Otherwise return FALSE
  FALSE
}

14.2 Returning invisibly

When the main purpose of a function is to generate output, like drawing a plot or printing something in the console, you may not want a return value to be rinted as well. In that case, the value should be invisibly returned.

The base R plot function returns NULL, since its main purpose is to draw a plot. This isn’t helpful if you want to use it in piped code: instead it should invisibly return the plot data to be piped on to the next step.

Recall that plot() has a formula interface: instead of giving it vectors for x and y, you can specify a formula describing which columns of a data frame go on the x and y axes, and a data argument for the data frame. Note that just like lm(), the arguments are the wrong way round because the detail argument, formula, comes before the data argument.

plot(y \~ x, data = data)

Instructions 100 XP

  • Use the cars dataset and the formula interface to plot(), draw a scatter plot of dist versus speed.

  • Give pipeable_plot() data and formula arguments (in that order) and make it draw the plot, then invisibly return data.

  • Draw the scatter plot of dist vs. speed again by calling pipeable_plot()

ex_17.R
# Using cars, draw a scatter plot of dist vs. speed
plt_dist_vs_speed <- plot(dist ~ speed, data = cars)

# Oh no! The plot object is NULL
plt_dist_vs_speed

# Define a pipeable plot fn with data and formula args
pipeable_plot <- function(data, formula) {
  # Call plot() with the formula interface
  plot(formula, data)
  # Invisibly return the input dataset
  invisible(data)
}

# Draw the scatter plot of dist vs. speed again
plt_dist_vs_speed <- cars %>% 
  pipeable_plot(dist ~ speed)

14.3 Returning many things

Functions can only return one value. If you want to return multiple things, then you can store them all in a list.

If users want to have the list items as separate variables, they can assign each list element to its own variable using zeallot’s multi-assignment operator, %\<-%.

glance(), tidy(), and augment() each take the model object as their only rgument.

The Poisson regression model of Snake River visits is available as model. broom and zeallot are loaded.

Instructions 100 XP

  • Examine the structure of model.

  • Use broom functions on model to create a list containing the model–, coefficient–, and observation-level parts of model.

  • Wrap the code into a function, groom_model(), that accepts model as its only argument.

  • Call groom_model() on model, multi-assigning the result to three variables at once: mdl, cff, and obs.

ex_18.R
# Look at the structure of model (it's a mess!)
str(model)

# Use broom tools to get a list of 3 data frames
list(
  # Get model-level values
  model = glance(model),
  # Get coefficient-level values
  coefficients = tidy(model),
  # Get observation-level values
  observations = augment(model)
)

# Wrap this code into a function, groom_model
groom_model <- function(model){
  list(
    model = glance(model),
    coefficients = tidy(model),
    observations = augment(model)
  )
}

groom_model(model)

# From previous step
groom_model <- function(model) {
  list(
    model = glance(model),
    coefficients = tidy(model),
    observations = augment(model)
  )
}

# Call groom_model on model, assigning to 3 variables
c(mdl, cff, obs) %<-% groom_model(model)
#c(var1, var2, var3) %<-% fn(args)

# See these individual variables
mdl; cff; obs

14.4 Returning metadata

Sometimes you want to return multiple things from a function, but you want the result to have a particular class (for example, a data frame or a numeric vector), so returning a list isn’t appropriate. This is common when you have a result plus metadata about the result. (Metadata is “data about the data”. For example, it could be the file a dataset was loaded from, or the username of the person who created the variable, or the number of iterations for an algorithm to converge.)

In that case, you can store the metadata in attributes. Recall the syntax for assigning attributes is as follows.

attr(object, "attribute_name") \<- attribute_value

Instructions 100 XP

  • Update pipeable_plot() so the result has an attribute named “formula” with the value of formula.

  • plt_dist_vs_speed, that you previously created, is shown. Examine its updated structure.

ex_19.R
pipeable_plot <- function(data, formula) {
  plot(formula, data)
  # Add a "formula" attribute to data
  attr(data, "formula") <- formula
  invisible(data)
}

# From previous exercise
plt_dist_vs_speed <- cars %>% 
  pipeable_plot(dist ~ speed)

# Examine the structure of the result
str(plt_dist_vs_speed)

14.5 Creating and exploring environments

Environments are used to store other variables. Mostly, you can think of them as lists, but there’s an important extra property that is relevant to writing functions. Every environment has a parent environment (except the empty environment, at the root of the environment tree). This determines which variables R know about at ifferent places in your code.

Facts about the Republic of South Africa are contained in capitals, national_parks, and population.

Instructions 100 XP

  • Create rsa_lst, a named list from capitals, national_parks, and population. Use those values as the names.
  • List the structure of each element of rsa_lst using ls.str().
  • Convert the list to an environment, rsa_env, using list2env().
  • List the structure of each element of rsa_env
  • Find the parent environment of rsa_env and print its name.
ex_20.R
# Add capitals, national_parks, & population to a named list
rsa_lst <- list(
  capitals = capitals,
  national_parks = national_parks,
  population = population
)

# List the structure of each element of rsa_lst
ls.str(rsa_lst)

# From previous step
rsa_lst <- list(
  capitals = capitals,
  national_parks = national_parks,
  population = population
)

# Convert the list to an environment
rsa_env <- list2env(rsa_lst)

# List the structure of each variable
ls.str(rsa_env)

# From previous steps
rsa_lst <- list(
  capitals = capitals,
  national_parks = national_parks,
  population = population
)
rsa_env <- list2env(rsa_lst)

# Find the parent environment of rsa_env
parent <- parent.env(rsa_env)
environmentName(parent)

# Print its name
print(environmentName)

14.6 Do variables exist?

If R cannot find a variable in the current environment, it will look in the parent environment, then the grandparent environment, and so on until it finds it.

rsa_env has been modified so it includes capitals and national_parks, but not population.

Instructions 100 XP

  • Check if population exists in rsa_env, using default inheritance rules.
  • Check if population exists in rsa_env, ignoring inheritance.
ex_21.R
# Compare the contents of the global environment and rsa_env
ls.str(globalenv())
ls.str(rsa_env)

# Does population exist in rsa_env?
exists("population", envir = rsa_env)

# Does population exist in rsa_env, ignoring inheritance?
exists("population", envir = rsa_env,inherits = FALSE)

14.7 Variable precedence 1

Consider this code, run in a fresh R session.

x_plus_y <- function(x) { y <- 3 x + y } y <- 7

If you now call x_plus_y(5), what is the result?

14.8 Answer the question 50XP

Possible Answers

An error is thrown.

14.9 Variable precedence 2

Consider this (slightly different) code, run in a fresh R session.

x_plus_y \<- function(x) { x \<- 6 y \<- 3 x + y } y \<- 7  

If you now call x_plus_y(5), what is the result?

14.10 Answer the question 50XP

Possible Answers

  1. An error is thrown.