13 All About Arguments
13.1 NULL defaults
The cut() function used by cut_by_quantile() can automatically provide sensible labels for each category. The code to generate these labels is pretty complicated, so rather than appearing in the function signature directly, its labels argument defaults to NULL, and the calculation details are shown on the ?cut help page.
Instructions 100 XP
Update the definition of cut_by_quantile() so that the labels argument defaults to NULL. Remove the labels argument from the call to cut_by_quantile().
ex_08.R
# Set the default for labels to NULL
cut_by_quantile <- function(x, n = 5, na.rm = FALSE, labels=NULL,
interval_type) {
probs <- seq(0, 1, length.out = n + 1)
qtiles <- quantile(x, probs, na.rm = na.rm, names = FALSE)
right <- switch(interval_type, "(lo, hi]" = TRUE, "[lo, hi)" = FALSE)
cut(x, qtiles, labels = labels, right = right, include.lowest = TRUE)
}
# Remove the labels argument from the call
cut_by_quantile(
n_visits,
interval_type = "(lo, hi]"
)13.2 Categorical defaults
When cutting up a numeric vector, you need to worry about what happens if a value lands exactly on a boundary. You can either put this value into a category of the lower interval or the higher interval. That is, you can choose your intervals to include values at the top boundary but not the bottom (in mathematical terminology, “open on the left, closed on the right”, or (lo, hi]). Or you can choose the opposite (“closed on the left, open on the right”, or [lo, hi)). cut_by_quantile() should allow these two choices.
The pattern for categorical defaults is:
function(cat_arg = c("choice1", "choice2")) {
cat_arg \<- match.arg(cat_arg)
}Free hint: In the console, type head(rank) to see the start of rank()’s definition, and look at the ties.method argument.
Instructions 100 XP
Update the signature of cut_by_quantile() so that the interval_type argument can be “(lo, hi ]” or ” [lo, hi)“. Note the space after each comma. Update the body of cut_by_quantile() to match the interval_type argument. Remove the interval_type argument from the call to cut_by_quantile().
ex_09.R
# Set the categories for interval_type to "(lo, hi]" and "[lo, hi)"
cut_by_quantile <- function(x, n = 5, na.rm = FALSE, labels = NULL,
interval_type= c("(lo, hi]", "[lo, hi)")) {
# Match the interval_type argument
interval_type <- match.arg(interval_type)
probs <- seq(0, 1, length.out = n + 1)
qtiles <- quantile(x, probs, na.rm = na.rm, names = FALSE)
right <- switch(interval_type, "(lo, hi]" = TRUE, "[lo, hi)" = FALSE)
cut(x, qtiles, labels = labels, right = right, include.lowest = TRUE)
}
# Remove the interval_type argument from the call
cut_by_quantile(n_visits)13.3 Harmonic mean
The harmonic mean is the reciprocal of the arithmetic mean of the reciprocal of the data. That is the harmonic mean is often used to average ratio data. You’ll be using it on the price/earnings ratio of stocks in the Standard and Poor’s 500 index, provided as std_and_poor500. Price/earnings ratio is a measure of how expensive a stock is.
The dplyr package is loaded.
Instructions 100 XP
Look at std_and_poor500 (you’ll need this later). Write a function, get_reciprocal, to get the reciprocal of an input
x. Its only argument should bex, and it should return one overx.Write a function,
calc_harmonic_mean(), that calculates the harmonic mean of its only input,x.Using
std_and_poor500, group by sector, and summarize to calculate the harmonic mean of the price/earning ratios in the pe_ratio column.
ex_10.R
# Look at the Standard and Poor 500 data
glimpse(std_and_poor500)
# Write a function to calculate the reciprocal
get_reciprocal <- function(x){
1/x
}
# From previous step
get_reciprocal <- function(x) {
1 / x
}
# Write a function to calculate the harmonic mean
calc_harmonic_mean <- function(x) {
x %>%
get_reciprocal() %>%
mean() %>%
get_reciprocal()
}
# From previous steps
get_reciprocal <- function(x) {
1 / x
}
calc_harmonic_mean <- function(x) {
x %>%
get_reciprocal() %>%
mean() %>%
get_reciprocal()
}
std_and_poor500 %>%
# Group by sector
group_by(sector) %>%
# Summarize, calculating harmonic mean of P/E ratio
summarize(hmean_pe_ratio = calc_harmonic_mean(pe_ratio))13.4 Dealing with missing values
In the last exercise, many sectors had an NA value for the harmonic mean. It would be useful for your function to be able to remove missing values before calculating.
Rather than writing your own code for this, you can outsource this functionality to mean().
The dplyr package is loaded.
Instructions 100 XP
Modify the signature and body of
calc_harmonic_mean()so it has anna.rmargument, defaulting to false, that gets passed tomean().Using
std_and_poor500, group by sector, and summarize to calculate the harmonic mean of the price/earning ratios in thepe_ratiocolumn, removing missing values.
ex_11.R
# Add an na.rm arg with a default, and pass it to mean()
calc_harmonic_mean <- function(x,na.rm = FALSE) {
x %>%
get_reciprocal() %>%
mean(na.rm =na.rm) %>%
get_reciprocal()
}
# From previous step
calc_harmonic_mean <- function(x, na.rm = FALSE) {
x %>%
get_reciprocal() %>%
mean(na.rm = na.rm) %>%
get_reciprocal()
}
std_and_poor500 %>%
# Group by sector
group_by(sector) %>%
# Summarize, calculating harmonic mean of P/E ratio
summarize(hmean_pe_ratio = calc_harmonic_mean(pe_ratio,
na.rm = TRUE))13.5 Passing arguments with ...
Rather than explicitly giving calc_harmonic_mean() and na.rm argument, you can use ... to simply “pass other arguments” to mean().
The dplyr package is loaded.
Instructions 100 XP
Replace the na.rm argument with … in the signature and body of
calc_harmonic_mean().Using
std_and_poor500, group by sector, and summarize to calculate the harmonic mean of the price/earning ratios in thepe_ratiocolumn, removing missing values.
ex_12.R
# Swap na.rm arg for ... in signature and body
calc_harmonic_mean <- function(x, ...) {
x %>%
get_reciprocal() %>%
mean(...) %>%
get_reciprocal()
}
calc_harmonic_mean(x = c(1, NA, 3, NA, 5))
calc_harmonic_mean <- function(x, ...) {
x %>%
get_reciprocal() %>%
mean(...) %>%
get_reciprocal()
}
std_and_poor500 %>%
# Group by sector
group_by(sector) %>%
# Summarize, calculating harmonic mean of P/E ratio
summarize(hmean_pe_ratio = calc_harmonic_mean(pe_ratio,
na.rm = TRUE))13.6 Throwing errors with bad arguments
If a user provides a bad input to a function, the best course of action is to throw an error letting them know. The two rules are
Throw the error message as soon as you realize there is a problem (typically at the start of the function). Make the error message easily understandable. You can use the assert\_\*() functions from assertive to check inputs and throw errors when they fail.
Instructions 100 XP
- Add a line to the body of
calc_harmonic_mean()to assert thatxis numeric. - Look at what happens when you pass a character argument to
calc_harmonic_mean().
ex_13.R
alc_harmonic_mean <- function(x, na.rm = FALSE) {
# Assert that x is numeric
assert_is_numeric(x)
x %>%
get_reciprocal() %>%
mean(na.rm = na.rm) %>%
get_reciprocal()
}
# See what happens when you pass it strings
calc_harmonic_mean(std_and_poor500$sector)13.7 Custom error logic
Sometimes the assert_*() functions in assertive don’t give the most informative error message. For example, the assertions that check if a number is in a numeric range will tell the user that a value is out of range, but the won’t say why that’s a problem. In that case, you can use the is \_\*() functions in conjunction with messages, warnings, or errors to define custom feedback.
The harmonic mean only makes sense when x has all positive values. (Try calculating the harmonic mean of one and minus one to see why.) Make sure your users know this!
13.7.1 Instructions 100 XP
- If any values of
xare non-positive (ignoring NAs) then throw an error. - Look at what happens when you pass a character argument to
calc_harmonic_mean().
ex_14.R
calc_harmonic_mean <- function(x, na.rm = FALSE) {
assert_is_numeric(x)
# Check if any values of x are non-positive
if(any(is_non_positive(x), na.rm = TRUE)) {
# Throw an error
stop("x contains non-positive values, so the harmonic mean makes no sense.")
}
x %>%
get_reciprocal() %>%
mean(na.rm = na.rm) %>%
get_reciprocal()
}
# See what happens when you pass it negative numbers
calc_harmonic_mean(std_and_poor500$pe_ratio - 20)13.8 Fixing function arguments
The harmonic mean function is almost complete. However, you still need to provide some checks on the na.rm argument. This time, rather than throwing errors when the input is in an incorrect form, you are going to try to fix it.
na.rm should be a logical vector with one element (that is, TRUE, or FALSE).
The assertive package is loaded for you.
Instructions 100 XP
- Update calc_harmonic_mean() to fix the na.rm argument by using
use_first()to select the firstna.rmelement, andcoerce_to()to change it to logical.
ex_15.R
# Update the function definition to fix the na.rm argument
calc_harmonic_mean <- function(x, na.rm = FALSE) {
assert_is_numeric(x)
if(any(is_non_positive(x), na.rm = TRUE)) {
stop("x contains non-positive values, so the harmonic mean makes no sense.")
}
# Use the first value of na.rm, and coerce to logical
na.rm <- coerce_to(use_first(na.rm), target_class = "logical")
x %>%
get_reciprocal() %>%
mean(na.rm = na.rm) %>%
get_reciprocal()
}
# See what happens when you pass it malformed na.rm
calc_harmonic_mean(std_and_poor500$pe_ratio, na.rm = 1:5)