24  Importing data from the web (Part 2)

24.1 From JSON to R

In the simplest setting, fromJSON() can convert character strings that represent JSON data into a nicely structured R list. Give it a try!

Instructions 100 XP

  • Load the jsonlite package. It’s already installed on DataCamp’s servers.
  • wine_json represents a JSON. Use fromJSON() to convert it to a list, named wine.
  • Display the structure of wine
ex_017.R
# Load the jsonlite package
library(jsonlite)

# wine_json is a JSON
wine_json <- '
    {
        "name":"Chateau Migraine",
        "year":1997,
        "alcohol_pct":12.4,
        "color":"red",
        "awarded":false
    }
'

# Convert wine_json into a list: wine
wine <- fromJSON(wine_json)

# Print structure of wine
print(str(wine))

24.2 Quandl API

As Filip showed in the video, fromJSON() also works if you pass a URL as a character string or the path to a local file that contains JSON data. Let’s try this out on the Quandl API, where you can fetch all sorts of financial and economical data.

Instructions 100 XP

  • quandl_url represents a URL. Use fromJSON() directly on this URL and store the result in quandl_data.
  • Display the structure of quandl_data .
ex_018.R
# jsonlite is preloaded

# Definition of quandl_url
quandl_url <- "https://www.quandl.com/api/v3/datasets/WIKI/FB/data.json?auth_token=i83asDsiWUUyfoypkgMz"

# Import Quandl data: quandl_data
quandl_data <- fromJSON(quandl_url)

# Print structure of quandl_data
str(quandl_data)

24.3 OMDb API

In the video, you saw how easy it is to interact with an API once you know how to formulate requests. You also saw how to fetch all information on Rain Man from OMDb. Simply perform a GET() call, and next ask for the contents with the content() function. This content() function, which is part of the httr package, uses jsonlite behind the scenes to import the JSON data into R.

However, by now you also know that jsonlite can handle URLs itself. Simply passing the request URL to fromJSON() will get your data into R. In this exercise, you will be using this technique to compare the release year of two movies in the Open Movie Database.

Instructions 100 XP

  • Two URLs are included in the sample code, as well as a fromJSON() call to build sw4. Add a similar call to build sw3.
  • Print out the element named Title of both sw4 and sw3. You can use the $ operator. What movies are we dealing with here? -Write an expression that evaluates to TRUE if sw4 was released later than sw3. This information is stored in the Year element of the named lists.
ex_019.R
# The package jsonlite is already loaded

# Definition of the URLs
url_sw4 <- "http://www.omdbapi.com/?apikey=72bc447a&i=tt0076759&r=json"
url_sw3 <- "http://www.omdbapi.com/?apikey=72bc447a&i=tt0121766&r=json"

# Import two URLs with fromJSON(): sw4 and sw3
sw3 <- fromJSON(url_sw3)
sw4 <- fromJSON(url_sw4)


# Print out the Title element of both lists
print(sw3$Title)
print(sw4$Title)

# Is the release year of sw4 later than sw3?
sw4$Year > sw3$Year

24.4 JSON practice (1)

JSON is built on two structures: objects and arrays. To help you experiment with these, two JSON strings are included in the sample code. It’s up to you to change them appropriately and then call jsonlite’s fromJSON() function on them each time.

Instrucions 100 XP

  • Change the assignment of json1 such that the R vector after conversion contains the numbers 1 up to 6, in ascending order. Next, call fromJSON() on json1.
  • Adapt the code for json2 such that it’s converted to a named list with two elements: a, containing the numbers 1, 2 and 3 and b, containing the numbers 4, 5 and 6. Next, call fromJSON() on json2.
ex_020.R
# jsonlite is already loaded
# Challenge 1
json1 <- '[1, 2, 3, 4, 5, 6]'
fromJSON(json1)
# Challenge 2
json2 <- '{
    "a": [1, 2, 3],
    "b": [4, 5, 6]
}'
fromJSON(json2)

24.5 JSON practice (2)

We prepared two more JSON strings in the sample code. Can you change them and call jsonlite’s fromJSON() function on them, similar to the previous exercise?

Instrucions 100 XP

  • Remove characters from json1 to build a 2 by 2 matrix containing only 1, 2, 3 and 4. Call fromJSON() on json1.

  • Add characters to json2 such that the data frame in which the json is converted contains an additional observation in the last row. For this observations, a equals 5 and b equals 6. Call fromJSON() one last time, on json2.

24.6 JSON practice (2)

Instrucions 100 XP

  • Remove characters from json1 to build a 2 by 2 matrix containing only 1, 2, 3 and 4. Call fromJSON() on json1.

  • Add characters to json2 such that the data frame in which the json is converted contains an additional observation in the last row. For this observations, a equals 5 and b equals 6. Call fromJSON() one last time, on json2.

ex_021.R
# jsonlite is already loaded
# Challenge 1
json1 <- '[[1, 2], [3, 4]]'
fromJSON(json1)

# Challenge 2
json2 <-
   '[{"a": 1, "b": 2}, {"a": 3, "b": 4}, {"a": 5, "b": 6}
]'
fromJSON(json2)

24.7 toJSON()

Apart from converting JSON to R with fromJSON(), you can also use toJSON() to convert R data to a JSON format. In its most basic use, you simply pass this function an R object to convert to a JSON. The result is an R object of the class json, which is basically a character string representing that JSON.

For this exercise, you will be working with a .csv file containing information on the amount of desalinated water that is produced around the world. As you’ll see, it contains a lot of missing values. This data can be found on the URL that is specified in the sample code.

Instrucions 100 XP

  • Use a function of the utils package to import the .csv file directly from the URL specified in url_csv. Save the resulting data frame as water. Make sure that strings are not imported as factors.
  • Convert the data frame water to a JSON. Call the resulting object water_json.
  • Print out water_json.
ex_021.R
# jsonlite is already loaded

# URL pointing to the .csv file
url_csv <- "http://s3.amazonaws.com/assets.datacamp.com/production/course_1478/datasets/water.csv"

# Import the .csv file located at url_csv
water <- read.csv(url_csv)

# Convert the data file according to the requirements
water_json <- toJSON(water)

# Print out water_json
print(water_json)

24.8 Minify and prettify

JSONs can come in different formats. Take these two JSONs, that are in fact exactly the same: the first one is in a minified format, the second one is in a pretty format with indentation, whitespace and new lines:

# Mini
{"a":1,"b":2,"c":{"x":5,"y":6}}

# Pretty
{
  "a": 1,
  "b": 2,
  "c": {
    "x": 5,
    "y": 6
  }
}

Unless you’re a computer, you surely prefer the second version. However, the standard form that toJSON() returns, is the minified version, as it is more concise. You can adapt this behavior by setting the pretty argument inside toJSON() to TRUE. If you already have a JSON string, you can use prettify() or minify() to make the JSON pretty or as concise as possible.

Instrucions 100 XP

  • Convert the mtcars dataset, which is available in R by default, to a pretty JSON. Call the resulting JSON pretty_json.
  • Print out pretty_json. Can you understand the output easily?
  • Convert pretty_json to a minimal version using minify(). Store this version under a new variable, mini_json.
  • Print out mini_json. Which version do you prefer, the pretty one or the minified one?
ex_022.R
# jsonlite is already loaded
# Convert mtcars to a pretty JSON: pretty_json
pretty_json <- toJSON(mtcars, pretty = TRUE)
# Print pretty_json
print(pretty_json)

# Minify pretty_json: mini_json

mini_json <- minify(pretty_json)
# Print mini_json
print(mini_json)