24 Importing data from the web (Part 2)
24.1 From JSON to R
In the simplest setting, fromJSON() can convert character strings that represent JSON data into a nicely structured R list. Give it a try!
Instructions 100 XP
- Load the jsonlite package. It’s already installed on DataCamp’s servers.
- wine_json represents a JSON. Use fromJSON() to convert it to a list, named wine.
- Display the structure of wine
ex_017.R
# Load the jsonlite package
library(jsonlite)
# wine_json is a JSON
wine_json <- '
{
"name":"Chateau Migraine",
"year":1997,
"alcohol_pct":12.4,
"color":"red",
"awarded":false
}
'
# Convert wine_json into a list: wine
wine <- fromJSON(wine_json)
# Print structure of wine
print(str(wine))24.2 Quandl API
As Filip showed in the video, fromJSON() also works if you pass a URL as a character string or the path to a local file that contains JSON data. Let’s try this out on the Quandl API, where you can fetch all sorts of financial and economical data.
Instructions 100 XP
quandl_urlrepresents a URL. UsefromJSON()directly on this URL and store the result inquandl_data.- Display the structure of
quandl_data.
ex_018.R
# jsonlite is preloaded
# Definition of quandl_url
quandl_url <- "https://www.quandl.com/api/v3/datasets/WIKI/FB/data.json?auth_token=i83asDsiWUUyfoypkgMz"
# Import Quandl data: quandl_data
quandl_data <- fromJSON(quandl_url)
# Print structure of quandl_data
str(quandl_data)24.3 OMDb API
In the video, you saw how easy it is to interact with an API once you know how to formulate requests. You also saw how to fetch all information on Rain Man from OMDb. Simply perform a GET() call, and next ask for the contents with the content() function. This content() function, which is part of the httr package, uses jsonlite behind the scenes to import the JSON data into R.
However, by now you also know that jsonlite can handle URLs itself. Simply passing the request URL to fromJSON() will get your data into R. In this exercise, you will be using this technique to compare the release year of two movies in the Open Movie Database.
Instructions 100 XP
- Two URLs are included in the sample code, as well as a
fromJSON()call to buildsw4. Add a similar call to buildsw3. - Print out the element named
Titleof bothsw4andsw3. You can use the$operator. What movies are we dealing with here? -Write an expression that evaluates toTRUEif sw4was released later thansw3. This information is stored in theYearelement of the named lists.
ex_019.R
# The package jsonlite is already loaded
# Definition of the URLs
url_sw4 <- "http://www.omdbapi.com/?apikey=72bc447a&i=tt0076759&r=json"
url_sw3 <- "http://www.omdbapi.com/?apikey=72bc447a&i=tt0121766&r=json"
# Import two URLs with fromJSON(): sw4 and sw3
sw3 <- fromJSON(url_sw3)
sw4 <- fromJSON(url_sw4)
# Print out the Title element of both lists
print(sw3$Title)
print(sw4$Title)
# Is the release year of sw4 later than sw3?
sw4$Year > sw3$Year24.4 JSON practice (1)
JSON is built on two structures: objects and arrays. To help you experiment with these, two JSON strings are included in the sample code. It’s up to you to change them appropriately and then call jsonlite’s fromJSON() function on them each time.
Instrucions 100 XP
- Change the assignment of json1 such that the R vector after conversion contains the numbers 1 up to 6, in ascending order. Next, call
fromJSON()onjson1. - Adapt the code for json2 such that it’s converted to a named list with two elements:
a, containing the numbers 1, 2 and 3 andb, containing the numbers 4, 5 and 6. Next, callfromJSON()onjson2.
ex_020.R
# jsonlite is already loaded
# Challenge 1
json1 <- '[1, 2, 3, 4, 5, 6]'
fromJSON(json1)
# Challenge 2
json2 <- '{
"a": [1, 2, 3],
"b": [4, 5, 6]
}'
fromJSON(json2)24.5 JSON practice (2)
We prepared two more JSON strings in the sample code. Can you change them and call jsonlite’s fromJSON() function on them, similar to the previous exercise?
Instrucions 100 XP
Remove characters from
json1to build a 2 by 2 matrix containing only 1, 2, 3 and 4. CallfromJSON()onjson1.Add characters to
json2such that the data frame in which the json is converted contains an additional observation in the last row. For this observations,aequals 5 andbequals 6. CallfromJSON()one last time, onjson2.
24.6 JSON practice (2)
Instrucions 100 XP
Remove characters from json1 to build a 2 by 2 matrix containing only 1, 2, 3 and 4. Call
fromJSON()on json1.Add characters to
json2such that the data frame in which the json is converted contains an additional observation in the last row. For this observations,aequals 5 andbequals 6. CallfromJSON()one last time, on json2.
ex_021.R
# jsonlite is already loaded
# Challenge 1
json1 <- '[[1, 2], [3, 4]]'
fromJSON(json1)
# Challenge 2
json2 <-
'[{"a": 1, "b": 2}, {"a": 3, "b": 4}, {"a": 5, "b": 6}
]'
fromJSON(json2)24.7 toJSON()
Apart from converting JSON to R with fromJSON(), you can also use toJSON() to convert R data to a JSON format. In its most basic use, you simply pass this function an R object to convert to a JSON. The result is an R object of the class json, which is basically a character string representing that JSON.
For this exercise, you will be working with a .csv file containing information on the amount of desalinated water that is produced around the world. As you’ll see, it contains a lot of missing values. This data can be found on the URL that is specified in the sample code.
Instrucions 100 XP
- Use a function of the utils package to import the
.csvfile directly from the URL specified inurl_csv. Save the resulting data frame aswater. Make sure that strings are not imported as factors. - Convert the data frame
waterto a JSON. Call the resulting objectwater_json. - Print out
water_json.
ex_021.R
# jsonlite is already loaded
# URL pointing to the .csv file
url_csv <- "http://s3.amazonaws.com/assets.datacamp.com/production/course_1478/datasets/water.csv"
# Import the .csv file located at url_csv
water <- read.csv(url_csv)
# Convert the data file according to the requirements
water_json <- toJSON(water)
# Print out water_json
print(water_json)24.8 Minify and prettify
JSONs can come in different formats. Take these two JSONs, that are in fact exactly the same: the first one is in a minified format, the second one is in a pretty format with indentation, whitespace and new lines:
# Mini
{"a":1,"b":2,"c":{"x":5,"y":6}}
# Pretty
{
"a": 1,
"b": 2,
"c": {
"x": 5,
"y": 6
}
}Unless you’re a computer, you surely prefer the second version. However, the standard form that toJSON() returns, is the minified version, as it is more concise. You can adapt this behavior by setting the pretty argument inside toJSON() to TRUE. If you already have a JSON string, you can use prettify() or minify() to make the JSON pretty or as concise as possible.
Instrucions 100 XP
- Convert the
mtcarsdataset, which is available in R by default, to apretty JSON. Call the resulting JSONpretty_json. - Print out
pretty_json. Can you understand the output easily? - Convert
pretty_jsonto a minimal version usingminify(). Store this version under a new variable,mini_json. - Print out
mini_json. Which version do you prefer, the pretty one or the minified one?
ex_022.R
# jsonlite is already loaded
# Convert mtcars to a pretty JSON: pretty_json
pretty_json <- toJSON(mtcars, pretty = TRUE)
# Print pretty_json
print(pretty_json)
# Minify pretty_json: mini_json
mini_json <- minify(pretty_json)
# Print mini_json
print(mini_json)