Extracting Data from OpenStreetMap

Introduction

This tutorial covers obtaining free OpenStreetMap (OSM) data. OpenstreetMap (OSM) is a free and open map of the world created largely by voluntary contribution of millions of people around the world. Much like Wikipedia. Since the data is free and open, there are much less restrictions to obtaining and using the data. The only condition of using OSM data is proper attribution to OSM contributors.

Comparison between OSM and Google

Availability of data is reasonably good in both Goolge and OSM in most parts of the world. However, the availability of Google data is better in places where there is more commercial interest and that in OSM where there is more humanitarian interst. You can use Map Compare to compare OSM and Google in particular locations.

OSM data is free to download. Overpass can be used free of cost to download small amount of data. For large datasets, use Geofabrik. Google requires users to pay based on volume of data served after a limited daily quota. Policies of Google change frequently, so note that your code will eventually and frequently break.

Downloading data

OSM serves two APIs, namely Main API for editing OSM, and Overpass API for providing OSM data. We will use Overpass API to gather data in this tutorial just like in the tutorial about OSM and QGIS

Data can be queried for download using a combination of search criteria lke location and type of objects. It helps to understand how OSM data is structured. OSM data is stored as a list of attributes tagged in key - value pairs of geospatial objects (points, lines or polygons). For example, for an architect’s office, the key is “office”, and the value is “architect.” For the name of the office, key is “name” and value is “ABC Design Studio.” Access an extensive list of key-value pairs through OSM Wiki Map features.

Obtaining point locations of restaurants in Durham from OSM

Restaurants are tagged under amenities. Amenities, according to OSM Wiki are facilities used by visitors and residents. Here, ‘key’ is “amenity” and ‘value’ is “restaurant.” Do not forget to look for related amenities such as “pub”, “food court”, “cafe”, “fast food”, etc. Other amenities include: “university”, “music school”, “kindergarten” and the likes in education, “bus station”, “fuel”, “parking” and others in transportation, and much more.

library(osmdata)
library(sf)
library(tidyverse)
library(leaflet)
library(widgetframe)

data_from_osm_df <- opq (getbb ("Durham, North carolina")) %>% #gets bounding box
  add_osm_feature(key = "amenity", value = "restaurant") %>% #searches for restaurants
  osmdata_sf() #download OSM data as sf

#select name and geometry from point data for restaurants
cafe_osm <- data_from_osm_df$osm_points %>% #select point data from downloaded OSM data
  select(name, geometry) #for now just selecting the name and geometry to plot

#create a plot in leaflet
m1 <-
leaflet() %>%
  addProviderTiles("CartoDB.Positron") %>%
  addCircles(data = cafe_osm)

frameWidget(m1)

It is helpful to learn about the distinctions between different tags (key-value pairs). For example, the key “landuse” is used to describe the purpose for which an area is being used. Examples of values for the key “landuse” are “commercial”, “retail”, “vineyard”, “cemetery”, “religious”, etc. Landuse tags are more generic than amenities and are only used for area objects while amenities can also be used for point objects. In case of any confusion, refer to Map features.

Various amenities, land-use, roads (e.g. key=“highway”, value = “primary”, “service”, “footway”), natural land features (key=“natural”, value = “grassland”), settlements (key = “place”, value = “suburb”), power (key=“power”, value=“line”, “pole”, “transformer”), etc. may be useful in planning applications.

Comparing with Google Places

It is useful to compare the output of OSM to Google Places. I am going to use googleway package for this analysis.

To make this portion of the code work, you will need an API key from Google. Instructions to get and set an API key are located here.

Note the use of loops to get the next page.

library(googleway)

str <- "restaurants in Durham, NC"  # Construct a search string

res <- google_places(search_string = str, key = YOUR_API_KEY)  #Query google servers. Do not forget to set your Google API key 

str(res)
## List of 4
##  $ html_attributions: list()
##  $ next_page_token  : chr "AfLeUgMGT-mwxcivbE0RqYwAt129UKfk0t72RjKr4882rjVi4LsM5k4ApcVC0veEtDM2AaDtIXm_AwTDvhyNuR7dMA80Cjxs9j18FFjUQOLfgBz"| __truncated__
##  $ results          :'data.frame':   20 obs. of  16 variables:
##   ..$ business_status      : chr [1:20] "OPERATIONAL" "OPERATIONAL" "OPERATIONAL" "OPERATIONAL" ...
##   ..$ formatted_address    : chr [1:20] "737 9th St #210, Durham, NC 27705, United States" "8128 Renaissance Pkwy #114, Durham, NC 27713, United States" "315 E Chapel Hill St, Durham, NC 27701, United States" "1200 W Chapel Hill St, Durham, NC 27701, United States" ...
##   ..$ geometry             :'data.frame':    20 obs. of  2 variables:
##   .. ..$ location:'data.frame':  20 obs. of  2 variables:
##   .. .. ..$ lat: num [1:20] 36 35.9 36 36 36 ...
##   .. .. ..$ lng: num [1:20] -78.9 -79 -78.9 -78.9 -78.9 ...
##   .. ..$ viewport:'data.frame':  20 obs. of  2 variables:
##   .. .. ..$ northeast:'data.frame':  20 obs. of  2 variables:
##   .. .. .. ..$ lat: num [1:20] 36 35.9 36 36 36 ...
##   .. .. .. ..$ lng: num [1:20] -78.9 -79 -78.9 -78.9 -78.9 ...
##   .. .. ..$ southwest:'data.frame':  20 obs. of  2 variables:
##   .. .. .. ..$ lat: num [1:20] 36 35.9 36 36 36 ...
##   .. .. .. ..$ lng: num [1:20] -78.9 -79 -78.9 -78.9 -78.9 ...
##   ..$ icon                 : chr [1:20] "https://maps.gstatic.com/mapfiles/place_api/icons/v1/png_71/restaurant-71.png" "https://maps.gstatic.com/mapfiles/place_api/icons/v1/png_71/restaurant-71.png" "https://maps.gstatic.com/mapfiles/place_api/icons/v1/png_71/restaurant-71.png" "https://maps.gstatic.com/mapfiles/place_api/icons/v1/png_71/restaurant-71.png" ...
##   ..$ icon_background_color: chr [1:20] "#FF9E67" "#FF9E67" "#FF9E67" "#FF9E67" ...
##   ..$ icon_mask_base_uri   : chr [1:20] "https://maps.gstatic.com/mapfiles/place_api/icons/v2/restaurant_pinlet" "https://maps.gstatic.com/mapfiles/place_api/icons/v2/restaurant_pinlet" "https://maps.gstatic.com/mapfiles/place_api/icons/v2/restaurant_pinlet" "https://maps.gstatic.com/mapfiles/place_api/icons/v2/restaurant_pinlet" ...
##   ..$ name                 : chr [1:20] "Juju Durham" "Harvest 18 Restaurant" "The Restaurant at The Durham" "GRUB Durham" ...
##   ..$ opening_hours        :'data.frame':    20 obs. of  1 variable:
##   .. ..$ open_now: logi [1:20] FALSE FALSE FALSE TRUE FALSE FALSE ...
##   ..$ photos               :List of 20
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 1365
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/101715461096256363950\">Juju Asian Tapas + Bar</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgNgzxuFSzGqNULuskoX1J1eSYN1MoW3YEQ9c7PkXcg4qWXatHFQ4Rb5cwIQr7l-ga-4lijhr69kK6U63LVpoR7ba-XZ9zzumHPUSoZkE4V"| __truncated__
##   .. .. ..$ width            : int 2048
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 2340
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/114770559454901883104\">Mike Little</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgPPIA99W25syCwvvcQYxFUrut0bFaPBZl5jprnhTIRyGdcr0QDfzDOikz4Hp1Cjhg7k7E4iYQCGFWk2-uZgbns_csK-P3xcrxnZebT2q70"| __truncated__
##   .. .. ..$ width            : int 4160
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3024
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/103711833847435042805\">Erik Newby</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgMcxBgjOJDODXoQaFqHrMF-yY6eEKkRn-LhM076R41KFhumuBtiv8zjuw6yXHBfaAL_APZqjefYigLmDkpicqiayu7HvtmYc7Be8rFFYY3"| __truncated__
##   .. .. ..$ width            : int 4032
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3024
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/114870547795315215397\">James Goerke</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgPMD9yIH36j5iqlrH_muls2Z3d76MRXuGriZKTcRB45kzgYghXWPiiOqgiss4XovnhvEzrA1SsrkGaL2AmkuYgu5d3wlmKaXTY3p5ZF2pF"| __truncated__
##   .. .. ..$ width            : int 4032
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3072
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/106155315629813478210\">John Vinueza</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgPxyiChtj0yxR6NMCSAyHyP1C0v-aw7ROgHBa4XqABXYj2euzeR9vsXSNckahZHOuk42N6U47k5-8NDCzRTqF_14y_jR1Xr8zCnfHDskpe"| __truncated__
##   .. .. ..$ width            : int 4080
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 1280
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/108889804668316321834\">Local 22 Kitchen And Bar</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgPZqAgMiH1Cj_5Ml-pqlh2Bpa7eRf7hWYaC-sjNAW-qyMHYP7KBHOvmHEcOuHALI8EM5zteRzzzbD_2HuVK4rbkhJuPdPEjxhS1aJlw55w"| __truncated__
##   .. .. ..$ width            : int 1920
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 2268
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/116309041934196405059\">Vishwanath Math</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgNDxdslnjs0TSAfW8xZ2G6yKznysWRwifnJ_M1O-fFl1wO-dxca1PJE50xJkk5Z9oZt42GWr5zmEuSdC218K6_TWf9RkPjQSj6YCzH6tsm"| __truncated__
##   .. .. ..$ width            : int 4032
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3024
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/108963259167865212631\">BRIAN MASSENGILL</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgMRLSDmOeN5XmRRLtenMPy7mdfSlyxTX0wgvoI0cshqMcaYS0M3_kNYu5dL_5TIAb7zvhVmgnX27K-u2NdncbLJG5flZFm181NU0a3ZI8p"| __truncated__
##   .. .. ..$ width            : int 4032
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3024
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/117366860799003538591\">A Google User</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgP_hRNFFXZGdI75PiKIxqXFXTlY8__rykRZ2aKHSlWuA_DchjC8tU63cdRXxX5ips1oiVWQCriruFHmM-E4tPxiqAPkbrQDce_fTmnqzIe"| __truncated__
##   .. .. ..$ width            : int 4032
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3024
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/109300032866188423940\">Zach Brown</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgOICZG0j3Pyh1V3uJ5kUgcFhu46mfHDQMIf-L5r4Ytv-SAm1qGlBt447IByIr57j6szD56-G-nOrs5ZDEsOjHoFdDRfZF6eRcJXtF5-XKk"| __truncated__
##   .. .. ..$ width            : int 4032
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 1067
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/118359313798311826267\">A Google User</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgPASPTIaPmGSxahYbOEgFAaqXrAQHPgGzIX7KIkDKv-SfOwbY4bfoRTtQtWoNnnSLiVZa8Qr3Yy31h3Sx3OYBmjpP-HVfHJ24ZPcZS3WYI"| __truncated__
##   .. .. ..$ width            : int 1600
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3072
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/114917363150643087487\">David Foresman</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgOX4w-TgQ7Gg5xc8iDY4haqOyxq1zvU5b9o2NifriF-Op-xL6HJC9exho2T8vHFG7gUnEWwxOjxfDS5NRqlOlX-rY9VbYsKdbfqojsZL2e"| __truncated__
##   .. .. ..$ width            : int 4080
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3024
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/102066483967244766103\">Marv Baker</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgMEL1DvyfjMEr3FDGmvbs2tG1txo65kjR2e6GSxoTLzHwpJfdB0wzi0kO_5MDOdc2QAF7y7CGppjXZZoshDMiUv0tK3lg8pmaGDKNUcKRG"| __truncated__
##   .. .. ..$ width            : int 4032
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3024
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/115591303687864814152\">warada shafi</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgPEm3Ny2re6YXG2yFmGZ9-VZtedKU6Pdja1s4WeeTeVYhooI5jOBgqwHbSwXA5HzHNSZ-3jpdt316T1UV3AP0WQItKaRCznhzFSW07u4Nz"| __truncated__
##   .. .. ..$ width            : int 4032
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3072
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/117939330467348488790\">protonoid</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgM-f9F0veS1XgpyjuG-sk9lHguYnn-JzHApG7mHDoUKLIGv-rPaExIxvP9ngcWfdRhpcF1f4CbgqAWhDON_eojLkkMfrn__jsM3Iqm3xgU"| __truncated__
##   .. .. ..$ width            : int 4080
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 1512
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/116117583444734059617\">Mothers &amp; Sons Trattoria</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgNGmqVkzlzONPHWuqhkBpc9GQRuedQTY7w8lRBBcNrhkK6wjySsTUe4trkk3KPboWvwUpRGdV0KYc3biGRmuUU5TSIz5X9TREJr0Scail5"| __truncated__
##   .. .. ..$ width            : int 2016
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 2721
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/108832116470362120552\">Kimberly Slentz-Kesler</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgPi22ZdFw8D5q7HgP-evVd5fvUuh_ry12PsACZ5J_7-qLHNEiLeUCuHQpAsM-Vqz1vtSjdpHnOOX0L9bxG9iMZROZGtD6LQK2Q9Sdd6-lE"| __truncated__
##   .. .. ..$ width            : int 4032
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 4032
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/100123088912788695638\">Chris Schwarz</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgNQN_h2cGNl6t3S6GBSgmI7d2CBOBHD8kRDRhBs_iJ8eSNQtf9J1nhGzr33VJhXDMCowe-xhy4-INZqzAQAnBkGiTAo31nQqmb0nnSJeph"| __truncated__
##   .. .. ..$ width            : int 3024
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3264
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/115121000442582864884\">Rachael Lord</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgP6NoILAb1ve9q3phRmvXuygx6Lno87WupJQq63RxjGl2QvzKWygM-IiyM3_ve-KCx6Bt6RaRSLCcn_iJNa5HvCbPNoUjQaHwpFlIkJytG"| __truncated__
##   .. .. ..$ width            : int 2448
##   .. ..$ :'data.frame':  1 obs. of  4 variables:
##   .. .. ..$ height           : int 3024
##   .. .. ..$ html_attributions:List of 1
##   .. .. .. ..$ : chr "<a href=\"https://maps.google.com/maps/contrib/114464732617145277475\">Mauro Jeronimo Mendoza</a>"
##   .. .. ..$ photo_reference  : chr "AfLeUgMGjK4hHrxBdRwAT6gKHVWZ7wpuFh0TV2hJ80R0JJMBx3ZmkkM6fOHze0Ry7f1tjgxkKzuFufnxrSO9UqgOT0VPmpu6FLAprlIKn4H8fCT"| __truncated__
##   .. .. ..$ width            : int 4032
##   ..$ place_id             : chr [1:20] "ChIJ608CRgfkrIkRzaPpHby3GS4" "ChIJj6foxurorIkRhlH7GB_PMo0" "ChIJNR4KG3LkrIkR2G7VfaFtyWw" "ChIJtQrhphrkrIkRwN9ZgZlJdKU" ...
##   ..$ plus_code            :'data.frame':    20 obs. of  2 variables:
##   .. ..$ compound_code: chr [1:20] "235H+W2 Durham, North Carolina" "W23W+8Q Durham, North Carolina" "X3WX+VJ Durham, North Carolina" "X3WJ+QW Durham, North Carolina" ...
##   .. ..$ global_code  : chr [1:20] "8783235H+W2" "8773W23W+8Q" "8773X3WX+VJ" "8773X3WJ+QW" ...
##   ..$ price_level          : int [1:20] 3 2 2 2 NA 2 2 3 2 2 ...
##   ..$ rating               : num [1:20] 4.5 4.3 4.3 4.4 4.7 4.5 4.4 4.5 4.4 4.6 ...
##   ..$ reference            : chr [1:20] "ChIJ608CRgfkrIkRzaPpHby3GS4" "ChIJj6foxurorIkRhlH7GB_PMo0" "ChIJNR4KG3LkrIkR2G7VfaFtyWw" "ChIJtQrhphrkrIkRwN9ZgZlJdKU" ...
##   ..$ types                :List of 20
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:5] "bar" "restaurant" "food" "point_of_interest" ...
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:5] "restaurant" "bar" "food" "point_of_interest" ...
##   .. ..$ : chr [1:6] "restaurant" "cafe" "food" "point_of_interest" ...
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:5] "restaurant" "bar" "food" "point_of_interest" ...
##   .. ..$ : chr [1:5] "bar" "restaurant" "food" "point_of_interest" ...
##   .. ..$ : chr [1:5] "bar" "restaurant" "food" "point_of_interest" ...
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   .. ..$ : chr [1:4] "restaurant" "food" "point_of_interest" "establishment"
##   ..$ user_ratings_total   : int [1:20] 941 1053 116 1748 179 797 1095 505 607 240 ...
##  $ status           : chr "OK"
# Notice that the result has Status code and next page token. We are going to use them to extract the restaurants.

# Initiatlise some values.
nextpage_yes_no <- !is.null(res$next_page_token)
token <- res$next_page_token
i <- 1
cafe_google_list <- NULL

# extract only when status is OK.

if(res$status == "OK") {
    cafe_google_list[[i]] <- 
                        cbind ("id" = res$results$id, 
                               "name" = res$results$name,
                               "address" = res$results$formatted_address,
                               "longitude" = res$results$geometry$location$lng,  # Notice that we are going multiple levels down in the data frame.  You should really examine the structure of the res and results to understand what is going on here.
                               "latitude" = res$results$geometry$location$lat,
                               "plus_code" = res$results$plus_code$compound_code,
                               "price_level" = res$results$price_level,
                               "rating" = res$results$rating
                               ) %>% as_tibble()

}


# The loop begins.

while(nextpage_yes_no == TRUE){ #See if the loop will run at least once.
  i <- i+1 #increment i.
  res_next <- google_places(search_string = str,
                          page_token = token,
                          key = YOUR_API_KEY)

  if(res_next$status == "OK") {
    cafe_google_list[[i]] <- 
                        cbind ("id" = res_next$results$id, 
                               "name" = res_next$results$name,
                               "address" = res_next$results$formatted_address,
                               "longitude" = res_next$results$geometry$location$lng,
                               "latitude" = res_next$results$geometry$location$lat,
                               "plus_code" = res_next$results$plus_code$compound_code,
                               "price_level" = res_next$results$price_level,
                               "rating" = res_next$results$rating
                               ) %>% as_tibble()

}


  token <- res_next$next_page_token # notice the update of the token 
  nextpage_yes_no <- !is.null(res_next$next_page_token) # Notice the update of nextpage_yes_no. If you don't do it, you can potentially run the loop forever (or at least till the server shuts you down.)
  rm(res_next)  # clean up the temporary objects. Good practise/
  Sys.sleep(5) # Introduce a time delay, so that you do not overwhelm the server.
}  # The loop concludes


# Convert the list to a sf object to visualise
cafe_google <- plyr::compact(cafe_google_list) %>% bind_rows 
cafe_google <- st_as_sf(cafe_google,  coords = c("longitude", "latitude"), crs=4326)

m1 <-
leaflet() %>%
  addProviderTiles("CartoDB.Positron") %>%
  addCircles(data = cafe_google)

frameWidget(m1)

Note that OSM report 894 entries while Google reports 20 restaurants.


Exercise

  • Why is there a difference between Google and OSM? (Hint: Read the documentation)
  • If there is a marked difference between OSM and Google, what implication does this have for any analysis that you might do using their data?
  • Notice that repeated application of the same query produce different results in in the same system. Why? What implication does this have for reproducibility?

Conclusion

The provenance of the data and the continual update of data on servers have serious implications for reproducibility. On the other hand, the updates allow for timeliness of analysis. It is important to recognise these limitations and potential.

Acknowledgements

Much of the post is written by Kshitiz Khanal.

Related