In this tutorial we will look at the examples of working with BigQuery using the package bigrquery.

You will need:
- Install bigrquery package
- Service account key (how to create it you can find here)
- Enable BigQuery api in google cloud console

bigrquery

From bigrquery description:

The bigrquery package makes it easy to work with data stored in Google BigQuery by allowing you to query BigQuery tables and retrieve metadata about your projects, datasets, tables, and jobs. The bigrquery package provides three levels of abstraction on top of BigQuery:

  • The low-level API provides thin wrappers over the underlying REST API. All the low-level functions start with bq_, and mostly have the form bq_noun_verb(). This level of abstraction is most appropriate if you’re familiar with the REST API and you want do something not supported in the higher-level APIs.

  • The DBI interface wraps the low-level API and makes working with BigQuery like working with any other database system. This is the most convenient layer if you want to execute SQL queries in BigQuery or upload smaller amounts (i.e. <100 MB) of data.

  • The dplyr interface lets you treat BigQuery tables as if they are in-memory data frames. This is the most convenient layer if you don’t want to write SQL, but instead want dbplyr to write it for you.

Examples

  1. Load from bigquery:
library(DBI)
library(bigrquery)
library(dplyr)

bigrquery::set_service_token("key.json")
  
con <- DBI::dbConnect(bigrquery::dbi_driver(),
                         dataset = "datasetName", 
                         project = "projectId",
                         billing = "projectId")

dplyr::collect(dplyr::tbl(con, "tableName"))

For small datasets you can write something like:

loadBQ <- function(projectId, datasetName, tableName) {
  bigrquery::set_service_token("key.json")
  
  con <- DBI::dbConnect(bigrquery::dbi_driver(),
                         dataset = "datasetName", 
                         project = "projectId",
                         billing = "projectId")
  
  dplyr::collect(dplyr::tbl(con, "tableName"))
}
  1. Upload to bigquery:
  library(bigrquery)
  
  bigrquery::set_service_token("key.json")
  
  bigrquery::insert_upload_job(
    project = "projectId",
    dataset = "datasetName",
    table = "tableName",
    values = valueName # your data
  )

And small function for this:

uploadBQ <- function(projectId, datasetName, tableName, valueName) {
  bigrquery::set_service_token("key.json")
  
  bigrquery::insert_upload_job(
    project = "projectId",
    dataset = "datasetName",
    table = "tableName",
    values = valueName
  )
}
  1. Delete from table in bigquery:
  library(bigrquery)

  deleteQuery <- 'DELETE FROM google_analytics_sample.ga_sessions_20170801 WHERE visitNumber < 1'

  bigrquery::set_service_token("key.json")

  bigrquery::query_exec(query = deleteQuery, project = "projectId", use_legacy_sql = FALSE)