--- title: "Google authentication types for R" author: "Mark Edmondson" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Google authentication types for R} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # Other more modern libraries `googleAuthR` was one of my first R packages and has enjoyed 7+ years of being my key workhorse for Google authentication, but in the meantime more modern packages have been released that may be more suited to your needs - consider [`gargle`](https://gargle.r-lib.org/) (that `googleAuthR` heavily depends upon now for authentication rather than its own home spun functions) and [`firebase`](https://firebase.john-coene.com/) for alternatives. The roles/functionality of `googleAuthR` aside authentication such as batching, package creation etc. will eventually be also superseded by smaller packages. However, `googleAuthR` is still in active use and will be supported in a maintenance mode. ## Version 2.0 Version 2.0 removed `googleAuthR` shiny modules in favour of `gar_shiny_*`. If you need those older legacy functions depend on `googleAuthR == 1.4.1` or before. # Quick user based authentication Once setup, then you should go through the Google login flow in your browser when you run this command: ```r library(googleAuthR) # starts auth process with defaults gar_auth() #>The googleAuthR package is requesting access to your Google account. Select a #> pre-authorised account or enter '0' to obtain a new token. Press Esc/Ctrl + C to abort. #> 1: mark@work.com #> 2: home@home.com ``` The authentication cache token is kept at a global level as per the `gargle` library documentation - [see there for more details](https://gargle.r-lib.org/). You can also specify your email to avoid the interactive menu: ```r gar_auth(email = "your@email.com") ``` These functions are usually wrapped in package specific functions when used in other packages, such as `googleAnalyticsR::ga_auth()` # Client options Most libraries will set the appropriate options for you, otherwise you will need to supply them from the Google Cloud console, in its `APIs & services > Credentials` section ( `https://console.cloud.google.com/apis/credentials` ). You will need as a minimum: * A client Id and secret generated via `Create Credentials > OAuth client ID > Other` - these are set in `options(googleAuthR.client_id)` and `options(googleAuthR.client_secret)`, or if you download the client ID JSON using `gar_set_client()` * An API scope for the API you want to authenticate with, found in the APIs documentation or via the `googleAuthR` RStudio addin. * A user authentication file, either generated interactivily via `gar_auth()` or via a service account file JSON file, created via `Create credentials > Service account key`. If creating your own library you can choose to supply some or all of the above to the end-user, as an end-user you may need to set some of the above (most usually your own user authentication). # Multiple authentication tokens ## googleAuthR > 1.0.0 Authentication cache tokens are kept at a global level on your computer. When you authenticate the first time with a new client_id, scope or email then you will go through the authentication process in the browser, however the next time it wil be cached and be a lot quicker. ```r # switching between auth scopes # first time new scope manual auth, then auto if supplied email gar_auth(email = "your@email.com", scopes = "https://www.googleapis.com/auth/drive") # ... query Google Drive functions ... gar_auth(email = "your@email.com", scopes = "https://www.googleapis.com/auth/bigquery") # ..query BigQuery functions ... ``` # Setting the client via Google Cloud client JSON To avoid keeping track of which client_id/secret to use, Google offers a client ID JSON file you can download from the Google Cloud console here - `https://console.cloud.google.com/apis/credentials`. Make sure the client ID type is `Desktop` for desktop applications. You can use this to set the client details before your first authentication. The above example would then be: ```r library(googleAuthR) library(googleAnalyticsR) library(searchConsoleR) # set the scopes required scopes = c("https://www.googleapis.com/auth/analytics", "https://www.googleapis.com/auth/webmasters") # set the client gar_set_client("client-id.json", scopes = scopes) # authenticate and go through the OAuth2 flow first time gar_auth() # can run Google Analytics API calls: ga_account_list() # and run Search Console API calls: list_websites() ``` You can also place the file location of your client ID JSON in the `GAR_CLIENT_JSON` environment argument, where it will look for it by default: ```r # .Renviron GAR_CLIENT_JSON="~/path/to/clientjson.json" ``` Then you just need to supply the scopes: ```r gar_set_client(scopes = "https://www.googleapis.com/auth/webmasters") ``` # Authentication with no browser Refer to [this gargle article](https://gargle.r-lib.org/articles/non-interactive-auth.html) on how to authenticate in a non-interactive manner # Authentication with a JSON file via Service Accounts You can also authenticate single users via a server side JSON file rather than going through the online OAuth2 flow. The end user could supply this JSON file, or you can upload your own JSON file to your applications. This is generally more secure if you know its only one user on the service, such as for Cloud services. This involves downloading a secret JSON key with the authentication details. More details are available from Google here: Using OAuth2.0 for Server to Server Applications[https://developers.google.com/identity/protocols/oauth2/service-account] To use, go to your Project in the Google Developement Console and select JSON Key type. Save the JSON file to your computer and supply the file location to the function `gar_auth_service()` ## Roles Roles all start with `roles/*` e.g. `roles/editor` - a list of [predefined roles are here](https://cloud.google.com/iam/docs/understanding-roles#predefined_roles) or you can see [roles within your GCP console here](https://console.cloud.google.com/iam-admin/roles/details/roles). ## Creating service account and a key ### From R The `gar_service_create()` and related functions let you create service accounts from a user OAuth2 login. The user requires permission `iam.serviceAccounts.create` for the project. Most often the user is an Owner/Editor. The workflow for authenticating with a new key from R is: * Creating a new service account Id * Giving that service account any roles it needs to operate * Downloading a one-time only JSON key file that authenticates that account Id * Using that key file in `gar_auth_service()` or otherwise with the correct scopes to use the API See this related [Google help article or creating service accounts](https://cloud.google.com/iam/docs/creating-managing-service-accounts#iam-service-accounts-create-rest). The above workflow is encapsulated within `gar_service_provision()` which will run through them for you if you supply it with your GCP projects Client Id (another JSON file that identifies your project.) ### WebUI Navigate to the JSON file from the Google Developer Console via: `Credentials > New credentials > Service account Key > Select service account > Key type = JSON` If you are using the JSON file, you must ensure: * The service email has access to the resource you are trying to fetch (for example a Google Analytics View) * You have set the scopes to the correct API * The Google Project has the API turned on An example using a service account JSON file for authentication is shown below: ```r library(googleAuthR) options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/urlshortner") service_token <- gar_auth_service(json_file="~/location/of/the/json/secret.json") analytics_url <- function(shortUrl, timespan = c("allTime", "month", "week","day","twoHours")){ timespan <- match.arg(timespan) f <- gar_api_generator("https://www.googleapis.com/urlshortener/v1/url", "GET", pars_args = list(shortUrl = "shortUrl", projection = "FULL"), data_parse_function = function(x) { a <- x$analytics return(a[timespan][[1]]) }) f(pars_arguments = list(shortUrl = shortUrl)) } analytics_url("https://goo.gl/2FcFVQbk") ``` Another example is from the `searchConsoleR` library - in this case we avoid using `scr_auth()` to authenticate via the JSON, which has had the service email added to the Search Console web property as a user. ```r library(googleAuthR) library(searchConsoleR) options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/webmasters") gar_auth_service("auth.json") list_websites() ``` # Authentication within Shiny If you want to create a Shiny app just using your data, refer to the [non-interactive authentication article on gargle](https://gargle.r-lib.org/articles/non-interactive-auth.html) If you want to make a multi-user Shiny app, where users login to their own Google account and the app works with their data, `googleAuthR` provides the below functions to help make the Google login process as easy as possible. ## Types of Shiny Authentication There are now these types of logins available, which suit different needs: * `gar_shiny_*` functions. These create a login UI before the main Shiny UI loads. Authentication occurs, and then the main UI loads but with the created unique user's authentication. You can then use `httr` based Google authentication functions normally as you would offline. * `googleSignIn` module - this is for when you just want to have a login, but do not need to make API calls. It is a lightweight JavaScript based sign in solution. ## Shiny Modules `googleAuthR` uses [Shiny Modules](https://shiny.rstudio.com/articles/modules.html). This means less code and the ability to have multiple login buttons on the same app. To use modules, you need to use the functions ending with `_UI` in your ui.R, then call the id you set there server side with the `callModule(moduleName, "id")` syntax. See the examples below. ## Shiny Authentication Examples Remember that client IDs and secrets will need to be created for the examples below. You need to pick a clientID for *web applications*, not *"Other"* as is used for offline `googleAuthR` functions. ### URL redirects In some platforms the URL you are authenticating from will not match the Docker container the script is running in (e.g. shinyapps.io or a kubernetes cluster) - in that case you can manually set it via `options(googleAuthR.redirect = http://your-shiny-url.com`). In other circumstances the Shiny app should be able to detect this itself. ### `gar_shiny_*` functions example This uses the most modern `gar_shiny_*` family of functions to create authentication. The app lists the files you have stored in Google Drive. ```r library(shiny) library(googleAuthR) gar_set_client(scopes = "https://www.googleapis.com/auth/drive") fileSearch <- function(query) { gar_api_generator("https://www.googleapis.com/drive/v3/files/", "GET", pars_args=list(q=query), data_parse_function = function(x) x$files)() } ## ui.R ui <- fluidPage(title = "googleAuthR Shiny Demo", textInput("query", label = "Google Drive query", value = "mimeType != 'application/vnd.google-apps.folder'"), tableOutput("gdrive") ) ## server.R server <- function(input, output, session){ # create a non-reactive access_token as we should never get past this if not authenticated gar_shiny_auth(session) output$gdrive <- renderTable({ req(input$query) # no need for with_shiny() fileSearch(input$query) }) } shinyApp(gar_shiny_ui(ui, login_ui = gar_shiny_login_ui), server) ``` ### `googleSignIn` module example This module is suitable if you don't need to authenticate APIs in your app, you just would like a login. You can then reach the user email, id, name or avatar to decide which content you want to show with durther logic within your Shiny app. You only need to set the `client_id` for this login, as no secrets are being created. ```r library(shiny) library(googleAuthR) options(googleAuthR.webapp.client_id = "1080525199262-qecndq7frddi66vr35brgckc1md5rgcl.apps.googleusercontent.com") ui <- fluidPage( titlePanel("Sample Google Sign-In"), sidebarLayout( sidebarPanel( googleSignInUI("demo") ), mainPanel( with(tags, dl(dt("Name"), dd(textOutput("g_name")), dt("Email"), dd(textOutput("g_email")), dt("Image"), dd(uiOutput("g_image")) )) ) ) ) server <- function(input, output, session) { sign_ins <- shiny::callModule(googleSignIn, "demo") output$g_name = renderText({ sign_ins()$name }) output$g_email = renderText({ sign_ins()$email }) output$g_image = renderUI({ img(src=sign_ins()$image) }) } # Run the application shinyApp(ui = ui, server = server) ``` # Auto-authentication Auto-authentication can be performed upon a package load. This requires the setup of environment variables either in your `.Renviron` file or via `Sys.setenv()` to point to a previously created authentication file. This file can be either a `.httr-oauth` file created via `gar_auth()` or a Google service account JSON downloaded from the Google API console. This file will then be used for authentication via `gar_auth_auto`. You can call this function yourself in scripts or R sessions, but its main intention is to be called in the `.onAttach` function via `gar_attach_auth_auto`, so that you will authenticate right after you load the library via `library(yourlibrary)` An example from `googleCloudStorageR` is shown below: ```r .onAttach <- function(libname, pkgname){ googleAuthR::gar_attach_auto_auth("https://www.googleapis.com/auth/devstorage.full_control", environment_var = "GCS_AUTH_FILE") } ``` ..which calls an environment variable set in `~/.Renvion`: ``` GCS_AUTH_FILE="/Users/mark/auth/my_auth_file.json" ``` # Authentication on Google Cloud Use `googleAuthR::gar_gce_auth()` to authenticate reusing the service keys of the Google Compute Engine instance (or other compute service). # Authentication on Kubernetes via workload identity Workload identity is a way of federating the authentication of a service key to Kubernetes without needing to download a service key. Its the "right" way to do authentication on K8s and other places if possible since it involves not downloading keys which is a potential security risk. 1. Following the [docs](https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity) you create a service account as normal and give it permissions and scopes needed to say upload to BigQuery, as you would before. eg. `my-service-key@my-project.iam.gserviceaccount.com` with `https://www.googleapis.com/auth/bigquery` scopes 2. Instead of downloading a JSON key, you instead federate that permission by adding a policy binding to another service account within Kubernetes 3. Create the service account within Kubernetes, ideally within a new namespace: ```sh # create namespace kubectl create namespace my-namespace # Create Kubernetes service account kubectl create serviceaccount --namespace my-namespace bq-service-account ``` 4. Bind that Kubernetes service account to the service account outside of kubernetes you created in step 1, and assign it an annotation ```sh # Create IAM policy binding between k8s SA and GSA gcloud iam service-accounts add-iam-policy-binding my-service-key@my-project.iam.gserviceaccount.com \ --role roles/iam.workloadIdentityUser \ --member "serviceAccount:my-project.svc.id.goog[my-namespace/bq-service-account]" # Annotate k8s SA kubectl annotate serviceaccount bq-service-account \ --namespace my-namespace \ iam.gke.io/gcp-service-account=my-service-key@my-project.iam.gserviceaccount.com ``` This key will now be available to add to pods within the cluster. For Airflow, you can pass them in using the `GKEPodOperator(...., namespace='my-namespace', service_account_name='bq-service-account')` 5. When calling the `gargle::gce_credentials()` within R, you need first make sure its using the right internal kubernetes endpoint (`options(gargle.gce.use_ip = TRUE)`) and then call the service email that is not "default". `gargle:::list_service_accounts()` was helpful in debugging (maybe export this?) ```r # code within the Docker container library(bigQueryR) options(gargle.gce.use_ip = TRUE) googleAuthR::gar_gce_auth("my-service-key@my-project.iam.gserviceaccount.com") ... do authenticated stuff... ``` # Revoking Authentication For local use, call `gar_deauth()` to de-authenticate a session. To avoid cache tokens being reused delete them from the gargle cache folder, usually `~/.R/gargle/gargle-oauth/` For service level accounts delete the JSON file. For a Shiny app, a cookie is left by Google that will mean a faster login next time a user uses the app with no Authorization screen that they get the first time through. To force this every time, activate the parameter `revoke=TRUE` within the `googleAuth` function.