--- title: "Using web-hosted boards" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Using web-hosted boards} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = rlang::is_installed("webfakes") ) ``` The pins package supports back-and-forth collaboration for publishing and consuming using, for example, `board_s3()`. The goal of this vignette is to show how to publish a board of pins to a website, bringing your pins to a wider audience. How does this work? - For consumers, `board_url()` offers **read-only** collaboration. - For publishers, a helper function is provided, `write_board_manifest()`. A manifest contains a list of pins and versions, enabling a `board_url()` to read like a `board_s3()` or `board_folder()`. ## Publishing The steps for publishing a board that can be read by consumers using `board_url()` are: - create a board, - write pins to the board, - write a manifest file using `write_board_manifest()`, and - publish your website. The first and last steps will be specific to how you deploy your board on the web; we discuss options in the [Publishing platforms] section. Regardless of platform, you'll write the pins and the manifest the same way. For this first demonstration, we'll start by creating a board, and finish by showing how the board works after being served. ```{r pin-mtcars} library(pins) board <- board_temp(versioned = TRUE) ``` We're using a temporary board for this demonstration, but in practice, you might use `board_folder()` in a project folder or GitHub repo, or perhaps `board_s3()`. Let's make the `mtcars` dataset available as a JSON file: ```{r} board %>% pin_write(mtcars, type = "json") ``` Let's make a new version of this data by adding a column: `lper100km`, consumption in liters per 100 km. This could make our data friendlier to folks outside the United States. ```{r} #| echo: false # we need a short delay here to keep the versions in the right order Sys.sleep(2) ``` ```{r metric} mtcars_metric <- mtcars mtcars_metric$lper100km <- 235.215 / mtcars$mpg board %>% pin_write(mtcars_metric, name = "mtcars", type = "json") ``` Let's check our board to ensure we have one pin named `"mtcars"`, with two versions: ```{r board-pin-list} board %>% pin_list() board %>% pin_versions("mtcars") ``` Because a `board_url()` is consumed over the web, it doesn't have access to a file system the way, for example, a `board_folder()` has; we can work around this by creating a manifest file. When a `board_url()` is set up by a consumer for reading, the pins package uses this file to discover the pins and their versions. The manifest file is the key to `board_url()`'s ability to discover pins as if it were a file-system-based board. After writing pins but *before* publishing, call `write_board_manifest()`: ```{r} board %>% write_board_manifest() ``` The maintenance of this manifest file is not automated; it is your responsibility as the board publisher to keep the manifest up to date. Let's confirm that there is a file called `_pins.yaml`: ```{r} withr::with_dir(board$path, fs::dir_ls()) ``` We can inspect its contents to see each pin in the board, and each version of each pin: ```yaml `r if (rlang::is_installed("webfakes")) paste(readLines(fs::path(board$path, "_pins.yaml")), collapse = "\n")` ``` At this point, we would publish the folder containing the board as a part of a web site. Let's pretend that we have served the folder from our fake website, `https://not.real.website.co/pins/`. ```{r board-serve} #| echo: false board_server <- webfakes::new_app() board_server$use(webfakes::mw_static(root = board$path)) board_process <- webfakes::new_app_process(board_server) web_board <- board_url(board_process$url()) ``` ## Consuming With an up-to-date manifest file, a `board_url()` can behave as a read-only version of a `board_folder()`. Let's create a `board_url()` using our fake URL: ```{r} #| eval: false web_board <- board_url("https://not.real.website.co/pins/") ``` The `board_url()` function reads the manifest file to discover the pins and versions: ```{r} #| message: false web_board %>% pin_list() versions <- web_board %>% pin_versions("mtcars") versions ``` We can read the most-recent version of the `"mtcars"` pin: ```{r} #| message: false web_board %>% pin_read("mtcars") %>% head() ``` We can also read the first version: ```{r} #| message: false web_board %>% pin_read("mtcars", version = versions$version[[1]]) %>% head() ``` ## Publishing platforms The goal of this section is to illustrate ways to publish a board as a part of a website. ### pkgdown Pins offers another way for package developers to share data associated with an R package. Publishing a package dataset as a pin can extend your data's "audience" to those who have not installed the package. Using [pkgdown](https://pkgdown.r-lib.org/articles/customise.html#additional-html-and-files), any files you save in the directory `pkgdown/assets/` will be copied to the website's root directory when `pkgdown::build_site()` is run. The [_R Packages_ book](https://r-pkgs.org/data.html#sec-data-data-raw) suggests using a folder called `data-raw` for working with datasets; this can be adapted to use pins. You would start with `usethis::use_data_raw()`. In a file in your `/data-raw` directory, wrangle and clean your datasets in the same way as if you were going to use `usethis::use_data()`. To offer such datasets on a web-based board instead of as a built-in package dataset, in your `/data-raw` file you would: - Create a board: `board_folder(here::here("pkgdown/assets/pins-board"))` (you might use a different name than `"pins-board"`). - Write your datasets to the board using `pin_write()`. - At the end, call `write_board_manifest()`. Now when you build your pkgdown site and serve it (perhaps via GitHub Pages at a URL like `https://user-name.github.io/repo-name/`), your datasets are available as pins. The _R Packages_ book offers this observation on CRAN and package data: > Generally, package data should be smaller than a megabyte - if it’s larger you’ll need to argue for an exemption. Publishing a board on your pkgdown site provides a way to offer datasets too large for CRAN or extended versions of your data. A consumer can read your pins by setting up a board like: ```{r} #| eval: false board <- board_url("https://user-name.github.io/repo-name/pins-board/") ``` ### S3 S3 buckets can be made available to different users using [permissions](https://docs.aws.amazon.com/config/latest/developerguide/s3-bucket-policy.html); buckets can even be made publicly accessible. Publishing data as a pin in an S3 bucket can allow your collaborators to read without dealing with the authentication required by `board_s3()`. To offer datasets as a pin on S3 via `board_url()` you would: - Create a board: `board_s3("your-existing-bucket")` (set the bucket's permissions to give appropriate people access). - Write your datasets to the board using `pin_write()`. - At the end, call `write_board_manifest()`. S3 buckets typically have a URL like `https://your-existing-bucket.s3.us-west-2.amazonaws.com/`. For a person who has access to your bucket, they can read your pins by setting up a board like: ```{r} #| eval: false board <- board_url("https://your-existing-bucket.s3.us-west-2.amazonaws.com/") ```