--- title: "Managing custom formats" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Managing custom formats} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = rlang::is_installed("arrow") ) ``` The pins package provides a robust set of functions to read and write standard types of files using standard tools, e.g. CSV files using `read.csv()` and `write.csv()`. However, from time to time, you may wish read or write in other ways. You may want to read and write: - CSV files using readr or vroom - Arrow files without using compression - Whole directories that are archived/zipped You can create a customized approach using either `pin_upload()` and `pin_download()`. The goal of this vignette is to show how you can incorporate this customization into your workflow. To see a different approach for when you want to write and read with consistent metadata, see `vignette("customize-pins-metadata")`. We'll begin with an example where we write and read uncompressed Arrow files, starting by creating a temporary board: ```{r setup} library(pins) board <- board_temp() ``` ## Upload a single file Two points to keep in mind: - `pin_upload()` takes a vector of `paths` to local files. - `pin_download()` returns a vector of `paths` to local files. If you are writing a one-off file, you can do everything directly: ```{r} pin_name <- "mtcars-arrow" # file name will be `mtcars-arrow.arrow` path <- fs::path_temp(fs::path_ext_set(pin_name, "arrow")) arrow::write_feather(mtcars, path, compression = "uncompressed") pin_upload(board, paths = path, name = pin_name) ``` Reading from the downloaded pin is straightforward; `pin_download()` returns a local path that can be piped to `arrow::read_feather()`: ```{r} mtcars_download <- pin_download(board, pin_name) %>% arrow::read_feather() head(mtcars_download) ``` ## Function to manage uploading If you want to write more than one custom file of a certain type, or using a certain tool, you might consider writing a helper function: ```{r} pin_upload_arrow <- function(board, x, name, ...) { # path deleted when `pin_upload_arrow()` exits path <- fs::path_temp(fs::path_ext_set(name, "arrow")) withr::defer(fs::file_delete(path)) # custom writer arrow::write_feather(x, path, compression = "uncompressed") pin_upload(board, paths = path, name = name, ...) } ``` This helper function is designed to work like `pin_write()`: ```{r} pin_upload_arrow(board, x = mtcars, name = "mtcars-arrow2") ``` As before, you can pipe the result of `pin_download()` to your reader function: ```{r} pin_download(board, name = "mtcars-arrow2") %>% arrow::read_feather() %>% head() ``` ## Another example: upload a zipped directory archive as a pin If you want to use this same approach to [archive](https://archive.r-lib.org/) and pin a whole directory, you can write a helper function like: ```{r} pin_upload_archive <- function(board, dir, name, ...) { path <- fs::path_temp(fs::path_ext_set(name, "tar.gz")) withr::defer(fs::file_delete(path)) archive::archive_write_dir(path, dir) pin_upload(board = board, paths = path, name = name, ...) } ``` You can download the compressed archive via `pin_download(board, name)` and then pipe that path straight to `archive::archive_extract()` to extract your archive in a new directory.