Title: Pipeline Tools Inspired by 'GNU Make'
Version: 0.2.2
Description: A suite of tools for transforming an existing workflow into a self-documenting pipeline with very minimal upfront costs. Segments of the pipeline are specified in much the same way a 'Make' rule is, by declaring an executable recipe (which might be an R script), along with the corresponding targets and dependencies. When the entire pipeline is run through, only those recipes that need to be executed will be. Meanwhile, execution metadata is captured behind the scenes for later inspection.
License: GPL (≥ 3)
URL: https://kinto-b.github.io/makepipe/, https://github.com/kinto-b/makepipe
BugReports: https://github.com/kinto-b/makepipe/issues
Imports: cli, nomnoml, R6, utils, roxygen2
Suggests: knitr, covr, testthat (≥ 3.0.0), withr, rmarkdown, webshot2, visNetwork,
Config/testthat/edition: 3
Encoding: UTF-8
RoxygenNote: 7.2.3
VignetteBuilder: knitr
NeedsCompilation: no
Packaged: 2025-01-05 16:50:57 UTC; kinto
Author: Kinto Behr [aut, cre, cph]
Maintainer: Kinto Behr <kinto.behr@gmail.com>
Repository: CRAN
Date/Publication: 2025-01-07 10:30:02 UTC

Pipeline visualisations

Description

A Pipeline object is automatically constructed as calls to ⁠make_*()⁠ are made. It stores the relationships between targets, dependencies, and sources.

Public fields

segments

A list of Segment objects

Methods

Public methods


Method add_source_segment()

Add an edge to edges

Add any nodes in private$edges that are missing from private$nodes into private$nodes

Reconstruct Pipeline edges from Segment edges. Called primarily to update outofdateness

Add a pipeline segment corresponding to a make_with_source() call

Usage
Pipeline$add_source_segment(
  source,
  targets,
  dependencies,
  packages,
  envir,
  force
)
Arguments
source

The path to an R script which makes the targets

targets

A character vector of paths to files

dependencies

A character vector of paths to files which the targets depend on

packages

A character vector of names of packages which targets depend on

envir

The environment in which to execute the source or recipe.

force

A logical determining whether or not execution of the source or recipe will be forced (i.e. happen whether or not the targets are out-of-date)

new_edge

An data.frame constructed with new_edge()

Returns

The SegmentSource added to the Pipeline


Method add_recipe_segment()

Add a pipeline segment corresponding to a make_with_recipe() call

Usage
Pipeline$add_recipe_segment(
  recipe,
  targets,
  dependencies,
  packages,
  envir,
  force
)
Arguments
recipe

A language object which, when evaluated, makes the targets

targets

A character vector of paths to files

dependencies

A character vector of paths to files which the targets depend on

packages

A character vector of names of packages which targets depend on

envir

The environment in which to execute the source or recipe.

force

A logical determining whether or not execution of the source or recipe will be forced (i.e. happen whether or not the targets are out-of-date)

Returns

The SegmentRecipe added to the Pipeline


Method build()

Build all targets

Usage
Pipeline$build(quiet = getOption("makepipe.quiet"))
Arguments
quiet

A logical determining whether or not messages are signaled

Returns

self


Method clean()

Clean all targets

Usage
Pipeline$clean()
Returns

self


Method touch()

Touch all targets, updating file modification time to current time. This is useful when you know your targets are all up-to-date but makepipe doesn't (e.g. after a negligible change was made to your source code).

Usage
Pipeline$touch()
Returns

self


Method annotate()

Apply annotations to Pipeline

Usage
Pipeline$annotate(labels = NULL, notes = NULL)
Arguments
labels

A named character vector mapping nodes in the Pipeline onto labels to display beside them.

notes

A named character vector mapping nodes in the Pipeline onto notes to display on beside the labels (nomnoml) or as tooltips (visNetwork).


Method refresh()

Refresh Pipeline to check outofdateness

Usage
Pipeline$refresh()

Method nomnoml()

Display the pipeline with nomnoml

Usage
Pipeline$nomnoml(
  direction = c("down", "right"),
  arrow_size = 1,
  edge_style = c("hard", "rounded"),
  bend_size = 0.3,
  font = "Courier",
  font_size = 12,
  line_width = 3,
  padding = 16,
  spacing = 40,
  leading = 1.25,
  stroke = "#33322E",
  fill_arrows = FALSE,
  gutter = 5,
  edge_margin = 0
)
Arguments
direction

The direction the flowchart should go in

arrow_size

The arrowhead size

edge_style

The arrow edge style

bend_size

The degree of rounding in the arrows (requires edge_style=rounded)

font

The name of a font to use

font_size

The font size

line_width

The line width for arrows and box outlines

padding

The amount of padding within boxes

spacing

The amount of spacing between boxes,

leading

The amount of spacing between lines of text

stroke

The color of arrows, text, and box outlines

fill_arrows

Whether arrow heads are full triangles (TRUE) or angled (FALSE)

gutter

The amount space to leave around the flowchart

edge_margin

The amount of space to leave between boxes and arrows

Returns

self


Method visnetwork()

Display the pipeline with nomnoml

Usage
Pipeline$visnetwork(...)
Arguments
...

Arguments (other than nodes and edges) to pass to visNetwork::visNetwork()

Returns

self


Method text_summary()

Display a text summary of the pipeline

Usage
Pipeline$text_summary()
Returns

self


Method print()

Display

Usage
Pipeline$print(...)
Arguments
...

Arguments (other than nodes and edges) to pass to visNetwork::visNetwork()

Returns

self


Method save_visnetwork()

Save pipeline visNetwork

Usage
Pipeline$save_visnetwork(file, selfcontained = TRUE, background = "white", ...)
Arguments
file

File to save HTML into

selfcontained

Whether to save the HTML as a single self-contained file (with external resources base64 encoded) or a file with external resources placed in an adjacent directory.

background

Text string giving the html background color of the widget. Defaults to white.

...

Arguments (other than nodes and edges) to pass to visNetwork::visNetwork()

Returns

self


Method save_nomnoml()

Save pipeline nomnoml

Usage
Pipeline$save_nomnoml(file, width = NULL, height = NULL, ...)
Arguments
file

File to save the png into

width

Image width

height

Image height

...

Arguments to pass to self$nomnoml()

Returns

self


Method save_text_summary()

Save a text summary of the pipeline

Usage
Pipeline$save_text_summary(file)
Arguments
file

File to save text summary into

Returns

self


Method clone()

The objects of this class are cloneable with this method.

Usage
Pipeline$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

See Also

Other pipeline: pipeline-accessors, pipeline-vis


Segment

Description

A Segment object is automatically constructed and attached to the Pipeline when a call to ⁠make_*()⁠ is made. It stores the relationships between targets, dependencies, and sources.

Public fields

targets

A character vector of paths to files

dependencies

A character vector of paths to files which the targets depend on

packages

A character vector of names of packages which targets depend on

force

A logical determining whether or not execution of the source or recipe will be forced (i.e. happen whether or not the targets are out-of-date)

envir

The environment in which to execute the instructions.

result

An object, whatever is returned by executing the instructions

executed

A logical, whether or not the instructions were executed

execution_time

A difftime, the time taken to execute the instructions

label

A short label for the segment

note

A description of what the segment does

Active bindings

edges

Get edges connecting the dependencies, instructions, and targets

nodes

Get nodes corresponding to dependencies, instructions, and targets

text_summary

A plain text summary of the Segment

Methods

Public methods


Method new()

Initialise a new Segment

Usage
Segment$new(
  id,
  targets,
  dependencies,
  packages,
  envir,
  force,
  executed,
  result,
  execution_time
)
Arguments
id

An integer that uniquely identifies the segment

targets

A character vector of paths to files

dependencies

A character vector of paths to files which the targets depend on

packages

A character vector of names of packages which targets depend on

envir

The environment in which to execute the instructions.

force

A logical determining whether or not execution of the source or recipe will be forced (i.e. happen whether or not the targets are out-of-date)

executed

A logical, whether or not the instructions were executed

result

An object, whatever is returned by executing the instructions

execution_time

A difftime, the time taken to execute the instructions


Method print()

Printing method

Usage
Segment$print()

Method update_result()

Update the Segment with new execution information

Usage
Segment$update_result(executed, execution_time, result)
Arguments
executed

A logical, whether or not the instructions were executed

execution_time

A difftime, the time taken to execute the instructions

result

An object, whatever is returned by executing the instructions


Method annotate()

Apply annotations to Segment

Usage
Segment$annotate(label = NULL, note = NULL)
Arguments
label

A short label for the segment

note

A description of what the segment does


Method clone()

The objects of this class are cloneable with this method.

Usage
Segment$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

See Also

Other segment: SegmentRecipe, SegmentSource


Segment

Description

A Segment object is automatically constructed and attached to the Pipeline when a call to ⁠make_*()⁠ is made. It stores the relationships between targets, dependencies, and sources.

Super class

makepipe::Segment -> SegmentRecipe

Public fields

recipe

A chunk of R code which makes the targets

Methods

Public methods

Inherited methods

Method new()

Initialise a new Segment

Usage
SegmentRecipe$new(
  id,
  recipe,
  targets,
  dependencies,
  packages,
  envir,
  force,
  executed,
  result,
  execution_time
)
Arguments
id

An integer that uniquely identifies the segment

recipe

A chunk of R code which makes the targets

targets

A character vector of paths to files

dependencies

A character vector of paths to files which the targets depend on

packages

A character vector of names of packages which targets depend on

envir

The environment in which to execute the instructions.

force

A logical determining whether or not execution of the source or recipe will be forced (i.e. happen whether or not the targets are out-of-date)

executed

A logical, whether or not the instructions were executed

result

An object, whatever is returned by executing the instructions

execution_time

A difftime, the time taken to execute the instructions


Method update_result()

Update the Segment with new execution information

Usage
SegmentRecipe$update_result(executed, execution_time, result)
Arguments
executed

A logical, whether or not the instructions were executed

execution_time

A difftime, the time taken to execute the instructions

result

An object, whatever is returned by executing the instructions


Method execute()

Execute the Segment

Usage
SegmentRecipe$execute(envir = NULL, quiet = getOption("makepipe.quiet"), ...)
Arguments
envir

The environment in which to execute the instructions.

quiet

A logical determining whether or not messages are signaled

...

Additional parameters to pass to base::eval()


Method clone()

The objects of this class are cloneable with this method.

Usage
SegmentRecipe$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

See Also

Other segment: SegmentSource, Segment


Segment

Description

A Segment object is automatically constructed and attached to the Pipeline when a call to ⁠make_*()⁠ is made. It stores the relationships between targets, dependencies, and sources.

Super class

makepipe::Segment -> SegmentSource

Public fields

source

The path to an R script which makes the targets

Methods

Public methods

Inherited methods

Method new()

Initialise a new Segment

Usage
SegmentSource$new(
  id,
  source,
  targets,
  dependencies,
  packages,
  envir,
  force,
  executed,
  result,
  execution_time
)
Arguments
id

An integer that uniquely identifies the segment

source

The path to an R script which makes the targets

targets

A character vector of paths to files

dependencies

A character vector of paths to files which the targets depend on

packages

A character vector of names of packages which targets depend on

envir

The environment in which to execute the instructions.

force

A logical determining whether or not execution of the source or recipe will be forced (i.e. happen whether or not the targets are out-of-date)

executed

A logical, whether or not the instructions were executed

result

An object, whatever is returned by executing the instructions

execution_time

A difftime, the time taken to execute the instructions


Method update_result()

Update the Segment with new execution information

Usage
SegmentSource$update_result(executed, execution_time, result)
Arguments
executed

A logical, whether or not the instructions were executed

execution_time

A difftime, the time taken to execute the instructions

result

An object, whatever is returned by executing the instructions


Method execute()

Execute the Segment

Usage
SegmentSource$execute(envir = NULL, quiet = getOption("makepipe.quiet"), ...)
Arguments
envir

The environment in which to execute the instructions.

quiet

A logical determining whether or not messages are signaled

...

Additional parameters to pass to base::source()


Method clone()

The objects of this class are cloneable with this method.

Usage
SegmentSource$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

See Also

Other segment: SegmentRecipe, Segment


Parameters for make-like functions

Description

Parameters for make-like functions

Arguments

source

The path to an R script which makes the targets

recipe

A chunk of R code which makes the targets

targets

A character vector of paths to files

dependencies

A character vector of paths to files which the targets depend on

packages

A character vector of names of packages which targets depend on

envir

The environment in which to execute the source or recipe. By default, execution will take place in a fresh environment whose parent is the calling environment.

quiet

A logical determining whether or not messages are signaled

force

A logical determining whether or not execution of the source or recipe will be forced (i.e. happen whether or not the targets are out-of-date)

label

A short label for the source or recipe, displayed in pipeline visualisations. If NULL, the basename(source) or 'Recipe' will be used.

build

A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user


Register objects to be returned from make_with_source

Description

It is sometimes useful to have access to certain objects which are generated as side-products in a source script which yields as a main-product one or more targets. Typically these objects are used for checking that the targets were produced as expected.

Usage

make_register(value, name, quiet = FALSE)

Arguments

value

A value to be registered in a source script and returned as part of the Segment

name

A variable name, given as a character string. No coercion is done, and the first element of a character vector of length greater than one will be used, with a warning.

quiet

A logical determining whether or not warnings are signaled when make_register() is called outside of a 'makepipe' pipeline

Value

value invisibly

Examples


## Not run: 
  # Imagine this is part of your source script:
  x <- readRDS("input.Rds")
  x <- do_stuff(x)
  chk <- do_check(x)
  make_register(chk, "x_check")
  saveRDS(x, "output.Rds")

  # You will have access to `chk` in your pipeline script:
  step_one <- make_with_source(
    "source.R",
    "output.Rds",
    "input.Rds",
  )
  step_one$result$chk

## End(Not run)

Create a pipeline using roxygen tags

Description

Instead of maintaining a separate pipeline script containing calls to make_with_source(), you can add roxygen-like headers to the .R files in your pipeline containing the ⁠@makepipe⁠ tag along with ⁠@targets⁠, ⁠@dependencies⁠, and so on. These tags will be parsed by make_with_dir() and used to construct a pipeline. You can call a specific part of the pipeline that has been documented in this way using make_with_roxy().

Usage

make_with_dir(
  dir = ".",
  recursive = FALSE,
  build = TRUE,
  envir = new.env(parent = parent.frame()),
  quiet = getOption("makepipe.quiet")
)

make_with_roxy(
  source,
  envir = new.env(parent = parent.frame()),
  quiet = getOption("makepipe.quiet"),
  build = TRUE
)

Arguments

dir

A character vector of full path names; the default corresponds to the working directory

recursive

A logical determining whether or not to recurse into subdirectories

build

A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user

envir

The environment in which to execute the source or recipe. By default, execution will take place in a fresh environment whose parent is the calling environment.

quiet

A logical determining whether or not messages are signaled

source

The path to an R script which makes the targets

Details

Other than ⁠@makepipe⁠, which is used to tell whether a given script should be included in the pipeline, the tags recognised mirror the arguments to make_with_source(). In particular,

See the getting started vignette for more information.

Value

A Pipeline object

See Also

Other make: make_with_recipe(), make_with_source()

Examples

## Not run: 
# Create a pipeline from scripts in the working dir without executing it
p <- make_with_dir(build = FALSE)
p$build() # Then execute it yourself

## End(Not run)

Make targets out of dependencies using a recipe

Description

Make targets out of dependencies using a recipe

Usage

make_with_recipe(
  recipe,
  targets,
  dependencies = NULL,
  packages = NULL,
  envir = new.env(parent = parent.frame()),
  quiet = getOption("makepipe.quiet"),
  force = FALSE,
  label = NULL,
  note = NULL,
  build = TRUE,
  ...
)

Arguments

recipe

A chunk of R code which makes the targets

targets

A character vector of paths to files

dependencies

A character vector of paths to files which the targets depend on

packages

A character vector of names of packages which targets depend on

envir

The environment in which to execute the source or recipe. By default, execution will take place in a fresh environment whose parent is the calling environment.

quiet

A logical determining whether or not messages are signaled

force

A logical determining whether or not execution of the source or recipe will be forced (i.e. happen whether or not the targets are out-of-date)

label

A short label for the source or recipe, displayed in pipeline visualisations. If NULL, the basename(source) or 'Recipe' will be used.

note

A description of what the recipe does, displayed in pipeline visualisations. If NULL, the recipe code is used.

build

A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user

...

Additional parameters to pass to base::eval()

Value

A Segment object containing execution metadata.

See Also

Other make: make_with_dir(), make_with_source()

Examples

## Not run: 
# Merge files in fresh environment if raw data has been updated since last
# merged
make_with_recipe(
  recipe = {
    dat <- readRDS("data/raw_data.Rds")
    pop <- readRDS("data/pop_data.Rds")
    merged_dat <- merge(dat, pop, by = "id")
    saveRDS(merged_dat, "data/merged_data.Rds")
  },
  targets = "data/merged_data.Rds",
  dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds")
)

# Merge files in current environment if raw data has been updated since last
# merged. (If recipe executed, all objects bound in source will be available
# in current env).
make_with_recipe(
  recipe = {
    dat <- readRDS("data/raw_data.Rds")
    pop <- readRDS("data/pop_data.Rds")
    merged_dat <- merge(dat, pop, by = "id")
    saveRDS(merged_dat, "data/merged_data.Rds")
  },
  targets = "data/merged_data.Rds",
  dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
  envir = environment()
)

# Merge files in global environment if raw data has been updated since last
# merged. (If source executed, all objects bound in source will be available
# in global env).
make_with_recipe(
  recipe = {
    dat <- readRDS("data/raw_data.Rds")
    pop <- readRDS("data/pop_data.Rds")
    merged_dat <- merge(dat, pop, by = "id")
    saveRDS(merged_dat, "data/merged_data.Rds")
  },
  targets = "data/merged_data.Rds",
  dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
  envir = globalenv()
)

## End(Not run)

Make targets out of dependencies using a source file

Description

Make targets out of dependencies using a source file

Usage

make_with_source(
  source,
  targets,
  dependencies = NULL,
  packages = NULL,
  envir = new.env(parent = parent.frame()),
  quiet = getOption("makepipe.quiet"),
  force = FALSE,
  label = NULL,
  note = NULL,
  build = TRUE,
  ...
)

Arguments

source

The path to an R script which makes the targets

targets

A character vector of paths to files

dependencies

A character vector of paths to files which the targets depend on

packages

A character vector of names of packages which targets depend on

envir

The environment in which to execute the source or recipe. By default, execution will take place in a fresh environment whose parent is the calling environment.

quiet

A logical determining whether or not messages are signaled

force

A logical determining whether or not execution of the source or recipe will be forced (i.e. happen whether or not the targets are out-of-date)

label

A short label for the source or recipe, displayed in pipeline visualisations. If NULL, the basename(source) or 'Recipe' will be used.

note

A description of what the source does, displayed in pipeline visualisations

build

A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user

...

Additional parameters to pass to base::source()

Value

A Segment object containing execution metadata.

See Also

Other make: make_with_dir(), make_with_recipe()

Examples

## Not run: 
# Merge files in fresh environment if raw data has been updated since last
# merged
make_with_source(
  source = "merge_data.R",
  targets = "data/merged_data.Rds",
  dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds")
)


# Merge files in current environment if raw data has been updated since last
# merged. (If source executed, all objects bound in source will be available
# in current env).
make_with_source(
  source = "merge_data.R",
  targets = "data/merged_data.Rds",
  dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
  envir = environment()
)


# Merge files in global environment if raw data has been updated since last
# merged. (If source executed, all objects bound in source will be available
# in global env).
make_with_source(
  source = "merge_data.R",
  targets = "data/merged_data.Rds",
  dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
  envir = globalenv()
)

## End(Not run)


Check if targets are out-of-date vis-a-vis their dependencies

Description

Check if targets are out-of-date vis-a-vis their dependencies

Usage

out_of_date(targets, dependencies, packages = NULL)

Arguments

targets

A character vector of paths to files

dependencies

A character vector of paths to files which the targets depend on

packages

A character vector of names of packages which targets depend on

Value

TRUE if any of targets are older than any of dependencies or if any of targets do not exist; FALSE otherwise

Examples

## Not run: 
out_of_date("data/processed_data.Rds", "data/raw_data.Rds")

## End(Not run)

Access and interface with Pipeline.

Description

get_pipeline(), set_pipeline() and reset_pipeline() access and modify the current active pipeline, while all other helper functions do not affect the active pipeline

Usage

is_pipeline(pipeline)

set_pipeline(pipeline)

get_pipeline()

reset_pipeline()

Arguments

pipeline

A pipeline. See Pipeline for more details.

See Also

Other pipeline: Pipeline, pipeline-vis

Examples

## Not run: 
# Build up a pipeline from scratch and save it out
reset_pipeline()
# A series of `make_with_*()` blocks go here...
saveRDS(get_pipeline(), "data/my_pipeline.Rds")

# ... Later on we can read in and set the pipeline
p <- readRDS("data/my_pipeline.Rds")
set_pipeline(p)

## End(Not run)

Visualise the Pipeline.

Description

Produce a flowchart visualisation of the pipeline. Out-of-date targets will be coloured red, up-to-date targets will be coloured green, and everything else will be blue.

Usage

show_pipeline(
  pipeline = get_pipeline(),
  as = c("nomnoml", "visnetwork", "text"),
  labels = NULL,
  notes = NULL,
  ...
)

save_pipeline(
  file,
  pipeline = get_pipeline(),
  as = c("nomnoml", "visnetwork", "text"),
  labels = NULL,
  notes = NULL,
  ...
)

Arguments

pipeline

A pipeline. See Pipeline for more details.

as

A string determining whether to use nomnoml or visNetwork

labels

A named character vector mapping nodes in the pipeline onto labels to display beside them.

notes

A named character vector mapping nodes in the Pipeline onto notes to display on beside the labels (nomnoml) or as tooltips (visNetwork).

...

Arguments passed onto Pipeline$nomnoml() or Pipeline$visnetwork

file

File to save png (nomnoml) or html (visnetwork) into

Details

Labels and notes must be supplied as named character vector where the names correspond to the filepaths of nodes (i.e. targets, dependencies, or source scripts)

See Also

Other pipeline: Pipeline, pipeline-accessors

Examples

## Not run: 
# Run pipeline
make_with_source(
  "recode.R",
  "data/0 raw_data.R",
  "data/1 data.R"
)
make_with_source(
  "merge.R",
  c("data/1 data.R", "data/0 raw_pop.R"),
  "data/2 data.R"
)

# Visualise pipeline with custom notes
show_pipeline(notes = c(
  "data/0 raw_data.R" = "Raw survey data",
  "data/0 raw_pop.R" = "Raw population data",
  "data/1 data.R" = "Survey data with recodes applied",
  "data/2 data.R" = "Survey data with demographic variables merged in"
))

## End(Not run)