Title: | Pipeline Tools Inspired by 'GNU Make' |
Version: | 0.2.2 |
Description: | A suite of tools for transforming an existing workflow into a self-documenting pipeline with very minimal upfront costs. Segments of the pipeline are specified in much the same way a 'Make' rule is, by declaring an executable recipe (which might be an R script), along with the corresponding targets and dependencies. When the entire pipeline is run through, only those recipes that need to be executed will be. Meanwhile, execution metadata is captured behind the scenes for later inspection. |
License: | GPL (≥ 3) |
URL: | https://kinto-b.github.io/makepipe/, https://github.com/kinto-b/makepipe |
BugReports: | https://github.com/kinto-b/makepipe/issues |
Imports: | cli, nomnoml, R6, utils, roxygen2 |
Suggests: | knitr, covr, testthat (≥ 3.0.0), withr, rmarkdown, webshot2, visNetwork, |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.3 |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2025-01-05 16:50:57 UTC; kinto |
Author: | Kinto Behr [aut, cre, cph] |
Maintainer: | Kinto Behr <kinto.behr@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-01-07 10:30:02 UTC |
Pipeline visualisations
Description
A Pipeline object is automatically constructed as calls to
make_*()
are made. It stores the relationships between targets,
dependencies, and sources.
Public fields
segments
A list of
Segment
objects
Methods
Public methods
Method add_source_segment()
Add an edge to edges
Add any nodes in private$edges
that are missing from
private$nodes
into private$nodes
Reconstruct Pipeline edges from Segment edges. Called primarily to update outofdateness
Add a pipeline segment corresponding to a make_with_source()
call
Usage
Pipeline$add_source_segment( source, targets, dependencies, packages, envir, force )
Arguments
source
The path to an R script which makes the
targets
targets
A character vector of paths to files
dependencies
A character vector of paths to files which the
targets
depend onpackages
A character vector of names of packages which
targets
depend onenvir
The environment in which to execute the
source
orrecipe
.force
A logical determining whether or not execution of the
source
orrecipe
will be forced (i.e. happen whether or not the targets are out-of-date)new_edge
An data.frame constructed with
new_edge()
Returns
The SegmentSource
added to the Pipeline
Method add_recipe_segment()
Add a pipeline segment corresponding to a make_with_recipe()
call
Usage
Pipeline$add_recipe_segment( recipe, targets, dependencies, packages, envir, force )
Arguments
recipe
A language object which, when evaluated, makes the
targets
targets
A character vector of paths to files
dependencies
A character vector of paths to files which the
targets
depend onpackages
A character vector of names of packages which
targets
depend onenvir
The environment in which to execute the
source
orrecipe
.force
A logical determining whether or not execution of the
source
orrecipe
will be forced (i.e. happen whether or not the targets are out-of-date)
Returns
The SegmentRecipe
added to the Pipeline
Method build()
Build all targets
Usage
Pipeline$build(quiet = getOption("makepipe.quiet"))
Arguments
quiet
A logical determining whether or not messages are signaled
Returns
self
Method clean()
Clean all targets
Usage
Pipeline$clean()
Returns
self
Method touch()
Touch all targets, updating file modification time to current time. This is useful when you know your targets are all up-to-date but makepipe doesn't (e.g. after a negligible change was made to your source code).
Usage
Pipeline$touch()
Returns
self
Method annotate()
Apply annotations to Pipeline
Usage
Pipeline$annotate(labels = NULL, notes = NULL)
Arguments
labels
A named character vector mapping nodes in the
Pipeline
onto labels to display beside them.notes
A named character vector mapping nodes in the
Pipeline
onto notes to display on beside the labels (nomnoml) or as tooltips (visNetwork).
Method refresh()
Refresh Pipeline to check outofdateness
Usage
Pipeline$refresh()
Method nomnoml()
Display the pipeline with nomnoml
Usage
Pipeline$nomnoml( direction = c("down", "right"), arrow_size = 1, edge_style = c("hard", "rounded"), bend_size = 0.3, font = "Courier", font_size = 12, line_width = 3, padding = 16, spacing = 40, leading = 1.25, stroke = "#33322E", fill_arrows = FALSE, gutter = 5, edge_margin = 0 )
Arguments
direction
The direction the flowchart should go in
arrow_size
The arrowhead size
edge_style
The arrow edge style
bend_size
The degree of rounding in the arrows (requires
edge_style=rounded
)font
The name of a font to use
font_size
The font size
line_width
The line width for arrows and box outlines
padding
The amount of padding within boxes
spacing
The amount of spacing between boxes,
leading
The amount of spacing between lines of text
stroke
The color of arrows, text, and box outlines
fill_arrows
Whether arrow heads are full triangles (
TRUE
) or angled (FALSE
)gutter
The amount space to leave around the flowchart
edge_margin
The amount of space to leave between boxes and arrows
Returns
self
Method visnetwork()
Display the pipeline with nomnoml
Usage
Pipeline$visnetwork(...)
Arguments
...
Arguments (other than
nodes
andedges
) to pass tovisNetwork::visNetwork()
Returns
self
Method text_summary()
Display a text summary of the pipeline
Usage
Pipeline$text_summary()
Returns
self
Method print()
Display
Usage
Pipeline$print(...)
Arguments
...
Arguments (other than
nodes
andedges
) to pass tovisNetwork::visNetwork()
Returns
self
Method save_visnetwork()
Save pipeline visNetwork
Usage
Pipeline$save_visnetwork(file, selfcontained = TRUE, background = "white", ...)
Arguments
file
File to save HTML into
selfcontained
Whether to save the HTML as a single self-contained file (with external resources base64 encoded) or a file with external resources placed in an adjacent directory.
background
Text string giving the html background color of the widget. Defaults to white.
...
Arguments (other than
nodes
andedges
) to pass tovisNetwork::visNetwork()
Returns
self
Method save_nomnoml()
Save pipeline nomnoml
Usage
Pipeline$save_nomnoml(file, width = NULL, height = NULL, ...)
Arguments
file
File to save the png into
width
Image width
height
Image height
...
Arguments to pass to
self$nomnoml()
Returns
self
Method save_text_summary()
Save a text summary of the pipeline
Usage
Pipeline$save_text_summary(file)
Arguments
file
File to save text summary into
Returns
self
Method clone()
The objects of this class are cloneable with this method.
Usage
Pipeline$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
See Also
Other pipeline:
pipeline-accessors
,
pipeline-vis
Segment
Description
A Segment object is automatically constructed and attached to
the Pipeline when a call to make_*()
is made. It stores the relationships
between targets, dependencies, and sources.
Public fields
targets
A character vector of paths to files
dependencies
A character vector of paths to files which the
targets
depend onpackages
A character vector of names of packages which
targets
depend onforce
A logical determining whether or not execution of the
source
orrecipe
will be forced (i.e. happen whether or not the targets are out-of-date)envir
The environment in which to execute the instructions.
result
An object, whatever is returned by executing the instructions
executed
A logical, whether or not the instructions were executed
execution_time
A difftime, the time taken to execute the instructions
label
A short label for the segment
note
A description of what the segment does
Active bindings
edges
Get edges connecting the dependencies, instructions, and targets
nodes
Get nodes corresponding to dependencies, instructions, and targets
text_summary
A plain text summary of the Segment
Methods
Public methods
Method new()
Initialise a new Segment
Usage
Segment$new( id, targets, dependencies, packages, envir, force, executed, result, execution_time )
Arguments
id
An integer that uniquely identifies the segment
targets
A character vector of paths to files
dependencies
A character vector of paths to files which the
targets
depend onpackages
A character vector of names of packages which
targets
depend onenvir
The environment in which to execute the instructions.
force
A logical determining whether or not execution of the
source
orrecipe
will be forced (i.e. happen whether or not the targets are out-of-date)executed
A logical, whether or not the instructions were executed
result
An object, whatever is returned by executing the instructions
execution_time
A difftime, the time taken to execute the instructions
Method print()
Printing method
Usage
Segment$print()
Method update_result()
Update the Segment with new execution information
Usage
Segment$update_result(executed, execution_time, result)
Arguments
executed
A logical, whether or not the instructions were executed
execution_time
A difftime, the time taken to execute the instructions
result
An object, whatever is returned by executing the instructions
Method annotate()
Apply annotations to Segment
Usage
Segment$annotate(label = NULL, note = NULL)
Arguments
label
A short label for the segment
note
A description of what the segment does
Method clone()
The objects of this class are cloneable with this method.
Usage
Segment$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
See Also
Other segment:
SegmentRecipe
,
SegmentSource
Segment
Description
A Segment object is automatically constructed and attached to
the Pipeline when a call to make_*()
is made. It stores the relationships
between targets, dependencies, and sources.
Super class
makepipe::Segment
-> SegmentRecipe
Public fields
recipe
A chunk of R code which makes the
targets
Methods
Public methods
Inherited methods
Method new()
Initialise a new Segment
Usage
SegmentRecipe$new( id, recipe, targets, dependencies, packages, envir, force, executed, result, execution_time )
Arguments
id
An integer that uniquely identifies the segment
recipe
A chunk of R code which makes the
targets
targets
A character vector of paths to files
dependencies
A character vector of paths to files which the
targets
depend onpackages
A character vector of names of packages which
targets
depend onenvir
The environment in which to execute the instructions.
force
A logical determining whether or not execution of the
source
orrecipe
will be forced (i.e. happen whether or not the targets are out-of-date)executed
A logical, whether or not the instructions were executed
result
An object, whatever is returned by executing the instructions
execution_time
A difftime, the time taken to execute the instructions
Method update_result()
Update the Segment with new execution information
Usage
SegmentRecipe$update_result(executed, execution_time, result)
Arguments
executed
A logical, whether or not the instructions were executed
execution_time
A difftime, the time taken to execute the instructions
result
An object, whatever is returned by executing the instructions
Method execute()
Execute the Segment
Usage
SegmentRecipe$execute(envir = NULL, quiet = getOption("makepipe.quiet"), ...)
Arguments
envir
The environment in which to execute the instructions.
quiet
A logical determining whether or not messages are signaled
...
Additional parameters to pass to
base::eval()
Method clone()
The objects of this class are cloneable with this method.
Usage
SegmentRecipe$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
See Also
Other segment:
SegmentSource
,
Segment
Segment
Description
A Segment object is automatically constructed and attached to
the Pipeline when a call to make_*()
is made. It stores the relationships
between targets, dependencies, and sources.
Super class
makepipe::Segment
-> SegmentSource
Public fields
source
The path to an R script which makes the
targets
Methods
Public methods
Inherited methods
Method new()
Initialise a new Segment
Usage
SegmentSource$new( id, source, targets, dependencies, packages, envir, force, executed, result, execution_time )
Arguments
id
An integer that uniquely identifies the segment
source
The path to an R script which makes the
targets
targets
A character vector of paths to files
dependencies
A character vector of paths to files which the
targets
depend onpackages
A character vector of names of packages which
targets
depend onenvir
The environment in which to execute the instructions.
force
A logical determining whether or not execution of the
source
orrecipe
will be forced (i.e. happen whether or not the targets are out-of-date)executed
A logical, whether or not the instructions were executed
result
An object, whatever is returned by executing the instructions
execution_time
A difftime, the time taken to execute the instructions
Method update_result()
Update the Segment with new execution information
Usage
SegmentSource$update_result(executed, execution_time, result)
Arguments
executed
A logical, whether or not the instructions were executed
execution_time
A difftime, the time taken to execute the instructions
result
An object, whatever is returned by executing the instructions
Method execute()
Execute the Segment
Usage
SegmentSource$execute(envir = NULL, quiet = getOption("makepipe.quiet"), ...)
Arguments
envir
The environment in which to execute the instructions.
quiet
A logical determining whether or not messages are signaled
...
Additional parameters to pass to
base::source()
Method clone()
The objects of this class are cloneable with this method.
Usage
SegmentSource$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
See Also
Other segment:
SegmentRecipe
,
Segment
Parameters for make-like functions
Description
Parameters for make-like functions
Arguments
source |
The path to an R script which makes the |
recipe |
A chunk of R code which makes the |
targets |
A character vector of paths to files |
dependencies |
A character vector of paths to files which the |
packages |
A character vector of names of packages which |
envir |
The environment in which to execute the |
quiet |
A logical determining whether or not messages are signaled |
force |
A logical determining whether or not execution of the |
label |
A short label for the |
build |
A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user |
Register objects to be returned from make_with_source
Description
It is sometimes useful to have access to certain objects which are generated as side-products in a source script which yields as a main-product one or more targets. Typically these objects are used for checking that the targets were produced as expected.
Usage
make_register(value, name, quiet = FALSE)
Arguments
value |
A value to be registered in a source script and returned as part
of the |
name |
A variable name, given as a character string. No coercion is done, and the first element of a character vector of length greater than one will be used, with a warning. |
quiet |
A logical determining whether or not warnings are signaled when
|
Value
value
invisibly
Examples
## Not run:
# Imagine this is part of your source script:
x <- readRDS("input.Rds")
x <- do_stuff(x)
chk <- do_check(x)
make_register(chk, "x_check")
saveRDS(x, "output.Rds")
# You will have access to `chk` in your pipeline script:
step_one <- make_with_source(
"source.R",
"output.Rds",
"input.Rds",
)
step_one$result$chk
## End(Not run)
Create a pipeline using roxygen tags
Description
Instead of maintaining a separate pipeline script containing calls to
make_with_source()
, you can add roxygen-like headers to the .R files in
your pipeline containing the @makepipe
tag along with @targets
,
@dependencies
, and so on. These tags will be parsed by make_with_dir()
and used to construct a pipeline. You can call a specific part of the
pipeline that has been documented in this way using make_with_roxy()
.
Usage
make_with_dir(
dir = ".",
recursive = FALSE,
build = TRUE,
envir = new.env(parent = parent.frame()),
quiet = getOption("makepipe.quiet")
)
make_with_roxy(
source,
envir = new.env(parent = parent.frame()),
quiet = getOption("makepipe.quiet"),
build = TRUE
)
Arguments
dir |
A character vector of full path names; the default corresponds to the working directory |
recursive |
A logical determining whether or not to recurse into subdirectories |
build |
A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user |
envir |
The environment in which to execute the |
quiet |
A logical determining whether or not messages are signaled |
source |
The path to an R script which makes the |
Details
Other than @makepipe
, which is used to tell whether a given script should
be included in the pipeline, the tags recognised mirror the arguments to
make_with_source()
. In particular,
-
@targets
and@dependencies
are for declaring inputs and outputs, the expected format is a comma separated list of strings like@targets "out1.Rds", "out2.Rds"
but R code like@targets file.path(DIR, "out.Rds")
(evaluated inenvir
) works too -
@packages
is for declaring the packages that the targets depend on, the expected format is@packages pkg1 pkg2 etc
-
@force
is for declaring whether or not execution should be forced, the expected format is a logical likeTRUE
orFALSE
See the getting started vignette for more information.
Value
A Pipeline
object
See Also
Other make:
make_with_recipe()
,
make_with_source()
Examples
## Not run:
# Create a pipeline from scripts in the working dir without executing it
p <- make_with_dir(build = FALSE)
p$build() # Then execute it yourself
## End(Not run)
Make targets out of dependencies using a recipe
Description
Make targets out of dependencies using a recipe
Usage
make_with_recipe(
recipe,
targets,
dependencies = NULL,
packages = NULL,
envir = new.env(parent = parent.frame()),
quiet = getOption("makepipe.quiet"),
force = FALSE,
label = NULL,
note = NULL,
build = TRUE,
...
)
Arguments
recipe |
A chunk of R code which makes the |
targets |
A character vector of paths to files |
dependencies |
A character vector of paths to files which the |
packages |
A character vector of names of packages which |
envir |
The environment in which to execute the |
quiet |
A logical determining whether or not messages are signaled |
force |
A logical determining whether or not execution of the |
label |
A short label for the |
note |
A description of what the |
build |
A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user |
... |
Additional parameters to pass to |
Value
A Segment
object containing execution metadata.
See Also
Other make:
make_with_dir()
,
make_with_source()
Examples
## Not run:
# Merge files in fresh environment if raw data has been updated since last
# merged
make_with_recipe(
recipe = {
dat <- readRDS("data/raw_data.Rds")
pop <- readRDS("data/pop_data.Rds")
merged_dat <- merge(dat, pop, by = "id")
saveRDS(merged_dat, "data/merged_data.Rds")
},
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds")
)
# Merge files in current environment if raw data has been updated since last
# merged. (If recipe executed, all objects bound in source will be available
# in current env).
make_with_recipe(
recipe = {
dat <- readRDS("data/raw_data.Rds")
pop <- readRDS("data/pop_data.Rds")
merged_dat <- merge(dat, pop, by = "id")
saveRDS(merged_dat, "data/merged_data.Rds")
},
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
envir = environment()
)
# Merge files in global environment if raw data has been updated since last
# merged. (If source executed, all objects bound in source will be available
# in global env).
make_with_recipe(
recipe = {
dat <- readRDS("data/raw_data.Rds")
pop <- readRDS("data/pop_data.Rds")
merged_dat <- merge(dat, pop, by = "id")
saveRDS(merged_dat, "data/merged_data.Rds")
},
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
envir = globalenv()
)
## End(Not run)
Make targets out of dependencies using a source file
Description
Make targets out of dependencies using a source file
Usage
make_with_source(
source,
targets,
dependencies = NULL,
packages = NULL,
envir = new.env(parent = parent.frame()),
quiet = getOption("makepipe.quiet"),
force = FALSE,
label = NULL,
note = NULL,
build = TRUE,
...
)
Arguments
source |
The path to an R script which makes the |
targets |
A character vector of paths to files |
dependencies |
A character vector of paths to files which the |
packages |
A character vector of names of packages which |
envir |
The environment in which to execute the |
quiet |
A logical determining whether or not messages are signaled |
force |
A logical determining whether or not execution of the |
label |
A short label for the |
note |
A description of what the |
build |
A logical determining whether or not the pipeline/segment will be built immediately or simply returned to the user |
... |
Additional parameters to pass to |
Value
A Segment
object containing execution metadata.
See Also
Other make:
make_with_dir()
,
make_with_recipe()
Examples
## Not run:
# Merge files in fresh environment if raw data has been updated since last
# merged
make_with_source(
source = "merge_data.R",
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds")
)
# Merge files in current environment if raw data has been updated since last
# merged. (If source executed, all objects bound in source will be available
# in current env).
make_with_source(
source = "merge_data.R",
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
envir = environment()
)
# Merge files in global environment if raw data has been updated since last
# merged. (If source executed, all objects bound in source will be available
# in global env).
make_with_source(
source = "merge_data.R",
targets = "data/merged_data.Rds",
dependencies = c("data/raw_data.Rds", "data/raw_pop.Rds"),
envir = globalenv()
)
## End(Not run)
Check if targets are out-of-date vis-a-vis their dependencies
Description
Check if targets are out-of-date vis-a-vis their dependencies
Usage
out_of_date(targets, dependencies, packages = NULL)
Arguments
targets |
A character vector of paths to files |
dependencies |
A character vector of paths to files which the |
packages |
A character vector of names of packages which |
Value
TRUE
if any of targets
are older than any of dependencies
or if
any of targets
do not exist; FALSE
otherwise
Examples
## Not run:
out_of_date("data/processed_data.Rds", "data/raw_data.Rds")
## End(Not run)
Access and interface with Pipeline.
Description
get_pipeline()
, set_pipeline()
and reset_pipeline()
access and modify
the current active pipeline, while all other helper functions do not affect
the active pipeline
Usage
is_pipeline(pipeline)
set_pipeline(pipeline)
get_pipeline()
reset_pipeline()
Arguments
pipeline |
A pipeline. See Pipeline for more details. |
See Also
Other pipeline:
Pipeline
,
pipeline-vis
Examples
## Not run:
# Build up a pipeline from scratch and save it out
reset_pipeline()
# A series of `make_with_*()` blocks go here...
saveRDS(get_pipeline(), "data/my_pipeline.Rds")
# ... Later on we can read in and set the pipeline
p <- readRDS("data/my_pipeline.Rds")
set_pipeline(p)
## End(Not run)
Visualise the Pipeline.
Description
Produce a flowchart visualisation of the pipeline. Out-of-date targets will be coloured red, up-to-date targets will be coloured green, and everything else will be blue.
Usage
show_pipeline(
pipeline = get_pipeline(),
as = c("nomnoml", "visnetwork", "text"),
labels = NULL,
notes = NULL,
...
)
save_pipeline(
file,
pipeline = get_pipeline(),
as = c("nomnoml", "visnetwork", "text"),
labels = NULL,
notes = NULL,
...
)
Arguments
pipeline |
A pipeline. See Pipeline for more details. |
as |
A string determining whether to use |
labels |
A named character vector mapping nodes in the |
notes |
A named character vector mapping nodes in the |
... |
Arguments passed onto |
file |
File to save png (nomnoml) or html (visnetwork) into |
Details
Labels and notes must be supplied as named character vector where the
names correspond to the filepaths of nodes (i.e. targets
, dependencies
,
or source
scripts)
See Also
Other pipeline:
Pipeline
,
pipeline-accessors
Examples
## Not run:
# Run pipeline
make_with_source(
"recode.R",
"data/0 raw_data.R",
"data/1 data.R"
)
make_with_source(
"merge.R",
c("data/1 data.R", "data/0 raw_pop.R"),
"data/2 data.R"
)
# Visualise pipeline with custom notes
show_pipeline(notes = c(
"data/0 raw_data.R" = "Raw survey data",
"data/0 raw_pop.R" = "Raw population data",
"data/1 data.R" = "Survey data with recodes applied",
"data/2 data.R" = "Survey data with demographic variables merged in"
))
## End(Not run)