The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
Let’s start where we left off in the Get started with pipeflow vignette, that is, we have the following pipeline
pip
# <pipeflow_pip> my-pip (4 steps)
# -------------------------------
# step depends out state
# 1: data <data.frame[10x6]> done
# 2: data_prep data <data.frame[10x7]> done
# 3: model_fit data_prep <lm[13]> done
# 4: model_plot model_fit,data_prep <ggplot2::ggplot> donewith the following set data
Let’s say we want to insert a new step after the
data_prep step that standardizes the y-variable.
pip |> pip_add(
"standardize",
function(
data = ~data_prep,
yVar = "Ozone"
) {
data[, yVar] <- scale(data[, yVar])
data
},
after = "data_prep"
)pip
# <pipeflow_pip> my-pip (5 steps)
# -------------------------------
# step depends out state
# 1: data <data.frame[10x6]> done
# 2: data_prep data <data.frame[10x7]> done
# 3: standardize data_prep [NULL] new
# 4: model_fit data_prep <lm[13]> done
# 5: model_plot model_fit,data_prep <ggplot2::ggplot> doneThe standardize step is now part of the pipeline, but so
far it is not used by any other step.
Let’s revisit the function definition of the model_fit
step
pip[["model_fit", "fun"]]
# function (data = ~data_prep, xVar = "Solar.R")
# {
# lm(paste("Ozone ~", xVar), data = data)
# }To use the standardized data, we need to change the data dependency
such that it refers to the standardize step. Also instead
of a fixed y-variable in the model, let’s pass it as a parameter.
pip |> pip_replace(
"model_fit",
function(
data = ~standardize, # <- changed data reference
xVar = "Temp.Celsius",
yVar = "Ozone" # <- new y-variable
) {
lm(paste(yVar, "~", xVar), data = data)
}
)The model_plot step needs to be updated in a similar
way.
pip |> pip_replace(
"model_plot",
function(
model = ~model_fit,
data = ~standardize, # <- changed data reference
xVar = "Temp.Celsius",
yVar = "Ozone", # <- new y-variable
title = "Linear model fit"
) {
coeffs <- coefficients(model)
ggplot(data) +
geom_point(aes(.data[[xVar]], .data[[yVar]])) +
geom_abline(intercept = coeffs[1], slope = coeffs[2]) +
labs(title = title)
}
)The updated pipeline now looks as follows.
pip
# <pipeflow_pip> my-pip (5 steps)
# -------------------------------
# step depends out state
# 1: data <data.frame[10x6]> done
# 2: data_prep data <data.frame[10x7]> done
# 3: standardize data_prep [NULL] new
# 4: model_fit standardize [NULL] new
# 5: model_plot model_fit,standardize [NULL] newWe see that the model_fit and model_plot
steps now use (i.e., depend on) the standardized data. Let’s re-run the
pipeline and inspect the output.
pip_set_params(pip, params = list(xVar = "Solar.R", yVar = "Wind"))
pip_run(pip)
# info [2026-06-14 20:27:06.039 UTC]: Start run of pipeflow_pip 'my-pip'
# info [2026-06-14 20:27:06.039 UTC]: Step 1/5 data - skipping done step
# info [2026-06-14 20:27:06.039 UTC]: Step 2/5 data_prep - skipping done step
# info [2026-06-14 20:27:06.039 UTC]: Step 3/5 standardize
# info [2026-06-14 20:27:06.042 UTC]: Step 4/5 model_fit
# info [2026-06-14 20:27:06.044 UTC]: Step 5/5 model_plot
# info [2026-06-14 20:27:06.053 UTC]: Finished run of pipeflow_pip 'my-pip'Let’s see the pipeline again.
pip
# <pipeflow_pip> my-pip (5 steps)
# -------------------------------
# step depends out state
# 1: data <data.frame[10x6]> done
# 2: data_prep data <data.frame[10x7]> done
# 3: standardize data_prep <data.frame[10x7]> done
# 4: model_fit standardize <lm[13]> done
# 5: model_plot model_fit,standardize <ggplot2::ggplot> doneWhen you are trying to remove a step, {pipeflow} by default checks if the step is used by any other step, and raises an error if removing the step would violate the integrity of the pipeline.
try(pip_remove(pip, "standardize"))
# Error in pip_remove(pip, "standardize") :
# cannot remove step 'standardize' because the following steps depend on it: 'model_fit', 'model_plot'To enforce removing a step together with all its downstream
dependencies, you can use the recursive argument.
pip_remove(pip, "standardize", recursive = TRUE)
# Removing step 'standardize' and its downstream dependencies: 'model_fit', 'model_plot'pip
# <pipeflow_pip> my-pip (2 steps)
# -------------------------------
# step depends out state
# 1: data <data.frame[10x6]> done
# 2: data_prep data <data.frame[10x7]> doneNaturally, the last step never has any downstream dependencies, so it can be removed without any issues.
pip
# <pipeflow_pip> my-pip (1 step)
# ------------------------------
# step depends out state
# 1: data <data.frame[10x6]> doneReplacing steps in a pipeline as shown in this vignette will allow to re-use existing pipelines and adapt them programmatically to new requirements. Another way of re-using pipelines is to combine them, which is shown in the Combining pipelines vignette.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.