Jsonnet Pipeline Specifications
Jsonnet pipeline specifications are a way to parameterize your pipeline specification files, adding a dynamic component to their creation and maintenance. This is achieved by using the open-source Jsonnet data templating language to wrap the baseline of a JSON pipeline specification file into a function, which allows the injection of parameters to a pipeline specification file.
You can use Jsonnet pipeline specs to both create and update pipelines.
Benefits #
- Parameterization: Pass parameters to your pipeline specification files, making them dynamic and reusable.
- Code Reuse: Reuse the baseline of a given pipeline spec while experimenting with various values of given fields.
- Modularity: Create a library of pipeline specifications that can be instantiated with different parameters.
- Readability: Write more concise and readable pipeline specifications.
- Flexibility: Create multiple pipelines at once from a single file.
- Ease of Maintenance: Maintain a single file for multiple pipelines, reducing the number of files you need to manage.
Use Cases #
- Parameterizing Input Repositories: Pass different input repositories to the same pipeline specification file.
- Parameterizing Image Tags: Pass different image tags to the same pipeline specification file.
- Parameterizing Pipeline Names: Pass different pipeline names to the same pipeline specification file.
- Parameterizing Pipeline Descriptions: Pass different pipeline descriptions to the same pipeline specification file.
- Parameterizing Transform Commands: Pass different transform commands to the same pipeline specification file.
- Parameterizing Transform Images: Pass different transform images to the same pipeline specification file.
- Parameterizing Input Globs: Pass different input globs to the same pipeline specification file.
Before You Start #
Jsonnet pipeline specifications
is an experimental feature.All jsonnet pipeline specs have a .jsonnet
extension. Read Jsonnet’s complete standard library documentation to learn about all the variables types, string manipulation and mathematical functions, or assertions available to you.
At the minimum, your function should always have a parameter that acts as a pipeline.name
modifier. HPE Machine Learning Data Management’s pipeline names are unique.
You can quickly generate several pipelines from the same jsonnet pipeline specification file by adding a prefix or a suffix to its generic name.
CLI #
Creating pipelines from pipeline specs utilizing jsonnet in the CLI requires providing a function with named arguments that represent the parameters you want to pass to the pipeline spec.
// comments are arbitrary but recommended to describe the function
// arg1: description of arg1
// arg2: description of arg2
function(arg1, arg2, ... )
{
...
}
pachctl create pipeline --jsonnet jsonnet/example.jsonnet --arg arg1=foo --arg arg2=bar
Examples #
Parameterizing Pipeline Name & Input Repo #
The following example enables you to:
- Pass in a value for the
name
attribute as thesuffix
parameter to create a unique pipeline name. - Pass in a value for the
repo
attribute as thesrc
parameter to specify the input repository.
# edges.jsonnet
////
// Template arguments:
//
// suffix : An arbitrary suffix appended to the name of this pipeline, for
// disambiguation when multiple instances are created.
// src : the repo from which this pipeline will read the images to which
// it applies edge detection.
////
function(suffix, src)
{
pipeline: { name: "edges-"+suffix },
description: "OpenCV edge detection on "+src,
input: {
pfs: {
name: "images",
glob: "/*",
repo: src,
}
},
transform: {
cmd: [ "python3", "/edges.py" ],
image: "pachyderm/opencv:0.0.1"
}
}
pachctl create pipeline --jsonnet jsonnet/edges.jsonnet --arg suffix=1 --arg src=images
Console #
Creating pipelines from pipeline specs utilizing jsonnet in Console requires adhering to the following required template structure in YAML format:
/*
title: Required title of the pipeline
description: "Optional description of the pipeline"
args: # Required array that tells console what fields to present to the user.
- name: arg1
description: description of arg1
type: string
- name: arg2
description: description of arg2
type: string
default: Optional default value to display upon creating an instance of this pipeline
*/
function(arg1, arg2, ... )
Examples #
Parameterizing Pipeline Name & Input Repo #
- Create an
edges.jsonnet
file like the following:/* title: Image edges description: "Simple example pipeline." args: - name: suffix description: Pipeline name suffix type: string default: 1 - name: src description: Input repo to pipeline. type: string default: test */ function(suffix, src) { pipeline: { name: "edges-"+suffix }, description: "OpenCV edge detection on "+src, input: { pfs: { name: "images", glob: "/*", repo: src, } }, transform: { cmd: [ "python3", "/edges.py" ], image: "pachyderm/opencv:0.0.1" } }
- Save the jsonnet pipline spec file at an accessible location.
- Authenticate to HPE Machine Learning Data Management or access Console via Localhost.
- Scroll through the project list to find a project you want to view.
- Select View Project.
- Select Create > Pipeline from template from the sidebar.
- Provide a valid path to the pipeline spec file.
- Select Continue.
- Fill out any populated fields from the pipeline spec file and verify if default values are correct.
- Select Create Pipeline.