Snapshots
This article explains how to create and manage snapshots of your HPE Machine Learning Data Management cluster. Creating snapshots is essential before performing cluster upgrades to ensure you can recover your cluster state if needed.
Understanding Snapshots #
A HPE Machine Learning Data Management snapshot is a complete backup of your cluster state, consisting of:
- Postgres Snapshot: A
pg_dump
of the database containing cluster state information - ChunkSet: A collection of live data chunks from object storage at the time of backup
You can use these snapshots to restore your cluster to a previous state during Helm upgrades.
helm upgrade pachd pachyderm/pachyderm \
--set restoreSnapshot.enabled=true \
--set restoreSnapshot.snapshot_id=42 \
--reuse-values
How to Manage Snapshots #
Create a Snapshot #
- Run
pachctl create snapshot
. - Obtain the ID of the snapshot by running
pachctl list snapshot
.ID CHUNKSET CREATED 2 2 3 seconds ago 1 1 58 seconds ago
List Available Snapshots #
If you create snapshots routinely, you can list all available snapshots by running pachctl list snapshot
.
pachctl list snapshot
Inspect a Snapshot #
You can inspect snapshots to verify the contents by running pachctl inspect snapshot <SNAPSHOT_ID>
.
pachctl inspect snapshot <SNAPSHOT_ID>
ID: 1
Chunkset: 1
Created: 2 minutes ago
Version: v2.12.0
Fileset: fa492645343dff8e0ed7f6685c8b6228.d724eb2b1bf365148b8f4ffa746a50cfc2446f7ca5fa713f4ef8a757b0a792fe
Delete a Snapshot #
Typically, you should delete snapshots after successfully upgrading your cluster.
pachctl delete snapshot <SNAPSHOT_ID>