Manage Resource Templates
HPE Machine Learning Inferencing Software includes default resource templates that users can select when adding or editing a packaged model. You can manage these resource templates and create new ones using the MLIS UI, CLI, or API.
Before You Start #
- You must have the Admin or Maintainer user role to manage resource templates.
- Ensure your cluster has sufficient resources and appropriate GPU types to support the resource templates you create.
- Active deployments using an updated resource template will apply the new settings during the next canary rollout or upon re-deployment.
Default Resource Templates #
The following table shows the default resource templates available in MLIS.
Name | Description | Request CPU | Request Memory | Request GPU | Limit CPU | Limit Memory | Limit GPU |
---|---|---|---|---|---|---|---|
cpu-tiny | 1 cpu, 10Gi memory, no gpu per replica | 1 | 10Gi | 1 | 10Gi | ||
cpu-small | 4 cpu, 20Gi memory, no gpu per replica | 4 | 20Gi | 6 | 40Gi | ||
cpu-large | 8 cpu, 40Gi memory, no gpu per replica | 8 | 40Gi | 10 | 60Gi | ||
gpu-tiny | 1 cpu, 10Gi, 1 gpu per replica | 1 | 10Gi | 1 | 1 | 10Gi | 1 |
gpu-small | 2 cpu, 20Gi, 2 gpu per replica | 2 | 20Gi | 2 | 6 | 40Gi | 2 |
gpu-large | 8 cpu, 40Gi, 4 gpu per replica | 8 | 40Gi | 4 | 10 | 60Gi | 4 |
How to Add Resource Templates #
Via the UI #
-
In the MLIS UI, navigate to Settings > Resource templates.
-
Select Add new resource template.
-
Input the name, description, and resource requirements for the template.
Field Description Template name A unique identifier for the resource template Description A brief explanation of the template’s purpose or characteristics CPU request The minimum CPU resources required for the packaged model to operate CPU limit The maximum CPU resources allowed for handling traffic spikes Memory request The minimum memory resources required for the packaged model to operate Memory limit The maximum memory resources allowed for handling traffic spikes GPU request The minimum GPU resources required for the packaged model to operate GPU limit The maximum GPU resources allowed for handling traffic spikes GPU type The specific GPU model required for the packaged model(e.g., NVIDIA A100).
Specifying a GPU type requires heterogenous GPU support be enabled. -
Select Create template.
The new template is now available from the Resource Template dropdown on the Resources tab when adding or editing a packaged model. To update this template, select the ellipsis icon next to the template name and choose Edit.
Via the CLI #
- Sign in via the CLI.
aioli user login admin
- Add a new resource template with the following command:
aioli templates resource create <TEMPLATE_NAME> \ --description <TEMPLATE_DESCRIPTION> \ --gpu-type <GPU_TYPE> \ --limits-cpu <LIMITED_CPU> \ --limits-memory <LIMITED_MEMORY> \ --requests-cpu <REQUESTED_CPU> \ --requests-memory <REQUESTED_MEMORY>
Via the API #
- Sign in to MLIS.
curl -X 'POST' \ '<YOUR_EXT_CLUSTER_IP>/api/v1/login' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "username": "<YOUR_USERNAME>", "password": "<YOUR_PASSWORD>" }'
- Obtain the Bearer token from the response.
- Use the following cURL command to add a new resource template.
curl -X 'POST' \ 'https://<YOUR_EXT_CLUSTER_IP>/api/v1/templates/resources' \ -H 'Accept: application/json' \ -H 'Authorization: Bearer <YOUR_ACCESS_TOKEN>' \ -H 'Content-Type: application/json' \ -d '{ "description": "A resource template", "name": "my-template", "resources": { "gpuType": "NVIDIA_A100", "limits": { "cpu": "0.5", "gpu": "5", "memory": "2.5Gi" }, "requests": { "cpu": "0.5", "gpu": "5", "memory": "2.5Gi" } } }'