Manage Resource Templates

HPE Machine Learning Inferencing Software includes default resource templates that users can select when adding or editing a packaged model. You can manage these resource templates and create new ones using the MLIS UI, CLI, or API.

Before You Start #

You must have the Admin or Maintainer user role to manage resource templates.
Ensure your cluster has sufficient resources and appropriate GPU types to support the resource templates you create.
Active deployments using an updated resource template will apply the new settings during the next canary rollout or upon re-deployment.

Default Resource Templates #

The following table shows the default resource templates available in MLIS.

Name	Description	Request CPU	Request Memory	Request GPU	Limit CPU	Limit Memory	Limit GPU
cpu-tiny	1 cpu, 10Gi memory, no gpu per replica	1	10Gi		1	10Gi
cpu-small	4 cpu, 20Gi memory, no gpu per replica	4	20Gi		6	40Gi
cpu-large	8 cpu, 40Gi memory, no gpu per replica	8	40Gi		10	60Gi
gpu-tiny	1 cpu, 10Gi, 1 gpu per replica	1	10Gi	1	1	10Gi	1
gpu-small	2 cpu, 20Gi, 2 gpu per replica	2	20Gi	2	6	40Gi	2
gpu-large	8 cpu, 40Gi, 4 gpu per replica	8	40Gi	4	10	60Gi	4

How to Add Resource Templates #

Via the UI #

In the MLIS UI, navigate to Settings > Resource templates.
Select Add new resource template.

Input the name, description, and resource requirements for the template.

Field	Description
Template name	A unique identifier for the resource template
Description	A brief explanation of the template’s purpose or characteristics
CPU request	The minimum CPU resources required for the packaged model to operate
CPU limit	The maximum CPU resources allowed for handling traffic spikes
Memory request	The minimum memory resources required for the packaged model to operate
Memory limit	The maximum memory resources allowed for handling traffic spikes
GPU request	The minimum GPU resources required for the packaged model to operate
GPU limit	The maximum GPU resources allowed for handling traffic spikes
GPU type	The specific GPU model required for the packaged model(e.g., NVIDIA A100). Specifying a GPU type requires heterogenous GPU support be enabled.

Select Create template.

The new template is now available from the Resource Template dropdown on the Resources tab when adding or editing a packaged model. To update this template, select the ellipsis icon next to the template name and choose Edit.

Via the CLI #

Add a new resource template with the following command:

aioli templates resource create <TEMPLATE_NAME> \
--description <TEMPLATE_DESCRIPTION> \
--gpu-type <GPU_TYPE> \
--limits-cpu <LIMITED_CPU> \
--limits-memory <LIMITED_MEMORY> \
--requests-cpu <REQUESTED_CPU> \
--requests-memory <REQUESTED_MEMORY>

Via the API #

curl -X 'POST' \
  '<YOUR_EXT_CLUSTER_IP>/api/v1/login' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "username": "<YOUR_USERNAME>",
  "password": "<YOUR_PASSWORD>"
}'

Obtain the Bearer token from the response.

Use the following cURL command to add a new resource template.

curl -X 'POST' \
  'https://<YOUR_EXT_CLUSTER_IP>/api/v1/templates/resources' \
  -H 'Accept: application/json' \
  -H 'Authorization: Bearer <YOUR_ACCESS_TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{
    "description": "A resource template",
    "name": "my-template",
    "resources": {
      "gpuType": "NVIDIA_A100", 
      "limits": {
        "cpu": "0.5",
        "gpu": "5",
        "memory": "2.5Gi"
      },
      "requests": {
        "cpu": "0.5",
        "gpu": "5",
        "memory": "2.5Gi"
      }
    }
  }'