From Registry (UI)

Before You Start


Basic Details

  1. Sign in to HPE Machine Learning Inferencing Software.
  2. Navigate to Packaged Models.
  3. Select Add new model.
  4. Input details for the following:
    • Name: The name of the model within HPE Machine Learning Inferencing Software.
    • Description: A brief description of the model.
  5. Select Next.

Storage Details

tip icon NVIDIA NIMs

MLIS shows the currently available NIMs for the organization specified in the registry. Review the support matrix for a specific NVIDIA NIM for LLMS to properly configure the resources needed to run the model.

When the organization is not specified in your registry configuration, MLIS displays all available NIMs, including those not designed for LLMs. Non-LLM NIMs (such as ProteinMPNN or TTS FastPitch) are incompatible with the LLM configuration interface and may fail to launch using MLIS’s default settings. To run these specialized NIMs, you may need to use the custom model format and provide specific environment variables or arguments as detailed in each NIM’s documentation.

  1. Input details for the following:
    • Registry: The registry where the model is stored.
    • Model Format: Options are OpenLLM, Bento archive, NIM, Custom.
    • Image: The container image servicing the model; must be the name of the image + a release tag. For NIM, see the NGC catalog for the image options.
    • URL/Path: The location of the model object in the registry.
      PrefixURI SyntaxDescription
      openllm://openllm://<model-ref>An openllm model name from huggingface.co dynamically loaded and executed with a VLLM backend.
      s3://s3://<bucket-name>/<path-to-model>A model directory dynamically downloaded from an associated s3 registry bucket. This is supported for the bento-archive, openllm, and custom model formats.
      pvc://See From PVC setup guidesA PVC model path that can be used for pre-downloaded NIM and Custom models.
      pfs://pfs://<project>/<repo>@<commit>[:<path>][?containerPath=<path>]A PFS model path that can be used for models stored in HPE Machine Learning Data Management repositories.
      ngc://Not supported
      Custom models can use any of the supported registry URI syntaxes, depending on where the model is stored.
  2. Optionally, enable the local caching toggle to cache the model on first use. This speeds up startup times for subsequent deployments. You must have the Admin user role to enable local caching.
  3. Select Next.

Resource Templates

  1. Choose a Resource Template or define custom resources.

    NameDescriptionRequest CPURequest MemoryRequest GPULimit CPULimit MemoryLimit GPU
    cpu-tiny1 cpu, 10Gi memory, no gpu per replica110Gi110Gi
    cpu-small4 cpu, 20Gi memory, no gpu per replica420Gi640Gi
    cpu-large8 cpu, 40Gi memory, no gpu per replica840Gi1060Gi
    gpu-tiny1 cpu, 10Gi, 1 gpu per replica110Gi1110Gi1
    gpu-small2 cpu, 20Gi, 2 gpu per replica220Gi2640Gi2
    gpu-large8 cpu, 40Gi, 4 gpu per replica840Gi41060Gi4

    note icon GPU Type
  2. Select Next.

Environment Variables & Arguments

Environment variables and arguments are advanced configuration options that you can set for your packaged model. These inputs will vary based on your model’s requirements. For more information, see the Advanced Configuration reference article.

  1. Provide any needed Environment Variables.
  2. Provide any needed Arguments.
  3. Select Create model.