Welcome to the 1.3.0 release of HPE Machine Learning Inferencing Software (MLIS).
Highlights #
This release includes the following features/changes:
Huggingface/vLLM Runtime #
You can now deploy vLLM-compatible models. directly downloaded from huggingface.co.
- New registry of type
HuggingFace
enables access to vLLM-compatible models on huggingface.co - UI Browser to select from vLLLM-compatible huggingface models.
- New model format
vllm
which allows deployment of vLLM-compatible models. By default the models are executed withvllm/vllm-openai:v0.6.2
. If no GPUs are provided, an cpu-only amd64 variant of this runtime provided by MLIS.
HPE AI Essentials Updates #
When deployed as part of HPE AI Essentials, note the following changes in MLIS behavior:
- All deployment endpoints require an API token generated by HPE AI Essentials (Gen AI -> Model Endpoints page)
- The
API tokens
capability of MLIS is disabled. - When accessing NVIDIA NIMs, only the models provided by HPE AI Essentials are available. No NGC Access key is required.
Bug Fixes #
- The first model can be deployed successfully using the
Roll-out
button. On the first roll-out the model was blocked by the errors “Model ’’ does not exist”.