Environment Variables
CLI #
The following environment variables can be set to configure the CLI:
Variable | Description |
---|---|
AIOLI_USER
|
The username used to authenticate the user. |
AIOLI_PASS
|
The password used to authenticate the user. |
AIOLI_USER_TOKEN
|
The token used to authenticate the user. |
AIOLI_CONTROLLER
|
The protocol, IP, and port of the controller (e.g., http://mycontrollerhostname:80 ).
|
AIOLI_CONTROLLER_CERT_FILE
|
The path to the certificate file used by the controller, which is required for setting up external authentication. See our Obtaining a CA Signed Certificate (GKE) and GitHub Authentication guides for example usage. |
AIOLI_CONTROLLER_CERT_NAME
|
The name of the controller’s certificate file. |
AIOLI_DEBUG_CONFIG_PATH
|
The path to the debug configuration file. |
Deployment & Packaged Model #
The following environment variables can be set while adding a packaged model or creating a deployment to modify the default settings:
Variable | Description |
---|---|
AIOLI_LOGGER_PORT
|
The port that the logger service listens on; default is 49160 .
|
AIOLI_PROGRESS_DEADLINE
|
The deadline for downloading the model; default is 1500s .
|
AIOLI_READINESS_FAILURE_THRESHOLD
|
The number of readiness probe failures before the deployment is considered unhealthy; default is 100 .
|
AIOLI_COMMAND_OVERRIDE
|
The customized deployment command that enables you to override the default deployment command within a predefined runtime (e.g., for NIM containers). Useful if you want to switch to a nim_llm runtime for running vLLM models.
|
Environment variables set on a deployment will override the values set on its packaged model.
Command Override Argument Options #
MLIS executes a default command for your container runtime based on the type of packaged model you have selected. However, you can modify this command using the AIOLI_COMMAND_OVERRIDE
environment variable. Any arguments from the packaged model are then appended to the end of this command, followed by any arguments from the deployment (AIOLI_COMMAND_OVERRIDE = [CLI_COMMAND] [MODEL_ARGS] [DEPLOYMENT_ARGS]
).
The following table shows the default command for each packaged model’s framework type:
FRAMEWORK | COMMAND | DESCRIPTION |
---|---|---|
OpenLLM |
openllm start --port {{.containerPort}} {{.modelDir}}
|
You can add any options from OpenLLM version 0.4.44 to your command (see openllm start -h ).
|
Bento Archive |
bentoml serve ...
|
You can add any options from BentoML version 1.1.11 to your command (see bentoml serve -h ).
|
Custom |
none
|
For custom models, the default entrypoint for the container is executed. |
NVIDIA NIM |
none
|
For NIM models, the default entrypoint for the container is executed. You must use environment variables; NIM contaiers do not honor CLI arguments. |
You can also use the following variables to modify the command’s arguments:
Named Argument | Description |
---|---|
{{.numGpus}}
|
The number of GPUs the model is requesting. |
{{.modelName}}
|
The MLIS model name being deployed. |
{{.modelDir}}
|
The directory into which the model will be downloaded. This is typically /mnt/models . This applies to NIM, OpenLLM, and S3 models.
|
{{.containerPort}}
|
The http port that the container must listen on for inference requests and readiness checks. |
Examples #
AIOLI_COMMAND_OVERRIDE="nim_llm --model_name {{.modelName}} --model_path {{.modelDir}} --port {{.containerPort}} --health_port {{.healthPort}} --num_gpus {{.numGpus}}"
AIOLI_COMMAND_OVERRIDE="openllm start {{.modelName}} --port {{.containerPort}} --gpu-memory-utilization 0.9 --max-total-tokens 4096"
AIOLI_COMMAND_OVERRIDE="bentoml serve {{.modelDir}}/bentofile.yaml --production --port {{.containerPort}} --host 0.0.0.0"