No Streamed Responses
Scenario #
Responses that are configured to be streamed to the caller are not streamed. Instead, they are sent after the entire response is generated.
Triage #
There is a known KServe defect that prevents streamed responses when logging is enabled. The agent sidecar added for logging disrupts the streaming process, causing the caller to receive the entire response only after it has been fully created, rather than receiving a streamed response.
A bug ticket has been filed with the KServe team to address this issue.
Resolution #
Upgrade to KServe 0.14 or greater to resolve this issue.
Historical Workaround #
If you cannot upgrade KServe, the following workaround can be used to disable the KServe request/response logging but allow streaming to work as expected:
Define AIOLI_DISABLE_LOGGER=1
in your packaged model or deployment environment variables.