You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the Open AI API Impl (0.3.8) does not return the usage information (how many tokens were given, how many were returned) when streaming is enabled.
Resulting in a usage block being returned on the DONE event.
Without this I am either forced to disable streaming, which is not optimal for slow models on my machine or never be able to reliably know when I reach the context limit.
(To get the configured context limit, I cannot use the Open AI API but have to resort to LM Studio's API implementation of /api/v0/models/. Not optimal but still a way to get the total context size. Also FYI, the completions API of LM (not OpenAI): /api/v0/chat/completions does also not contain the token stats during streaming)
The text was updated successfully, but these errors were encountered:
Currently the Open AI API Impl (0.3.8) does not return the usage information (how many tokens were given, how many were returned) when streaming is enabled.
This has been an issue for Open AI as well for some time. They have, however, fixed it (https://community.openai.com/t/usage-stats-now-available-when-using-streaming-with-the-chat-completions-api-or-completions-api/738156) by adding another option to the streaming interface:
Resulting in a usage block being returned on the DONE event.
Without this I am either forced to disable streaming, which is not optimal for slow models on my machine or never be able to reliably know when I reach the context limit.
(To get the configured context limit, I cannot use the Open AI API but have to resort to LM Studio's API implementation of
/api/v0/models/
. Not optimal but still a way to get the total context size. Also FYI, the completions API of LM (not OpenAI):/api/v0/chat/completions
does also not contain the token stats during streaming)The text was updated successfully, but these errors were encountered: