Streaming: delay in GenerateAsync, or incorrect use? #475
-
|
Hi, I use the code below to perform a streamed generation. The model is OpenAI, with the streaming setting enabled on it as well. The code works, but I notice a 3-10 second delay before the first response is processed. The delay is proportional to the size of the LLM output. My project is in Blazor WASM, so I was able to confirm this - in the browser debug tools, I can see the response from OpenAI coming in right away and the size of it increases progressively - OpenAI is streaming the response correctly. But it's only when it's completely done that I can start iterating over the partial responses. Could anyone confirm whether I made a mistake in my code, or if this would be an issue with the GenerateAsync code? The readme mentions how to do simple generation queries, but I haven't seen an example with streaming. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 3 replies
-
|
The code looks correct |
Beta Was this translation helpful? Give feedback.
-
|
Everything seems correct on the latest version in main: Also please use latest prerelease version to ensure we testing same things |
Beta Was this translation helpful? Give feedback.
-
|
I looked into this again, but I kept hitting a wall. In the end Grok 3 told me that OpenAI doesn't really support direct browser requests with streaming, if that's true then I have been wasting my time trying to make it work! I just wanted to be able to start my blazor wasm app and make LLM direct requests without going through a server, for efficient development purposes. If that's incorrect or someone's been able to make it work, please let me know, I am very interested. |
Beta Was this translation helpful? Give feedback.
I looked into this again, but I kept hitting a wall.
OpenAI streaming requests do work without delay from a console app, and from an asp.net server. It's not a bug in LangChain.
I tried making changes to my HttpClient setup (to make sure blazor's SetBrowserResponseStreamingEnabled is called), but that didn't improve the situation.
In the end Grok 3 told me that OpenAI doesn't really support direct browser requests with streaming, if that's true then I have been wasting my time trying to make it work! I just wanted to be able to start my blazor wasm app and make LLM direct requests without going through a server, for efficient development purposes.
If that's incorrect or someone's been abl…