From the course: A Start to Using Generative AI in .NET

Unlock the full course today

Join today to access over 23,400 courses taught by industry experts.

Streaming vs. non-streaming

Streaming vs. non-streaming

- [Narrator] Let's compare our chat application on the left with the official ChatGPT interface on the right. I will use GPT-4 in both cases and write the same prompt in both: write a poem about apples. Well, that's a big difference. Even though the generation speed was pretty much the same, our app still felt a lot slower. Why? Because our app waited until the entire poem was ready before showing it. And ChatGPT showed the text as it was being written word by word. This is called streaming. ChatGPT displays the response in chunks, allowing us to start reading the answer much earlier. The current generation speed of GPT-4 is slightly faster than what a human can read. This means that if we stream the response, the user can essentially follow along and doesn't perceive the AI as slow. GPT-3.5 is even faster. It can write with a speed that's good enough just to skim through it using headers. The AI community fully expects the generation speed to increase dramatically in the near future.…

Contents