Beyond Guesswork: The Rise of Retrieval…

Nov 22, 2023

How to curb your AI's hallucinations

2 Comments

Nov 22, 2023

This is a terrific explanation - thank you. When it comes to "non-parametric memory", is this typically prefetched and the LLM sees this as a local cache of recent information or is this a real time lookup based on the user's query?

Expand full comment

Reply (1)

Sairam Sundaresan

Nov 28, 2023

Thanks! I'm glad you found this useful. I think the answer depends. In most scenarios, I'd think this would be a real-time look up since it's not possible to predict what the exact words in the user's query will be, right? But it seems like this is an active area of research. I just came across this post that talks of a "semantic" cache to speed up LLMs: https://portkey.ai/blog/reducing-llm-costs-and-latency-semantic-cache/

Expand full comment