This might be the first time ChatGPT (+
@jxnlco) helped us come up with a better retrieval algorithm for RAG:
1️⃣ Create a hierarchy/graph of “parent chunks” -> smaller chunks. Also link adjacent chunks together.
2️⃣ During query-time, first retrieve smaller chunks with embedding similarity.
3️⃣ Merge leaves: If any subset of these chunks is a major portion of a larger chunk, return the parent chunk instead.
Result 💡: Dynamically retrieve less disparate / larger contiguous blobs of context *only when you need it*. Helps the LLM synthesize better results, but avoids always cramming in as much context as you can.
We’ve implemented these ideas in
@llama_index. We created a HierarchicalNodeParser to parse unstructured text into a node hierarchy, and then a AutoMergingRetriever to “merge in parent chunks” during query-time.
Full guide here:
gpt-index.readthedocs.io/en/…
Again, full credits for this idea go to
@jxnlco - not only Python wizard, but also a ChatGPT Code Interpreter whisperer 🪄