2 Comments
User's avatar
Rob Hanna's avatar

Great talk, Lance! If we are working with structured content in a content management system where we have the ability to set up a publishing pipeline to deliver content, we can perform the chunking intentionally before it gets processed by the RAG parser. We can also add necessary context from metadata to each chunk, further enhancing the microcontent for better retrieval.

If we take your points on descriptive titling to improving the first phase of a multi-phase retrieval process, we can further improve the process if the titling conveys not just descriptive text but also text that conveys reader intent for the chunk of content. This will require us to design content standards around how we title different types of information where we can also use the pipeline to enrich the title used by the chunk from the metadata at publish time.

For example, we have a series of steps to describe how to back up pictures on your phone to the cloud:

[TERSE NON-DESCRIPTIVE] Backing up photos

[DESCRIPTIVE] Back up your photos to the cloud

[DESCRIPTIVE ENRICHED] Back up your photos to the cloud {USING THE IPHONE SERIES X, IPHONE SERIES 11, IPHONE SERIES 12, IPHONE SERIES 13, AND IPHONE SERIES 14}

Expand full comment
Lance Cummings's avatar

Thanks for those examples. Yeah, I think metadata is rhetorical data. I'm definitely looking forward to exploring this more!

Expand full comment