Building an LLM app has evolved significantly since 2022.
If you're working on an AI project, you should familiarize yourself with the anatomy of a modern LLM application. Here's a quick overview:
• LLM model - the core reasoning engine; an API into
@OpenAI,
@AnthropicAI,
@GoogleAI, or open source alternatives like
@MistralAI.
• Prompt template - the boilerplate instructions to your model, which are shared between requests. This is generally versioned and managed like code using formats like the .prompt file™️ .
• Data sources - to provide the relevant context to the model; often referred to as retrieval augmented generation (RAG). Examples being traditional relational databases, graph databases, and vector databases like
@pinecone or
@trychroma.
• Memory - like a data source, but that builds up a history of previous interactions with the model for re-use.
• Tools - provides access to actions like API calls and code execution empowering the model to interact with external systems where appropriate.
• Agent control flow - some form of looping logic that allows the model to make multiple generations to solve a task before hitting some stopping criteria.
• Guardrails - a check that is run on the output of the model before returning the output to the user. This can be simple logic, for example looking for certain keywords, or another model. Often triggering fallback to human-in-the-loop workflows.
These individual components represent a large and unique design space to navigate. The configuration of each one requires careful consideration; it's no longer just strictly prompt engineering.