In our experience, fine-tuning inputs to coax the right answers out of language models requires some level of prompt versioning, otherwise it becomes extremely difficult to keep track of changes past a couple of versions.
So the next question becomes: What kind of versioning system or tool is best?
A LangChain prompt template is a class containing elements you typically need for a Large Language Model (LLM) prompt. At a minimum, these are:
A natural language string that will serve as the prompt: This can be a simple text string, or, for prompts consisting of dynamic content, an f-string or docstring containing placeholders that represent variables.
Formatting instructions (optional), that specify how dynamic content should appear in the prompt, i.e., whether it should be italicized, capitalized, etc.
Input parameters (optional) that you pass into the prompt class to provide instructions or context for generating prompts. These parameters influence the content, structure, or formatting of the prompt. But oftentimes they’re variables for the placeholders in the string, whose values resolve to produce the final string that goes into the LLM through an API call as the prompt.
While anyone can develop LLM applications using just the OpenAI SDK—we used to do that since we didn’t find helper functions at the time to be particularly helpful—prompt engineering tools that simplify LLM interactions and enhance productivity are emerging as key players.
We’ve tried several libraries and have built our own, and in our experience you should look for six capabilities in a good prompt engineering tool, if you’re looking to develop robust, production-grade LLM applications:
We’ve seen many discussions around Large Language Model (LLM) software development allude to a workflow where prompts live apart from LLM calls and are managed by multiple stakeholders, including non-engineers. In fact, many popular LLM development frameworks and libraries are built in a way that requires prompts to be managed separately from their calls.
We think this is an unnecessarily cumbersome approach that’s not scalable for complex, production-grade LLM software development.
Here’s why: for anyone developing production-grade LLM apps, prompts that include code will necessarily be a part of your engineering workflow. Therefore, separating prompts from the rest of your codebase, especially from their API calls, means you’re splitting that workflow into different, independent parts.
Separating concerns and assigning different roles to manage each may seem to bring certain efficiencies, for example, easing collaboration between tech and non-tech roles. But it introduces fundamental complexity that can disrupt the engineering process. For instance, introducing a change in one place—like adding a new key-value pair to an input for an LLM call—means hunting down that change manually. And then, you will likely still not catch all the errors.
LangChain is a popular Large Language Model (LLM) orchestration framework because:
It’s a good way to learn concepts and get hands-on experience with natural language processing tasks and building LLM applications.
Its system of chaining modules together in different ways lets you build complex use cases. LangChain modules offer different functionalities such as interfacing with LLMs or retrieving data from them.
Its framework is broad and expanding: it offers hundreds of integrations, as well as LangChain Expression Language (LCEL) and other tools for managing aspects like debugging, streaming, and output parsing.
It has a large and active following on Twitter and Discord, and especially on GitHub. In fact, according to their blog, over 2,000 developers contributed to their repo in their first year of existence.