min read

Mirascope V1 Release: Lessons Learned from Building an LLM Package

William Bakst

Published on

Aug 19, 2024

After months of development, countless iterations, and invaluable community feedback, we're excited to announce the release of Mirascope V1. This journey has been filled with challenges, insights, and breakthroughs. In this post, I'll share my experience building an LLM package, the design issues we encountered in V0, and how we've addressed them in V1.

The Journey: Original Idea

We didn't actually intend to build a developer tool. We were working on a separate product and wanted to integrate LLMs into that product's workflow. Unfortunately everything was so new at the time that none of the existing developer tools fit our needs, so we started building our own internal tooling (which we've come to learn is the path many others have taken as well).

So we were building Mirascope out of necessity, and we felt that it would be best built out in the open as an open-source project considering how many others seemed to be feeling the same way.

Our goal with Mirascope was (and still is) to create a flexible, powerful tool for working with LLM APIs. We wanted to build a single layer abstraction on top of these APIs to improve their ergonomics without removing transparency or control. We wanted to build something provider-agnostic given how frequently a new "king" would pop up and be crowned the best LLM. We wanted to build something simple and elegant that felt like the natural progression from building with the base provider SDKs.

But most of all we wanted to build something Python-specific – something that properly took advantage of the qualities of Python that make it so loved.

We spent nearly 8 months building, testing, and refining our package in an open beta. Along the way, we learned a lot about LLM interactions, Python best practices, and the needs of the developer community.

Design Challenges in V0

The initial version of Mirascope was developed as an internal tool, which meant that the problems we were solving were naturally specific to our use-case.

This led to several design challenges, which became particularly evident in more complex applications:

Stateful Class-Based Approach: Initially, we used a class-based system for LLM calls. While this seemed intuitive at first, it introduced unnecessary state and complexity. We realized that we had added state to the wrong parts of our abstractions.
Performance Overhead: Creating class instances for each LLM call resulted in performance overhead, especially for applications making frequent calls.
Limited Flexibility: The class-based approach made it difficult to implement dynamic configurations and integrate seamlessly with other Python libraries.
Verbose Code: Users had to write more boilerplate code to set up and execute LLM calls.
Integration Challenges: Integrating with other Python libraries, especially those using decorators, was not as smooth as we wanted.

Community Feedback: Shaping Mirascope

Our community has been incredible, providing consistent feedback that has influenced our design and ultimately helped shape V1. For example:

Separation of State and Arguments: Many users wanted to be able to more clearly distinguish between the state managed across multiple calls and the arguments passed in for each individual call. For example, chat history vs. the user's current query.
Dynamic Configuration: We received multiple requests for dynamically configuring calls at runtime based on arguments of the call. One user in particular suggested we use the original decorated function's return for configuration, which was pivotal in how we designed and implemented dynamic configuration.
Provider Flexibility: Consistent requests for easier switching between LLM providers reinforced our commitment to provider-agnostic design. This in no way means not to engineer prompts for a specific provider, but the ability to switch should be present and easy.

It's worth going through our solutions to each of these problems to properly highlight why we made the changes we did in our v1 release. Of course, there were many other points of feedback we addressed as well as additional features we've included in the release.

Take a look at our migration guide and learn documentation for a deeper dive that covers everything in detail.

Separation of State and Arguments

In V0, state and arguments were mixed as fields of the class, making it difficult and unclear what should persist across calls and what should be specific to each individual call.

‍

1from mirascope.openai import OpenAICall
2
3
4class Librarian(OpenAICall):
5    prompt_template = """
6    SYSTEM: You are a librarian. You specialize in the {genre} genre
7    USER: {query}
8    """
9
10    genre: str
11    query: str
12
13
14librarian = Librarian(
15    genre="fantasy",
16    query="Recommend a book",
17)
18response = librarian.call()
19print(response.content)
20
21librarian.query = "Recommend a book for beginners"
22response = librarian.call()
23print(response.content)

‍

It's not clear to the end user of the `Librarian` class that the `query` field is actually an argument of the call that should be updated across each call. The solution would be to create an entirely new instance of `Librarian` for every call, but this would make both `genre` and `query` feel like arguments.

There was no clear way to separate between `genre` being state and `query` being an argument.

We tried our best to include arguments as part of the `call` and other such methods as an additional keyword argument, but this was unfortunately not possible without losing proper type hints due to limitations with the class-based approach in Python.

With V1, the separation of state and arguments was not only easy to implement but also felt extremely natural to write as an end-user of the interface:

‍

1from mirascope.core import openai, prompt_template
2from pydantic import BaseModel
3
4
5class Librarian(BaseModel):
6    genre: str
7
8    @openai.call("gpt-4o-mini")
9    @prompt_template(
10        """
11        SYSTEM: You are a librarian. You specialize in the {self.genre} genre
12        USER: {query}
13        """
14    )
15    def call(self, query: str): ...
16
17
18fantasy_librarian = Librarian(genre="fantasy")
19response = fantasy_librarian.call("Recommend a book")
20print(response.content)
21
22response = fantasy_librarian.call("Recommend a book for beginners")
23print(response.content)

‍

It's now evident that `genre` is state of the `Librarian` class, and the call method uses this state for every call. However, `query` is now clearly an argument of the call that should be provided for every call that's made.

Dynamic Configuration

Now that we had separation of state and arguments, we could fully enable dynamic configuration using these call parameters. The biggest change here is the ability to now dynamically generate tools, which was not previously feasible (even after a long back-and-forth with a talented Python engineer extremely experienced with Python typing, who has now joined the team!).

‍

1from mirascope.core import BaseToolKit, openai, prompt_template, toolkit_tool
2
3
4class BookToolkit(BaseToolKit):
5    genre: str
6
7    @toolkit_tool
8    def format_book(self, title: str, author: str) -> str:
9        """Format a {self.genre} book recommendation."""
10        return f"{title} by {author} ({self.genre})"
11
12
13@openai.call("gpt-4o-mini")
14@prompt_template("Recommend a {genre} book")
15def recommend_book(genre: str) -> openai.OpenAIDynamicConfig:
16    toolkit = BookToolkit(genre=genre)
17    return {"tools": toolkit.create_tools()}
18
19
20response = recommend_book("mystery")
21if response.tool:
22    print(response.tool.call())

Provider Flexibility

V0 made it challenging to switch between different LLM providers without significant code changes.

‍

1from mirascope.openai import OpenAICall
2
3
4class BookRecommender(OpenAICall):
5    prompt_template = "Recommend a {genre} book."
6
7    genre: str
8
9
10recommender = BookRecommender(genre="fantasy")
11response = recommender.call()
12print(response.content)

‍

In this example, switching providers would require rewriting the entire class (e.g. switching `OpenAICall` to `AnthropicCall`) with no alternate solution available.

The decorator approach in V1's design makes switching between providers simple. In fact, it's possible to easily run multiple providers on the same prompt with the same configuration:

‍

1from mirascope.core import anthropic, openai, prompt_template
2
3
4@prompt_template("Recommend a {genre} book")
5def recommend_book(genre: str): ...
6
7
8openai_response = openai.call("gpt-4o-mini")(recommend_book)("fantasy")
9print(openai_response.content)
10
11anthropic_response = anthropic.call("claude-3-5-sonnet-20240620")(recommend_book)(
12    "fantasy"
13)
14print(anthropic_response.content)

‍

You can also use the `BasePrompt` class for a more similar interface to v0 and run it against any supported provider's decorator:

‍

1from mirascope.core import BasePrompt, anthropic, openai, prompt_template
2
3
4@prompt_template("Recommend a {genre} book")
5class BookRecommendationPrompt(BasePrompt):
6    genre: str
7
8
9prompt = BookRecommendationPrompt(genre="fantasy")
10
11openai_response = prompt.run(openai.call("gpt-4o-mini"))
12print(openai_response.content)
13
14anthropic_response = prompt.run(anthropic.call("claude-3-5-sonnet-20240620"))
15print(anthropic_response.content)

‍

Key Learnings

Throughout this process, we've gained valuable insights:

1. Embrace Statelessness: LLM API calls are inherently stateless. Designing our package to reflect this leads to cleaner, more efficient code. When we do want to include state (such as for agents), the how and why are extremely clear.

2. Prioritize Developer Experience: Simplifying the API and reducing boilerplate significantly improves the developer experience. It makes building with Mirascope the most fun I've had building anything in a long time.

3. Performance Matters: Even small overheads can add up in large-scale applications. Optimizing for performance from the ground up is crucial. This is something that we will strive to continuously optimize.

4. Flexibility is Key: The AI landscape is rapidly evolving. Building flexibility into the core design allows for easier adaptation to new developments. What happens when a provider releases a new feature? There's no reason for users to wait on us to gain access to these features or otherwise be forced to rip everything out and return to the base SDK.

5. Community Feedback is Invaluable: Many of our improvements came from user feedback. Actively engaging with the community leads to a better product.

Looking Forward

Mirascope V1 is a significant step forward, but our journey doesn't end here. We're committed to continuous improvement and innovation. Some areas we're particularly excited about exploring in future releases include:

More Programming Languages: I started with Python because that's what felt most natural as an MLE / AI research engineer. But I think developers that program with other languages deserve tooling that follows the same principles with which we've built Mirascope. I expect the next language we support will be JavaScript/TypeScript.
Provider Support: New providers (and new features) are popping up all of the time. For example, OpenAI just recently released structured output support natively, and Anthropic just released prompt caching. We want to push ourselves to support such releases as soon as they are released.
Version Control: Versioning a text prompt may be easy, but the reality is that prompts are often far more than just a single text snippet – they involve code. We love git, but it falls short in this instance. We're excited about discovering new ways to support versioning on the fly for better hot-swapping of entire flows to enable easier evaluation and comparison of various implementation versions.
Postgres and pgvec: Similar to improving the ergonomics of LLM APIs, we think there is a single layer abstraction on top of Postgres and pgvec that's waiting to be built. We envision something like a CRUD interface for RAG apps where the underlying data can live where all of your data already lives.

Let us know if there's anything specific you'd like to see included in future releases!

Building Mirascope has been an incredible learning experience. We're incredibly grateful to our community for their support, feedback, and patience throughout this process. Mirascope V1 is as much your achievement as it is ours.

We invite you to try out Mirascope V1, share your experiences, and join us in shaping the future of LLM development in Python. Together, we can continue to push the boundaries of what's possible with AI.

Ready to get started? Here's how:

1. Check out our Quick Start Guide to set up Mirascope in minutes.

2. Explore our Usage Documentation for in-depth guides and examples.

3. Join our Slack Community to discuss questions, stay up to date on announcements, or even show off what you've built.

‍

Let's build the future of AI together!