Mirascope v2 Alpha

Welcome to the Mirascope v2 alpha!

We've taken everything we learned from Mirascope v0 and v1, and have re-written Mirascope from the ground up. Mirascope v2 focuses on providing type-safe, consistent abstractions that unify across all major LLM providers. Our goal is to fully abstract over provider-specific differences, providing a flexible interface that is fully portable across LLM providers.

Mirascope provides a llm.call decorator for a smooth and Pythonic approach to writing functions that call LLMs. However, we've implemented this on top of an equally powerful llm.Model, which is available for those who would prefer not to use decorators.

We're proud to share this alpha with you, and look forward to your feedback!

Installation

To install the Mirascope v2 alpha, use the following command:

uv add "mirascope[all]==2.0.0-alpha.3"

The Call Decorator

The most convenient way to use Mirascope v2 is via the llm.call decorator:

from mirascope import llm

@llm.call("openai/gpt-5")
def recommend_book(genre: str):
    return f"Please recommend a book in {genre}."

def main():
    response: llm.Response = recommend_book("fantasy")
    print(response.pretty())

main()

The call decorator decorates a "prompt function", which returns the content that's provided to the LLM. In many cases that content is a string that is transformed into a user message. The decorator itself requires provider and model_id as arguments, and may accept additional parameters, like tools. It returns an llm.Call, which may be called to actually invoke the chosen LLM with the provided content.

The prompt function may take arguments (genre: str in the above example), which are passed to the call when you invoke it.

As you can see by clicking on the tabs above, it's easy to switch to async calls (by decorating an async prompt function), or to switch to streaming the LLM responses (via calling .stream() on the Call object).

About Prompt Functions

The Model Class

All of Mirascope's LLM functionality is implemented in terms of the llm.Model class, which provides a consistent interface for calling any supported large language model.

The llm.call decorator acts as a wrapper around models, but using the call decorator is optional. You can retrieve a model via llm.use_model, and then call it directly.

from mirascope import llm


def recommend_book(genre: str) -> llm.Response:
    model: llm.Model = llm.use_model("openai/gpt-5")
    message = llm.messages.user(f"Please recommend a book in {genre}.")
    return model.call(messages=[message])


def main():
    response: llm.Response = recommend_book("fantasy")
    print(response.pretty())


main()

Why we call use_model instead of instantiating llm.Model

It's also possible to directly instantiate an llm.Model, rather than calling llm.use_model. The advantage of use_model is that it supports overriding model choice or model parameters at call time, via with llm.model(...):. Use llm.Model if you want to hard-code the model choice and make it impossible to overwrite.

LLM Responses

Regardless of how you call the LLM, you will get back an llm.Response, or a variant (like llm.AsyncResponse or llm.StreamResponse.)

All responses inherit from llm.RootResponse, and contain the full message history of the interaction with the LLM (including the most recent assistant message), and convenient property accessors for the content of the LLM's response. For example, response.texts contains all of the llm.Text pieces of the model's response. (In simple cases, the response content will consist of a single text piece.)

In our examples, we print response.pretty(), which is a convenience method that turns all of the response content into an organized, human-readable string.

After calling the LLM, you get back an llm.Response. Our example code calls response.pretty() as a convenient helper for converting response content to text; however, in production code you might reference response.content or response.texts instead.

Responses Reference

About Stream Responses

Call Parameters

Mirascope has a common set of parameters that may be used to configure the LLM; for example, max_tokens to limit the token usage, or temperature to adjust the variability of the responses.

from mirascope import llm


@llm.call(
    "openai/gpt-5",
    temperature=1,
)
def recommend_book(genre: str):
    return f"Please recommend a book in {genre}."


def main():
    response: llm.Response = recommend_book("fantasy")
    print(response.pretty())


main()

Parameters Reference

Overriding Models or Params

If you wish, you can override your choice of provider, model, and params at call time, using the llm.model context manager. You can change model params, or even the provider and model itself. (However, it is not possible to change the call's tools or response format.)

from mirascope import llm


@llm.call("openai/gpt-5")
def recommend_book(genre: str):
    return f"Please recommend a book in {genre}."


def main():
    with llm.model("anthropic/claude-sonnet-4-0", temperature=1):
        response: llm.Response = recommend_book("fantasy")
        print(response.pretty())


main()

You can use the same pattern to override the model being used when calling response.resume (regardless of whether the response was generated via the decorator or a direct model call). You can also use with llm.model to change the model params such as thinking, temperature, etc.

Resuming Responses

The llm.Response class makes it easy to continue a conversation using all the prior messages as context: just call response.resume with additional user content. Here's a simple example:

from mirascope import llm


@llm.call("openai/gpt-5")
def recommend_book(genre: str):
    return f"Please recommend a book in {genre}."


def main():
    response: llm.Response = recommend_book("fantasy")
    print(response.pretty())
    continuation: llm.Response = response.resume("Please explain your choice.")
    print(continuation.pretty())


main()

response.resume's behavior and return type depend on the response it's being called on. So AsyncResponse.resume is async and returns an AsyncResponse, StreamResponse.resume returns a new StreamResponse, etc.

Tools and Tool calling

We've made tool calling a breeze. Use the llm.tool decorator to convert a python function into an llm.Tool, and provide it via the tools argument to the call decorator (or the model call). Then, if response.tool_calls is present, you can call response.execute_tools() to call those tools (generating a sequence of llm.ToolOutputs), and response.resume(...) to use those outputs and create a new response. Here's an example:

from mirascope import llm


@llm.tool()
def available_library_books() -> list[str]:
    return [
        "Mistborn by Brandon Sanderson",
        "The Name of the Wind by Patrick Rothfuss",
        "Too Like the Lightning by Ada Palmer",
        "Wild Seed by Octavia Butler",
    ]


@llm.call(
    "openai/gpt-5",
    tools=[available_library_books],
)
def librarian(query: str):
    return query


def main():
    response: llm.Response = librarian(
        "Please recommend a mind-bending book that's available in the library."
    )

    while response.tool_calls:
        tool_outputs = response.execute_tools()
        response = response.resume(tool_outputs)

    print(response.pretty())


main()

A note on async tool calling: if the decorated tool function is async, then an llm.AsyncTool is created. If any tool is async, then all must be, and the call must be, too. In that case, you'd use await response.execute_tools().

Notes on Tool Construction

Response Formatting

If you'd like the LLM to return structured output, simply define a format class that inherits from pydantic BaseModel, and pass it as the format= argument to the decorator (or model call). Then, call response.parse() afterwards:

from pydantic import BaseModel

from mirascope import llm


class Book(BaseModel):
    title: str
    author: str


@llm.call("openai/gpt-5", format=Book)
def recommend_book(genre: str):
    return f"Please recommend a book in {genre}."


def main():
    response: llm.Response[Book] = recommend_book("fantasy")
    book: Book = response.parse()
    print(f"{book.title} by {book.author}")


main()

For type safety, the llm.Response is generic on the type of format that the response may be parsed into. Thus, in the example above, we get a llm.Response[Book], indicating that the response can be parsed to return a Book. When a format is not provided, we set the format type to None, so llm.Response is effectively an alias for llm.Response[None].

Notes on response formatting

Provider Registration

You can use llm.register_provider to modify what provider is used for a given model. This allows you to do things like customize the provider's base url or API key. You can also use this to extend the scope of models that are routed to a given provider. Since many model developers have OpenAI client compatibility, you can route their models through the OpenAI provider, as in the example below:

import os

from mirascope import llm

llm.register_provider(
    "openai",  # Use the OpenAI Provider
    scope="anthropic/",  # Apply it to anthropic/ model ids
    base_url="https://api.anthropic.com/v1/",
    api_key=os.getenv("ANTHROPIC_API_KEY"),
)


@llm.call("anthropic/claude-haiku-4-5")
def recommend_book(genre: str):
    return f"Recommend a {genre} book."


def main():
    # Will use the Claude Haiku model, routed through OpenAI
    response: llm.Response = recommend_book("fantasy")
    print(response.pretty())
    print(response.provider_id)  # prints "openai"


main()

Using Local Models

You can use the provider registration system to use local models, e.g. ollama in the below example:

from mirascope import llm

llm.register_provider(
    "openai:completions",
    scope="ollama/",
    base_url="http://localhost:11434/v1",
    api_key="ollama",
)


@llm.call("ollama/gpt-oss:20b")
def recommend_book(genre: str):
    return f"Recommend a {genre} book."


def main():
    response: llm.Response = recommend_book("fantasy")
    print(response.pretty())


main()

If you're working on a Macbook, you can also use MLX models directly on Apple Silicon:

from mirascope import llm

@llm.call("mlx-community/Qwen3-8B-4bit-DWQ-053125")
def recommend_book(genre: str):
    return f"Recommend a {genre} book."

def main():
    response: llm.Response = recommend_book("fantasy")
    print(response.pretty())

main()

Context Prompts & Context Tools

Sometimes, your prompt or tools may have dependencies that you'd like to inject at call time, not at function definition time. Mirascope's Context system is designed for just this use case. To use context, define your prompts and tool definitions so that their first arg is named ctx, and has type llm.Context (or a subclass of llm.Context).

When using a context call, you will generally need to pass a llm.Context object of the right type as the first argument whenever interacting with the LLM or with tools. Thus, ContextCall.call, ContextResponse.resume, and ContextResponse.execute_tools all take ctx: llm.Context as their first argument.

Here's an example, in which both the prompt and the tool depend on a Library context object.

from mirascope import llm


class Library:
    available_books: list[str]
    detailed_book_info: dict[str, str]


@llm.tool()
def get_book_info(ctx: llm.Context[Library], book: str) -> str:
    return ctx.deps.detailed_book_info.get(book, "Book not found")


@llm.call("openai/gpt-5", tools=[get_book_info])
def librarian(ctx: llm.Context[Library], query: str):
    book_list = "\n".join(ctx.deps.available_books)
    return [
        llm.messages.system(
            f"You are a librarian, with access to these books: ${book_list}"
        ),
        llm.messages.user(query),
    ]


def main():
    library = Library()
    query = "Please recommend a mind-bending book from the library."
    ctx = llm.Context(deps=library)
    response: llm.ContextResponse[Library] = librarian(ctx, query)

    while response.tool_calls:
        tool_outputs = response.execute_tools(ctx)
        response = response.resume(ctx, tool_outputs)

    print(response.pretty())


main()

When using context, the prompt and all provided context tools must agree on the type of the dependency being stored in context. (If multiple tools want different dependency objects, you should combine these into a wrapper dependency so they can still take the same object.) The context system is type-safe, so your typechecker will warn you if anything goes wrong.

It is okay to mix context tools and non-context tools in a single call, so long as the prompt is a context prompt taking a context object that matches what the tools expect.

For type safety, the llm.ContextResponse[DepsT, FormatT]. The first is the type of dependency that must be injected via llm.Context; the second is the format type (if specified).

Learning More

If you'd like to learn more about Mirascope v2, consider the following resources:

We have some additional examples that you may peruse.
We have extensive end-to-end snapshot testing which consists of real runnable Mirascope code, and snapshots that serialize the expected output. For example, here are end to end tests for cross-provider thinking support and here are the corresponding snapshots.
The API reference documents all of the public functionality in Mirascope.
You can hop on our Discord and ask us questions directly!

We welcome your feedback, questions, and bug reports.