responses

Attribute AsyncChunkIterator

Type: TypeAlias

Asynchronous iterator yielding chunks with raw data.

Class AsyncContextResponse

The response generated by an LLM from an async context call.

Bases: BaseResponse[AsyncContextToolkit[DepsT], FormattableT], Generic[DepsT, FormattableT]

Function execute_tools

Execute and return all of the tool calls in the response.

Parameters

Name	Type	Description
self	Any	-
ctx	Context[DepsT]	A `Context` with the required deps type.

Returns

Type	Description
Sequence[ToolOutput]	A sequence containing a `ToolOutput` for every tool call in the order they appeared.

Function resume

Generate a new AsyncContextResponse using this response's messages with additional user content.

Uses this response's tools and format type. Also uses this response's provider, model, client, and params, unless the model context manager is being used to provide a new LLM as an override.

Parameters

Name	Type	Description
self	Any	-
ctx	Context[DepsT]	A Context with the required deps type.
content	UserContent	The new user message content to append to the message history.

Returns

Type	Description
AsyncContextResponse[DepsT] \| AsyncContextResponse[DepsT, FormattableT]	A new `AsyncContextResponse` instance generated from the extended message history.

Class AsyncContextStreamResponse

An AsyncContextStreamResponse wraps response content from the LLM with a streaming interface.

This class supports iteration to process chunks as they arrive from the model.

Content can be streamed in one of three ways:

Via .streams(), which provides an iterator of streams, where each stream contains chunks of streamed data. The chunks contain deltas (new content in that particular chunk), and the stream itself accumulates the collected state of all the chunks processed thus far.
Via .chunk_stream() which allows iterating over Mirascope's provider- agnostic chunk representation.
Via .pretty_stream() a helper method which provides all response content as str deltas. Iterating through pretty_stream will yield text content and optionally placeholder representations for other content types, but it will still consume the full stream.
Via .structured_stream(), a helper method which provides partial structured outputs from a response (useful when FormatT is set). Iterating through structured_stream will only yield structured partials, but it will still consume the full stream.

As chunks are consumed, they are collected in-memory on the AsyncContextStreamResponse, and they become available in .content, .messages, .tool_calls, etc. All of the stream iterators can be restarted after the stream has been consumed, in which case they will yield chunks from memory in the original sequence that came from the LLM. If the stream is only partially consumed, a fresh iterator will first iterate through in-memory content, and then will continue consuming fresh chunks from the LLM.

In the specific case of text chunks, they are included in the response content as soon as they become available, via an llm.Text part that updates as more deltas come in. This enables the behavior where resuming a partially-streamed response will include as much text as the model generated.

For other chunks, like Thinking or ToolCall, they are only added to response content once the corresponding part has fully streamed. This avoids issues like adding incomplete tool calls, or thinking blocks missing signatures, to the response.

For each iterator, fully iterating through the iterator will consume the whole LLM stream. You can pause stream execution midway by breaking out of the iterator, and you can safely resume execution from the same iterator if desired.

Bases: BaseAsyncStreamResponse[AsyncContextToolkit, FormattableT], Generic[DepsT, FormattableT]

Function execute_tools

Execute and return all of the tool calls in the response.

Parameters

Name	Type	Description
self	Any	-
ctx	Context[DepsT]	A `Context` with the required deps type.

Returns

Type	Description
Sequence[ToolOutput]	A sequence containing a `ToolOutput` for every tool call in the order they appeared.

Function resume

Generate a new AsyncContextStreamResponse using this response's messages with additional user content.

Uses this response's tools and format type. Also uses this response's provider, model, client, and params, unless the model context manager is being used to provide a new LLM as an override.

Parameters

Name	Type	Description
self	Any	-
ctx	Context[DepsT]	A Context with the required deps type.
content	UserContent	The new user message content to append to the message history.

Returns

Type	Description
AsyncContextStreamResponse[DepsT] \| AsyncContextStreamResponse[DepsT, FormattableT]	A new `AsyncContextStreamResponse` instance generated from the extended message history.

Class AsyncResponse

The response generated by an LLM in async mode.

Bases:

BaseResponse[AsyncToolkit, FormattableT]

Function execute_tools

Execute and return all of the tool calls in the response.

Parameters

Name	Type	Description
self	Any	-

Returns

Type	Description
Sequence[ToolOutput]	A sequence containing a `ToolOutput` for every tool call in the order they appeared.

Function resume

Generate a new AsyncResponse using this response's messages with additional user content.

Uses this response's tools and format type. Also uses this response's provider, model, client, and params, unless the model context manager is being used to provide a new LLM as an override.

Parameters

Name	Type	Description
self	Any	-
content	UserContent	The new user message content to append to the message history.

Returns

Type	Description
AsyncResponse \| AsyncResponse[FormattableT]	A new `AsyncResponse` instance generated from the extended message history.

Attribute AsyncStream

Type: TypeAlias

An asynchronous assistant content stream.

Class AsyncStreamResponse

An AsyncStreamResponse wraps response content from the LLM with a streaming interface.

This class supports iteration to process chunks as they arrive from the model.

Content can be streamed in one of three ways:

Via .streams(), which provides an iterator of streams, where each stream contains chunks of streamed data. The chunks contain deltas (new content in that particular chunk), and the stream itself accumulates the collected state of all the chunks processed thus far.
Via .chunk_stream() which allows iterating over Mirascope's provider- agnostic chunk representation.
Via .pretty_stream() a helper method which provides all response content as str deltas. Iterating through pretty_stream will yield text content and optionally placeholder representations for other content types, but it will still consume the full stream.
Via .structured_stream(), a helper method which provides partial structured outputs from a response (useful when FormatT is set). Iterating through structured_stream will only yield structured partials, but it will still consume the full stream.

As chunks are consumed, they are collected in-memory on the AsyncContextStreamResponse, and they become available in .content, .messages, .tool_calls, etc. All of the stream iterators can be restarted after the stream has been consumed, in which case they will yield chunks from memory in the original sequence that came from the LLM. If the stream is only partially consumed, a fresh iterator will first iterate through in-memory content, and then will continue consuming fresh chunks from the LLM.

In the specific case of text chunks, they are included in the response content as soon as they become available, via an llm.Text part that updates as more deltas come in. This enables the behavior where resuming a partially-streamed response will include as much text as the model generated.

For other chunks, like Thinking or ToolCall, they are only added to response content once the corresponding part has fully streamed. This avoids issues like adding incomplete tool calls, or thinking blocks missing signatures, to the response.

For each iterator, fully iterating through the iterator will consume the whole LLM stream. You can pause stream execution midway by breaking out of the iterator, and you can safely resume execution from the same iterator if desired.

Bases:

BaseAsyncStreamResponse[AsyncToolkit, FormattableT]

Function execute_tools

Execute and return all of the tool calls in the response.

Parameters

Name	Type	Description
self	Any	-

Returns

Type	Description
Sequence[ToolOutput]	A sequence containing a `ToolOutput` for every tool call in the order they appeared.

Function resume

Generate a new AsyncStreamResponse using this response's messages with additional user content.

Uses this response's tools and format type. Also uses this response's provider, model, client, and params, unless the model context manager is being used to provide a new LLM as an override.

Parameters

Name	Type	Description
self	Any	-
content	UserContent	The new user message content to append to the message history.

Returns

Type	Description
AsyncStreamResponse \| AsyncStreamResponse[FormattableT]	A new `AsyncStreamResponse` instance generated from the extended message history.

Class AsyncTextStream

Asynchronous text stream implementation.

Bases:

BaseAsyncStream[Text, str]

Attributes

Name	Type	Description
type	Literal['async_text_stream']	-
content_type	Literal['text']	The type of content stored in this stream.
partial_text	str	The accumulated text content as chunks are received.

Function collect

Asynchronously collect all chunks and return the final Text content.

Parameters

Name	Type	Description
self	Any	-

Returns

Type	Description
Text	The complete text content after consuming all chunks.

Class AsyncThoughtStream

Asynchronous thought stream implementation.

Bases:

BaseAsyncStream[Thought, str]

Attributes

Name	Type	Description
type	Literal['async_thought_stream']	-
content_type	Literal['thought']	The type of content stored in this stream.
partial_thought	str	The accumulated thought content as chunks are received.

Function collect

Asynchronously collect all chunks and return the final Thought content.

Parameters

Name	Type	Description
self	Any	-

Returns

Type	Description
Thought	The complete thought content after consuming all chunks.

Class AsyncToolCallStream

Asynchronous tool call stream implementation.

Bases:

BaseAsyncStream[ToolCall, str]

Attributes

Name	Type	Description
type	Literal['async_tool_call_stream']	-
content_type	Literal['tool_call']	The type of content stored in this stream.
tool_id	str	A unique identifier for this tool call.
tool_name	str	The name of the tool being called.
partial_args	str	The accumulated tool arguments as chunks are received.

Function collect

Asynchronously collect all chunks and return the final ToolCall content.

Parameters

Name	Type	Description
self	Any	-

Returns

Type	Description
ToolCall	The complete tool call after consuming all chunks.

Attribute ChunkIterator

Type: TypeAlias

Synchronous iterator yielding chunks with raw data.

Class ContextResponse

The response generated by an LLM from a context call.

Bases: BaseResponse[ContextToolkit[DepsT], FormattableT], Generic[DepsT, FormattableT]

Function execute_tools

Execute and return all of the tool calls in the response.

Parameters

Name	Type	Description
self	Any	-
ctx	Context[DepsT]	A `Context` with the required deps type.

Returns

Type	Description
Sequence[ToolOutput]	A sequence containing a `ToolOutput` for every tool call.

Function resume

Generate a new ContextResponse using this response's messages with additional user content.

Uses this response's tools and format type. Also uses this response's provider, model, client, and params, unless the model context manager is being used to provide a new LLM as an override.

Parameters

Name	Type	Description
self	Any	-
ctx	Context[DepsT]	A `Context` with the required deps type.
content	UserContent	The new user message content to append to the message history.

Returns

Type	Description
ContextResponse[DepsT] \| ContextResponse[DepsT, FormattableT]	A new `ContextResponse` instance generated from the extended message history.

Class ContextStreamResponse

A ContextStreamResponse wraps response content from the LLM with a streaming interface.

This class supports iteration to process chunks as they arrive from the model.

Content can be streamed in one of three ways:

Via .streams(), which provides an iterator of streams, where each stream contains chunks of streamed data. The chunks contain deltas (new content in that particular chunk), and the stream itself accumulates the collected state of all the chunks processed thus far.
Via .chunk_stream() which allows iterating over Mirascope's provider- agnostic chunk representation.
Via .pretty_stream() a helper method which provides all response content as str deltas. Iterating through pretty_stream will yield text content and optionally placeholder representations for other content types, but it will still consume the full stream.
Via .structured_stream(), a helper method which provides partial structured outputs from a response (useful when FormatT is set). Iterating through structured_stream will only yield structured partials, but it will still consume the full stream.

As chunks are consumed, they are collected in-memory on the ContextStreamResponse, and they become available in .content, .messages, .tool_calls, etc. All of the stream iterators can be restarted after the stream has been consumed, in which case they will yield chunks from memory in the original sequence that came from the LLM. If the stream is only partially consumed, a fresh iterator will first iterate through in-memory content, and then will continue consuming fresh chunks from the LLM.

In the specific case of text chunks, they are included in the response content as soon as they become available, via an llm.Text part that updates as more deltas come in. This enables the behavior where resuming a partially-streamed response will include as much text as the model generated.

For other chunks, like Thinking or ToolCall, they are only added to response content once the corresponding part has fully streamed. This avoids issues like adding incomplete tool calls, or thinking blocks missing signatures, to the response.

For each iterator, fully iterating through the iterator will consume the whole LLM stream. You can pause stream execution midway by breaking out of the iterator, and you can safely resume execution from the same iterator if desired.

Bases: BaseSyncStreamResponse[ContextToolkit, FormattableT], Generic[DepsT, FormattableT]

Function execute_tools

Execute and return all of the tool calls in the response.

Parameters

Name	Type	Description
self	Any	-
ctx	Context[DepsT]	A `Context` with the required deps type.

Returns

Type	Description
Sequence[ToolOutput]	A sequence containing a `ToolOutput` for every tool call.

Function resume

Generate a new ContextStreamResponse using this response's messages with additional user content.

Uses this response's tools and format type. Also uses this response's provider, model, client, and params, unless the model context manager is being used to provide a new LLM as an override.

Parameters

Name	Type	Description
self	Any	-
ctx	Context[DepsT]	A Context with the required deps type.
content	UserContent	The new user message content to append to the message history.

Returns

Type	Description
ContextStreamResponse[DepsT] \| ContextStreamResponse[DepsT, FormattableT]	A new `ContextStreamResponse` instance generated from the extended message history.

Class FinishReason

The reason why the LLM finished generating a response.

FinishReason is only set when the response did not have a normal finish (e.g. it ran out of tokens). When a response finishes generating normally, no finish reason is set.

Bases: str, Enum

Attributes

Name	Type	Description
MAX_TOKENS	'max_tokens'	-
REFUSAL	'refusal'	-

Class FinishReasonChunk

Represents the finish reason for a completed stream.

Attributes

Name	Type	Description
type	Literal['finish_reason_chunk']	-
finish_reason	FinishReason	The reason the stream finished.

Class RawMessageChunk

A chunk containing provider-specific raw message content that will be added to the AssistantMessage.

This chunk contains a provider-specific representation of a piece of content that will be added to the AssistantMessage reconstructed by the containing stream. This content should be a Jsonable Python object for serialization purposes.

The intention is that this content may be passed as-is back to the provider when the generated AssistantMessage is being reused in conversation.

Attributes

Name	Type	Description
type	Literal['raw_message_chunk']	-
raw_message	Jsonable	The provider-specific raw content. Should be a Jsonable object.

Class RawStreamEventChunk

A chunk containing a raw stream event from the underlying provider.

Will be accumulated on StreamResponse.raw for debugging purposes.

Attributes

Name	Type	Description
type	Literal['raw_stream_event_chunk']	-
raw_stream_event	Any	The raw stream event from the underlying provider.

Class Response

The response generated by an LLM.

Bases:

BaseResponse[Toolkit, FormattableT]

Function execute_tools

Execute and return all of the tool calls in the response.

Parameters

Name	Type	Description
self	Any	-

Returns

Type	Description
Sequence[ToolOutput]	A sequence containing a `ToolOutput` for every tool call in the order they appeared.

Function resume

Generate a new Response using this response's messages with additional user content.

Uses this response's tools and format type. Also uses this response's provider, model, client, and params, unless the model context manager is being used to provide a new LLM as an override.

Parameters

Name	Type	Description
self	Any	-
content	UserContent	The new user message content to append to the message history.

Returns

Type	Description
Response \| Response[FormattableT]	A new `Response` instance generated from the extended message history.

Class RootResponse

Base class for LLM responses.

Bases: Generic[ToolkitT, FormattableT], ABC

Attributes

Name	Type	Description
raw	Any	The raw response from the LLM.
provider	Provider	The provider that generated this response.
model_id	ModelId	The model id that generated this response.
params	Params	The params that were used to generate this response (or None).
toolkit	ToolkitT	The toolkit containing the tools used when generating this response.
messages	list[Message]	The message history, including the most recent assistant message.
content	Sequence[AssistantContentPart]	The content generated by the LLM.
texts	Sequence[Text]	The text content in the generated response, if any.
tool_calls	Sequence[ToolCall]	The tools the LLM wants called on its behalf, if any.
thoughts	Sequence[Thought]	The readable thoughts from the model's thinking process, if any. The thoughts may be direct output from the model thinking process, or may be a generated summary. (This depends on the provider; newer models tend to summarize.)
finish_reason	FinishReason \| None	The reason why the LLM finished generating a response, if set. `finish_reason` is only set if the response did not finish generating normally, e.g. `FinishReason.MAX_TOKENS` if the model ran out of tokens before completing. When the response generates normally, `response.finish_reason` will be `None`.
format	Format[FormattableT] \| None	The `Format` describing the structured response format, if available.
model	Model	A `Model` with parameters matching this response.

Function parse

Format the response according to the response format parser.

Parameters

Name	Type	Description
self	Any	-
partial= False	bool	-

Returns

Type	Description
FormattableT \| Partial[FormattableT] \| None	The formatted response object of type FormatT.

Function pretty

Return a string representation of all response content.

The response content will be represented in a way that emphasies clarity and readability, but may not include all metadata (like thinking signatures or tool call ids), and thus cannot be used to reconstruct the response. For example:

Thinking: The user is asking a math problem. I should use the calculator tool.

Tool Call (calculator) {'operation': 'mult', 'a': 1337, 'b': 4242}

I am going to use the calculator and answer your question for you!

Parameters

Name	Type	Description
self	Any	-

Returns

Type	Description
str	-

Attribute Stream

Type: TypeAlias

A synchronous assistant content stream.

Class StreamResponse

A StreamResponse wraps response content from the LLM with a streaming interface.

This class supports iteration to process chunks as they arrive from the model.

Content can be streamed in one of three ways:

Via .streams(), which provides an iterator of streams, where each stream contains chunks of streamed data. The chunks contain deltas (new content in that particular chunk), and the stream itself accumulates the collected state of all the chunks processed thus far.
Via .chunk_stream() which allows iterating over Mirascope's provider- agnostic chunk representation.
Via .pretty_stream() a helper method which provides all response content as str deltas. Iterating through pretty_stream will yield text content and optionally placeholder representations for other content types, but it will still consume the full stream.
Via .structured_stream(), a helper method which provides partial structured outputs from a response (useful when FormatT is set). Iterating through structured_stream will only yield structured partials, but it will still consume the full stream.

As chunks are consumed, they are collected in-memory on the StreamResponse, and they become available in .content, .messages, .tool_calls, etc. All of the stream iterators can be restarted after the stream has been consumed, in which case they will yield chunks from memory in the original sequence that came from the LLM. If the stream is only partially consumed, a fresh iterator will first iterate through in-memory content, and then will continue consuming fresh chunks from the LLM.

In the specific case of text chunks, they are included in the response content as soon as they become available, via an llm.Text part that updates as more deltas come in. This enables the behavior where resuming a partially-streamed response will include as much text as the model generated.

For other chunks, like Thinking or ToolCall, they are only added to response content once the corresponding part has fully streamed. This avoids issues like adding incomplete tool calls, or thinking blocks missing signatures, to the response.

For each iterator, fully iterating through the iterator will consume the whole LLM stream. You can pause stream execution midway by breaking out of the iterator, and you can safely resume execution from the same iterator if desired.

Bases:

BaseSyncStreamResponse[Toolkit, FormattableT]

Function execute_tools

Execute and return all of the tool calls in the response.

Parameters

Name	Type	Description
self	Any	-

Returns

Type	Description
Sequence[ToolOutput]	A sequence containing a `ToolOutput` for every tool call in the order they appeared.

Function resume

Generate a new StreamResponse using this response's messages with additional user content.

Uses this response's tools and format type. Also uses this response's provider, model, client, and params, unless the model context manager is being used to provide a new LLM as an override.

Parameters

Name	Type	Description
self	Any	-
content	UserContent	The new user message content to append to the message history.

Returns

Type	Description
StreamResponse \| StreamResponse[FormattableT]	A new `StreamResponse` instance generated from the extended message history.

Attribute StreamResponseChunk

Type: TypeAlias

Class TextStream

Synchronous text stream implementation.

Bases:

BaseStream[Text, str]

Attributes

Name	Type	Description
type	Literal['text_stream']	-
content_type	Literal['text']	The type of content stored in this stream.
partial_text	str	The accumulated text content as chunks are received.

Function collect

Collect all chunks and return the final Text content.

Parameters

Name	Type	Description
self	Any	-

Returns

Type	Description
Text	The complete text content after consuming all chunks.

Class ThoughtStream

Synchronous thought stream implementation.

Bases:

BaseStream[Thought, str]

Attributes

Name	Type	Description
type	Literal['thought_stream']	-
content_type	Literal['thought']	The type of content stored in this stream.
partial_thought	str	The accumulated thought content as chunks are received.

Function collect

Collect all chunks and return the final Thought content.

Parameters

Name	Type	Description
self	Any	-

Returns

Type	Description
Thought	The complete thought content after consuming all chunks.

Class ToolCallStream

Synchronous tool call stream implementation.

Bases:

BaseStream[ToolCall, str]

Attributes

Name	Type	Description
type	Literal['tool_call_stream']	-
content_type	Literal['tool_call']	The type of content stored in this stream.
tool_id	str	A unique identifier for this tool call.
tool_name	str	The name of the tool being called.
partial_args	str	The accumulated tool arguments as chunks are received.

Function collect

Collect all chunks and return the final ToolCall content.

Parameters

Name	Type	Description
self	Any	-

Returns

Type	Description
ToolCall	The complete tool call after consuming all chunks.