Versioning

If you haven't already, we recommend reading Traces first.

When working with non-determinism (cough cough LLMs), it's extremely important that we take a snapshot of the exact version of the code that was used to produce an output. This reproducibility is necessary for proper evaluation.

Automatic Versioning

Setting versioning="automatic" will tell Lilypad to automatically version the decorated function every time it's executed:

from google.genai import Client
import lilypad

lilypad.configure(auto_llm=True)
client = Client()
    
@lilypad.trace(versioning="automatic")
def answer_question(question: str) -> str | None:
    response = client.models.generate_content(
        model="gemini-2.0-flash-001",
        contents=f"Answer this question: {question}",
    )
    return response.text
    
response = answer_question("What is the capital of France?")  # automatically versioned
print(response)
# > The capital of France is Paris.

The version is the 'runnable closure' snapshot

Versioned Function

Version Override

Since a function's version is a snapshot, we can re-run the exact code for a given version.

First, generate type-safe stubs for all available versions:

lilypad sync

This ensures that the .version method return type-safe function signatures matching the expected arguments for that version, which you can then run in a sandbox:

response = answer_question.version(1)("What is the capital of France?")

The .version method will have proper type hints when the function is imported but not if used in the same file. This is due to the way that .pyi files work in Python.

A/B Testing

You can use the .version method to run different versions of the same function easily. This makes A/B testing quite easy — just run each version for a subset of users and evaluate which one is better.

Remote Deployment

Environments are currently under active development during the beta period.

If remote deployment is important to you, please reach out to us on Slack or via email.