The most surprising thing about validating LLM output is that the "guardrails" are often just as complex and prone to error as the LLM itself.
Let’s see this in action. Imagine we want to extract structured data from customer feedback. We’ll use a simple example with a Python script and the guardrails library.
from guardrails import Guard
from guardrails.classes.llm.llm_output import LLMOutput
from pydantic import BaseModel, Field
import json
# Define the output schema using Pydantic
class Feedback(BaseModel):
sentiment: str = Field(description="The overall sentiment of the feedback (positive, negative, neutral)")
topics: list[str] = Field(description="Key topics mentioned in the feedback")
action_item: str | None = Field(None, description="If an action is requested, what is it?")
# Define the guardrails for the LLM
# We'll use a simple output validator that ensures the output conforms to our Pydantic schema.
# The 'on_fail' argument specifies what to do if validation fails. Here, we'll ask the LLM to re-try.
guard = Guard(
validators=[
LLMOutput(
model="gpt-3.5-turbo", # Or any other LLM you prefer
output_schema=Feedback,
on_fail="reask" # Re-ask the LLM with instructions to fix the output
)
]
)
# Example customer feedback
customer_feedback = """
I'm really happy with the new feature! It's made my workflow so much smoother.
However, I noticed a small bug where the save button sometimes doesn't respond.
Please fix this.
"""
# Prompt the LLM with guardrails
# The 'prompt' argument is the actual instruction to the LLM.
# The 'output_schema' is passed implicitly by the validator.
# The 'num_reasks' limits how many times the LLM can re-attempt if it fails validation.
response = guard(
prompt=f"Extract the sentiment, topics, and any action items from the following customer feedback:\n\n{customer_feedback}",
num_reasks=3 # Allow up to 3 re-attempts
)
# The 'response' object contains the validated output.
# If 'response.output' is None, it means validation failed after all re-asks.
if response.output:
print("Validated Output:")
print(json.dumps(response.output.model_dump(), indent=2))
else:
print("Failed to validate LLM output after multiple re-asks.")
print("Raw LLM output history:")
for i, history_item in enumerate(response.history):
print(f"--- Reask {i+1} ---")
print(f"Prompt: {history_item.prompt}")
print(f"LLM Output: {history_item.llm_output}")
print(f"Validation Result: {history_item.validation_result}")
This script defines a Feedback structure using Pydantic, outlining the expected keys and their types. The guardrails library then wraps an LLM call, instructing it to produce output matching this Feedback schema. If the LLM’s output doesn’t conform (e.g., missing a key, wrong data type), the on_fail="reask" directive tells guardrails to send the original prompt back to the LLM, along with the error, and ask it to correct its output.
The core problem guardrails solves is the inherent unreliability of LLM output when strict structure or factual accuracy is required. LLMs are probabilistic; they predict the next most likely token. This means they can hallucinate, misinterpret instructions, or produce malformed output. guardrails acts as an external validation layer, ensuring the LLM’s raw output is post-processed into a usable, predictable format. It’s not just about getting an answer, but about getting an answer that conforms to predefined rules.
Internally, guardrails constructs a meta-prompt for each re-ask. This meta-prompt includes the original user prompt, the LLM’s previous erroneous output, and specific instructions from the validator (like the Pydantic schema definition and the error that occurred) guiding the LLM on how to fix its mistakes. The num_reasks parameter is crucial for preventing infinite loops in cases where the LLM might consistently fail. The output_schema parameter, when used with a Pydantic model, automatically generates the necessary JSON schema and validation logic.
Here’s a look at the response.history if the LLM fails initially:
Failed to validate LLM output after multiple re-asks.
Raw LLM output history:
--- Reask 1 ---
Prompt: Extract the sentiment, topics, and any action items from the following customer feedback:
I'm really happy with the new feature! It's made my workflow so much smoother.
However, I noticed a small bug where the save button sometimes doesn't respond.
Please fix this.
LLM Output: {"sentiment": "positive", "topics": ["new feature", "workflow"], "action_item": "fix the save button bug"}
Validation Result: ValidationResult(status=<ValidationStatus.FAIL: 'fail'>, error_message='Field "action_item" is of type str, but got NoneType.', validator_name='output_schema')
Notice how the error_message clearly states the problem: action_item was expected to be a string but was NoneType. The guardrails library will then use this information to instruct the LLM on the next re-ask.
A subtle but critical aspect of using guardrails is understanding how the on_fail parameter interacts with different validator types. For example, if you’re using a ContentValidator that checks for profanity, on_fail="filter" might remove the offending content, while on_fail="exception" would raise an error. The reask strategy is powerful because it leverages the LLM’s own capabilities to correct its output, rather than relying solely on external parsing or filtering. This is particularly effective for complex data extraction or generation tasks where the LLM has a good chance of understanding and rectifying its own errors with a bit of guidance.
The next step after successfully validating LLM output is often to integrate these structured outputs into downstream applications or to implement more sophisticated chaining of LLM calls, each with its own set of guardrails.