The New OpenAI Agents Platform

The New OpenAI Agents Platform
While everyone is now repeating that 2025 is the "Year of the Agent", OpenAI is heads down building towards it. In the first 2 months of the year they released Operator and Deep Research, and today they are bringing a lot of those capabilities to the API with their new Agents Platform.
PP
by Pierre Placide
 
Key Components of the OpenAI Agents Platform
1
Responses API
A more flexible foundation for developers building agentic applications, serving as a superset of the chat completion API and the suggested starting point for developers working with OpenAI models.
2
Web Search Tool
Brings ChatGPT Search capabilities to the API, with fine-tuned models that provide real-time information access with inline citations pointing to exact locations in source documents.
3
Computer Use Tool
The technology behind Operator (CUA) now available to developers, allowing AI to interact with computer interfaces through screenshots and generate appropriate actions.
4
Agents SDK
An evolution of Swarm, providing a framework for multi-agent workflows with integrated observability tools, handoffs between agents, and guardrails for safety.
Responses API: The New Foundation
Superset of Chat Completions
The Responses API supports everything that chat completion supports at launch, and over time will support everything that assistants supports, making it a good fit for anyone starting with OpenAI.
Stateful by Default
Unlike chat completions, Responses API is stateful by default, storing chat state for 30 days at no additional cost. This can be disabled by passing "store=false" for a completely stateless experience.
Built for Agentic Applications
Designed for longer horizon tasks that require multiple turns to accomplish, supporting the new agentic workflows that developers and companies want to build.
Assistants API Transition
1
Current Status
The Assistants API has a target sunset date of first half of 2026, with OpenAI planning to make it easy for Assistants API users to migrate to Responses API without any loss of functionality or data.
2
Migration Path
OpenAI will add assistant-like objects and thread-like objects that work with the Responses API, along with the code interpreter tool which is not launching today but will come soon.
3
Transition Period
Users will have a full year to migrate, with OpenAI providing support to help through any issues they face during the transition to the more flexible Responses API primitive.
Responses API vs. Chat Completions
Chat Completions
The most widely adopted OpenAI API, optimized for a pre-multi-modality world and single-turn interactions. It takes a prompt in and returns a response out. OpenAI will continue to support it for years with new models and features.
Responses API
Designed for multi-turn interactions and agentic workflows, supporting built-in tools and stateful conversations. It combines the best of what OpenAI learned from the Assistants API with a simplified integration path.
When to Use Each
New users should start with Responses API for access to more capabilities. Existing users should consider migrating if they want to use built-in tools or need more advanced features, while simple use cases can continue with Chat Completions.
Debugging and Observability Benefits
Visual Dashboard
The stateful nature of Responses API enables visual observability of everything happening in your application, making debugging much simpler. You can see exactly what happened that might have messed up your prompt.
Tool Configuration
The dashboard helps identify if tools were misconfigured or not called correctly, providing insights into how your application is interacting with the OpenAI platform.
Metadata Support
Responses API includes metadata support, allowing developers to attach additional information to their API calls for better organization and tracking of conversations.
Web Search Tool: Real-Time Information Access
Fine-Tuned Models
The Web Search API exposes two 4o fine-tunes: gpt-4o-search-preview and gpt-4o-mini-search-preview, the same models that power ChatGPT Search, priced at $30/1000 queries and $25/1000 queries respectively.
Inline Citations
The killer feature is inline citations: you not only get a link to a page but also a deep link to exactly where your query was answered in the result page, providing precise attribution.
Performance Improvements
In simple QA benchmarks, GPT-4o achieves 38% accuracy while GPT-4o-search reaches 90%, demonstrating the significant performance boost from specialized fine-tuning for search tasks.
Web Search Implementation Options
1
As a Tool in Responses API
Web search is available as a built-in tool in the Responses API. Developers can simply enable the web search tool in their configuration and it's ready to go, integrating seamlessly with other tools.
2
As a Model in Chat Completions
For developers using the Chat Completions API, which doesn't support built-in tools, OpenAI provides direct access to the fine-tuned model that ChatGPT for search uses, called GPT-4-search-preview.
3
Combined with Other Features
When used as a tool, web search combines with other features like function calling and structured output, enabling developers to structure data from the web in real-time in the JSON schema needed for their application.
Web Search: Behind the Scenes
1
1
Information Gathering
OpenAI's search research team gathers information from multiple data sources used for search, focusing on retrieving relevant and up-to-date content.
2
2
Content Selection
The system picks the right information from the gathered sources, prioritizing relevance and accuracy to the user's query.
3
3
Accurate Citation
The model is trained to cite sources accurately, providing proper attribution to the original content creators and enabling verification.
4
4
Fine-Tuning Techniques
The team uses synthetic data techniques and model distillation to make the 4o fine-tunes highly effective at remaining factual and answering questions based on retrieved information.
Publisher Controls for Web Search
Opt-In/Opt-Out Options
Publishers can control whether their websites appear in the web search results, with OpenAI providing documentation on how websites and publishers can manage what shows up in the web search tool.
Citation Benefits
For publishers who participate, the inline citation feature provides direct attribution and drives traffic to the specific content that answers users' questions.
Getting Your Site Included
Content creators like Latent Space can follow OpenAI's documentation to ensure their content is properly indexed and available through the web search capabilities.
Web Search in the Online LLM Landscape
1
Perplexity
One of the first to offer an API connected to search, establishing the pattern of LLMs with real-time web access.
2
Gemini
Followed with their search grounding API, continuing the trend of connecting language models to current information.
3
OpenAI
Now joining with their Web Search tool, which adds the innovation of deep links to exact paragraphs matching the query, raising the standard for citation precision.
Knowledge Cutoff Considerations
1
Always Live Information
With web search integration, there's effectively no knowledge cutoff as the model can always access current information, fundamentally changing how we think about model knowledge limitations.
2
Model Knowledge vs. Search
There's a distinction between what the model has internalized through training and what it's retrieving through search, with different use cases benefiting from each approach.
3
Use Case Dependent
For applications like those built by Hebbia for credit firms or law firms, combining public information from the internet with live sources and citations is crucial, while other applications may prefer to rely on the model's internal knowledge.
Future Search Enhancements
Search Depth Parameters
While not currently available, OpenAI is considering parameters to control how deep and wide the search goes, similar to the hyper-parameters in Deep Research implementations.
Similarity Cutoffs
Rather than just offering top K results (like top 10 or top 20), future implementations might benefit from similarity score cutoffs that adapt to the number of relevant documents available.
Context Budgeting
A potential approach could involve setting a context budget, allowing the system to go as deep as possible while selecting the best content within a specified token limit to manage costs.
Creative Uses of Files and Web Search Together
User Preferences
Storing user preferences or memories in the vector store, then using the file search tool to retrieve these preferences when needed for personalized recommendations.
Contextual Web Search
Using retrieved preferences to guide web searches for information or products that match the user's interests, creating a personalized discovery experience.
Unified Experience
Combining these capabilities in a single Responses API call, where the system configures both tools and executes the entire workflow seamlessly, presenting relevant results to the user.
The Power of Combined Knowledge Sources
1
2
3
4
1
Personalized Answers
Precise responses tailored to specific needs
2
Private Documents
Company data and confidential information
3
Real-time Web Information
Current data from across the internet
4
Model's Internal Knowledge
Base understanding from training data
When neural networks combine their internal knowledge with real-time access to the internet and private company documents through file search, they can provide compelling and precise answers for virtually any use case. This combination of knowledge sources creates a powerful foundation for building specialized applications.
File Search Tool Enhancements
1
Metadata Filtering
A highly requested feature that's now available, allowing developers to filter search results based on metadata attributes, which becomes critical as vector store size grows beyond 5,000-10,000 records.
2
Expanded File Types
Support for more file formats, making it easier to ingest and search through diverse document collections without having to handle different parsing requirements.
3
Query Optimization
Improved search algorithms that better understand user queries and match them to the most relevant content in the file store.
4
Custom Re-ranking
Capabilities to adjust how search results are prioritized, allowing for more relevant results based on specific application needs.
File Search as a Managed RAG Service
DIY RAG Approach
Building a custom RAG system from scratch gives more control over chunking strategies, retrieval methods, and every aspect of the implementation, but requires more engineering effort and expertise.
OpenAI's Managed Solution
The File Search tool provides an out-of-the-box solution with some customization knobs, handling parsing, chunking, embedding, and making content searchable without requiring deep RAG expertise.
Recommendation
Start with OpenAI's solution to see if it meets your needs, as they'll be adding more customization options over time. If you need complete control over every aspect, consider hand-rolling a solution using other specialized tools.
Real-World File Search Applications
Corporate Knowledge Bases
Companies like Navant use file search to make all their FAQ and travel policies searchable, creating assistants that are naturally aware of internal policies without having to build custom RAG systems.
Legal Research
Law firms can upload case documents and precedents, enabling quick retrieval of relevant legal information without specialized legal search expertise.
Technical Documentation
Engineering teams can make their codebase documentation and technical specs searchable, creating assistants that can answer questions about internal systems and practices.
File Search Pricing
$2.50
Per 1,000 Queries
Cost for searching through your uploaded files using the File Search tool.
$0.10
Per GB Per Day
Storage cost for keeping your files available for search in the OpenAI system.
1 GB
Free Storage
The first gigabyte of storage is provided at no cost, making it accessible for smaller applications.
The pricing structure for File Search is designed to be straightforward and predictable, with costs scaling based on usage. The per-query pricing model allows developers to estimate costs based on expected application traffic, while the storage pricing is based on the volume of data being made searchable.
Computer Use Tool: Bringing Operator to the API
Computer-Using-Agent (CUA)
The model that powers Operator is now available in the API as computer-use-preview, bringing the ability to interact with computer interfaces to developers.
State-of-the-Art Performance
The model achieves 38.1% success on OSWorld for full computer use tasks, 58.1% on WebArena, and 87% on WebVoyager for web-based interactions, setting new benchmarks for computer interaction.
Model and Tool Duality
Computer-use-preview functions as both a model and a tool, allowing developers to specify the environment in which the agent will operate and the tasks it should perform.
How Computer Use Works
Screenshot Input
The system takes screenshots of the computer interface as input, analyzing the visual elements to understand the current state of the application or website.
Action Generation
Based on the screenshot analysis, the model determines what action to take next, with outputs almost always being tool calls that specify the next interaction.
Execution
The system performs the specified action, such as clicking, scrolling, or typing, and then captures the updated screen state for the next iteration.
Task Completion
This process repeats until the entire task is completed, with the agent reporting back on the actions taken and the final outcome.
Computer Use Applications
Task Automation
Automating repetitive tasks that require navigating through user interfaces, such as data entry, form filling, or information retrieval from web applications.
User Assistance
Creating agents that can help users navigate complex software or websites by demonstrating the steps or performing actions on their behalf.
UI Testing
Automatically testing user interfaces by having the agent interact with the application and verify that it behaves as expected across different scenarios.
Computer Use Pricing and Availability
Input Token Cost
Usage is priced at $3 per million input tokens, which includes the screenshots and instructions provided to the model.
Output Token Cost
Output tokens are priced at $12 per million, covering the actions and responses generated by the model during task execution.
Availability
Currently only available to users in tiers 3-5 of OpenAI's access program, with plans to expand availability as the technology matures.
Computer Use: Early Days of a New Paradigm
The computer use models are exciting. The cool thing about computer use is that we're just so, so early. It's like the GPT-2 of computer use or maybe GPT-1 of computer use right now.
Nikunj Handa from OpenAI emphasizes that computer-using agents are at the very beginning of their development journey. Just as early language models like GPT-1 and GPT-2 were primitive compared to today's capabilities, the current computer use models represent the first steps in what will likely be a rapidly evolving technology.
This early stage presents both limitations and opportunities, as developers can start building with these capabilities while anticipating significant improvements in the coming years.
The Future of Model Specialization
1
Vision Capabilities
Initially introduced as a separate vision preview model based on GPT-4, vision capabilities were eventually integrated into the main GPT-4o model as the technology matured.
2
Current Specialized Models
Models like search-preview and computer-use-preview represent specialized capabilities that are currently offered as separate fine-tunes of the base models.
3
Future Integration
As these specialized capabilities mature and OpenAI learns more about developer use cases, they plan to merge them into the main model line, reducing the need for developers to work with multiple model variants.
Agents SDK: Evolution from Swarm
1
Swarm Origins
OpenAI released Swarm as an experimental SDK for multi-agent orchestration, intended as an educational tool. It gained unexpected popularity, with developers embracing its approach to agent coordination.
2
Community Adoption
The viral reception to Swarm demonstrated strong demand for tools to manage complex agent interactions, prompting OpenAI to develop a more robust, officially supported framework.
3
Agents SDK Launch
Building on Swarm's success, the new Agents SDK provides a more comprehensive framework with added features like typing support, guardrails, and integrated tracing in the OpenAI dashboard.
Agents SDK: Core Components
1
Agents
Easily configurable LLMs with clear instructions and built-in tools, designed to perform specific functions within a larger workflow.
2
Handoffs
Mechanisms to intelligently transfer control between agents, allowing for specialization and division of labor in complex tasks.
3
Guardrails
Configurable safety checks for input and output validation, ensuring that agent interactions remain within appropriate boundaries.
4
Tracing & Observability
Tools to visualize agent execution traces, making it easier to debug and optimize performance by understanding exactly how agents are interacting.
Agents SDK: Technical Improvements
Type Support
Added support for TypeScript/Python types, making it easier to build type-safe applications and catch errors during development rather than at runtime.
Guardrail Patterns
Implementation of the guardrail pattern where validation happens in parallel with execution, potentially blocking actions that violate safety or policy constraints.
API Flexibility
Support for any API provider that follows the ChatCompletions API format, not just limited to OpenAI's offerings, enabling multi-provider agent orchestration.
Multiple Tracing Providers
While the default is the OpenAI dashboard, the SDK supports multiple tracing providers, with partnerships to be announced for integration with existing observability platforms.
Common Agentic Patterns
1
1
Workflows
Sequential processing of tasks through multiple specialized agents
2
2
Handoffs
Transferring control between agents based on task requirements
3
3
Agents-as-Tools
Using agents as callable tools within other agents' workflows
4
4
LLM-as-a-Judge
Using models to evaluate outputs and make decisions
5
5
Parallelization
Running multiple agents simultaneously for efficiency
OpenAI is explicitly designing for these common patterns in multi-agent systems, providing examples and templates in the Agents SDK to help developers implement them effectively. These patterns represent proven approaches to solving complex problems through agent collaboration.
Tracing and Observability
Execution Traces
The Agents SDK provides detailed visualization of how agents execute tasks, showing each step in the process and making it easier to understand what's happening behind the scenes.
Handoff Visualization
When control transfers between agents, the tracing UI shows exactly what information was passed and how the receiving agent interpreted and acted on it.
Tool Call Inspection
Developers can see exactly which tools were called, with what parameters, and what responses were received, providing crucial insights for debugging complex agent behaviors.
Benefits of Multi-Agent Architecture
Logic Separation
Instead of having one agent handle everything, multi-agent systems allow for separation of concerns, making code more maintainable and easier to reason about.
Agent Specialization
Each agent can be optimized for specific tasks, with instructions and tools tailored to their particular role in the overall workflow.
Improved Monitoring
With the new tracing UI, developers can see exactly what happened at each step of a multi-agent workflow, making it much easier to troubleshoot issues.
Triage Pattern Example
1
User Intent Analysis
A triage agent analyzes the user's request to determine what type of task is being requested and which specialized agent would be best suited to handle it.
2
Agent Selection
Based on the analysis, the triage agent selects the appropriate specialized agent from the available pool, considering factors like task domain and required capabilities.
3
Handoff Execution
The triage agent performs a handoff to the selected agent, passing along relevant context and instructions for completing the specific task.
4
Task Completion
The specialized agent completes the task using its specific tools and knowledge, potentially handing back to the triage agent or directly to the user upon completion.
Future Integration with Fine-Tuning
Trace Collection
As agents operate, their execution traces are collected and stored, providing a rich dataset of agent behaviors and outcomes that can be used for improvement.
Evaluation Generation
The traces can be used to generate evaluations (evals) that measure agent performance on specific tasks, identifying areas where improvements are needed.
Grader Development
Specialized grader models can be created to assess agent outputs and provide feedback on quality and correctness, creating a foundation for reinforcement learning.
Reinforcement Fine-Tuning
With evaluations and graders in place, reinforcement learning techniques can be applied to fine-tune agents for improved performance on their specific tasks.
The Year of the Agent
We said 2025 was the year of agents. So there you have it, like a lot of new tools to build these agents for developers.
OpenAI is clearly positioning 2025 as the "Year of the Agent," and their recent releases support this vision. In just the first two months of the year, they've released Operator and Deep Research, and now they're bringing these capabilities to developers through the API.
The new Agents Platform represents a significant step toward democratizing agent development, making it easier for developers to create sophisticated AI systems that can perform complex tasks through multi-step reasoning and tool use.
OpenAI's Agent Development Timeline
1
Operator Release
Early 2025 saw the launch of Operator, a computer-using agent capable of interacting with user interfaces to complete tasks on behalf of users.
2
Deep Research
OpenAI released Deep Research, arguably the most successful agent archetype so far, capable of conducting thorough research on complex topics.
3
Agents Platform
The latest release brings agent capabilities to developers through the API, with tools for web search, computer use, file search, and multi-agent orchestration.
Responses API: Technical Implementation
import openai

client = openai.OpenAI()

# Create a response with web search enabled
response = client.beta.responses.create(
    model="gpt-4o",
    tools=[{"type": "web_search"}],
    messages=[
        {"role": "user", "content": "What are the latest developments in quantum computing?"}
    ]
)

# Print the response
print(response.choices[0].message.content)

# The response includes citations to web sources
This code example shows how to create a basic response using the new Responses API with the web search tool enabled. The API follows a similar pattern to chat completions but adds support for built-in tools that extend the model's capabilities.
The stateful nature of the API means that conversation history is automatically maintained, making it easy to build multi-turn interactions without having to manage state yourself.
Web Search Tool: Technical Implementation
import openai

client = openai.OpenAI()

# Using web search as a tool in Responses API
response = client.beta.responses.create(
    model="gpt-4o",
    tools=[{"type": "web_search"}],
    messages=[
        {"role": "user", "content": "What were the key announcements at the last Apple event?"}
    ]
)

# Using the dedicated search model in Chat Completions
chat_completion = client.chat.completions.create(
    model="gpt-4o-search-preview",
    messages=[
        {"role": "user", "content": "What were the key announcements at the last Apple event?"}
    ]
)
This example demonstrates the two ways to access web search capabilities: as a tool in the Responses API or by using the dedicated search model in the Chat Completions API. Both approaches provide access to real-time information, but the tool-based approach in Responses offers more flexibility for combining with other capabilities.
File Search Tool: Technical Implementation
import openai

client = openai.OpenAI()

# First, upload a file
file = client.files.create(
    file=open("company_policies.pdf", "rb"),
    purpose="file-search"
)

# Create a file search tool
response = client.beta.responses.create(
    model="gpt-4o",
    tools=[{
        "type": "file_search",
        "file_ids": [file.id]
    }],
    messages=[
        {"role": "user", "content": "What is our company's travel reimbursement policy?"}
    ]
)

# Print the response with information from the file
print(response.choices[0].message.content)
This example shows how to use the File Search tool to make documents searchable and accessible to the model. After uploading a file with the "file-search" purpose, you can reference it in the file_search tool configuration, allowing the model to retrieve and use information from the document when responding to queries.
Computer Use Tool: Technical Implementation
import openai
import base64
from PIL import ImageGrab
import io

client = openai.OpenAI()

# Capture screenshot
screenshot = ImageGrab.grab()
buffer = io.BytesIO()
screenshot.save(buffer, format="PNG")
screenshot_base64 = base64.b64encode(buffer.getvalue()).decode('utf-8')

# Create a computer use request
response = client.beta.responses.create(
    model="gpt-4o",
    tools=[{
        "type": "computer_use",
        "environment": {
            "type": "browser",
            "url": "https://example.com"
        }
    }],
    messages=[
        {"role": "user", "content": "Find the contact information on this website"},
        {"role": "user", "content": [
            {"type": "image", "image_url": {
                "url": f"data:image/png;base64,{screenshot_base64}"
            }}
        ]}
    ]
)

# The response will include tool calls with actions to take
This example demonstrates how to use the Computer Use tool to interact with a website. The code captures a screenshot of the current screen, encodes it as a base64 string, and sends it to the model along with instructions. The model analyzes the screenshot and returns actions to take, such as clicking on specific elements or typing text.
Agents SDK: Basic Implementation
from openai_agents import Agent, Workflow

# Define a research agent
research_agent = Agent(
    name="researcher",
    instructions="You are a research agent. Find information on the given topic.",
    tools=[{"type": "web_search"}]
)

# Define a summarization agent
summary_agent = Agent(
    name="summarizer",
    instructions="You are a summarization agent. Create concise summaries of research findings."
)

# Create a workflow with handoff
workflow = Workflow(
    name="research_workflow",
    agents=[research_agent, summary_agent],
    initial_agent="researcher"
)

# Execute the workflow
result = workflow.run("What are the latest advancements in fusion energy?")
print(result)
This example shows a basic implementation of the Agents SDK, creating a workflow with two specialized agents: one for research and one for summarization. The workflow starts with the research agent, which gathers information using the web search tool, and then hands off to the summarization agent to create a concise summary of the findings.
Agents SDK: Guardrails Implementation
from openai_agents import Agent, Workflow, Guardrail

# Define a content safety guardrail
safety_guardrail = Guardrail(
    name="content_safety",
    instructions="Ensure all content is appropriate and does not contain harmful material.",
    criteria="Content must be factual, respectful, and appropriate for all audiences."
)

# Define an agent with guardrails
content_agent = Agent(
    name="content_creator",
    instructions="Create engaging content on the given topic.",
    guardrails=[safety_guardrail]
)

# Create a workflow with the guarded agent
workflow = Workflow(
    name="content_workflow",
    agents=[content_agent],
    initial_agent="content_creator"
)

# Execute the workflow
result = workflow.run("Write a short article about controversial political topics")
# The guardrail will intervene if the content violates the safety criteria
This example demonstrates how to implement guardrails in the Agents SDK. The guardrail acts as a safety check that evaluates content against specified criteria before it's returned to the user. If the content violates the guardrail's criteria, it can be blocked or modified to ensure compliance with safety and policy requirements.
Agents SDK: Tracing Implementation
from openai_agents import Agent, Workflow, Tracer
from openai_agents.tracers import OpenAITracer

# Initialize the OpenAI tracer
tracer = OpenAITracer()

# Define agents
planner_agent = Agent(
    name="planner",
    instructions="Create a plan for completing the task."
)

executor_agent = Agent(
    name="executor",
    instructions="Execute the plan created by the planner."
)

# Create a workflow with tracing enabled
workflow = Workflow(
    name="traced_workflow",
    agents=[planner_agent, executor_agent],
    initial_agent="planner",
    tracer=tracer
)

# Execute the workflow
result = workflow.run("Organize a virtual team-building event")

# Trace ID can be used to view the execution in the OpenAI dashboard
print(f"View trace in dashboard: {tracer.trace_id}")
This example shows how to enable tracing in the Agents SDK using the OpenAI tracer. The tracer records all agent interactions, tool calls, and handoffs, making them visible in the OpenAI dashboard for debugging and optimization. This visibility is crucial for understanding complex multi-agent workflows and identifying areas for improvement.
Agents SDK: Parallel Execution
from openai_agents import Agent, ParallelWorkflow

# Define multiple research agents for different sources
news_agent = Agent(
    name="news_researcher",
    instructions="Research recent news articles on the topic.",
    tools=[{"type": "web_search"}]
)

academic_agent = Agent(
    name="academic_researcher",
    instructions="Research academic papers on the topic.",
    tools=[{"type": "web_search"}]
)

social_agent = Agent(
    name="social_researcher",
    instructions="Research social media discussions on the topic.",
    tools=[{"type": "web_search"}]
)

# Create a parallel workflow
parallel_workflow = ParallelWorkflow(
    name="comprehensive_research",
    agents=[news_agent, academic_agent, social_agent]
)

# Execute all agents in parallel
results = parallel_workflow.run("Impact of artificial intelligence on job markets")

# Process the combined results
for agent_name, result in results.items():
    print(f"Findings from {agent_name}:")
    print(result)
This example demonstrates parallel execution in the Agents SDK, where multiple agents work simultaneously on different aspects of the same task. This approach can significantly reduce the time required for complex tasks by distributing the workload across specialized agents, each focusing on a specific data source or perspective.
Agents SDK: LLM-as-a-Judge Pattern
from openai_agents import Agent, Workflow, Judge

# Define content creation agents with different styles
formal_agent = Agent(
    name="formal_writer",
    instructions="Write in a formal, professional style."
)

casual_agent = Agent(
    name="casual_writer",
    instructions="Write in a casual, conversational style."
)

# Define a judge to select the best output
style_judge = Judge(
    name="style_selector",
    instructions="Select the writing style that best matches the user's request.",
    criteria="Consider tone, formality, and appropriateness for the intended audience."
)

# Create a workflow with judgment
workflow = Workflow(
    name="adaptive_writing",
    agents=[formal_agent, casual_agent],
    judge=style_judge
)

# Execute the workflow with context about the audience
result = workflow.run("Write an email about our new product launch. The audience is corporate executives.")
# The judge will select the formal style for this audience
This example illustrates the LLM-as-a-Judge pattern, where a specialized agent evaluates outputs from multiple other agents and selects the best one based on specified criteria. This pattern is useful for situations where different approaches might be valid, and the optimal choice depends on context or user preferences.
Agents SDK: Agents-as-Tools Pattern
from openai_agents import Agent, Tool, Workflow

# Define specialized agents as tools
translation_agent = Agent(
    name="translator",
    instructions="Translate text between languages accurately."
)

summarization_agent = Agent(
    name="summarizer",
    instructions="Create concise summaries of longer texts."
)

# Create tools that invoke these agents
translation_tool = Tool(
    name="translate",
    description="Translate text to a specified language",
    agent=translation_agent
)

summarization_tool = Tool(
    name="summarize",
    description="Create a summary of the provided text",
    agent=summarization_agent
)

# Define a coordinator agent that uses these tools
coordinator_agent = Agent(
    name="content_processor",
    instructions="Process content according to user instructions.",
    tools=[translation_tool, summarization_tool]
)

# Create a workflow
workflow = Workflow(
    name="content_processing",
    agents=[coordinator_agent, translation_agent, summarization_agent],
    initial_agent="content_processor"
)

# Execute the workflow
result = workflow.run("Find a news article about climate change, summarize it, and translate the summary to Spanish.")
This example demonstrates the Agents-as-Tools pattern, where specialized agents are exposed as tools that can be invoked by other agents. This creates a hierarchical structure where a coordinator agent can delegate specific tasks to specialized agents, combining their capabilities to solve complex problems.
Real-World Applications of the Agents Platform
Research Assistants
Companies can build research assistants that combine web search, file search, and specialized agents to conduct comprehensive research on complex topics, with proper citation and source tracking.
Customer Service
Multi-agent systems can handle customer inquiries by searching knowledge bases, accessing web information, and even operating internal tools through the computer use capability to resolve issues efficiently.
Data Analysis
Organizations can create systems that gather data from various sources, process it through specialized analysis agents, and generate insights with proper attribution and explanation of methodologies.
Case Study: Hebbia
Company Profile
Hebbia is a company that provides AI solutions for credit firms and law firms, helping them process and analyze large volumes of information efficiently.
Implementation
They've used the web search tool to combine public information from the internet with live sources and citations, creating a comprehensive information retrieval system.
Value Proposition
For their clients, having access to both public information and internal documents with proper citation is crucial for making informed decisions and ensuring compliance with regulations.
Hebbia's implementation demonstrates how the OpenAI Agents Platform can be used to create specialized solutions for industries with complex information needs, combining different knowledge sources while maintaining traceability and attribution.
Case Study: Navant
1
Company Profile
Navant provides solutions that help organizations manage and access their internal knowledge more effectively, improving employee productivity and decision-making.
2
Implementation
They've used the file search tool to make company FAQs and travel policies searchable, creating assistants that are naturally aware of internal policies without requiring custom RAG system development.
3
Value Proposition
Their solution allows employees to quickly find and understand company policies without having to search through multiple documents or systems, saving time and ensuring consistent policy application.
Challenges in Agent Orchestration
Growing Complexity
As agent systems become more sophisticated, the complexity of orchestrating their interactions grows exponentially, making it difficult to reason about system behavior without proper tools.
Observability Gaps
Without proper tracing and monitoring, it can be nearly impossible to understand why an agent made a particular decision or where a workflow went wrong.
Safety Concerns
As agents gain more capabilities, ensuring they operate within appropriate boundaries becomes increasingly important, requiring robust guardrail mechanisms.
How the Agents SDK Addresses Challenges
1
Structured Workflows
The SDK provides clear patterns and structures for defining agent interactions, making it easier to reason about complex systems and maintain them over time.
2
Integrated Tracing
Built-in tracing capabilities provide visibility into every step of agent execution, making it possible to identify and fix issues quickly.
3
Configurable Guardrails
The guardrail system allows developers to define clear boundaries for agent behavior, ensuring that outputs meet safety and policy requirements.
Learning from Swarm's Success
Swarm just came to life out of learning from customers directly that orchestrating agents in production was pretty hard. Simple ideas could quickly turn very complex. Like what are those guardrails? What are those handoffs, et cetera? So that came out of learning from customers. And it was initially shipped as a low-key experiment, I'd say. But we were kind of taken by surprise at how much momentum there was around this concept.
Romain from OpenAI explains that Swarm was created in response to customer feedback about the challenges of orchestrating agents in production environments. What started as a low-key experiment gained unexpected momentum, revealing a strong demand for better tools to manage complex agent interactions.
The Evolution to Agents SDK
1
Customer Feedback
OpenAI learned from customers that orchestrating agents in production was challenging, with simple ideas quickly becoming complex when implemented.
2
Swarm Experiment
The team released Swarm as a low-key experiment to address these challenges, focusing on educational value and experimental approaches to agent orchestration.
3
Unexpected Adoption
The community's enthusiastic adoption of Swarm surprised OpenAI, demonstrating the strong need for better agent orchestration tools.
4
Agents SDK Development
Based on this feedback and adoption, OpenAI decided to embrace agent orchestration as a core primitive of their platform, leading to the development of the more robust Agents SDK.
Separating Agent Logic
Monolithic Approach
Traditional approaches often use a single agent with many tool calls, which can be hard to monitor and reason about as complexity increases.
Separated Logic
The Agents SDK encourages separating logic into specialized agents, making the system more modular and easier to understand and maintain.
Triage Pattern
A common pattern involves using a triage agent that analyzes user intent and routes requests to appropriate specialized agents, creating a clear separation of responsibilities.
The ability to separate logic across multiple specialized agents is one of the key benefits of the Agents SDK. This approach makes complex systems more manageable by breaking them down into smaller, more focused components that can be developed, tested, and maintained independently.
Future of Agent Development
1
2
3
4
1
Autonomous Systems
Self-improving agent networks
2
Specialized Agent Ecosystems
Libraries of purpose-built agents
3
Multi-Agent Orchestration
Complex workflows with handoffs
4
Single-Agent Tools
Basic tool use capabilities
The evolution of agent development is moving from simple single-agent tools toward increasingly sophisticated multi-agent systems. The OpenAI Agents Platform represents a significant step in this progression, providing the foundation for building complex agent ecosystems that can tackle increasingly challenging tasks through collaboration and specialization.
Integration with the Broader AI Ecosystem
API Compatibility
The Agents SDK supports any API provider that follows the ChatCompletions API format, enabling integration with a diverse ecosystem of AI models and services.
Observability Partnerships
OpenAI plans to announce partnerships with existing observability platforms, allowing the Agents SDK to integrate with established monitoring and debugging tools.
Community Extensions
The open-source nature of the SDK encourages community contributions and extensions, potentially creating a rich ecosystem of agent patterns and tools.
Resources for Getting Started
OpenAI has provided comprehensive documentation and resources to help developers get started with the new Agents Platform. The documentation includes detailed guides, API references, and examples for each component, while the GitHub repository for the Agents SDK contains additional examples and patterns for common agent workflows.
Key Takeaways
1
Unified Agent Platform
OpenAI has created a comprehensive platform for building agent-based applications, bringing together tools for web search, computer use, file search, and multi-agent orchestration.
2
API Evolution
The new Responses API represents an evolution of OpenAI's API strategy, combining the best aspects of chat completions and assistants into a more flexible foundation for agentic applications.
3
Multi-Agent Paradigm
The Agents SDK establishes multi-agent workflows as a core pattern for AI development, providing tools to manage the complexity of agent interactions through handoffs, guardrails, and observability.
4
Year of the Agent
These releases reinforce OpenAI's vision of 2025 as the "Year of the Agent," with a clear focus on enabling developers to build increasingly sophisticated agent-based applications.
Next Steps for Developers
Explore the Documentation
Start by reviewing the comprehensive documentation for each component of the Agents Platform, understanding the capabilities and integration options available.
Experiment with Simple Use Cases
Begin with straightforward implementations of each tool, such as basic web search or file search, to understand how they work and what they can do.
Implement Multi-Agent Patterns
Explore the common agentic patterns provided in the Agents SDK examples, experimenting with workflows, handoffs, and guardrails in your applications.
Leverage Tracing for Optimization
Use the tracing capabilities to understand how your agents are performing and identify opportunities for improvement in your agent workflows.
Community Resources
OpenAI DevDay
OpenAI's developer events provide opportunities to learn directly from the team and connect with other developers building with the platform.
GitHub Community
The open-source Agents SDK repository on GitHub serves as a hub for community contributions, discussions, and sharing of best practices.
Developer Forum
OpenAI's developer forum provides a space for asking questions, sharing experiences, and connecting with other developers working on similar challenges.
Conclusion: The Future of AI Agents
1
1
Increased Capabilities
Agent capabilities will continue to expand, with improvements in reasoning, planning, and execution across diverse domains
2
2
Deeper Integration
Agents will become more deeply integrated with existing systems and workflows, enhancing productivity across industries
3
3
Specialized Ecosystems
Domain-specific agent ecosystems will emerge, with specialized agents for fields like healthcare, finance, and education
4
4
Collaborative Intelligence
Multi-agent systems will enable new forms of human-AI collaboration, tackling increasingly complex problems together
OpenAI's Agents Platform represents a significant step toward a future where AI agents can handle increasingly complex tasks through collaboration, specialization, and tool use. As these capabilities continue to evolve, we can expect to see transformative applications across industries, changing how we work and interact with technology.