This guide helps new contributors get started with the OpenAI Agents Python repository. It covers repo structure, how to test your work, available utilities, and guidelines for commits and PRs.
Location: AGENTS.md at the repository root.
Run $code-change-verification before marking work complete when changes affect runtime code, tests, or build/test behavior.
Run it when you change:
src/agents/(library code) or shared utilities.tests/or add or modify snapshot tests.examples/.- Build or test configuration such as
pyproject.toml,Makefile,mkdocs.yml,docs/scripts/, or CI workflows.
You can skip $code-change-verification for docs-only or repo-meta changes (for example, docs/, .agents/, README.md, AGENTS.md, .github/), unless a user explicitly asks to run the full verification stack.
When working on OpenAI API or OpenAI platform integrations in this repo (Responses API, tools, streaming, Realtime API, auth, models, rate limits, MCP, Agents SDK or ChatGPT Apps SDK), use $openai-knowledge to pull authoritative docs via the OpenAI Developer Docs MCP server (and guide setup if it is not configured).
Before changing runtime code, exported APIs, external configuration, persisted schemas, wire protocols, or other user-facing behavior, use $implementation-strategy to decide the compatibility boundary and implementation shape. Judge breaking changes against the latest release tag, not unreleased branch-local churn. Interfaces introduced or changed after the latest release tag may be rewritten without compatibility shims unless they define durable external state or the user explicitly asks for a migration path.
Call out compatibility risk early in your plan only when the change affects behavior shipped in the latest release tag or durable external state, and confirm the approach before implementing changes that could impact users.
Use an ExecPlan when work is multi-step, spans several files, involves new features or refactors, or is likely to take more than about an hour. Start with the template and rules in PLANS.md, keep milestones and living sections (Progress, Surprises & Discoveries, Decision Log, Outcomes & Retrospective) up to date as you execute, and rewrite the plan if scope shifts. Call out compatibility risk only when the plan changes behavior shipped in the latest release tag or durable external state. Do not treat branch-local interface churn or unreleased post-tag changes on main as breaking by default; prefer direct replacement over compatibility layers in those cases. If you intentionally skip an ExecPlan for a complex task, note why in your response so reviewers understand the choice.
Treat the parameter and dataclass field order of exported runtime APIs as a compatibility contract.
- For public constructors (for example
RunConfig,FunctionTool,AgentHookContext), preserve existing positional argument meaning. Do not insert new constructor parameters or dataclass fields in the middle of existing public order. - When adding a new optional public field/parameter, append it to the end whenever possible and keep old fields in the same order.
- If reordering is unavoidable, add an explicit compatibility layer and regression tests that exercise the old positional call pattern.
- Prefer keyword arguments at call sites to reduce accidental breakage, but do not rely on this to justify breaking positional compatibility for public APIs.
The OpenAI Agents Python repository provides the Python Agents SDK, examples, and documentation built with MkDocs. Use uv run python ... for Python commands to ensure a consistent environment.
src/agents/: Core library implementation.tests/: Test suite; seetests/README.mdfor snapshot guidance.examples/: Sample projects showing SDK usage.docs/: MkDocs documentation source; do not edit translated docs underdocs/ja,docs/ko, ordocs/zh(they are generated).docs/scripts/: Documentation utilities, including translation and reference generation.mkdocs.yml: Documentation site configuration.Makefile: Common developer commands.pyproject.toml,uv.lock: Python dependencies and tool configuration..github/PULL_REQUEST_TEMPLATE/pull_request_template.md: Pull request template to use when opening PRs.site/: Built documentation output.
src/agents/run.pyis the runtime entrypoint (Runner,AgentRunner). Keep it focused on orchestration and public flow control. Put new runtime logic undersrc/agents/run_internal/and import it intorun.py.- When
run.pygrows, refactor helpers intorun_internal/modules (for examplerun_loop.py,turn_resolution.py,tool_execution.py,session_persistence.py) and leave only wiring and composition inrun.py. - Keep streaming and non-streaming paths behaviorally aligned. Changes to
run_internal/run_loop.py(run_single_turn,run_single_turn_streamed,get_new_response,start_streaming) should be mirrored, and any new streaming item types must be reflected insrc/agents/stream_events.py. - Input guardrails run only on the first turn and only for the starting agent. Resuming an interruption from
RunStatemust not increment the turn counter; only actual model calls advance turns. - Server-managed conversation (
conversation_id,previous_response_id,auto_previous_response_id) usesOpenAIServerConversationTrackerinrun_internal/oai_conversation.py. Only deltas should be sent. Ifcall_model_input_filteris used, it must returnModelInputDatawith a list input and the tracker must be updated with the filtered input (mark_input_as_sent). Session persistence is disabled when server-managed conversation is active. - Adding new tool/output/approval item types requires coordinated updates across:
src/agents/items.py(RunItem types and conversions)src/agents/run_internal/run_steps.py(ProcessedResponse and tool run structs)src/agents/run_internal/turn_resolution.py(model output processing, run item extraction)src/agents/run_internal/tool_execution.pyandsrc/agents/run_internal/tool_planning.pysrc/agents/run_internal/items.py(normalization, dedupe, approval filtering)src/agents/stream_events.py(stream event names)src/agents/run_state.py(RunState serialization/deserialization)src/agents/run_internal/session_persistence.py(session save/rewind)
- If the serialized RunState shape changes, bump
CURRENT_SCHEMA_VERSIONinsrc/agents/run_state.pyand update serialization/deserialization accordingly.
- Python 3.10+.
uvinstalled for dependency management (uv sync) anduv runfor Python commands.makeavailable to run repository tasks.
- Sync with
mainand create a feature branch:git checkout -b feat/<short-description>
- If dependencies changed or you are setting up the repo, run
make sync. - Implement changes and add or update tests alongside code updates.
- Highlight compatibility or API risks in your plan before implementing changes that alter the latest released behavior or durable external state.
- Build docs when you touch documentation:
make build-docs
- When
$code-change-verificationapplies, run it to execute the full verification stack before marking work complete. - Commit with concise, imperative messages; keep commits small and focused, then open a pull request.
- When reporting code changes as complete (after substantial code work), invoke
$pr-draft-summaryto generate the required PR summary block with change summary, PR title, and draft description.
Before submitting changes, ensure relevant checks pass and extend tests when you touch code.
When $code-change-verification applies, run it to execute the required verification stack from the repository root. Rerun the full stack after applying fixes.
- Run the full test suite:
make tests
- Run a focused test:
uv run pytest -s -k <pattern>
- Type checking:
make mypy
Some tests rely on inline snapshots; see tests/README.md for details. Re-run make tests after updating snapshots.
- Fix snapshots:
make snapshots-fix
- Create new snapshots:
make snapshots-create
- Generate coverage (fails if coverage drops below threshold):
make coverage
- Formatting and linting use
ruff; runmake format(applies fixes) andmake lint(checks only). - Type hints must pass
make mypy. - Write comments as full sentences ending with a period.
- Imports are managed by Ruff and should stay sorted.
When $code-change-verification applies, run the full sequence in order (or use the skill scripts):
make format
make lint
make mypy
make tests- Install or refresh development dependencies:
make sync
- Run tests against the oldest supported version (Python 3.10) in an isolated environment:
UV_PROJECT_ENVIRONMENT=.venv_310 uv sync --python 3.10 --all-extras --all-packages --group dev UV_PROJECT_ENVIRONMENT=.venv_310 uv run --python 3.10 -m pytest
- Documentation workflows:
make build-docs # build docs after editing docs make serve-docs # preview docs locally make build-full-docs # run translations and build
- Snapshot helpers:
make snapshots-fix make snapshots-create
- Use
examples/to see common SDK usage patterns. - Review
Makefilefor common commands and useuv runfor Python invocations. - Explore
docs/anddocs/scripts/to understand the documentation pipeline. - Consult
tests/README.mdfor test and snapshot workflows. - Check
mkdocs.ymlto understand how docs are organized.
- Use the template at
.github/PULL_REQUEST_TEMPLATE/pull_request_template.md; include a summary, test plan, and issue number if applicable. - Add tests for new behavior when feasible and update documentation for user-facing changes.
- Run
make format,make lint,make mypy, andmake testsbefore marking work ready. - Commit messages should be concise and written in the imperative mood. Small, focused commits are preferred.
- ✅ Checks pass (
make format,make lint,make mypy,make tests). - ✅ Tests cover new behavior and edge cases.
- ✅ Code is readable, maintainable, and consistent with existing style.
- ✅ Public APIs and user-facing behavior changes are documented.
- ✅ Examples are updated if behavior changes.
- ✅ History is clean with a clear PR description.