Skip to content

Commit e0b243d

Browse files
authored
Add two more agent skills (#2304)
1 parent c06a8bb commit e0b243d

4 files changed

Lines changed: 152 additions & 0 deletions

File tree

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
name: final-release-review
3+
description: Perform a release-readiness review by locating the previous release tag from remote tags and auditing the diff (e.g., v1.2.3...main) for breaking changes, regressions, improvement opportunities, and risks before releasing openai-agents-python.
4+
---
5+
6+
# Final Release Review
7+
8+
## Purpose
9+
10+
Use this skill when validating main for release. It guides you to fetch remote tags, pick the previous release tag, and thoroughly inspect the `BASE_TAG...TARGET` diff for breaking changes, introduced bugs/regressions, improvement opportunities, and release risks.
11+
12+
## Quick start
13+
14+
1. Ensure repository root: `pwd``path-to-workspace/openai-agents-python`.
15+
2. Sync tags and pick base (default `v*`):
16+
```bash
17+
BASE_TAG="$(.codex/skills/final-release-review/scripts/find_latest_release_tag.sh origin 'v*')"
18+
```
19+
3. Choose target (default `main`, ensure fresh): `git fetch origin main --prune` then `TARGET="main"`.
20+
4. Snapshot scope:
21+
```bash
22+
git diff --stat "${BASE_TAG}"..."${TARGET}"
23+
git diff --dirstat=files,0 "${BASE_TAG}"..."${TARGET}"
24+
git log --oneline --reverse "${BASE_TAG}".."${TARGET}"
25+
git diff --name-status "${BASE_TAG}"..."${TARGET}"
26+
```
27+
5. Deep review using `references/review-checklist.md` to spot breaking changes, regressions, and improvement chances.
28+
6. Capture findings and call the release gate: ship/block with conditions; propose focused tests for risky areas.
29+
30+
## Workflow
31+
32+
- **Prepare**
33+
- Run the quick-start tag command to ensure you use the latest remote tag. If the tag pattern differs, override the pattern argument (e.g., `'*.*.*'`).
34+
- If the user specifies a base tag, prefer it but still fetch remote tags first.
35+
- Keep the working tree clean to avoid diff noise.
36+
- **Map the diff**
37+
- Use `--stat`, `--dirstat`, and `--name-status` outputs to spot hot directories and file types.
38+
- For suspicious files, prefer `git diff --word-diff BASE...TARGET -- <path>`.
39+
- Note any deleted or newly added tests, config (for example `pyproject.toml`, `uv.lock`, `mkdocs.yml`), migrations, or scripts.
40+
- **Analyze risk**
41+
- Walk through the categories in `references/review-checklist.md` (breaking changes, regression clues, improvement opportunities).
42+
- When you suspect a risk, cite the specific file/commit and explain the behavioral impact.
43+
- Suggest minimal, high-signal validation commands (targeted tests or linters) instead of generic reruns when time is tight.
44+
- **Form a recommendation**
45+
- State BASE_TAG and TARGET explicitly.
46+
- Provide a concise diff summary (key directories/files and counts).
47+
- List: breaking-change candidates, probable regressions/bugs, improvement opportunities, missing release notes/migrations.
48+
- Recommend ship/block and the exact checks needed to unblock if blocking.
49+
50+
## Resources
51+
52+
- `scripts/find_latest_release_tag.sh`: Fetches remote tags and returns the newest tag matching a pattern (default `v*`).
53+
- `references/review-checklist.md`: Detailed signals and commands for spotting breaking changes, regressions, and release polish gaps.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Release Diff Review Checklist
2+
3+
## Quick commands
4+
5+
- Sync tags: `git fetch origin --tags --prune`.
6+
- Identify latest release tag (default pattern `v*`): `git tag -l 'v*' --sort=-v:refname | head -n1` or use `.codex/skills/final-release-review/scripts/find_latest_release_tag.sh`.
7+
- Generate overview: `git diff --stat BASE...TARGET`, `git diff --dirstat=files,0 BASE...TARGET`, `git log --oneline --reverse BASE..TARGET`.
8+
- Inspect risky files quickly: `git diff --name-status BASE...TARGET`, `git diff --word-diff BASE...TARGET -- <path>`.
9+
10+
## Breaking change signals
11+
12+
- Public API surface: removed/renamed modules, classes, functions, or re-exports; changed parameters/return types, default values changed, new required options, stricter validation.
13+
- Protocol/schema: request/response fields added/removed/renamed, enum changes, JSON shape changes, ID formats, pagination defaults.
14+
- Config/CLI/env: renamed flags, default behavior flips, removed fallbacks, environment variable changes, logging levels tightened.
15+
- Dependencies/platform: Python version requirement changes, dependency major bumps, `pyproject.toml`/`uv.lock` changes, removed or renamed extras.
16+
- Persistence/data: migration scripts missing, data model changes, stored file formats, cache keys altered without invalidation.
17+
- Docs/examples drift: examples still reflect old behavior or lack migration note.
18+
19+
## Regression risk clues
20+
21+
- Large refactors with light test deltas or deleted tests; new `skip`/`todo` markers.
22+
- Concurrency/timing: new async flows, asyncio event-loop changes, retries, timeouts, debounce/caching changes, race-prone patterns.
23+
- Error handling: catch blocks removed, swallowed errors, broader catch-all added without logging, stricter throws without caller updates.
24+
- Stateful components: mutable shared state, global singletons, lifecycle changes (init/teardown), resource cleanup removal.
25+
- Third-party changes: swapped core libraries, feature flags toggled, observability removed or gated.
26+
27+
## Improvement opportunities
28+
29+
- Missing coverage for new code paths; add focused tests.
30+
- Performance: obvious N+1 loops, repeated I/O without caching, excessive serialization.
31+
- Developer ergonomics: unclear naming, missing inline docs for public APIs, missing examples for new features.
32+
- Release hygiene: add migration/upgrade note when behavior changes; ensure changelog/notes capture user-facing shifts.
33+
34+
## Evidence to capture in the review output
35+
36+
- BASE tag and TARGET ref used for the diff; confirm tags fetched.
37+
- High-level diff stats and key directories touched.
38+
- Concrete files/commits that indicate breaking changes or risk, with brief rationale.
39+
- Tests or commands suggested to validate suspected risks.
40+
- Explicit release gate call (ship/block) with conditions to unblock.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
remote="${1:-origin}"
5+
pattern="${2:-v*}"
6+
7+
# Sync tags from the remote to ensure the latest release tag is available locally.
8+
git fetch "$remote" --tags --prune --quiet
9+
10+
latest_tag=$(git tag -l "$pattern" --sort=-v:refname | head -n1)
11+
12+
if [[ -z "$latest_tag" ]]; then
13+
echo "No tags found matching pattern '$pattern' after fetching from $remote." >&2
14+
exit 1
15+
fi
16+
17+
echo "$latest_tag"
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
---
2+
name: test-coverage-improver
3+
description: 'Improve test coverage in the OpenAI Agents Python repository: run `make coverage`, inspect coverage artifacts, identify low-coverage files, propose high-impact tests, and confirm with the user before writing tests.'
4+
---
5+
6+
# Test Coverage Improver
7+
8+
## Overview
9+
10+
Use this skill whenever coverage needs assessment or improvement (coverage regressions, failing thresholds, or user requests for stronger tests). It runs the coverage suite, analyzes results, highlights the biggest gaps, and prepares test additions while confirming with the user before changing code.
11+
12+
## Quick Start
13+
14+
1. From the repo root run `make coverage` to regenerate `.coverage` data and `coverage.xml`.
15+
2. Collect artifacts: `.coverage` and `coverage.xml`, plus the console output from `coverage report -m` for drill-downs.
16+
3. Summarize coverage: total percentages, lowest files, and uncovered lines/paths.
17+
4. Draft test ideas per file: scenario, behavior under test, expected outcome, and likely coverage gain.
18+
5. Ask the user for approval to implement the proposed tests; pause until they agree.
19+
6. After approval, write the tests in `tests/`, rerun `make coverage`, and then run `$code-change-verification` before marking work complete.
20+
21+
## Workflow Details
22+
23+
- **Run coverage**: Execute `make coverage` at repo root. Avoid watch flags and keep prior coverage artifacts only if comparing trends.
24+
- **Parse summaries efficiently**:
25+
- Prefer the console output from `coverage report -m` for file-level totals; fallback to `coverage.xml` for tooling or spreadsheets.
26+
- Use `uv run coverage html` to generate `htmlcov/index.html` if you need an interactive drill-down.
27+
- **Prioritize targets**:
28+
- Public APIs or shared utilities in `src/agents/` before examples or docs.
29+
- Files with low statement coverage or newly added code at 0%.
30+
- Recent bug fixes or risky code paths (error handling, retries, timeouts, concurrency).
31+
- **Design impactful tests**:
32+
- Hit uncovered paths: error cases, boundary inputs, optional flags, and cancellation/timeouts.
33+
- Cover combinational logic rather than trivial happy paths.
34+
- Place tests under `tests/` and avoid flaky async timing.
35+
- **Coordinate with the user**: Present a numbered, concise list of proposed test additions and expected coverage gains. Ask explicitly before editing code or fixtures.
36+
- **After implementation**: Rerun coverage, report the updated summary, and note any remaining low-coverage areas.
37+
38+
## Notes
39+
40+
- Keep any added comments or code in English.
41+
- Do not create `scripts/`, `references/`, or `assets/` unless needed later.
42+
- If coverage artifacts are missing or stale, rerun `pnpm test:coverage` instead of guessing.

0 commit comments

Comments
 (0)