Update Python version to 3.12 and refresh PR template#1648
Update Python version to 3.12 and refresh PR template#1648kayametehan wants to merge 1 commit intoopenai:mainfrom
Conversation
- Bump python-version from 3.9 to 3.12 in run_tests.yaml and test_eval.yaml workflows (closes openai#1606) - Remove stale GPT-4 private-access language from PR template; replace with current contribution guidelines (closes openai#1608)
There was a problem hiding this comment.
Pull request overview
Updates CI to run on a supported Python version and refreshes contributor guidance in the PR template to remove obsolete model-specific requirements.
Changes:
- Bump GitHub Actions workflows from Python 3.9 to Python 3.12.
- Refresh
.github/PULL_REQUEST_TEMPLATE.mdby removing outdated GPT-4-specific merge requirements and adding updated guidance/links.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| .github/workflows/test_eval.yaml | Updates workflow Python runtime to 3.12 for eval validation. |
| .github/workflows/run_tests.yaml | Updates unit test workflow Python runtime to 3.12. |
| .github/PULL_REQUEST_TEMPLATE.md | Updates PR template instructions to reflect current eval submission expectations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| **PLEASE READ THIS**: | ||
|
|
||
| In order for a PR to be merged, it must fail on GPT-4. We are aware that right now, users do not have access, so you will not be able to tell if the eval fails or not. Please run your eval with GPT-3.5-Turbo, but keep in mind as we run the eval, if GPT-4 gets higher than 90% on the eval, we will likely reject it since GPT-4 is already capable of completing the task. | ||
| We are currently only accepting evals that use **model-graded eval classes** (no custom code). Your eval should include a minimum of **15 high-quality samples**. |
There was a problem hiding this comment.
The PR template now says we are only accepting evals that use model-graded eval classes. This conflicts with the repo docs, which explicitly say contributors can follow an existing eval template to build a basic or model-graded eval (and the only stated restriction is “no custom code”). Please align the template wording with README guidance so contributors aren’t incorrectly discouraged from submitting basic-template evals.
| We are currently only accepting evals that use **model-graded eval classes** (no custom code). Your eval should include a minimum of **15 high-quality samples**. | |
| We are currently only accepting evals that use **existing eval classes** such as **Basic** or **model-graded eval classes** (no custom code). Your eval should include a minimum of **15 high-quality samples**. |
Fixes #1606 and #1608.
Changes
Python version bump (fixes #1606)
run_tests.yaml:python-version: 3.9→"3.12"test_eval.yaml:python-version: 3.9→"3.12"Python 3.9 reached end-of-life in October 2025. 3.12 is the current stable release.
PR template refresh (fixes #1608)