On a 2.0 terminal benchmark, OpenAI’s model scores about 10% higher, guiding users toward stronger results on long, complex coding tasks.
Claude Opus 4.6 and ChatGPT 5.3 Codex launch with a 1-million-token window and 25% faster runs, letting you match tasks to each model’s strengths.
Codex will remain free, and Deep Research just got better ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results