OpenAI o3 Model Announced - New Frontiers in Reasoning-Focused AI

2025.12.22

What is OpenAI o3

In December 2024, OpenAI announced the o3 model on the final day of their “12 Days of OpenAI” event. As the successor to o1, this model represents a significant evolution in reasoning capabilities, recording remarkable scores particularly on the ARC-AGI benchmark.

Reference: OpenAI - o3 Announcement

Remarkable Benchmark Results

ARC-AGI (Abstract Reasoning)

ModelScore
GPT-4o5%
o132%
o3 (low compute)75.7%
o3 (high compute)87.5%
Human average85%

o3 has become the first AI model to exceed human average.

Other Benchmarks

Math (AIME 2024): 96.7%
Coding (Codeforces): 2727 Elo (99.95 percentile)
Science (GPQA Diamond): 87.7%

Reference: ARC Prize - o3 Results

o3 Technical Features

1. Compute Scaling

A key feature of o3 is the ability to adjust compute during inference.

from openai import OpenAI

client = OpenAI()

# Low compute mode (fast, low cost)
response_fast = client.chat.completions.create(
    model="o3-mini",
    reasoning_effort="low",
    messages=[{"role": "user", "content": "Simple question"}]
)

# High compute mode (high precision, high cost)
response_precise = client.chat.completions.create(
    model="o3",
    reasoning_effort="high",
    messages=[{"role": "user", "content": "Complex mathematical proof"}]
)

2. o3-mini

A more efficient version that outperforms o1 on many tasks.

Comparisono1-minio3-mini
AIME 202470%84%
SpeedBaseline~2x faster
CostBaseline~40% reduction

Reference: OpenAI API Documentation

Safety Measures

Deliberative Alignment

o3 introduces a new safety mechanism called “deliberative alignment.”

1. Analyze user intent
2. Evaluate potential risks
3. Verify alignment with safety policies
4. Generate appropriate response

Safety Test Results

  • Harmful content generation resistance: 99.2%
  • Jailbreak resistance: 98.5%
  • Misinformation prevention: 97.8%

How to Use

API Usage

from openai import OpenAI

client = OpenAI()

# Complex reasoning with o3
response = client.chat.completions.create(
    model="o3",
    messages=[
        {
            "role": "user",
            "content": """
            Please solve the following puzzle:
            There is a 3x3 grid where each cell contains a number from 1-9.
            Make each row and column sum to 15.
            """
        }
    ]
)

print(response.choices[0].message.content)

ChatGPT Usage

ChatGPT Plus/Pro users can use o3 through ChatGPT.

Setup:
1. Log in to ChatGPT
2. Select o3 in model selection
3. Enable "Reasoning mode"

Reference: ChatGPT - OpenAI

o3 vs Competitors

Capabilityo3Gemini 2.0Claude Opus 4.5
Math reasoningExcellentGoodGood
CodingExcellentGoodExcellent
Abstract reasoningExcellentGoodGood
SpeedFairExcellentGood
CostFairGoodGood

Pricing Structure (Expected)

ModelInput (1M tokens)Output (1M tokens)
o3$60$240
o3-mini$15$60
o1$15$60

Note: Official pricing will be announced at general release

Summary

OpenAI o3 has achieved a new milestone in reasoning capabilities.

  • ARC-AGI 87.5%: Abstract reasoning exceeding human average
  • Codeforces 2727 Elo: World-class coding ability
  • Compute scaling: Precision-cost tradeoff possible
  • Enhanced safety: Introduction of Deliberative Alignment

General availability is scheduled for late January 2025.

← Back to list