OpenAI Models Reference
Complete capabilities matrix for currently documented OpenAI API models (non-deprecated)
Last updated: 2025-11-19 | Source: OpenAI Platform Documentation
Vision Out (Gen): Model can GENERATE images/videos
General-Purpose GPT + Reasoning Models
Main LLMs for chat completions, responses, and agent tools. These are the "brain" models you'd typically use.
| Model ID | Type | Vision In (OCR) | Vision Out (Gen) | Audio In | Audio Out | Chat/Text | Tools | Reasoning |
|---|---|---|---|---|---|---|---|---|
gpt-5.1 | Flagship GPT | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ Configurable |
gpt-5 | GPT Family | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ Oriented |
gpt-5-pro | Reasoning GPT | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ Advanced |
gpt-5-mini | Smaller GPT-5 | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ Lighter |
gpt-5.1-codex | Coding-Optimized | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ Strong | ✅ Coding |
gpt-5.1-codex-mini | Small Codex | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ Lighter |
gpt-4.1 | High-Intelligence | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ Excellent | ❌ |
gpt-4.1-mini | Small, Fast | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ Optimized | ❌ |
gpt-4.1-nano | Tiny GPT-4.1 | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ |
gpt-4o | Omni GPT | ✅ | ❌ | ✅ Native | ✅ Native | ✅ | ✅ | ❌ |
gpt-4o-mini | Smaller Omni | ✅ | ❌ | ✅ Variants | ✅ TTS | ✅ | ✅ | ❌ |
gpt-4 | Legacy GPT-4 | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ |
gpt-4-turbo-preview | Legacy Turbo | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ |
o3 | Ⓡ Reasoning | ✅ Strong | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ Slow/Strong |
o3-mini | Ⓡ Fast | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ Cheaper |
o4-mini | Ⓡ Reasoning | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ |
o3-deep-research | Ⓡ Web Research | ✅ | ❌ | ❌ | ❌ | ✅ Reports | ✅ Web Search | ✅ Multi-Step |
o4-mini-deep-research | Ⓡ Fast Research | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ |
gpt-oss | Small OSS-style | ❌ Text Only | ❌ | ❌ | ❌ | ✅ | ✅ Standard | ❌ |
davinci-002 | Legacy Base | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ Basic | ❌ |
Embedding & Moderation Models
Models for generating vector embeddings and content moderation.
| Model ID | Type | Vision In (OCR) | Embedding | Description |
|---|---|---|---|---|
text-embedding-3-large | Embeddings | ❌ | ✅ | High-quality, multilingual |
text-embedding-3-small | Embeddings | ❌ | ✅ | Cheap, fast |
text-embedding-ada-002 | Legacy | ❌ | ✅ | Older, backward-compatible |
omni-moderation-latest | Moderation | ✅ Image | ❌ | Text + image moderation |
text-moderation-latest | Moderation | ❌ | ❌ | Text-only moderation |
Audio & Speech Models
Models for speech-to-text, text-to-speech, and audio-capable chat via the Audio API.
| Model ID | Type | Vision In (OCR) | Audio In | Audio Out | Chat | Tools |
|---|---|---|---|---|---|---|
gpt-audio-mini | General Audio LLM | ❌ | ✅ | ✅ | ✅ | ✅ |
gpt-4o-audio-preview | 4o Audio | ✅ | ✅ | ✅ | ✅ | ✅ |
gpt-4o-mini-audio-preview | 4o Mini Audio | ✅ | ✅ | ✅ | ✅ | ✅ |
gpt-4o-mini-tts | TTS | ❌ | ❌ | ✅ 11+ Voices | ❌ | ❌ |
gpt-4o-transcribe | STT | ❌ | ✅ | ❌ | Outputs text | ❌ |
gpt-4o-mini-transcribe | Small STT | ❌ | ✅ | ❌ | Outputs text | ❌ |
gpt-4o-transcribe-diarize | STT + Diarization | ❌ | ✅ | ❌ | Text + speakers | ❌ |
whisper-1 | Classic STT | ❌ | ✅ | ❌ | Text only | ❌ |
Image & Vision-Focused Models
Models for image generation and manipulation via the Images API. These models GENERATE images (Vision Out), unlike chat models that READ images (Vision In).
| Model ID | Type | Vision In (OCR) | Vision Out (Gen) | Description |
|---|---|---|---|---|
gpt-image-1 | Image Gen LMM | ✅ Text + Image | ✅ Generate | Multimodal LLM via Images API |
dall-e-3 | Image Gen | ❌ | ✅ Generate | Older image generation |
dall-e-2 | Legacy Image Gen | ❌ | ✅ Generate | Legacy image generation |
gpt-image-1 is a multimodal LLM that can both READ images (Vision In) and GENERATE images (Vision Out). You call it via the Images API / image-tool, not as a general chat model.
Realtime Models
Streaming text + audio models via WebRTC/WebSocket with API key authentication.
| Model ID | Type | Vision In (OCR) | Audio In | Audio Out | Chat | Tools |
|---|---|---|---|---|---|---|
gpt-realtime-2025-08-28 | Realtime GPT | Check docs | ✅ | ✅ | ✅ Streaming | ✅ Realtime API |
API Access
All models listed above are accessible via server-side code using an OpenAI API key.
API Endpoints
chatorresponses- GPT + reasoning + audioembeddings- Vector embeddingsaudio- STT/TTSimages- Image generationmoderations- Content moderationrealtime- WebRTC/WebSocket streaming
Additional Info
- Models listed are non-deprecated as of 2025-11-19
- Many have snapshot IDs and aliases
- Some eval-only graders not listed for brevity
- Check live docs for updates
Recommended Models for George AI
💬 Chat Assistants
gpt-4o-mini- Fast, cheapgpt-4.1- High intelligencegpt-5.1- Flagship with reasoning
🔍 Embeddings
text-embedding-3-small- Fast, cheaptext-embedding-3-large- High quality
👁️ Vision In (OCR)
Reading/analyzing images
gpt-4o- Native multimodalgpt-4.1- Best qualitygpt-5.1- Latest
🎨 Vision Out (Gen)
Generating images
gpt-image-1- Latest (LMM)dall-e-3- High quality
✨ Structured Data Extraction
Recommended: gpt-4o-mini (fast/cheap), gpt-4.1-mini (optimized for tools), o3-mini (reasoning)
💡 Tip: All OpenAI GPT models support native function calling. Use reasoning models (o-series) for complex extraction tasks requiring multi-step logic.