OpenAI Models Reference

Complete capabilities matrix for currently documented OpenAI API models (non-deprecated)

Last updated: 2025-11-19 | Source: OpenAI Platform Documentation

Legend
= Supported = Not supported = Reasoning model family
Vision In (OCR): Model can READ/analyze images (for OCR, image understanding)
Vision Out (Gen): Model can GENERATE images/videos

General-Purpose GPT + Reasoning Models

Main LLMs for chat completions, responses, and agent tools. These are the "brain" models you'd typically use.

Model ID Type Vision In
(OCR)
Vision Out
(Gen)
Audio In Audio Out Chat/Text Tools Reasoning
gpt-5.1 Flagship GPT ✅ Configurable
gpt-5 GPT Family ✅ Oriented
gpt-5-pro Reasoning GPT ✅ Advanced
gpt-5-mini Smaller GPT-5 ✅ Lighter
gpt-5.1-codex Coding-Optimized ✅ Strong ✅ Coding
gpt-5.1-codex-mini Small Codex ✅ Lighter
gpt-4.1 High-Intelligence ✅ Excellent
gpt-4.1-mini Small, Fast ✅ Optimized
gpt-4.1-nano Tiny GPT-4.1
gpt-4o Omni GPT ✅ Native ✅ Native
gpt-4o-mini Smaller Omni ✅ Variants ✅ TTS
gpt-4 Legacy GPT-4
gpt-4-turbo-preview Legacy Turbo
o3 Ⓡ Reasoning ✅ Strong ✅ Slow/Strong
o3-mini Ⓡ Fast ✅ Cheaper
o4-mini Ⓡ Reasoning
o3-deep-research Ⓡ Web Research ✅ Reports ✅ Web Search ✅ Multi-Step
o4-mini-deep-research Ⓡ Fast Research
gpt-oss Small OSS-style ❌ Text Only ✅ Standard
davinci-002 Legacy Base ✅ Basic
All GPT and reasoning models support structured JSON output and function/tool calling via standard APIs. Many have multiple snapshots (dated model IDs) and aliases.

Embedding & Moderation Models

Models for generating vector embeddings and content moderation.

Model ID Type Vision In
(OCR)
Embedding Description
text-embedding-3-large Embeddings High-quality, multilingual
text-embedding-3-small Embeddings Cheap, fast
text-embedding-ada-002 Legacy Older, backward-compatible
omni-moderation-latest Moderation ✅ Image Text + image moderation
text-moderation-latest Moderation Text-only moderation

Audio & Speech Models

Models for speech-to-text, text-to-speech, and audio-capable chat via the Audio API.

Model ID Type Vision In
(OCR)
Audio In Audio Out Chat Tools
gpt-audio-mini General Audio LLM
gpt-4o-audio-preview 4o Audio
gpt-4o-mini-audio-preview 4o Mini Audio
gpt-4o-mini-tts TTS ✅ 11+ Voices
gpt-4o-transcribe STT Outputs text
gpt-4o-mini-transcribe Small STT Outputs text
gpt-4o-transcribe-diarize STT + Diarization Text + speakers
whisper-1 Classic STT Text only

Image & Vision-Focused Models

Models for image generation and manipulation via the Images API. These models GENERATE images (Vision Out), unlike chat models that READ images (Vision In).

Model ID Type Vision In
(OCR)
Vision Out
(Gen)
Description
gpt-image-1 Image Gen LMM ✅ Text + Image ✅ Generate Multimodal LLM via Images API
dall-e-3 Image Gen ✅ Generate Older image generation
dall-e-2 Legacy Image Gen ✅ Generate Legacy image generation
gpt-image-1 is a multimodal LLM that can both READ images (Vision In) and GENERATE images (Vision Out). You call it via the Images API / image-tool, not as a general chat model.

Realtime Models

Streaming text + audio models via WebRTC/WebSocket with API key authentication.

Model ID Type Vision In
(OCR)
Audio In Audio Out Chat Tools
gpt-realtime-2025-08-28 Realtime GPT Check docs ✅ Streaming ✅ Realtime API

API Access

All models listed above are accessible via server-side code using an OpenAI API key.

API Endpoints

  • chat or responses - GPT + reasoning + audio
  • embeddings - Vector embeddings
  • audio - STT/TTS
  • images - Image generation
  • moderations - Content moderation
  • realtime - WebRTC/WebSocket streaming

Additional Info

  • Models listed are non-deprecated as of 2025-11-19
  • Many have snapshot IDs and aliases
  • Some eval-only graders not listed for brevity
  • Check live docs for updates

Recommended Models for George AI

💬 Chat Assistants

  • gpt-4o-mini - Fast, cheap
  • gpt-4.1 - High intelligence
  • gpt-5.1 - Flagship with reasoning

🔍 Embeddings

  • text-embedding-3-small - Fast, cheap
  • text-embedding-3-large - High quality

👁️ Vision In (OCR)

Reading/analyzing images

  • gpt-4o - Native multimodal
  • gpt-4.1 - Best quality
  • gpt-5.1 - Latest

🎨 Vision Out (Gen)

Generating images

  • gpt-image-1 - Latest (LMM)
  • dall-e-3 - High quality

✨ Structured Data Extraction

Recommended: gpt-4o-mini (fast/cheap), gpt-4.1-mini (optimized for tools), o3-mini (reasoning)

💡 Tip: All OpenAI GPT models support native function calling. Use reasoning models (o-series) for complex extraction tasks requiring multi-step logic.

See Also
Looking for self-hosted options? Check out the Open Source Models Reference for 30+ OSS models including Llama, Gemma, Qwen, DeepSeek, and more.
Back to AI Models Guide Documentation Home
George-Cloud