Skip to content

OpenAI GPT-5.x Models

Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: MIT-0

The GenAIIDP accelerator supports OpenAI’s frontier models GPT-5.4 (openai.gpt-5.4) and GPT-5.5 (openai.gpt-5.5) on Amazon Bedrock.

Unlike every other model in the accelerator, these are not served on the Bedrock Converse / InvokeModel APIs. They are available only on the bedrock-mantle endpoint via the OpenAI Responses API. The accelerator hides this difference behind the existing idp_common Bedrock client: when a model ID starting with openai.gpt-5 is selected, BedrockClient.invoke_model transparently routes the request to a SigV4-signed HTTP call against bedrock-mantle (see idp_common/bedrock/openai_responses.py) and returns the same response/metering shape every service already expects — so no per-service code changes are required.

TL;DR — GPT-5.4/5.5 work for OCR, classification, extraction, assessment, summarization, evaluation, and Chat-with-Document. They do not work for agentic extraction, Discovery, or Policy Discovery, and are available in US regions only. See the support matrix below.

GPT-5.4GPT-5.5
Model IDopenai.gpt-5.4openai.gpt-5.5
Context window272K tokens272K tokens
Max output tokens (capped by accelerator)128,000128,000
Endpointbedrock-mantle (OpenAI Responses API)bedrock-mantle (OpenAI Responses API)
In-Region availabilityus-east-2, us-west-2, us-gov-west-1us-east-2
Geo / Global cross-region inferenceNot availableNot available
Service tierStandard onlyStandard only

There are no eu.* or global.* variants and no :1m context suffix — the model IDs carry no region prefix.

CapabilitySupported?Notes
OCR (Bedrock backend)Image + text input
ClassificationPage-level and holistic
Extraction (standard)Text + page images
AssessmentIncluding granular assessment
Summarization
Evaluation (LLM method)
Chat-with-DocumentStreaming — token deltas stream to the UI via the Responses SSE stream
Text input
Image inputPage images are sent as image content
Reasoning effort controlNew reasoning_effort config field (see below)
GuardrailsApplied via the standard headers on the mantle endpoint
CapabilitySupported?Why / what happens
Agentic extraction (extraction.agentic.enabled: true)The agentic path uses the Strands framework over the Converse API, which GPT-5.x doesn’t support. This combination is a hard error in idp-cli config-validate and raises at runtime.
Discovery (classes / without- & with-ground-truth / auto-split)Discovery ingests whole PDFs as Converse document blocks, which the Responses API cannot accept (text + image only). Rejected by config-validate and guarded at runtime.
Policy / Rule DiscoverySame PDF-document-block limitation; agentic rule discovery also uses Strands. Rejected by config-validate and guarded at runtime.
PDF document input blocksThe Responses API accepts text and images only. Pipelines that need whole-PDF ingestion should use a Claude or Nova model.
Prompt caching (<<CACHEPOINT>>)Not supported by these models; <<CACHEPOINT>> markers are stripped automatically.
Service tiers (:priority / :flex)Standard tier only.
temperature / top_p / top_kThese are reasoning models; sampling parameters are ignored. Use reasoning_effort instead.
EU / global cross-region inferenceUS (and us-gov) in-region only; hidden in EU-region deployments.

GPT-5.x are reasoning models — they reject temperature / top_p / top_k and are instead tuned with reasoning effort. Each model-selectable service (OCR, classification, extraction, assessment, summarization, evaluation, and Chat-with-Document) exposes a reasoning_effort config field. Allowed values: minimal, low, medium, high (default medium). It is an OpenAI-only parameter — ignored by Anthropic / Nova and other model families.

extraction:
model: "openai.gpt-5.4"
reasoning_effort: "high" # minimal | low | medium | high

GPT-5.5 is available in us-east-2 only; GPT-5.4 in us-east-2, us-west-2, and us-gov-west-1. There is no EU availability and no geo/global cross-region inference.

If the IDP stack is deployed in a region where the selected model is not available, the accelerator routes the bedrock-mantle request to a known-available region (logging a warning about cross-region data movement). To pin the region explicitly, set BEDROCK_MANTLE_REGION. EU-region deployments hide these models from the configuration picklists entirely (they are not callable there). See EU Region Model Support.

Lambda execution roles that perform generation are granted the bedrock-mantle:CreateInference action (plus GetProject / ListProjects / ListTagsForResources) — equivalent to the AWS-managed AmazonBedrockMantleInferenceAccess policy. When routing Bedrock through a cross-account hub role, that role must also grant these bedrock-mantle actions — see Cross-Account Bedrock.

VariablePurposeDefault
BEDROCK_MANTLE_REGIONPin the bedrock-mantle region for all GPT-5.x callsDerived from the stack region with a per-model fallback
BEDROCK_MANTLE_SIGNING_NAMESigV4 signing service namebedrock-mantle
BEDROCK_MANTLE_REASONING_EFFORTGlobal fallback reasoning effort when a service config omits reasoning_effortmedium

Pricing for bedrock/openai.gpt-5.4 and bedrock/openai.gpt-5.5 is defined in config_library/pricing.yaml and matches OpenAI first-party rates on Bedrock (per 1M tokens):

ModelInputCached inputOutput
GPT-5.4$2.75$0.275$16.50
GPT-5.5$5.50$0.55$33.00

Confirm against the Amazon Bedrock pricing page if rates change.

Use GPT-5.4/5.5 for OCR, classification, extraction, assessment, summarization, evaluation, or chat where their reasoning quality helps and inputs are text or page images. For workloads that require whole-PDF ingestion (Discovery, Policy Discovery) or agentic extraction, choose a Claude or Nova model, which accept PDF document blocks natively and support the Converse/Strands paths.