As Codex Moves From Code Suggestions to Code Execution, OpenAI’s Security Model Gets Much More Granular

A software developer at a desk with multiple monitors showing code and AI tools in a modern office environment.

OpenAI’s Codex security model is not a simple switch between “sandboxed” and “unsandboxed.” It is a layered system built around restricted execution modes, approval gates, and telemetry, aimed at the harder enterprise problem: not generating code, but running AI-generated actions safely inside real development environments.

Three sandbox modes, not one security state

Codex exposes different operating modes with materially different risk profiles. In read-only mode, the agent can inspect but not modify files; in workspace-write mode, which is the default, it can edit within a local working directory; in danger-full-access mode, it can operate without those restrictions and is meant for highly trusted cases.

That matters because the controls are tied to what the agent is actually allowed to touch, not to a generic “AI on” setting. Enterprises deciding whether Codex can refactor a codebase, run commands, or interact with anything beyond a confined workspace are really choosing among distinct execution boundaries with different failure modes.

Network access is treated as a separate risk channel

Filesystem permissions are only one part of the deployment problem. Codex typically has network access turned off by default, and OpenAI separates outbound connectivity from local execution because external calls create a different class of exposure, including data exfiltration and prompt injection risk.

One concrete example is web search. Rather than assuming live internet access, Codex uses cached search results by default, which reduces the chance that an agent will ingest manipulated content in real time; live access requires explicit configuration instead of being quietly available in the background.

Control area Default or baseline Higher-risk option Why the distinction matters
Filesystem access Read-only or workspace-write Danger-full-access Limits whether the agent can inspect, edit locally, or operate without meaningful file constraints
Network access Off by default Explicitly enabled live connectivity Changes exposure to external services, data leakage, and prompt injection paths
Web search Cached results Live web queries Cached search lowers the chance of real-time hostile content steering the agent
Action approval Human review for sensitive actions Auto-review for low-risk approvals Balances speed against oversight as agents take on more execution steps

Approval workflows are where autonomy is actually negotiated

OpenAI’s approval model is designed around actions that cross a boundary, such as leaving the sandbox or using networked tools. In those cases, Codex does not simply proceed; it requires user approval before execution, making human review part of the runtime control path rather than an after-the-fact audit step.

There is also an Auto-review option for low-risk actions, which is important because full manual review does not scale well once coding agents become embedded in daily development workflows. The practical question for enterprise teams is not whether to allow automation, but which categories of actions can be pre-cleared without turning approval into a rubber stamp or, in the other direction, slowing the tool until developers route around it.

Telemetry turns Codex from a coding tool into a governable system

OpenAI pairs execution controls with agent-native logging that records prompts, approvals, commands, and network events. Those records are designed to integrate with OpenTelemetry and enterprise compliance systems, so security teams can treat AI-agent activity as something observable inside existing monitoring and audit pipelines rather than as an opaque side channel.

This is one of the more material changes in deployment reality. Once logs can be exported into standard enterprise tooling and reviewed by AI-powered triage agents, organizations gain a way to distinguish an unusual but legitimate coding workflow from suspicious behavior, which matters for incident response, policy enforcement, and post-event forensics.

Where the framework still depends on enterprise judgment

OpenAI extends the same pattern into Codex Security, which scans connected GitHub repositories, validates vulnerabilities inside sandboxed environments, and then returns ranked evidence with suggested fixes. The extra validation step is meant to cut down false positives, but it does not remove the need for teams to define scope, threat assumptions, and acceptable automation levels for their own repositories and pipelines.

That is also where the next checkpoint sits. Authentication through ChatGPT logins, API keys stored in OS keyrings, and workspace-scoped controls gives administrators traceability, but governance still comes down to local decisions about who can use danger-full-access, when live network access is justified, and how much approval can be delegated to Auto-review before the organization loses meaningful human oversight.

Leave a Reply