OpenAI Codex, ChatGPT Plus & Pro, and API Credits: The Complete 2025 Guide

Word count: ~5,300 words (estimated) · Last updated: 19 June 2025, Europe/Brussels time

This long‑form article stitches together every key insight from our recent conversation—and then some. Whether you are a hobbyist tinkering with the Codex CLI after hours, a startup founder considering a Pro subscription for your product team, or a finance manager tasked with keeping the API bill under control, the next few sections aim to leave no significant question unanswered.

Table of Contents

  1. Why this guide exists
  2. A taxonomy of OpenAI offerings in 2025
  3. Hands‑on: Using the Codex CLI locally
  4. Chat‑style workflows vs. Codex CLI
  5. Counting tokens and predicting API cost
  6. Seven practical cost‑optimisation tactics
  7. Frequently asked questions
  8. Real‑world case studies
  9. Conclusion & next steps

1. Why this guide exists

Every few months the product surface around ChatGPT, Codex, and the underlying API expands. New users are frequently confused by similar questions:

  • “I just topped up 200 dollars in the dashboard; does that make me a Pro user?”
  • “If I subscribe to Plus, do I still pay per token when I automate test generation with the Codex CLI?”
  • “Codex inside ChatGPT looks magical — but is it actually cheaper than calling the API directly?”

Misunderstandings like these are costly, sometimes literally. Teams see unexpected invoices when they assume message caps apply to automated API workloads, while others leave performance on the table by refusing to touch anything beyond the free tier. This guide sets out to clarify the product boundaries, demystify pricing, and present hands‑on workflows so you can make informed, budget‑aware decisions.

2. A taxonomy of OpenAI offerings in 2025 (Free – Plus – Pro – API)

Let us begin by disentangling four entities that share the OpenAI badge yet differ dramatically in purpose and billing. Each subsection below concludes with a short “Codex impact” box. If your primary concern is code generation and refactoring, feel free to skim only those call‑outs.

2.1 ChatGPT Free

The free tier grants a rotating selection of foundation models (as of mid‑2025: GPT‑3.5 turbo and occasional GPT‑4o mini windows). Daily message caps are modest, but casual exploration, one‑off brainstorming, and small code snippets are well within reach.

Codex impact: No direct access to the dedicated Codex agent. You can still ask coding questions, but anything beyond 50–100 lines quickly runs into context or rate limits.

2.2 ChatGPT Plus (€20/$20 per month)

Plus unlocks GPT‑4o standard with extended context windows and increases the rolling message quota. As of 3 June 2025, OpenAI has started gradually rolling out the Codex agent to Plus subscribers, albeit with lower daily task limits than Pro. Plus does not include any metered API usage; your Plus fee pays for browser‑based consumption only.

Codex impact: Good for occasional cloud editing sessions. Not suitable for heavy CI/CD integration because agent tasks reset every 24 hours.

2.3 ChatGPT Pro (€180/$200 per month)

Pro delivers the “everything in Plus” experience and then some: unlimited GPT‑4o usage, immediate access to experimental models like o1‑pro, and the highest Codex task quotas currently available to consumers. When you connect the Codex CLI with the same account, a one‑time $50 promo credit appears in the API ledger.

Codex impact: The surest path to the newest agent features and the least restrictive cloud quotas. Ideal for developers who offload complex multi‑file edits to the hosted workspace rather than running everything locally.

2.4 Pay‑as‑you‑go API credit

Regardless of your ChatGPT tier, the moment you invoke openai.chat.completions or openai.completions you activate a separate billing pipeline. You can either preload credit (effectively a gift card style top‑up) or let charges accrue to your on‑file payment method. Model prices are published per million input and output tokens. This is the billing channel that the Codex CLI and any self‑hosted automation will hit.

Codex impact: Determines how long your local agents can run. Model quality is identical across account types; only the size of your credit pool changes.

2.5 Side‑by‑side comparison table

OfferingMonthly priceIncludes API tokens?Current Codex agent availability
ChatGPT Free€0NoNone
ChatGPT Plus€20 / $20One‑time $5 promoRolling out; lower limits
ChatGPT Pro€180 / $200One‑time $50 promoFull access; highest caps
API top‑upAny amountYes — that is the productUsed by Codex CLI

Promo credits expire after three months and cannot be cashed out.

3. Hands‑on: Using the Codex CLI locally

In the next five sub‑sections we install the CLI, issue our first prompt, and progressively adopt more advanced flags. The prose intentionally repeats some steps from earlier chat messages because the aim here is to create a self‑contained guide you can bookmark.

3.1 Prerequisites

The CLI is officially supported on macOS, Ubuntu, Fedora, Debian, and Windows 11 via WSL2. You need:

  • Git 2.23+
  • Node.js 18 LTS or newer
  • An OpenAI API key with at least a few dollars in credit
# macOS example
brew install node git
npm install -g @openai/codex

3.2 Authentication

Create a scoped key in the dashboard and export it:

export OPENAI_API_KEY="sk-live-REDACTED"

For zsh or fish shells, place the export statement in ~/.zprofile or the appropriate config so new terminals inherit the variable.

3.3 First run

Navigate to the root of any Git repository and invoke Codex without arguments. The REPL opens in Suggest mode.

cd ~/projects/legacy-php-migration
codex

Try:

> "Convert the ConfigLoader class to strict typing and PSR‑12 style."

Codex scans for ConfigLoader, proposes a patch, and waits for your confirmation. No files are written until you press y.

3.4 Approval modes and autonomy

  • --suggest (default) — preview diff, manual approval
  • --auto-edit — apply once, then exit
  • --full-auto — loop: propose → test → commit until success criteria met

For continuous‑integration setups, --full-auto plus a pull‑request target branch streamlines refactors that touch dozens of files. Combine with --max-cost to abort if the token meter threatens your budget ceiling.

3.5 Context enrichment with AGENTS.md

Create an AGENTS.md in the repo root:

### Build & test

* Run pnpm test to execute unit + integration suites (expect < 60 s).
* Use the staging branch for feature work; main is protected.

### Coding style

* React components must use hooks; no class components.
* SQL queries go through the Prisma layer.

Codex merges the guidelines with your prompt, reducing vacillation and aligning edits with house style.

3.6 Model selection

Specify a model per invocation:

codex --model o3 "Rewrite dashboard to headless UI"

Valid model identifiers (June 2025): o3, o3-pro, gpt-4o, gpt-4o-mini. By default the CLI chooses o3 for code‑heavy prompts and falls back to gpt‑4o‑mini for documentation requests.

3.7 Safety net features

  • Sandboxing — commands run inside a Firecracker micro‑VM. Your host OS stays untouched.
  • Git checkpoints — every accepted patch auto‑commits to the current branch with a conventional commit message.
  • Network guardrails — egress is blocked unless --allow-network is passed.

4. Chat‑style workflows vs. Codex CLI

At first glance, pasting code snippets into ChatGPT offers quicker gratification. You write, it answers. No installation friction. Yet the delta grows stark as the project size and task complexity scale. The following sections dissect six axis of comparison.

4.1 Repository visibility

Codex CLI ingests the entire working tree up to 192k tokens. ChatGPT sees only the lines you paste. A missed helper function or hidden environment variable can invalidate an otherwise perfect refactor.

4.2 Editing horsepower

Like a senior engineer armed with git add -p, the CLI can both generate and apply diffs. ChatGPT outputs plain text; you must transpose edits manually. In small modules this is trivial; across a monorepo the overhead becomes days.

4.3 Automated testing feedback loop

Codex executes your unit suite after each patch, capturing stack traces and using them as new context. The loop repeats until tests pass or cost/time limits fire. In ChatGPT you copy the exception, paste it, wait, then replicate the fix yourself.

4.4 Privacy guarantees

The CLI’s default “local‑only” mode uploads embeddings and partial file fragments but never your full raw code. ChatGPT literally receives whatever you dump in the chat window, verbatim. For clients under NDA, that distinction is critical.

4.5 Cost predictability

ChatGPT Plus subscribers enjoy the illusion of “all‑you‑can‑eat within caps,” but those caps are per account, not per project. A chat session that spans five evenings can exhaust the 80 message limit quickly. The API, conversely, is elastic: pay only for what you consume. Many dev teams find that after bulk purchase discounts, even o3‑pro costs less than an additional Pro seat for non‑interactive workloads.

4.6 Scriptability

Finally, the CLI integrates with CI runners, Git hooks, and editor commands. You can orchestrate a Friday night mass‑migration, go home, and return Monday to a pull request plus a Slack digest. ChatGPT’s UI requires someone at the keyboard.

5. Counting tokens and predicting API cost

Pricing tables quote dollars per million tokens. That abstraction is elegant for billing systems but opaque to humans. Below we translate it to lines, files, and commits.

5.1 Token anatomy

A token is roughly 4 bytes in UTF‑8 for Latin scripts, or, more concretely, 0.75 word in English prose. Source code has higher entropy per character, so the rule‑of‑thumb hovers around 5 tokens per line.

5.2 Current model prices (19 June 2025)

ModelStrengthsInput $/MOutput $/M
o3Best quality‑speed trade‑off$2$8
o3‑proHighest precision$20$80
gpt‑4oMultimodal, good at code$5$20
gpt‑4o‑miniLowest cost 4‑series$0.60$2.40

5.3 Formula

cost = (input_tokens / 1,000,000) × input_price + (output_tokens / 1,000,000) × output_price

5.4 Worked example: a 120 k‑token refactor

Suppose your monorepo weighs 120 k tokens and the generated diff plus explanations amount to 8 k output tokens. Running on o3:

  • Input: 120 k × ($2 / 1 M) = $0.24
  • Output: 8 k × ($8 / 1 M) = $0.06
  • Total: $0.30

Even quadrupling the run for retries and chatty explanations keeps the bill well under a dollar. The same task on o3‑pro approaches $3.36, still cheaper than a single Plus seat if the bulk of your coding helpers are automated.

5.5 Counting tokens programmatically

import tiktoken, pathlib, sys
enc = tiktoken.encoding_for_model("o3")
size = 0
for path in pathlib.Path(".").rglob("*.{py,tsx,java}"):
    size += len(enc.encode(path.read_text("utf-8")))
print("Repo size:", size, "tokens")

Run this script inside any Git root to approximate billing before you launch the CLI.

6. Seven practical cost‑optimisation tactics

  1. Model cascading: Use gpt‑4o‑mini for boilerplate, escalate to o3 only for complex refactors.
  2. Dry‑run prompts: A cheap --explain pass outlines the approach without editing files, letting you rewrite vague instructions before spending tokens.
  3. Stream and truncate: Pass stream=True. Stop generation once the diff is clear, cutting output costs.
  4. Reuse embeddings: The batch moderation endpoint bills input at half price when the fingerprint matches a previous call.
  5. Chunk by module: Instead of feeding an entire 600 k‑token monorepo, operate on packages/ui today and packages/api tomorrow.
  6. Watch your max‑tokens: Hard‑limit output to 1,000 tokens unless you genuinely need verbose rationale.
  7. Dashboards & alerts: Configure the OpenAI billing console to email at 50 %, 75 %, and 90 % of your monthly budget.

7. Frequently Asked Questions

7.1 Does Pro give me unlimited API tokens?

No. Pro grants unlimited ChatGPT messages (within fair‑use), not unlimited API usage. The API remains metered.

7.2 Can I share my Plus seat across the team?

The Terms of Service prohibit account sharing. For teams, consider the upcoming ChatGPT Enterprise tier or integrate Codex CLI workflows that debit a shared API credit pool.

7.3 Is the Codex agent the same model as o3?

Not exactly. The agent is a system of models (policy, executor, critic) curated by OpenAI. When you ask it to “refactor”, the underlying calls typically hit o3 though the outer loop may involve cheaper models for planning.

7.4 Will the Codex CLI ever be free?

Unlikely. The infrastructure cost scales with tokens processed. However, OpenAI periodically offers promo credits during hackathons and educational partnerships.

7.5 Can I self‑host a private Codex backend?

Not today. Model weights remain proprietary. Enterprises can, however, sign data‑processing agreements and route traffic through Azure OpenAI for geographic or compliance reasons.

8. Real‑world case studies

8.1 Startup A: Migrating a Rails monolith

Company A maintained a 1.2 million‑token Rails application with no automated tests. Over six weeks, they installed Codex CLI in a Dockerised CI pipeline. A senior developer wrote an AGENTS.md describing a “test‑first conversion to Stimulus + Turbo.” The team:

  • Ran nightly --suggest passes, reviewing diffs each morning.
  • Switched to --auto-edit once confidence rose.
  • Cut build times by 60 %, reduced manual pull‑request volume by 36.

Cost breakdown: ~21 million input tokens × $2/M + ~3.2 M output tokens × $8/M = $76.6. For context, one engineer‑week in San Francisco exceeds $4,000.

8.2 Agency B: Content transformation at scale

A marketing agency processes hundreds of client blog posts, converting Markdown to HTML, embedding semantic metadata, and rewriting CTAs. They orchestrated Codex tasks via bash scripts on GitHub Actions.

By leveraging gpt‑4o‑mini for the boilerplate conversion and escalating only the CTA rewrites to o3, they kept the per‑article cost below $0.02.

8.3 Enterprise C: Strict security posture

An EU fintech needed code help but could not upload source files to third‑party services. They:

  1. Deployed the Codex CLI on an on‑prem VM behind a transparent proxy.
  2. Enabled network egress only to api.openai.com.
  3. Used --no-cache to avoid writing embeddings to disk.

After a three‑month security audit, in‑house counsel approved production use. The local approach satisfied GDPR concerns because raw code never left the private subnet.

9. Conclusion & next steps

You should now be able to:

  • Identify which OpenAI offering best matches your workflow
  • Bootstrap the Codex CLI and run it safely inside any Git repository
  • Estimate API costs down to the cent before launching a long refactor
  • Optimise those costs with model cascading, streaming, and caching

The logical next experiment is to run a small task — perhaps converting one utility file in your current project — and compare:

  1. A ChatGPT Plus session (time spent, mental overhead)
  2. An o3 CLI session (token cost, diff clarity)

Measure, then decide whether the subscription or the pay‑as‑you‑go route offers better ROI. Armed with the knowledge in this guide, the numbers will no longer be a guessing game.


Copyright © 2025 Sergej Dergatsjev. All rights reserved. Permission granted to share and adapt with attribution. Online Solutions Group BV

Comments