What Is a Token? (The 30-Second Version)
A token is a chunk of text that an AI model reads and produces. Not a word. Not a character. A chunk. On average, one token equals roughly 4 characters, or about three-quarters of a word.
"Listing" is 1 token. "Real estate agent" is 3 tokens. "3-bedroom ranch with updated kitchen" is about 7 tokens. You can see exactly how any text splits into tokens using OpenAI's free tokenizer tool.
Why does this matter? Because tokens are how AI models measure everything. Your prompt has a token count. The AI's response has a token count. The model's memory -- how much it can "hold" in a single conversation -- is measured in tokens. And if you're using the API, your bill is calculated in tokens.
Think of tokens as the square footage of an AI conversation. Your prompt is the rooms you're furnishing. The AI's response is the rooms it builds for you. The context window is the total floor plan. Everything fits inside that footprint -- or it doesn't.
Tokens in Practice: Real Estate Examples
Let's make this concrete with tasks you actually do.
Listing description: A typical prompt (property details + instructions + your Context Card) runs about 400-600 input tokens. The AI's response (a 200-word listing description) is about 250-350 output tokens. Total: roughly 700-950 tokens. At GPT-4o API rates, that's about $0.005-$0.01. Less than a penny.
Buyer follow-up email: A prompt with lead details and your email style is about 300-500 input tokens. The drafted email is 200-400 output tokens. Total: 500-900 tokens. Cost: under $0.01.
Neighborhood market analysis: If you paste in MLS data, recent sales, and neighborhood stats, you might send 3,000-5,000 input tokens. The AI's analysis might be 1,500-2,500 output tokens. Total: 4,500-7,500 tokens. Cost: about $0.03-$0.05.
Inspection report summary: A full inspection report pasted into Claude could be 15,000-25,000 input tokens. The summary might be 1,000-2,000 output tokens. Total: 16,000-27,000 tokens. Cost: about $0.06-$0.10. That's a dime to summarize a 30-page report.
The pattern is clear: individual tasks cost almost nothing in tokens. Even if you process 100 listing descriptions per month via the API, you're looking at about $1 total. Tokens only become a meaningful cost factor at high volume or with very long documents.
Context Windows: Your AI's Memory Limit
The context window is the total number of tokens an AI model can process in a single conversation -- your prompts, the AI's responses, and everything in between. When you hit the limit, the AI starts "forgetting" the earliest parts of your conversation.
Here's where the major models stand:
- GPT-4o: 128,000 tokens (~96,000 words)
- Claude (Anthropic): 200,000 tokens (~150,000 words)
- Gemini 1.5 Pro: 1,000,000 tokens (~750,000 words)
For real estate, context window size matters most when you're working with long documents. A 50-page purchase agreement is roughly 15,000-20,000 tokens. A full CMA packet might be 30,000 tokens. An inspection report runs 15,000-25,000 tokens. All of these fit comfortably inside any modern model's context window.
Where agents run into trouble: long, multi-turn conversations. If you've been going back and forth with ChatGPT for 30 messages about a complex deal, and the responses start feeling disconnected from what you discussed earlier, you've likely filled the context window. The solution isn't a bigger model -- it's a new conversation with a fresh Context Card that includes the key details from your previous chat.
Token Costs: Subscription vs API Pricing
| Option | Price | Token Limit | Best For |
|---|---|---|---|
| ChatGPT Plus | $20/month flat | Unlimited messages (rate-limited) | Individual agents, daily use |
| Claude Pro | $20/month flat | Unlimited messages (rate-limited) | Long documents, detailed instructions |
| OpenAI API (GPT-4o) | ~$2.50/1M input, ~$10/1M output | 128K per request | Automations, bulk processing |
| Anthropic API (Claude Sonnet 4) | ~$3/1M input, ~$15/1M output | 200K per request | Complex workflows, large documents |
| OpenAI API (GPT-4o-mini) | ~$0.15/1M input, ~$0.60/1M output | 128K per request | High-volume, simple tasks |
Subscription plans charge flat monthly rates. API pricing is per-token, pay-as-you-go. Prices as of February 2026.
When Tokens Matter (and When They Don't)
Here's the honest breakdown: if you're using ChatGPT Plus or Claude Pro at $20/month, tokens don't affect your bill at all. You've paid the flat fee. Use it as much as you want. The only limit is rate throttling during peak times, and for most agents, you'll never notice it.
ChatGPT Plus has 10 million subscribers paying that flat $20. The vast majority never think about tokens. And that's fine. If you're using the app for conversation-based work -- writing, research, analysis, brainstorming -- tokens are invisible to you. Stay on the subscription. Don't overthink it.
Tokens start mattering in three scenarios:
1. API integrations. If you're building automations or your brokerage is running AI through code, you're paying per token. Understanding token costs helps you estimate monthly expenses and choose the right model. GPT-4o-mini at $0.15 per million input tokens is 16x cheaper than GPT-4o -- and for simple tasks like drafting short emails, the output quality is nearly identical.
2. Very long documents. Pasting a 100-page document into the AI uses a lot of tokens from your context window. If you're working with large files regularly, a model with a bigger context window (Claude's 200K vs GPT-4o's 128K) gives you more room. This isn't about cost -- it's about capacity.
3. Context window overflow. When your conversation gets so long that the AI starts losing track of what you said earlier, that's a token problem. The fix: start fresh with a new conversation and include the essential context upfront in your Context Card.
The OODA Loop for Token Management
If you're using the API or managing AI costs for a team, the OODA Loop gives you a framework for optimizing token spend.
Observe: Track your actual token usage. OpenAI's dashboard shows usage by day and model. Anthropic provides the same. Look at how many tokens each workflow consumes. You might find that one automation accounts for 80% of your bill.
Orient: Map token usage to business value. That lead-response automation using 500K tokens per month at $6 might be generating $4,000 in additional commissions. That's a 66,000% return. But that research tool pulling 2M tokens per month at $25 for data nobody reads? That's waste.
Decide: Choose the right model for each task. You don't need GPT-4o for every job. Simple tasks (email subject lines, quick summaries, data extraction) work fine on GPT-4o-mini at a fraction of the cost. Reserve the powerful models for complex analysis, long-document processing, and tasks where quality directly impacts revenue.
Act: Implement model routing. Send simple requests to the cheap model, complex requests to the powerful model. Trim unnecessary context from prompts -- don't paste your entire Context Card when you only need one section. Set usage alerts so you're not surprised by a spike. Review monthly and adjust.
For individual agents on the $20 subscription: you can skip all of this. Tokens are abstracted away. Focus on getting better at prompting instead -- that's where the real value lives.