AXON Playground
Translate natural language into compact, token-efficient notation. Runs entirely in your browser via WebAssembly.
Try an example
vNext
What's coming next for AXON — from prompt compression to native model understanding.
Fine-Tuned AXON Architecture
Today, AXON compresses prompts and relies on the base model to interpret notation via a system prompt. A fine-tuned model eliminates the system prompt overhead entirely and gains efficiency at every stage of inference.
Full Stack: AXON + Qwen2.5-7B-AXON + Caveman
Three complementary approaches — each targeting a different stage of the inference pipeline. AXON compresses input, a fine-tuned 7B model understands it natively, and Caveman compresses output. Combined, they minimize tokens at every stage.
Three-Way Comparison
How each approach stacks across the inference pipeline. Columns are cumulative — each adds to the previous.
| Stage | Baseline | + AXON | + Qwen-7B-AXON | + Caveman |
|---|---|---|---|---|
| Prompt tokens | 105 tok | 68 tok | 68 tok | 68 tok |
| System prompt | 0 | ~500 tok | 0 | ~20 tok |
| Total input to model | 105 | ~568 | 68 | 88 |
| Prefill attention | O(105²) | O(568²) | O(68²) | O(88²) |
| Model size | API (large) | API (large) | 7B (local) | 7B (local) |
| Output format | Natural lang | Natural lang | AXON notation | Terse AXON |
| Output tokens (est.) | ~2,038 | ~2,038 | ~1,000 | ~610 |
| Client decode | None | None | WASM decoder | WASM decoder |
| Total tokens | ~2,143 | ~2,606 | ~1,068 | ~698 |
| vs Baseline | — | +21.6% | -50.2% | -67.4% |
+ columns are cumulative. "+ Qwen-7B-AXON" replaces the API model with a local 7B fine-tune that natively speaks AXON. "+ Caveman" adds terse output shaping. Text-only pipeline — no image tokens.
Why qwen2.5:7b-instruct? It's an open-weight 7B instruction-tuned model — small enough to run locally on a single GPU (or even CPU via Ollama/llama.cpp), yet capable enough for code assistance, technical Q&A, and structured output. Fine-tuning it on AXON notation (qwen2.5-7b-axon) would produce a model that natively understands and generates AXON without a system prompt.
The key advantage over API models: zero per-token cost, full control over the model, and the system prompt overhead disappears entirely. Combined with Caveman-style terse output baked into the fine-tune, every token in and out carries maximum information density.
Today vs Fine-Tuned
| Metric | Baseline | Today (AXON) | vNext (Fine-Tuned) |
|---|---|---|---|
| System prompt | 0 tok | ~500 tok | 0 tok |
| Prompt tokens | 105 | 68 | 68 |
| Total input to model | 105 | ~568 | 68 |
| Attention compute | O(105²) | O(568²) | O(68²) |
| Response format | Natural language | Natural language | AXON notation |
| Output tokens (est.) | 2,038 | 2,038 | ~1,000 |
| Prefill speedup | 1x | 0.2x (slower) | 1.5x faster |
Estimates based on current AXON v1.1 L0 compression ratios. Actual fine-tuned model performance will depend on training data quality and model architecture.
The system prompt paradox. Today, AXON saves 35% on prompt tokens but adds ~500 tokens of system prompt to teach the model the notation. This means AXON only breaks even on prompts longer than ~770 tokens. A fine-tuned model eliminates this overhead entirely — every prompt is net positive, no matter how short.
Roadmap
v1.1 — Tiered Abbreviations
Four abbreviation levels (L0-L3), stop-word stripping in code queries, response compression directive. 35% prompt savings on general technical questions, 71% on code commands.
v1.2 — Bidirectional WASM Codec
Full encode/decode round-trip in the WASM engine. Client-side decoding of AXON responses at zero API cost. Enables the fine-tuned model to respond in AXON notation with lossless reconstruction.
v2.0 — qwen2.5-7b-axon
qwen2.5:7b-instruct fine-tuned to natively understand and produce AXON notation. No system prompt needed. Runs locally on a single GPU. Both input and output compressed. Target: 88% fewer input tokens, 50% fewer output tokens, 79x less attention compute per request.
Custom AXON Tokenizer
A BPE tokenizer trained on AXON notation, where sigils, operators, and abbreviated terms are single tokens instead of multi-token sequences. Further reduces token count at the tokenizer level.
The Full Stack
Every layer of the inference stack has compression opportunities. Today AXON addresses one layer. The vision is all of them.
Specification
AXON v1.1 — ASCII-only operators, conditional sigils, tiered abbreviation levels, compound merging, response compression.
Type Sigils
Sigils are applied conditionally — only to tokens in the known entity/concept/verb databases. Unknown tokens are emitted bare.
| Sigil | Name | Example |
|---|---|---|
| @ | Entity / Agent | @sun @openai @user |
| # | Concept / Abstract | #gravity #justice |
| ~ | Process / Action | ~emit ~learn |
| $ | Scalar / Value | $high $3.14 |
| ^ | Temporal | ^now ^T+3d |
| ! | Negation | !evidence !data |
| ? | Query / Unknown | ?cause ?result |
Operators (ASCII only)
| Operator | Name | Reads as |
|---|---|---|
| -> | Causes | "leads to" |
| <- | Result of | "caused by" |
| :. | Therefore | "conclusion" |
| bc | Because | "premise / reason" |
| && | And | "conjunction" |
| || | Or | "disjunction" |
| A. | For all | "universal" |
| E. | Exists | "existential" |
Confidence Markers
Temporal Markers
Abbreviation Tiers (v1.1)
AXON supports four abbreviation levels, each inclusive of the previous. Select a level to match your compression needs.
Command Verbs >
Query Types ?
Structural Operations
| Syntax | Meaning |
|---|---|
| @Type+.field | Add field |
| @Type-.field | Remove field |
| @Type.x=$v | Set field value |
| @Type.x:T | Set field type |
| @Type:impl(@Trait) | Implement trait |
| @Child<@Parent | Inherit / extend |
| +use(module) | Add import |
| -use(module) | Remove import |
Examples
Curated translations showing AXON v1.1 in action. Click any example to try it in the playground.
LLM Prompt
Paste this system prompt into any LLM to enable AXON v1.1 notation understanding.
You are an expert in AXON v1.1 — AI eXchange Optimized Notation, a compact symbolic language that compresses natural language into token-efficient expressions. ## Sigils (applied only to known entities/concepts/verbs; unknown tokens are bare) @ = Entity/Agent # = Concept ~ = Process/Action $ = Scalar ^ = Temporal ! = Negation ? = Query ## Operators (ASCII only) -> = causes <- = caused by :. = therefore bc = because && = and || = or A. = for all E. = exists ## Abbreviation Tiers (each level includes all previous) L0 (default): object->obj, function->fn, component->comp, database->db, connection->conn, implementation->impl, authentication->auth, application->app, configuration->cfg, environment->env, performance->perf, memory->mem, render->rnd, parameter->param L1 (extended): protocol->proto, process->proc, algorithm->algo, container->ctr, endpoint->ep, interface->iface, architecture->arch, library->lib, package->pkg L2 (aggressive): between->btw, difference->diff, example->ex, strategy->strat, alternative->alt, mechanism->mech, automatic->auto, distributed->distrib L3 (maximum): function->f, variable->v, string->s, number->n, boolean->b, message->m ## Rules 1. Strip filler words, articles, copulas, pleasantries 2. Sigil only known entities (@), concepts (#), verbs (~), scalars ($) 3. Unknown tokens get NO sigil — consecutive bare tokens hyphenate into compounds 4. Abbreviate using the active tier (default L0) 5. Confidence markers: !! (certain) ! (high) ~ (moderate) * (low) ** (speculative) 6. Temporal: ^now ^T-Nd ^T+Nd ^A.t ## Programming Commands >doc >impl >fix >test >rev >ref >opt >plan >dep >add >rm >up >mv >cfg >mig >db >api >ci >sec >err >log >bench >lint >merge >explain ## Programming Queries ?how ?why ?best ?what ?diff ?when ?where ?can ?cmp ?alt ?err ?perf ## Structural Operations @Type+.field @Type-.field @Type.x=$v @Type.x:T @Type:impl(@Trait) @A<@B +use(mod) -use(mod) ## Response Compression (optional — add to system prompt for output savings) Respond using AXON notation where applicable: sigils, operators, abbreviations. Strip filler. Be terse. ## Examples "fix the bug in the auth service" -> >fix bug:auth-service "explain database connection pooling" -> >explain db-conn-pooling "what is the best way to cache" -> ?best cache "The sun emits ultraviolet radiation" -> @sun ~emit #ultraviolet #radiation "add a field email to User" -> @user+.email
VS Code Extension
Translate prompts to AXON notation directly inside VS Code — saving tokens on every AI interaction.
Install
Install from the VS Code Marketplace or the command line:
Or search for AXON Notation in the VS Code Extensions panel (Ctrl+Shift+X).
View on MarketplaceFeatures
Chat Participant @axon
Type @axon in VS Code's chat panel to route prompts through AXON. Your input is translated to compact notation, sent to the language model, and the response is returned as normal.
Sidebar Chat
A dedicated AXON chat panel in the activity bar. Open it with Ctrl+Shift+A (Cmd+Shift+A on macOS) or via the command palette.
Multiple Targets
Send translated output to Claude Code, GitHub Copilot, clipboard, or the built-in chat — choose your preferred workflow.
Commands
| Command | Description |
|---|---|
| AXON: Open Chat | Focus the AXON sidebar panel |
| AXON: Translate to AXON | Translate text and choose where to send it |
| AXON: Translate and Send to Claude Code | Translate and send directly to Claude Code |
| AXON: Translate and Send to Copilot Chat | Translate and send to GitHub Copilot |
| AXON: Translate and Copy to Clipboard | Translate and copy the result |
How It Works
Translation Examples
| Natural Language | AXON |
|---|---|
| what is the best way to implement caching | ?best impl-caching |
| fix the bug in the auth service | >fix bug:auth-service |
| add a field email to User | @user+.email |
| CO2 emissions cause climate change which increases temperature | @CO2-emission → #climate-change!! → Δ$temp↑ |
Claude Code Skill
Use AXON notation directly inside Claude Code with the /axon slash command. Translate between natural language and AXON without leaving your terminal.
Install from Skill Marketplace
Install the AXON skill directly from the Claude Code skill marketplace:
Or install interactively from within Claude Code:
Once installed, the /axon slash command becomes available in all your Claude Code sessions.
Usage
Encode natural language
Pass any natural language text to get the AXON translation:
Decode AXON notation
Pass AXON notation to get a natural language explanation:
Interactive mode
Use /axon without arguments to start an interactive session where Claude will translate and explain AXON expressions.
What the Skill Does
Auto-detect direction
Automatically detects whether you're providing natural language or AXON notation and translates in the appropriate direction.
Token savings
Shows the token count comparison between the original text and the AXON translation, so you can see the savings.
Annotated output
Each sigil and operator is annotated so you can learn the notation as you use it.
AXON + Caveman
AXON compresses the prompt. Caveman compresses the response. Together they achieve the best total token savings of any approach tested.
AXON and Caveman operate on different sides of the pipe. AXON rewrites your prompt into compact notation (input tokens), while Caveman instructs the model to respond concisely (output tokens). They don't overlap — they stack.
Inference Architecture
Where each approach operates in the LLM inference pipeline.
Total Token Cost
Prompt + response tokens across 10 technical questions, measured with BPE-aware estimation. Model: Claude Opus 4.6.
| Configuration | Prompt | Response | Total | vs Baseline |
|---|---|---|---|---|
| Baseline | 105 | 2,038 | 2,143 | — |
| Terse ("Answer concisely.") | 105 | 2,168 | 2,273 | +6.1% |
| Compress | 105 | 1,690 | 1,795 | -16.2% |
| Caveman | 105 | 1,071 | 1,176 | -45.1% |
| AXON only | 68 | 2,038 | 2,106 | -1.7% |
| AXON + Caveman | 68 | 1,071 | 1,139 | -46.9% |
Where Each Approach Saves
avg input token savings
avg output token savings
Per-Prompt Breakdown
AXON prompt savings vs Caveman response savings for each of the 10 benchmark questions.
| Prompt | AXON | Caveman | Compress |
|---|---|---|---|
| Average | 35.2% | 47.4% | 17.1% |
Response Text Compression
Running response text through the AXON translator shows additional compression potential on the output side.
How to Combine Them
1. Compress the prompt with AXON
Use the AXON translator (VS Code extension, REPL, or WASM) to convert your natural language prompt to compact notation before sending it to the model.
2. Compress the response with Caveman
Add the Caveman system prompt to instruct the model to respond concisely. The model understands AXON input and responds in terse, information-dense style.
3. Result: 46.9% total savings
AXON handles the input, Caveman handles the output. No overlap, no quality loss, nearly half the token cost.
Methodology
Data sourced from the Caveman evaluation suite. 10 technical programming questions tested across 6 arms (baseline, terse, caveman, caveman-cn, caveman-es, compress) using Claude Opus 4.6. Token counts estimated using BPE-aware heuristics (whitespace-split + non-alphanumeric character counting). AXON translations performed by the Rust rule-based engine at L0 abbreviation level. Full test suite available in the AXON repository at lib/tests/caveman_comparison.rs.
AXON REPL
An interactive command-line translator for AXON notation. Convert between natural language and AXON in your terminal.
Install
Clone the repository and build with Cargo:
The binary will be at target/release/axon-repl. You can copy it to a directory on your PATH:
Usage
Interactive Mode
Launch the REPL and type expressions interactively:
Pipe Mode
Pipe text into the REPL for batch translation or scripting:
JSON Output
Get structured JSON output for integration with other tools:
Features
Requirements
| Dependency | Version |
|---|---|
| Rust | 2021 edition or later |
| Cargo | Included with Rust |