AXON Playground

Translate natural language into compact, token-efficient notation. Runs entirely in your browser via WebAssembly.

Input

AXON Output

Try an example

700+

Test cases

~71%

Avg token savings

0

Network requests

vNext

What's coming next for AXON — from prompt compression to native model understanding.

Fine-Tuned AXON Architecture

Today, AXON compresses prompts and relies on the base model to interpret notation via a system prompt. A fine-tuned model eliminates the system prompt overhead entirely and gains efficiency at every stage of inference.

Full Stack: AXON + Qwen2.5-7B-AXON + Caveman

Three complementary approaches — each targeting a different stage of the inference pipeline. AXON compresses input, a fine-tuned 7B model understands it natively, and Caveman compresses output. Combined, they minimize tokens at every stage.

Three-Way Comparison

How each approach stacks across the inference pipeline. Columns are cumulative — each adds to the previous.

Stage	Baseline	+ AXON	+ Qwen-7B-AXON	+ Caveman
Prompt tokens	105 tok	68 tok	68 tok	68 tok
System prompt	0	~500 tok	0	~20 tok
Total input to model	105	~568	68	88
Prefill attention	O(105²)	O(568²)	O(68²)	O(88²)
Model size	API (large)	API (large)	7B (local)	7B (local)
Output format	Natural lang	Natural lang	AXON notation	Terse AXON
Output tokens (est.)	~2,038	~2,038	~1,000	~610
Client decode	None	None	WASM decoder	WASM decoder
Total tokens	~2,143	~2,606	~1,068	~698
vs Baseline	—	+21.6%	-50.2%	-67.4%

+ columns are cumulative. "+ Qwen-7B-AXON" replaces the API model with a local 7B fine-tune that natively speaks AXON. "+ Caveman" adds terse output shaping. Text-only pipeline — no image tokens.

?

Why qwen2.5:7b-instruct? It's an open-weight 7B instruction-tuned model — small enough to run locally on a single GPU (or even CPU via Ollama/llama.cpp), yet capable enough for code assistance, technical Q&A, and structured output. Fine-tuning it on AXON notation (qwen2.5-7b-axon) would produce a model that natively understands and generates AXON without a system prompt.

The key advantage over API models: zero per-token cost, full control over the model, and the system prompt overhead disappears entirely. Combined with Caveman-style terse output baked into the fine-tune, every token in and out carries maximum information density.

Today vs Fine-Tuned

Metric	Baseline	Today (AXON)	vNext (Fine-Tuned)
System prompt	0 tok	~500 tok	0 tok
Prompt tokens	105	68	68
Total input to model	105	~568	68
Attention compute	O(105²)	O(568²)	O(68²)
Response format	Natural language	Natural language	AXON notation
Output tokens (est.)	2,038	2,038	~1,000
Prefill speedup	1x	0.2x (slower)	1.5x faster

Estimates based on current AXON v1.1 L0 compression ratios. Actual fine-tuned model performance will depend on training data quality and model architecture.

!

The system prompt paradox. Today, AXON saves 35% on prompt tokens but adds ~500 tokens of system prompt to teach the model the notation. This means AXON only breaks even on prompts longer than ~770 tokens. A fine-tuned model eliminates this overhead entirely — every prompt is net positive, no matter how short.

Roadmap

Shipped

v1.1 — Tiered Abbreviations

Four abbreviation levels (L0-L3), stop-word stripping in code queries, response compression directive. 35% prompt savings on general technical questions, 71% on code commands.

Full encode/decode round-trip in the WASM engine. Client-side decoding of AXON responses at zero API cost. Enables the fine-tuned model to respond in AXON notation with lossless reconstruction.

Planned

v2.0 — qwen2.5-7b-axon

qwen2.5:7b-instruct fine-tuned to natively understand and produce AXON notation. No system prompt needed. Runs locally on a single GPU. Both input and output compressed. Target: 88% fewer input tokens, 50% fewer output tokens, 79x less attention compute per request.

Planned

Custom AXON Tokenizer

A BPE tokenizer trained on AXON notation, where sigils, operators, and abbreviated terms are single tokens instead of multi-token sequences. Further reduces token count at the tokenizer level.

The Full Stack

Every layer of the inference stack has compression opportunities. Today AXON addresses one layer. The vision is all of them.

PROMPT

AXON

Rule-based NLP

shipped

TOKENIZER

AXON BPE

Custom vocabulary

planned

MODEL

Fine-Tune

Native AXON I/O

planned

RESPONSE

Decoder

Client-side WASM

AXON v1.1 — ASCII-only operators, conditional sigils, tiered abbreviation levels, compound merging, response compression.

Type Sigils

Sigils are applied conditionally — only to tokens in the known entity/concept/verb databases. Unknown tokens are emitted bare.

Sigil	Name	Example
@	Entity / Agent	@sun @openai @user
#	Concept / Abstract	#gravity #justice
~	Process / Action	~emit ~learn
$	Scalar / Value	$high $3.14
^	Temporal	^now ^T+3d
!	Negation	!evidence !data
?	Query / Unknown	?cause ?result

Operators (ASCII only)

Operator	Name	Reads as
->	Causes	"leads to"
<-	Result of	"caused by"
:.	Therefore	"conclusion"
bc	Because	"premise / reason"
&&	And	"conjunction"
\|\|	Or	"disjunction"
A.	For all	"universal"
E.	Exists	"existential"

Confidence Markers

!! Certain

! High

~ Moderate

* Low

** Speculative

? Unknown

Temporal Markers

^now current moment

^T-7d one week ago

^T+30d next month

^A.t always / all time

Abbreviation Tiers (v1.1)

AXON supports four abbreviation levels, each inclusive of the previous. Select a level to match your compression needs.

Command Verbs >

Query Types ?

Structural Operations

Syntax	Meaning
@Type+.field	Add field
@Type-.field	Remove field
@Type.x=$v	Set field value
@Type.x:T	Set field type
@Type:impl(@Trait)	Implement trait
@Child<@Parent	Inherit / extend
+use(module)	Add import
-use(module)	Remove import

Examples

Curated translations showing AXON v1.1 in action. Click any example to try it in the playground.

LLM Prompt

Paste this system prompt into any LLM to enable AXON v1.1 notation understanding.

System Prompt

You are an expert in AXON v1.1 — AI eXchange Optimized Notation, a compact symbolic language that compresses natural language into token-efficient expressions.

## Sigils (applied only to known entities/concepts/verbs; unknown tokens are bare)
@ = Entity/Agent  # = Concept  ~ = Process/Action  $ = Scalar  ^ = Temporal  ! = Negation  ? = Query

## Operators (ASCII only)
-> = causes  <- = caused by  :. = therefore  bc = because
&& = and  || = or  A. = for all  E. = exists

## Abbreviation Tiers (each level includes all previous)
L0 (default): object->obj, function->fn, component->comp, database->db, connection->conn,
  implementation->impl, authentication->auth, application->app, configuration->cfg,
  environment->env, performance->perf, memory->mem, render->rnd, parameter->param
L1 (extended): protocol->proto, process->proc, algorithm->algo, container->ctr,
  endpoint->ep, interface->iface, architecture->arch, library->lib, package->pkg
L2 (aggressive): between->btw, difference->diff, example->ex, strategy->strat,
  alternative->alt, mechanism->mech, automatic->auto, distributed->distrib
L3 (maximum): function->f, variable->v, string->s, number->n, boolean->b, message->m

## Rules
1. Strip filler words, articles, copulas, pleasantries
2. Sigil only known entities (@), concepts (#), verbs (~), scalars ($)
3. Unknown tokens get NO sigil — consecutive bare tokens hyphenate into compounds
4. Abbreviate using the active tier (default L0)
5. Confidence markers: !! (certain) ! (high) ~ (moderate) * (low) ** (speculative)
6. Temporal: ^now ^T-Nd ^T+Nd ^A.t

## Programming Commands
>doc >impl >fix >test >rev >ref >opt >plan >dep >add >rm >up >mv >cfg >mig
>db >api >ci >sec >err >log >bench >lint >merge >explain

## Programming Queries
?how ?why ?best ?what ?diff ?when ?where ?can ?cmp ?alt ?err ?perf

## Structural Operations
@Type+.field  @Type-.field  @Type.x=$v  @Type.x:T  @Type:impl(@Trait)  @A<@B  +use(mod)  -use(mod)

## Response Compression (optional — add to system prompt for output savings)
Respond using AXON notation where applicable: sigils, operators, abbreviations. Strip filler. Be terse.

## Examples
"fix the bug in the auth service" -> >fix bug:auth-service
"explain database connection pooling" -> >explain db-conn-pooling
"what is the best way to cache" -> ?best cache
"The sun emits ultraviolet radiation" -> @sun ~emit #ultraviolet #radiation
"add a field email to User" -> @user+.email

VS Code Extension

Translate prompts to AXON notation directly inside VS Code — saving tokens on every AI interaction.

Install

Install from the VS Code Marketplace or the command line:

code --install-extension colwill.axon-notation

Or search for AXON Notation in the VS Code Extensions panel (Ctrl+Shift+X).

View on Marketplace

Features

Chat Participant @axon

Type @axon in VS Code's chat panel to route prompts through AXON. Your input is translated to compact notation, sent to the language model, and the response is returned as normal.

Sidebar Chat

A dedicated AXON chat panel in the activity bar. Open it with Ctrl+Shift+A (Cmd+Shift+A on macOS) or via the command palette.

Multiple Targets

Send translated output to Claude Code, GitHub Copilot, clipboard, or the built-in chat — choose your preferred workflow.

Commands

Command	Description
AXON: Open Chat	Focus the AXON sidebar panel
AXON: Translate to AXON	Translate text and choose where to send it
AXON: Translate and Send to Claude Code	Translate and send directly to Claude Code
AXON: Translate and Send to Copilot Chat	Translate and send to GitHub Copilot
AXON: Translate and Copy to Clipboard	Translate and copy the result

How It Works

1

Type a natural language prompt

2

WASM engine translates to AXON

3

Compressed prompt sent to AI

4

Same response, fewer tokens

Translation Examples

Natural Language	AXON
what is the best way to implement caching	?best impl-caching
fix the bug in the auth service	>fix bug:auth-service
add a field email to User	@user+.email
CO2 emissions cause climate change which increases temperature	@CO2-emission → #climate-change!! → Δ$temp↑

VS Code 1.93+

Minimum version

0 API keys

Runs locally via WASM

Claude Code Skill

Use AXON notation directly inside Claude Code with the /axon slash command. Translate between natural language and AXON without leaving your terminal.

Install from Skill Marketplace

Install the AXON skill directly from the Claude Code skill marketplace:

claude install-skill https://github.com/colwill/axon

Or install interactively from within Claude Code:

/install-skill https://github.com/colwill/axon

Once installed, the /axon slash command becomes available in all your Claude Code sessions.

Usage

Encode natural language

Pass any natural language text to get the AXON translation:

$

/axon fix the bug in the authentication service

→ >fix bug:auth-service

Decode AXON notation

Pass AXON notation to get a natural language explanation:

$

/axon @sun ~emit* #uv-radiation

→ The sun probably emits ultraviolet radiation.

Interactive mode

Use /axon without arguments to start an interactive session where Claude will translate and explain AXON expressions.

What the Skill Does

Auto-detect direction

Automatically detects whether you're providing natural language or AXON notation and translates in the appropriate direction.

Token savings

Shows the token count comparison between the original text and the AXON translation, so you can see the savings.

Annotated output

Each sigil and operator is annotated so you can learn the notation as you use it.

Claude Code

CLI or IDE extension

0 config

Works immediately after install

AXON + Caveman

AXON compresses the prompt. Caveman compresses the response. Together they achieve the best total token savings of any approach tested.

*

AXON and Caveman operate on different sides of the pipe. AXON rewrites your prompt into compact notation (input tokens), while Caveman instructs the model to respond concisely (output tokens). They don't overlap — they stack.

Inference Architecture

Where each approach operates in the LLM inference pipeline.

Total Token Cost

Prompt + response tokens across 10 technical questions, measured with BPE-aware estimation. Model: Claude Opus 4.6.

Configuration	Prompt	Response	Total	vs Baseline
Baseline	105	2,038	2,143	—
Terse ("Answer concisely.")	105	2,168	2,273	+6.1%
Compress	105	1,690	1,795	-16.2%
Caveman	105	1,071	1,176	-45.1%
AXON only	68	2,038	2,106	-1.7%
AXON + Caveman	68	1,071	1,139	-46.9%

Where Each Approach Saves

AXON prompt side

35.2%

avg input token savings

why does my React component re-render...

?why react-comp-re-rnd-parent-updates

13 → 8 tokens (38.5%)

Caveman response side

47.4%

avg output token savings

baseline: 2,038 response tokens

caveman: 1,071 response tokens

same quality, half the tokens

Per-Prompt Breakdown

AXON prompt savings vs Caveman response savings for each of the 10 benchmark questions.

Prompt	AXON	Caveman	Compress

Average	35.2%	47.4%	17.1%

Response Text Compression

Running response text through the AXON translator shows additional compression potential on the output side.

Baseline responses

41.6%

2,038 → 1,191 tokens

Caveman responses

34.5%

1,071 → 702 tokens

Compress responses

40.2%

1,690 → 1,011 tokens

How to Combine Them

1. Compress the prompt with AXON

Use the AXON translator (VS Code extension, REPL, or WASM) to convert your natural language prompt to compact notation before sending it to the model.

Input:

explain database connection pooling

→ >explain db-conn-pooling

2. Compress the response with Caveman

Add the Caveman system prompt to instruct the model to respond concisely. The model understands AXON input and responds in terse, information-dense style.

System prompt:

respond like caveman. short sentence. no fluff.

3. Result: 46.9% total savings

AXON handles the input, Caveman handles the output. No overlap, no quality loss, nearly half the token cost.

Methodology

Data sourced from the Caveman evaluation suite. 10 technical programming questions tested across 6 arms (baseline, terse, caveman, caveman-cn, caveman-es, compress) using Claude Opus 4.6. Token counts estimated using BPE-aware heuristics (whitespace-split + non-alphanumeric character counting). AXON translations performed by the Rust rule-based engine at L0 abbreviation level. Full test suite available in the AXON repository at lib/tests/caveman_comparison.rs.

AXON REPL

An interactive command-line translator for AXON notation. Convert between natural language and AXON in your terminal.

Install

Clone the repository and build with Cargo:

git clone https://github.com/colwill/axon.git

cd axon && cargo build --release -p axon-repl

The binary will be at target/release/axon-repl. You can copy it to a directory on your PATH:

cp target/release/axon-repl ~/.local/bin/

Usage

Interactive Mode

Launch the REPL and type expressions interactively:

$

axon-repl

axon>

fix the bug in the auth service

→ >fix bug:auth-service

axon>

what is the best way to cache data

→ ?best cache-data

Pipe Mode

Pipe text into the REPL for batch translation or scripting:

echo "add a field email to User" | axon-repl

JSON Output

Get structured JSON output for integration with other tools:

echo "fix the login bug" | axon-repl --json

Features

Bidirectional

Translate text to AXON and AXON back to text

Offline

Uses a local lookup table — no network needed

Pipeable

Works with stdin/stdout for scripting and agents

Requirements

Dependency	Version
Rust	2021 edition or later
Cargo	Included with Rust

AXON Playground

Try an example

vNext

Fine-Tuned AXON Architecture

Full Stack: AXON + Qwen2.5-7B-AXON + Caveman

Three-Way Comparison

Today vs Fine-Tuned

Roadmap

v1.1 — Tiered Abbreviations

v1.2 — Bidirectional WASM Codec

v2.0 — qwen2.5-7b-axon

Custom AXON Tokenizer

The Full Stack

Specification

Type Sigils

Operators (ASCII only)

Confidence Markers

Temporal Markers

Abbreviation Tiers (v1.1)

Command Verbs >

Query Types ?

Structural Operations

Examples

LLM Prompt

VS Code Extension

Install

Features

Chat Participant @axon

Sidebar Chat

Multiple Targets

Commands

How It Works

Translation Examples

Claude Code Skill

Install from Skill Marketplace

Usage

Encode natural language

Decode AXON notation

Interactive mode

What the Skill Does

Auto-detect direction

Token savings

Annotated output

AXON + Caveman

Inference Architecture

Total Token Cost

Where Each Approach Saves

Per-Prompt Breakdown

Response Text Compression

How to Combine Them

1. Compress the prompt with AXON

2. Compress the response with Caveman

3. Result: 46.9% total savings

Methodology

AXON REPL

Install

Usage

Interactive Mode

Pipe Mode

JSON Output

Features

Requirements