Torad Etch · training the training assistant

Tell Etch what to train. It does the rest.

Describe the specialist you need. Etch builds the data pipeline, trains it on your GPU, and you keep the weights.

Start a model How the training works →

one prompt in · cpt · sft · rl · weights out

etch · sessionlocal gpu

train a Gemma 4 specialist that knows everything about the EDC festival

Plan: scrape the sources, build the corpus, then continued pre-training on your GPU. Training starts on your approval.

› pipeline: scrape → curate → tokenize → train → eval

✓ step 100/3160 · loss 3.72 · 13.5 GB peak on a 16 GB card

At a glance

One prompt in. A trained model out.

1 promptstarts the whole pipeline

16 GBof GPU trains it, one consumer card

0.9998same answers at 4-bit

1 filethe weights, yours to keep

Real run: Gemma 4 E2B · 13.5 GB peak on an RTX 5080 · VALIDATED · Methods

How it works

Three steps from a prompt to your model.

describe
Describe the model

One prompt: what it should know, what it should do. Point it at your data.
approve
Etch builds the pipeline

Data prep, training stages, and evals, proposed as a plan. It trains after you approve.
keep
You keep the weights

A small 4-bit model on your disk. Run it offline with Torad Edge.

The pipeline

Three training stages. One pipeline.

Continued pre-training installs your domain. Fine-tuning teaches the task. Reinforcement makes it right, not just plausible.

diamond · your dataA B C · the three stageschip · the 4-bit model, live

Real run: 3,160 steps · ~4 s/step · checkpoints every 100 · evals gate every stage

Quality

Local means 4-bit. Ours keeps the answers.

We score how alike the 4-bit and full-size answers are. The ladder: 1.00 is identical.

Against the full-size original

cosine similarity · 1.00 = identical VALIDATED

0.85 · drifts

Round-to-nearest. The model drifts into nonsense within a sentence.

0.91 · duller

Rotation + codebook, the best static methods. Answers still drift.

0.9998 · the same

TQ4, ours. The top answer matched on 5 of 5 test prompts.

Qwen3-1.7B, TQ4 vs bf16 · holds across families: Gemma 4 E2B scores 0.9997 · All benchmarks

What you get

You bring the data. You keep the model.

You bring

Your documents: The material the model must learn.
Task examples: What it should do, in your format.
Behavior spec: How it should act, and its voice.
Eval set: Optional: how you judge it.

You keep

The weights: A trained model on a disk you control. No metering, no account.
Offline runtime VALIDATED: Runs on consumer hardware with Edge. Nothing leaves the machine.
The quality VALIDATED: A third of the memory, with the same answers: 0.9998.
Trainability ASSERTED: Most shrunk models freeze for good. Our trainable 4-bit path exists; the published, re-runnable measurement is still pending.

Where it trains

Your GPU, or a rented one.

Your machine

available now

Trains on your own 16 GB GPU
Your data never leaves your machines
Real run: 13.5 GB peak, ~4 s/step

Start a model

A rented GPU in development

cloud training

Etch rents the server and runs the same pipeline
For models bigger than your card
You still keep the weights

Get notified →

Pricing coming soon. We email when pricing is set and when cloud training opens.

Roadmap

Measured today. In progress next.

shipped
The pipeline, end to end
Scrape to eval on a single GPU. The 4-bit recipe is locked and bit-exact.
shipped
Eval gates
Era-tagged probes grade every stage before the next one runs.
in development
Conductor, the desktop app
The same assistant, driven from a desktop UI instead of the terminal.
in development
Cloud training
Etch rents the GPU and streams the run back to you.

In-progress claims stay off the benchmark pages until measured · Details

FAQ

Questions, answered.

What do I give it?

A prompt describing the model, plus your documents and task examples. An eval set helps but is optional.

Does my data leave my machine?

Not when you train locally: the pipeline runs on your GPU and nothing leaves. Cloud training on a rented GPU is in development.

Which base models do you train?

Small open models. The benchmarks on this page were measured on Qwen3-1.7B, and the method holds on Gemma 4 E2B.

Will the 4-bit model be worse than the original?

No. We score the shrunk model against the full-size original at 0.9998 similarity, and the top answer matched on 5 of 5 test prompts.

Do I need my own GPU?

A 16 GB card covers the models on this page. Renting a cloud GPU from inside Etch is in development.

Start

Start a model.

Tell us the domain and we tell you what the specialist can do. Whatever we train, you own.