Torad Etch / training
Your data, in the model.
Etch trains a small specialist on your documents and tasks, then hands you the 4-bit weights: the same answers as the full-size original, yours to run anywhere. Not a database it queries.
continued pre-training · fine-tuning · reinforcement
What you bring, what you get
You give us your domain. We give you a specialist.
Four inputs go in. One artifact comes out, yours on a disk you control.
What you bring
- Your domaindocuments and corpus it must know cold
- Task exampleswhat it should do, in your format
- Behavior spechow it should act, its voice
- Eval setoptional, how you judge it
What you keep
- A specialist that is yoursthe weights, on a disk you control. No metering, no account, no per-token bill.
- Runs offline VALIDATEDon consumer hardware. Nothing leaves the machine.
- Small, no loss in quality VALIDATEDa quarter of the memory, and it gives the We score it at 0.9998 out of 1.00 against the full-size original, where 1.00 means identical. See the proof below. as the full-size original.
- Still improvable VALIDATEDmost shrunk models freeze for good. Yours keeps learning.
The pipeline
One pipeline, not three products bolted together.
Your domain goes through continued pre-training, fine-tuning, then reinforcement. Select a stage to see what it does.
Your domain runs through three stages and comes out as a 4-bit model you own.
The 4-bit weights you train are the 4-bit weights you serve. Same kernel, same artifact, end to end. VALIDATED
The training engine. Reinforcement learning on one consumer graphics card, built on reward.
How it trains → TQ4Trainable 4-bit weights. Most shrunk models freeze; yours keeps learning after delivery.
The format → One kernelTraining and serving in the same program, so what you train is exactly what runs.
The kernel →The proof
Shrinking a model usually makes it worse. Ours gives the same answers.
Running on your machine means shrinking the model to 4-bit, which normally drifts the answers. We score how alike the shrunk and full-size answers come back, the A score from 0 to 1.00 for how alike two models' answers are. 1.00 means identical. It measures whether the answers point the same direction, not just whether the words match.: 1.00 is identical.
How alike is the shrunk model to the original?
VALIDATEDTwo ways to run it
Managed, or self-serve. You own the weights either way.
The same pipeline, the same artifact. The only question is whose machine it runs on.
Managed
pricing coming soonYou bring the domain. We run the pipeline and hand back the specialist. Available now. Kandi is proof.
Self-serve
pricing coming soonRun the same pipeline yourself and keep everything in-house. The weights never leave your machines.
Pricing lands soon. We write when the numbers are set and when self-serve opens. No noise.
What we will not claim yet
The line between what we proved and what we believe.
Green.
The 31% memory, the quality score, the trainable 4-bit substrate, the same result across model families. You can run each one yourself.
Amber.
"7x fewer steps", "first 2B to chain function calls", QGRE R6 completeness, on-device vs cloud. We keep these off the page above and name them in the open. See them named →
Start a model.
Tell us the domain and we tell you what the specialist can do. Whatever we train, you own.