// fine-tuning runtime — alpha

ShadowLM

Any open model.
Any harness.
Any method.

llama · qwen · mistral · gemma · phi · deepseek · olmo · mlx
Open source · maintained byKhush Patel@Lyzr
◢ MODEL.CORE
0x7F3A · BAYER 8×8
░░░░░ inference layer
↓ scroll
/LoRA/QLoRA/DoRA/Full-FT/CPT/DPO/GRPO/MoRE/BitFit/Soft-Prompt/P-Tuning/Adapter/LoRA/QLoRA/DoRA/Full-FT/CPT/DPO/GRPO/MoRE/BitFit/Soft-Prompt/P-Tuning/Adapter/LoRA/QLoRA/DoRA/Full-FT/CPT/DPO/GRPO/MoRE/BitFit/Soft-Prompt/P-Tuning/Adapter
// §01 — primitives

One runtime. Every variable.

Built for the labs and the operators who refuse to pick three out of three.

01 / models

Any open model

Load weights from HuggingFace or local checkpoints. Llama, Qwen, Mistral, DeepSeek, Gemma, OLMo, Phi — same five lines. 4-bit, 8-bit, or full precision.

> slm.load('Qwen/Qwen2.5-0.5B-Instruct')
02 / harness

Any harness

slm.capture() serves an OpenAI-compatible proxy. Point any agent harness at it — unchanged — and reconstruct multi-turn trajectories for RL.

> with slm.capture(model) as proxy: ...
03 / method

Any method

Twelve training methods behind one argument. SFT, LoRA, QLoRA, DoRA, full, CPT, DPO, GRPO, MoRE, BitFit, soft prompts, adapters. Switch by changing a string.

> model.finetune(ds, method='grpo')
// §02 — methods

Twelve methods.
One config.

ShadowLM normalizes every training objective into a single, declarative spec. Swap DPO for ORPO by changing a key. Stack a LoRA over a 4-bit quantized base in three lines. The substrate handles the rest — gradient routing, optimizer state, distributed sharding, eval hooks.

01
LoRA
low-rank adaptation
02
QLoRA
4-bit quantized lora
03
DoRA
weight-decomposed lora
04
Full
update every weight
05
CPT
continued pretraining
06
DPO
direct preference opt.
07
GRPO
group relative policy
08
MoRE
mixture of retrieval experts
09
BitFit
train only bias terms
10
Prompt
soft virtual prompts
11
P-Tuning
encoded prompt embeds
12
Adapter
houlsby bottlenecks
// §03 — interface

Three lines.
Production run.

shadowlm@0.2.0
$ pip install 'shadowlm[all]'
$ shadowlm finetune data.jsonl \
    --model Qwen/Qwen2.5-0.5B-Instruct \
    --method lora --max-steps 60

  [shadow] enabled: gradient checkpointing
  [mlx:gpu] finetuning Qwen2.5-0.5B-Instruct-4bit · lora
           · 8 examples · 40 iters · r=16 on 24 layers · lr 2e-4
  [████████████████████████] step 40/40  loss 0.0718  11.7 st/s
  loss  ▇▆█▇▆▇▇█▅▅▄▅▃▃▁▂▂▂▁▁▁▁  4.2120 → 0.0718
  ♥ succeeded · 40 steps · 3.5s
12
training methods
3
backends · mlx · torch · remote
0
deps in the core
MIT
license