API Reference
save_delta_from_paths
dt.save_delta_from_paths(
out_path,
finetuned_dir,
base_dir,
strategy="sparse",
prefetch=2,
**kwargs,
) -> str
Streaming delta save. Peak RAM is O(prefetch tensors), not O(two full models).
Args:
| Parameter | Type | Description |
|---|---|---|
out_path |
str \| Path |
Output .wdelta file path |
finetuned_dir |
str \| Path |
Folder containing fine-tuned safetensors shards |
base_dir |
str \| Path |
Folder containing base safetensors shards |
strategy |
str |
"sparse", "quantized", or "int4" |
prefetch |
int |
Number of tensor pairs to prefetch (default 2) |
**kwargs |
Strategy-specific options (see below) |
Strategy kwargs:
| Strategy | kwarg | default | description |
|---|---|---|---|
sparse |
sparsity |
0.9 |
Fraction of weights to zero out |
int4 |
outlier_fraction |
0.01 |
Fraction of weights stored as float16 outliers |
Returns: SHA-256 hex hash of the base model.
load_delta_from_paths
dt.load_delta_from_paths(
path,
base_dir,
verify=True,
) -> Dict[str, np.ndarray]
Reconstruct a fine-tuned model from a .wdelta file and a base model directory. Loads each base shard once — O(n_shards) file opens rather than O(n_tensors × n_shards).
Args:
| Parameter | Type | Description |
|---|---|---|
path |
str \| Path |
Path to the .wdelta file |
base_dir |
str \| Path |
Folder containing base safetensors shards |
verify |
bool |
SHA-256 verify base before reconstructing (default True) |
Returns: Reconstructed state dict as Dict[str, np.ndarray].
inspect
dt.inspect(path) -> dict
Return metadata from a .wdelta file without loading the base model.
Args:
| Parameter | Type | Description |
|---|---|---|
path |
str \| Path |
Path to the .wdelta file |
Returns:
{
"path": "checkpoint.wdelta",
"size_mb": 294.2,
"parent_hash": "e1810a...",
"strategy": "int4",
"n_tensors": 290,
"tensors": {
"model.embed_tokens.weight": {"shape": [151936, 896], "dtype": "float32"},
...
}
}
save_delta
dt.save_delta(
path,
finetuned,
base,
strategy="sparse",
**kwargs,
) -> str
Compute and save the delta between finetuned and base. Loads both models fully into RAM — for models larger than ~3B use save_delta_from_paths instead.
Args:
| Parameter | Type | Description |
|---|---|---|
path |
str \| Path |
Output .wdelta file path |
finetuned |
Dict[str, np.ndarray \| Tensor] |
Fine-tuned state dict |
base |
Dict[str, np.ndarray \| Tensor] |
Base state dict |
strategy |
str |
"sparse", "quantized", or "int4" |
Returns: SHA-256 hex hash of the base model.
load_delta
dt.load_delta(
path,
base,
verify=True,
) -> Dict[str, np.ndarray]
Reconstruct a fine-tuned model from a .wdelta file and a base state dict. Requires the full base loaded in RAM — for large models use load_delta_from_paths instead.
Args:
| Parameter | Type | Description |
|---|---|---|
path |
str \| Path |
Path to the .wdelta file |
base |
Dict[str, np.ndarray \| Tensor] |
Base state dict |
verify |
bool |
SHA-256 verify base before reconstructing (default True) |
Returns: Reconstructed state dict as Dict[str, np.ndarray].