Cocktail-Fork-MRX (MLX)

Apple MLX port of MERL's MRX (Multi-Resolution CrossNet) — separates a soundtrack mixture into three stems: music, speech, and sound effects (sfx). Runs natively on Apple Silicon, no PyTorch at inference.

  • Upstream: merlresearch/cocktail-fork-separation — The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks (ICASSP 2022).
  • Checkpoint: default_ (SNR-loss trained — the upstream default inference weights).
  • Variants: -paper (SI-SNR, ICASSP reproduction) · -adapted-loudness · -adapted-eq (cinematic-tuned for real movie stems).
  • Collection: Cocktail-Fork MRX (MLX).
  • License: MIT.
  • Parity: numerically exact vs the PyTorch reference (full-pipeline max_abs ≈ 9e-8; per-stem SI-SDR 107–139 dB vs torch).

Usage

pip install cocktail-fork-mlx   # or: pip install git+https://github.com/xocialize/cocktail-fork-mlx
cocktail-fork-mlx --audio-path soundtrack.wav --out-dir ./out
# -> out/music.wav  out/speech.wav  out/sfx.wav
import mlx.core as mx, soundfile as sf, numpy as np
from cocktail_fork_mlx.separate import separate_soundtrack
from cocktail_fork_mlx.weights import from_pretrained

audio, fs = sf.read("soundtrack.wav", always_2d=True)   # 44.1 kHz
model = from_pretrained("mlx-community/Cocktail-Fork-MRX")
stems = separate_soundtrack(mx.array(audio.T.astype("float32")), model)
for name, x in stems.items():
    sf.write(f"{name}.wav", np.array(x).T, 44100)

Model

  • 44.1 kHz, any channel count. ~30.6M params, fp32 (122 MB).
  • Multi-resolution STFT (windows 1024/2048/8192, hop 256) → per-resolution magnitude encoders → 3 parallel bidirectional CrossNet LSTMs → per-source/per-resolution mask decoders → masked iSTFT summed across resolutions.
  • CPU is the faster device for this LSTM-bound model (default in the CLI).

Ported by MVS Collective (xocialize). MIT, © MERL for the original model/weights.

Downloads last month
43
Safetensors
Model size
30.6M params
Tensor type
F32
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including mlx-community/Cocktail-Fork-MRX