๐ง MT-LNN: Microtubule-Inspired Liquid Neural Network Adapter
MT-LNN is a biologically inspired neural architecture that replaces traditional Transformer Feed-Forward Networks (FFNs) with a Microtubule Dynamic Layer (MT-DL). It consists of 13 parallel Closed-form Liquid Time-Constant (CfLTC) channels with multi-scale resonance and quantum-like lateral coupling.
This repository hosts the MT-Adapter weights trained on TinyLlama-1.1B-Chat-v1.0. By loading this residual adapter, you can instantly equip standard causal LLMs with biological continuous-time dynamics, maintaining 100% precision on Long-Context Retrieval (Needle-in-a-Haystack) up to 4K tokens at extremely high efficiency.
๐ How to Use (Usage Guide)
To use the MT-LNN adapter, you need to use the custom adapter wiring from our official GitHub repository.
1. Install & Clone the execution code
git clone https://github.com/everest-an/O1.git
cd O1
pip install -r requirements.txt
2. Loading the Adapter for Inference
You can load the MT-LNN biological adapter on top of the base Llama model and start generating text:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
from mt_lnn.llama_adapter import (
attach_adapters_from_checkpoint,
load_adapter_state,
maybe_apply_lora_for_checkpoint
)
from huggingface_hub import hf_hub_download
device = "cuda" if torch.cuda.is_available() else "cpu"
model_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
# 1. Download the adapter weights from Hugging Face
adapter_path = hf_hub_download(repo_id="EverestAn/MT-LNN", filename="llama_mt_adapter_000500.pt")
# 2. Load Base Model
tokenizer = AutoTokenizer.from_pretrained(model_id)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
# (Optional) Apply RoPE scaling for 4K+ long context
config = AutoConfig.from_pretrained(model_id)
if not hasattr(config, "rope_theta") or config.rope_theta is None: config.rope_theta = 10000.0
config.rope_scaling = {"type": "linear", "rope_type": "linear", "factor": 4.0}
model = AutoModelForCausalLM.from_pretrained(model_id, config=config, torch_dtype=torch.bfloat16)
# 3. Inject the Microtubule (MT) Adapter
checkpoint = torch.load(adapter_path, map_location="cpu")
attach_adapters_from_checkpoint(model, checkpoint)
model = maybe_apply_lora_for_checkpoint(model, checkpoint)
load_adapter_state(model, adapter_path, strict=False)
model.to(device).eval()
# 4. Generate
inputs = tokenizer("What is the biological function of computational microtubules?", return_tensors="pt").to(device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
๐ Evaluation (Needle-in-a-Haystack)
We evaluated MT-LNN as a residual adapter on TinyLlama-1.1B (fine-tuned for 500 steps) on the Needle-in-a-Haystack task.
| Variant | Context | Depth | Exact | Contains | Tok/s |
|---|---|---|---|---|---|
| Base | 1024-2048 | All | 1.000 | 1.000 | ~800 |
| MT-Adapter | 1024-2048 | All | 1.000 | 1.000 | ~670 (-13%) |
| Base | 4096 (RoPE) | All | 1.000 | 1.000 | ~580 |
| MT-Adapter | 4096 (RoPE) | All | 1.000 | 1.000 | ~545 |
Using RoPE scaling, we successfully extended the 2048 window to 4096 tokens. Inference speed confirms the MT-Adapter imposes only ~10-15% latency degradation across contexts, fully parallelizing the liquid dynamics while maintaining absolute reasoning proficiency.
๐ Paper
Please refer to the attached detailed papers for architecture formulation, Anesthesia Validation Protocol, and mathematical derivations: