How to use JunlongTong/StreamingLLM with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("JunlongTong/StreamingLLM", dtype="auto")