pt-sk/research_papers_short
Viewer • Updated • 118k • 28
How to use pt-sk/mamba_ml_abstract with Transformers:
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("pt-sk/mamba_ml_abstract", dtype="auto")This model uses Mamba Architecture trained on a research abstract dataset.
Import the scripts from the code folder
from model import Mamba, ModelArgs
Loading Model
mamba_model = Mamba.from_pretrained("pt-sk/mamba").to("cuda")
Loading Tokenizer
tokenizer = AutoTokenizer.from_pretrained('pt-sk/mamba')
mamba_reserach file contains the state dict of optimizer and the model.