LAMP HydrAMP

LAMP (latent antimicrobial peptide modelling) — HydrAMP is an encoder/decoder for short amino-acid sequences: the encoder maps token IDs to a 64-D latent Gaussian (mean, log_std); the decoder maps latent vectors plus a 2-D condition vector to per-position amino-acid logits. This Hub repo ships remote Python code; load with trust_remote_code=True.

When you publish results or reuse the HydrAMP architecture, cite the original Nature Communications paper (Szymczak et al., 2023); Citation at the bottom of this README has BibTeX and links.

Model repo: pszmk/hydramp

Use the HydrAMP AA tokenizer Hub repo for peptide strings (often pszmk/hydramp-aa-tokenizer next to this model).

Requirements

  • transformers with AutoModel and remote-code loading
  • torch
  • A separate HydrAMP AA tokenizer model repo (AutoTokenizer, trust_remote_code=True)

Load the model

from transformers import AutoModel

model = AutoModel.from_pretrained(
    "pszmk/hydramp",
    revision="main",
    trust_remote_code=True,
)
model.eval()

Tokenize, encode, reconstruct

forward encodes input_ids and decodes from the latent mean (deterministic reconstruction).

import torch
from transformers import AutoModel, AutoTokenizer

model_id = "pszmk/hydramp"
tokenizer_id = "pszmk/hydramp-aa-tokenizer"

tokenizer = AutoTokenizer.from_pretrained(
    tokenizer_id,
    revision="main",
    trust_remote_code=True,
)
model = AutoModel.from_pretrained(
    model_id,
    revision="main",
    trust_remote_code=True,
)
model.eval()

batch = tokenizer(
    ["ACDEFGHIKLMNPQRSTVWY"],
    padding="max_length",
    truncation=True,
    max_length=model.config.sequence_length,
    return_tensors="pt",
)

with torch.no_grad():
    out = model(batch["input_ids"])

# out.mean, out.log_std — out.logits shape [batch, seq_len, vocab_size]

Decode from a custom latent z

z has shape [batch, 64]. condition defaults to the model's default_condition buffer when omitted.

z = out.mean
with torch.no_grad():
    logits = model.forward_latent_positions(z, return_logits=True).logits
greedy_ids = model.decode_to_token_ids(z)  # or logits.argmax(dim=-1)

Citation

The HydrAMP architecture and original model were introduced by Szymczak et al. in Nature Communications (2023). When you refer to HydrAMP or build on this work, please cite:

@article{szymczak_discovering_2023,
  title = {Discovering highly potent antimicrobial peptides with deep generative model {HydrAMP}},
  volume = {14},
  issn = {2041-1723},
  url = {https://www.nature.com/articles/s41467-023-36994-z},
  doi = {10.1038/s41467-023-36994-z},
  abstract = {Antimicrobial peptides emerge as compounds that can alleviate the global health hazard of antimicrobial resistance, prompting a need for novel computational approaches to peptide generation. Here, we propose HydrAMP, a conditional variational autoencoder that learns lower-dimensional, continuous representation of peptides and captures their antimicrobial properties. The model disentangles the learnt representation of a peptide from its antimicrobial conditions and leverages parameter-controlled creativity. HydrAMP is the first model that is directly optimized for diverse tasks, including unconstrained and analogue generation and outperforms other approaches in these tasks. An additional preselection procedure based on ranking of generated peptides and molecular dynamics simulations increases experimental validation rate. Wet-lab experiments on five bacterial strains confirm high activity of nine peptides generated as analogues of clinically relevant prototypes, as well as six analogues of an inactive peptide. HydrAMP enables generation of diverse and potent peptides, making a step towards resolving the antimicrobial resistance crisis.},
  language = {en},
  number = {1},
  journal = {Nature Communications},
  author = {Szymczak, Paulina and Możejko, Marcin and Grzegorzek, Tomasz and Jurczak, Radosław and Bauer, Marta and Neubauer, Damian and Sikora, Karol and Michalski, Michał and Sroka, Jacek and Setny, Piotr and Kamysz, Wojciech and Szczurek, Ewa},
  month = mar,
  year = {2023},
  keywords = {Computational models, Machine learning, Protein design},
  pages = {1453},
}

Export provenance

Bundled from local weights directory /home/pszmk/Latent-Anti-Microbial-Peptides-LAMP/src/hydramp/weights into this Hugging Face layout.

Downloads last month
30
Safetensors
Model size
603k params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including pszmk/hydramp