allenai/mslr2022
Viewer • Updated • 22.6k • 402 • 15
How to use NotXia/longformer-bio-ext-summ with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "summarization" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("summarization", model="NotXia/longformer-bio-ext-summ", trust_remote_code=True) # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("NotXia/longformer-bio-ext-summ", trust_remote_code=True, dtype="auto")Work done for my Bachelor's thesis.
Longformer fine-tuned on MS^2 for extractive summarization.
The model architecture is similar to BERTSum.
Training code is available at biomed-ext-summ.
summarizer = pipeline("summarization",
model = "NotXia/longformer-bio-ext-summ",
tokenizer = AutoTokenizer.from_pretrained("NotXia/longformer-bio-ext-summ"),
trust_remote_code = True,
device = 0
)
sentences = ["sent1.", "sent2.", "sent3?"]
summarizer({"sentences": sentences}, strategy="count", strategy_args=2)
>>> (['sent1.', 'sent2.'], [0, 1])
Strategies to summarize the document:
length: summary with a maximum length (strategy_args is the maximum length).count: summary with the given number of sentences (strategy_args is the number of sentences).ratio: summary proportional to the length of the document (strategy_args is the ratio [0, 1]).threshold: summary only with sentences with a score higher than a given value (strategy_args is the minimum score).