Instructions to use FacebookAI/xlm-roberta-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FacebookAI/xlm-roberta-large with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="FacebookAI/xlm-roberta-large")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("FacebookAI/xlm-roberta-large") model = AutoModelForMaskedLM.from_pretrained("FacebookAI/xlm-roberta-large") - Inference
- Notebooks
- Google Colab
- Kaggle
Use XLM-R to build LM on X-languages (X>=5) from scratch
How to the XLM-R of HF train on our own languages, from scratch? The documentation is not super clear about it. I was working mainly with this (https://github.com/facebookresearch/XLM) but it is complex enough for my purpose.
Hey @bonadossue,
In general, I would not recommend to train XLM-R from scratch as it has been pretrained on all kinds of languages and one should be able to just fine-tune it on your preferred language. If you really want to run a whole pretraining though, I'd recommend the following example: https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_mlm.py
Which languages are you mostly interested in?
Many languages like Fon, Ghomala, Bambara, etc
Ok did you try just fine-tuning XLM-R on those languages?