Instructions to use dkalpakchi/SweCTRL-Mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dkalpakchi/SweCTRL-Mini with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="dkalpakchi/SweCTRL-Mini")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("dkalpakchi/SweCTRL-Mini") model = AutoModelForCausalLM.from_pretrained("dkalpakchi/SweCTRL-Mini") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use dkalpakchi/SweCTRL-Mini with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "dkalpakchi/SweCTRL-Mini" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dkalpakchi/SweCTRL-Mini", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/dkalpakchi/SweCTRL-Mini
- SGLang
How to use dkalpakchi/SweCTRL-Mini with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "dkalpakchi/SweCTRL-Mini" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dkalpakchi/SweCTRL-Mini", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "dkalpakchi/SweCTRL-Mini" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dkalpakchi/SweCTRL-Mini", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use dkalpakchi/SweCTRL-Mini with Docker Model Runner:
docker model run hf.co/dkalpakchi/SweCTRL-Mini
| import numpy as np | |
| import dataclasses as dc | |
| class CtrlArguments: | |
| train_data: str = dc.field( | |
| default="data/training_cunique_with_distractors.json", | |
| metadata={"help": "A CSV list of training data files"} | |
| ) | |
| formulation: str = dc.field( | |
| default="areg_ltr", | |
| metadata={"help": "Type of problem definition: autoregressive (areg) or u-PMLM (upmlm) or mixed (if predict_questions is set)"} | |
| ) | |
| context_strategy: str = dc.field( | |
| default="take_first", | |
| metadata={"help": "How to deal with contexts greater than a specified length"} | |
| ) | |
| tokenizer_file: str = dc.field( | |
| default="tokenizer.json", | |
| metadata={"help": "A JSON file (in the format provided by HuggingFace's tokenizers library) with a trained tokenizer"} | |
| ) | |
| sequence_length: int = dc.field( | |
| default=256, | |
| metadata={"help": "The max sequence length"} | |
| ) | |
| force_prepend_control: bool = dc.field( | |
| default=False, | |
| metadata={"help": "If the control code should be prepended for all sliding windows. Otherwise, it is only prepended at the start of the sequence"} | |
| ) | |
| class GradientPrinter: | |
| def __init__(self, name): | |
| self.name = name | |
| def __call__(self, grad): | |
| np_grad = grad.cpu().numpy() | |
| print("======== GRAD FOR {} ========".format(self.name)) | |
| print("\tGRAD {}".format(grad)) | |
| print("\tGRAD NORM {}".format(np.linalg.norm(np_grad))) | |
| print("\tGRAD MEAN {}".format(np.mean(np_grad))) | |
| print() |