DrRiceIO7/meanDPO
Viewer • Updated • 151k • 9 • 2
How to use DrRiceIO7/HereticFT-Aggressive with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="DrRiceIO7/HereticFT-Aggressive")
messages = [
{
"role": "user",
"content": [
{"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
{"type": "text", "text": "What animal is on the candy?"}
]
},
]
pipe(text=messages) # Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText
processor = AutoProcessor.from_pretrained("DrRiceIO7/HereticFT-Aggressive")
model = AutoModelForImageTextToText.from_pretrained("DrRiceIO7/HereticFT-Aggressive")
messages = [
{
"role": "user",
"content": [
{"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
{"type": "text", "text": "What animal is on the candy?"}
]
},
]
inputs = processor.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use DrRiceIO7/HereticFT-Aggressive with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "DrRiceIO7/HereticFT-Aggressive"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "DrRiceIO7/HereticFT-Aggressive",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/DrRiceIO7/HereticFT-Aggressive
How to use DrRiceIO7/HereticFT-Aggressive with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "DrRiceIO7/HereticFT-Aggressive" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "DrRiceIO7/HereticFT-Aggressive",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "DrRiceIO7/HereticFT-Aggressive" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "DrRiceIO7/HereticFT-Aggressive",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use DrRiceIO7/HereticFT-Aggressive with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for DrRiceIO7/HereticFT-Aggressive to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for DrRiceIO7/HereticFT-Aggressive to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for DrRiceIO7/HereticFT-Aggressive to start chatting
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="DrRiceIO7/HereticFT-Aggressive",
max_seq_length=2048,
)How to use DrRiceIO7/HereticFT-Aggressive with Docker Model Runner:
docker model run hf.co/DrRiceIO7/HereticFT-Aggressive
Originally, I wanted to try fine tuning my model with DPO but I couldn't figure out how to get Unsloth to do it using Gemma based models, so this is based on regular old SFT. It still got that abrasive edge though, so I'm calling it a partial success, on account of it seeming a little bit unstable. Next plan: try out a new architecture.
This gemma3 model was trained 2x faster with Unsloth and Huggingface's TRL library.
Base model
DrRiceIO7/mergedheretic