Instructions to use jaswanthsanjay88/atharva with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jaswanthsanjay88/atharva with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jaswanthsanjay88/atharva", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("jaswanthsanjay88/atharva", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("jaswanthsanjay88/atharva", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use jaswanthsanjay88/atharva with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jaswanthsanjay88/atharva" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jaswanthsanjay88/atharva", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/jaswanthsanjay88/atharva
- SGLang
How to use jaswanthsanjay88/atharva with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "jaswanthsanjay88/atharva" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jaswanthsanjay88/atharva", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "jaswanthsanjay88/atharva" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jaswanthsanjay88/atharva", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use jaswanthsanjay88/atharva with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jaswanthsanjay88/atharva to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jaswanthsanjay88/atharva to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for jaswanthsanjay88/atharva to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="jaswanthsanjay88/atharva", max_seq_length=2048, ) - Docker Model Runner
How to use jaswanthsanjay88/atharva with Docker Model Runner:
docker model run hf.co/jaswanthsanjay88/atharva
Atharva - Fine-tuned Phi-4 Mini
Atharva is an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data. This model is a fine-tuned version of microsoft/Phi-4-mini-instruct, optimized for extraction tasks from DuckDuckGo HTML search results.
Inference with Unsloth
If you saved the LoRA adapter only, you can load and use it with Unsloth for faster inference:
from unsloth import FastLanguageModel
# Load your fine-tuned model (LoRA adapter)
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="./fine_tuned_phi_4_mini",
max_seq_length=2048,
load_in_4bit=True, # 4-bit quantization
)
FastLanguageModel.for_inference(model) # Enable faster inference
# Example inference
SYSTEM_PROMPT = "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."
user_prompt = "What can you do?"
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_prompt}
]
inputs = tokenizer(tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True), return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, do_sample=True, use_cache=True)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(f"Assistant: {response}")
Inference with HuggingFace (Merged Model)
If you exported the full merged model, you can load it using standard Hugging Face transformers:
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the merged 16-bit model
model_path = "./fine_tuned_phi_4_mini_merged_16bit"
model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
# Example inference
SYSTEM_PROMPT = "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."
user_prompt = "What can you do?"
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_prompt}
]
inputs = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([inputs], return_tensors="pt").to("cuda")
outputs = model.generate(
**model_inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.7,
use_cache=True,
)
response = tokenizer.decode(outputs[0][model_inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(f"Assistant: {response}")
Push to Hugging Face Hub (Optional)
If you logged in to Hugging Face and have correctly set the repo_id variable to jaswanthsanjay88/atharva, your model should have been pushed successfully. If not, you can run the push command again:
# This code was already run in a previous cell
# model.push_to_hub_merged("jaswanthsanjay88/atharva", tokenizer, token=True)
To load your model directly from Hugging Face Hub:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("jaswanthsanjay88/atharva")
tokenizer = AutoTokenizer.from_pretrained("jaswanthsanjay88/atharva")
GGUF Export and Ollama Usage
If you exported the model to GGUF format, you can use it with tools like Ollama or LM Studio.
The GGUF file(s) will be located in the ./fine_tuned_phi_4_mini_gguf directory (or ./fine_tuned_phi_4_mini_gguf.zip if you downloaded the archive).
To use with Ollama, create a Modelfile like this (replace model.gguf with your actual GGUF filename):
# Modelfile content
FROM ././fine_tuned_phi_4_mini_gguf/model.gguf
PARAMETER stop "<|endoftext|>"
PARAMETER stop "<|end|>"
# Optionally set a custom system prompt
SYSTEM "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."
Then, build and run your model with Ollama:
# Navigate to the directory containing your GGUF model and Modelfile
cd ./fine_tuned_phi_4_mini_gguf
# Create the model
ollama create atharva-phi4-mini -f Modelfile
# Run the model
ollama run atharva-phi4-mini
Note: The quantization method used was q4_k_m.
- Downloads last month
- 8
Model tree for jaswanthsanjay88/atharva
Base model
microsoft/Phi-4-mini-instruct