Instructions to use jaswanthsanjay88/atharva with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use jaswanthsanjay88/atharva with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="jaswanthsanjay88/atharva", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("jaswanthsanjay88/atharva", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("jaswanthsanjay88/atharva", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use jaswanthsanjay88/atharva with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "jaswanthsanjay88/atharva"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jaswanthsanjay88/atharva",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/jaswanthsanjay88/atharva

SGLang

How to use jaswanthsanjay88/atharva with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "jaswanthsanjay88/atharva" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jaswanthsanjay88/atharva",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "jaswanthsanjay88/atharva" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jaswanthsanjay88/atharva",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio new

How to use jaswanthsanjay88/atharva with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for jaswanthsanjay88/atharva to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for jaswanthsanjay88/atharva to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for jaswanthsanjay88/atharva to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="jaswanthsanjay88/atharva",
    max_seq_length=2048,
)

Docker Model Runner
How to use jaswanthsanjay88/atharva with Docker Model Runner:
```
docker model run hf.co/jaswanthsanjay88/atharva
```

Atharva - Fine-tuned Phi-4 Mini

Atharva is an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data. This model is a fine-tuned version of microsoft/Phi-4-mini-instruct, optimized for extraction tasks from DuckDuckGo HTML search results.

Inference with Unsloth

If you saved the LoRA adapter only, you can load and use it with Unsloth for faster inference:

from unsloth import FastLanguageModel

# Load your fine-tuned model (LoRA adapter)
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="./fine_tuned_phi_4_mini",
    max_seq_length=2048,
    load_in_4bit=True, # 4-bit quantization
)
FastLanguageModel.for_inference(model) # Enable faster inference

# Example inference
SYSTEM_PROMPT = "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."
user_prompt = "What can you do?"

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": user_prompt}
]

inputs = tokenizer(tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True), return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, do_sample=True, use_cache=True)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(f"Assistant: {response}")

Inference with HuggingFace (Merged Model)

If you exported the full merged model, you can load it using standard Hugging Face transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the merged 16-bit model
model_path = "./fine_tuned_phi_4_mini_merged_16bit"
model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

# Example inference
SYSTEM_PROMPT = "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."
user_prompt = "What can you do?"

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": user_prompt}
]

inputs = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([inputs], return_tensors="pt").to("cuda")

outputs = model.generate(
    **model_inputs,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.7,
    use_cache=True,
)
response = tokenizer.decode(outputs[0][model_inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(f"Assistant: {response}")

Push to Hugging Face Hub (Optional)

If you logged in to Hugging Face and have correctly set the repo_id variable to jaswanthsanjay88/atharva, your model should have been pushed successfully. If not, you can run the push command again:

# This code was already run in a previous cell
# model.push_to_hub_merged("jaswanthsanjay88/atharva", tokenizer, token=True)

To load your model directly from Hugging Face Hub:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("jaswanthsanjay88/atharva")
tokenizer = AutoTokenizer.from_pretrained("jaswanthsanjay88/atharva")

GGUF Export and Ollama Usage

If you exported the model to GGUF format, you can use it with tools like Ollama or LM Studio.

The GGUF file(s) will be located in the ./fine_tuned_phi_4_mini_gguf directory (or ./fine_tuned_phi_4_mini_gguf.zip if you downloaded the archive).

To use with Ollama, create a Modelfile like this (replace model.gguf with your actual GGUF filename):

# Modelfile content
FROM ././fine_tuned_phi_4_mini_gguf/model.gguf
PARAMETER stop "<|endoftext|>"
PARAMETER stop "<|end|>"

# Optionally set a custom system prompt
SYSTEM "You are Atharva, an AI model created by Jaswanth Sanjay, specialized in processing html_duckduckgo data."

Then, build and run your model with Ollama:

# Navigate to the directory containing your GGUF model and Modelfile
cd ./fine_tuned_phi_4_mini_gguf

# Create the model
ollama create atharva-phi4-mini -f Modelfile

# Run the model
ollama run atharva-phi4-mini

Note: The quantization method used was q4_k_m.

Downloads last month: 8

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for jaswanthsanjay88/atharva

Base model

microsoft/Phi-4-mini-instruct

Finetuned

(66)

this model