Instructions to use microsoft/Phi-3.5-mini-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use microsoft/Phi-3.5-mini-instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="microsoft/Phi-3.5-mini-instruct", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3.5-mini-instruct", trust_remote_code=True)
model = AutoModelForMultimodalLM.from_pretrained("microsoft/Phi-3.5-mini-instruct", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use microsoft/Phi-3.5-mini-instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "microsoft/Phi-3.5-mini-instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Phi-3.5-mini-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/microsoft/Phi-3.5-mini-instruct

SGLang

How to use microsoft/Phi-3.5-mini-instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "microsoft/Phi-3.5-mini-instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Phi-3.5-mini-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "microsoft/Phi-3.5-mini-instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Phi-3.5-mini-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use microsoft/Phi-3.5-mini-instruct with Docker Model Runner:
```
docker model run hf.co/microsoft/Phi-3.5-mini-instruct
```

Why not include MedQA in your benchmarks?

by Hugman2345 - opened Aug 20, 2024

Discussion

Hugman2345

Aug 20, 2024

•

edited Aug 20, 2024

It's one of the good reasoning benchmarks built on USMLE questions. This benchmark was included in phi-3 and its June update so it makes sense to include it in phi-3.5 benchmarks no?

Thanks for the model and all your work too!

nguyenbh

Microsoft org Aug 20, 2024

•

edited Aug 20, 2024

Thank you for your interest in the Phi-3.5 models! We did benchmark MedQA 🩺 but we will let the community to run this benchmark by themself (hint: we think the Phi-3.5 MoE and Mini are very competitive 🌞)

Hugman2345

Aug 21, 2024

•

edited Aug 21, 2024

It's great and competes with much bigger models on USMLE/Medical questions, information and reasoning. In this area, phi-3.5 is better than other 7b,8b,9b competitors and phi-3.5's bigger context size is a plus, sadly it feels like it doesn't beat Phi-3-small-8k and Phi-3-medium-4k in this particular area. This is just from first impressions and needs to be confirmed by others. Definitely so much better than other tiny models it's not even remotely close.

Thanks for Phi-3.5, I don't know how such a small model is even close to the level of big models.

nguyenbh

Microsoft org Aug 22, 2024

•

edited Aug 25, 2024

@Hugman2345 Thank you for your effort on independently benchmarking the Phi-3.5 models on MedQA. It is great to see that the models perform within our expectation.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment