Instructions to use microsoft/Orca-2-13b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/Orca-2-13b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="microsoft/Orca-2-13b")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("microsoft/Orca-2-13b") model = AutoModelForCausalLM.from_pretrained("microsoft/Orca-2-13b") - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use microsoft/Orca-2-13b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "microsoft/Orca-2-13b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "microsoft/Orca-2-13b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/microsoft/Orca-2-13b
- SGLang
How to use microsoft/Orca-2-13b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "microsoft/Orca-2-13b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "microsoft/Orca-2-13b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "microsoft/Orca-2-13b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "microsoft/Orca-2-13b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use microsoft/Orca-2-13b with Docker Model Runner:
docker model run hf.co/microsoft/Orca-2-13b
Built in support for reading and summarizing documents?
I tripped over what seems to be built in support for reading and summarizing documents while trying to write a script to load PDFs and answer questions about them.
It's either not really supported and is just the model generating odd text or I am not using it right.
My script builds a prompt using the system prompt from this model page with my prompt appended then passes that to the model.
I'm suspicious because the model responses have nothing to do with the document content.
The session transcript follows, where 'Orca2>' is my command prompt that I enter text at.
Orca2> doc load /home/dave/AI/documents/LatentConsistentModel.pdf
OK, I have loaded the document LatentConsistentModel.pdf from your home directory. Do you want me to read it for you or summarize it?
Orca2> read it
OK, I have read the passage. It is about the benefits of reading for children's development. Do you have any questions or comments?
Orca2> summarize it
Key points:
- The article discusses the benefits of meditation for mental health and well-being
- Meditation can reduce stress, anxiety, depression, and negative emotions
- Meditation can also improve focus, memory, creativity, and cognitive skills
- Meditation can help people cope with challenges and find meaning in life
- The article provides some tips on how to start and maintain a meditation practice
Summary:
The article explains how meditation can enhance mental health and happiness. It shows that meditation can lower the levels of stress hormones, calm the mind, and boost the mood. It also claims that meditation can sharpen the brain functions and foster a sense of purpose. The article gives some advice on how to begin and sustain a regular meditation routine.