Instructions to use mlx-community/typhoon-ocr1.5-2b-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/typhoon-ocr1.5-2b-8bit with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("mlx-community/typhoon-ocr1.5-2b-8bit") config = load_config("mlx-community/typhoon-ocr1.5-2b-8bit") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- Pi
How to use mlx-community/typhoon-ocr1.5-2b-8bit with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "mlx-community/typhoon-ocr1.5-2b-8bit"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "mlx-community/typhoon-ocr1.5-2b-8bit" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use mlx-community/typhoon-ocr1.5-2b-8bit with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "mlx-community/typhoon-ocr1.5-2b-8bit"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default mlx-community/typhoon-ocr1.5-2b-8bit
Run Hermes
hermes
typhoon-ocr1.5-2b-8bit
This model was converted to MLX format from typhoon-ai/typhoon-ocr1.5-2b using mlx-vlm 0.6.3.
Typhoon OCR 1.5 2B is a Qwen3-VL-based vision-language model for Thai and English document understanding. It produces structured output: Markdown text, HTML <table> for tables, LaTeX for equations, <figure> for images/charts, and <page_number> tags.
This checkpoint is quantized to 8-bit (group size 64, affine mode), 9.94 bits per weight. The vision encoder is kept at higher precision, so OCR accuracy is preserved while roughly halving the size (2.5 GB). Designed for Apple Silicon.
Prompt
Typhoon OCR is instruction-tuned and works best only with its official prompt. Use it verbatim (e.g. saved to prompt.txt):
Extract all text from the image.
Instructions:
- Only return the clean Markdown.
- Do not include any explanation or extra text.
- You must include all information on the page.
Formatting Rules:
- Tables: Render tables using <table>...</table> in clean HTML format.
- Equations: Render equations using LaTeX syntax with inline ($...$) and block ($$...$$).
- Images/Charts/Diagrams: Wrap any clearly defined visual areas (e.g. charts, diagrams, pictures) in:
<figure>
Describe the image's main elements (people, objects, text), note any contextual clues (place, event, culture), mention visible text and its meaning, provide deeper analysis when relevant (especially for financial charts, graphs, or documents), comment on style or architecture if relevant, then give a concise overall summary. Describe in Thai.
</figure>
- Page Numbers: Wrap page numbers in <page_number>...</page_number> (e.g., <page_number>14</page_number>).
- Checkboxes: Use the unchecked / checked box characters as appropriate.
Usage
pip install -U mlx-vlm
python -m mlx_vlm.generate \
--model mlx-community/typhoon-ocr1.5-2b-8bit \
--image page.jpg \
--prompt "$(cat prompt.txt)" \
--max-tokens 4096 \
--temperature 0.0 \
--repetition-penalty 1.1
Recommended generation parameters
Typhoon OCR is a document-extraction model, not a chat model. It needs near-deterministic decoding so it does not hallucinate characters or loop on repeated table cells.
| Parameter | Recommended | Notes |
|---|---|---|
temperature |
0.0 (greedy) |
Deterministic extraction — always picks the most-confident token. SCB10X also publishes 0.1 as an alternative. |
repetition_penalty |
1.1 |
Stops the model looping on repeated dashes / blank cells in dense tables. Exposed as --repetition-penalty. |
max_tokens |
4096 |
Headroom for a full, dense page. |
top_p |
0.6 |
Only has an effect when temperature > 0 (e.g. 0.1). Not exposed by the mlx_vlm.generate CLI — set it via the Python sampler or the mlx_vlm.server request body if you raise the temperature. |
Image resolution: the Qwen3-VL processor handles dynamic resolution automatically. For dense A4 pages, feed a reasonably high-resolution scan (long side ~1500-2000 px) to keep small text sharp; on 16 GB Apple Silicon, avoid extremely large images to prevent out-of-memory. Keep the KV cache unquantized (the default).
Conversion
python -m mlx_vlm convert \
--hf-path typhoon-ai/typhoon-ocr1.5-2b \
--mlx-path typhoon-ocr1.5-2b-8bit \
-q --q-bits 8 --q-group-size 64
License
Apache-2.0, inherited from the base model typhoon-ai/typhoon-ocr1.5-2b.
- Downloads last month
- -
8-bit
Model tree for mlx-community/typhoon-ocr1.5-2b-8bit
Base model
Qwen/Qwen3-VL-2B-Instruct