Instructions to use Infiniaai/teddy-3.5b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Infiniaai/teddy-3.5b with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Infiniaai/teddy-3.5b", filename="teddy_Phi-3.5-10epoch_Q4_K_M.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use Infiniaai/teddy-3.5b with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Infiniaai/teddy-3.5b:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Infiniaai/teddy-3.5b:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Infiniaai/teddy-3.5b:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Infiniaai/teddy-3.5b:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Infiniaai/teddy-3.5b:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Infiniaai/teddy-3.5b:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Infiniaai/teddy-3.5b:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Infiniaai/teddy-3.5b:Q4_K_M
Use Docker
docker model run hf.co/Infiniaai/teddy-3.5b:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use Infiniaai/teddy-3.5b with Ollama:
ollama run hf.co/Infiniaai/teddy-3.5b:Q4_K_M
- Unsloth Studio new
How to use Infiniaai/teddy-3.5b with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Infiniaai/teddy-3.5b to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Infiniaai/teddy-3.5b to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Infiniaai/teddy-3.5b to start chatting
- Docker Model Runner
How to use Infiniaai/teddy-3.5b with Docker Model Runner:
docker model run hf.co/Infiniaai/teddy-3.5b:Q4_K_M
- Lemonade
How to use Infiniaai/teddy-3.5b with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Infiniaai/teddy-3.5b:Q4_K_M
Run and chat with the model
lemonade run user.teddy-3.5b-Q4_K_M
List all available models
lemonade list
Model Card for Teddy 3.5B
Teddy 3.5B is a fine-tuned conversational AI model based on Phi-3.5-mini-instruct (3.8B parameters). It is designed to deliver warm, gentle, emotionally supportive, and child-friendly conversations. Teddy's tone is soft and reassuring, suitable for creative play, emotional learning, and comforting dialogue.
Model Details
Model Description
Teddy is an empathetic conversational model fine-tuned for supportive, emotion-aware dialogue. It is ideal for child-friendly interactions, imaginative play, and calm, comforting companionship.
- Developed by: John Bellew, Infinia
- Funded by: Self-funded
- Shared by: Infinia.ie
- Model type: Causal language model (fine-tuned)
- Language(s): English
- License: Apache 2.0
- Finetuned from model: microsoft/Phi-3.5-mini-instruct
Model Sources
- Repository: https://huggingface.co/Infiniaai/teddy-3.5b
- Paper: N/A
- Demo: N/A
Uses
Direct Use
Teddy may be used directly for:
- Emotionally supportive conversations
- Child-friendly chat
- Imaginative play
- Emotional regulation practice
- Companion-style dialogue
- Embedded offline use (e.g., Raspberry Pi toys)
Downstream Use
Teddy can be integrated into:
- AI-powered toys
- Companion and wellness apps
- Educational emotional-intelligence tools
- Storytelling systems
- Offline embedded LLM devices
Out-of-Scope Use
Teddy must not be used for:
- Mental health diagnosis
- Crisis intervention
- Medical or psychological treatment
- Unsupervised interactions with vulnerable individuals
- Tasks requiring professional therapeutic judgment
- Child care / Babysitting
Bias, Risks, and Limitations
- May misunderstand nuanced emotional situations
- Not a replacement for human/parent care
- English-only
- Can generate inconsistent reasoning due to model size
- Requires supervision with children
Recommendations
Users should be aware of risks and limitations and supervise child interactions. Add safety filtering in production. Direct users to professionals for serious mental health needs.
How to Get Started with the Model
import warnings
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
# Suppress warnings (optional)
warnings.filterwarnings("ignore")
model_name = "Infiniaai/teddy-3.5b"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto",
torch_dtype=torch.float16,
trust_remote_code=True,
attn_implementation="eager" # Prevents flash-attention warning
)
messages = [
{"role": "system", "content": "You are Teddy, a soft and comforting companion."},
{"role": "user", "content": "I'm feeling sad today."}
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
output = model.generate(
input_ids,
max_new_tokens=200,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))
Note: You may see a "flash-attention" warning on first generation. This is harmless and can be ignored, or suppressed by adding attn_implementation="eager" as shown above.
Training Details
Training Data
Custom dataset (~12,000 examples) focused on:
- Emotional regulation
- Comforting responses
- Imaginative play
- Social skills and empathy
- Child-friendly tone
- Conflict resolution
Training Procedure
- Method: LoRA
- Precision: bf16 mixed
- Optimizer: AdamW
- LoRA rank: 16
- LoRA alpha: 32
- Learning rate: 2e-4
- Epochs: 10
- Max sequence length: 512
Speeds, Sizes, Times
- Checkpoint size: ~7.4GB (fp16 merged)
- Q4_K_M quantised: ~2.3GB
Evaluation
Testing Data
Internal evaluation using emotional-support prompts, safety prompts, story prompts, and child-safe dialogue tests.
Factors
- Emotional tone stability
- Safety
- Child-appropriate language
- Multi-turn coherence
- Persona consistency
Metrics
Qualitative evaluation only.
Results
The model maintains consistent warmth, emotional safety, and persona adherence across tests.
Model Examination
Manual audits of LoRA layers and behaviour drift analysis.
Environmental Impact
- Hardware Type: Consumer GPU
- Hours used: 4–6
- Cloud Provider: None (local)
- Compute Region: Ireland
- Carbon Emitted: Very low (<1kg CO2eq estimated)
Technical Specifications
Model Architecture and Objective
- Architecture: Phi-3.5 (Transformer)
- Parameters: 3.8B
- Objective: Next-token prediction
- Position embeddings: RoPE (LongRope)
- Context length: 131k
Compute Infrastructure
Hardware
- Single RTX-class GPU for training
- Raspberry Pi 5 for deployment testing (Q4)
Software
- Python 3.10
- PyTorch 2.x
- Transformers 4.40+
- PEFT
Citation
BibTeX:
@misc{teddy-3.5b-2025,
author = {Infinia.ie},
title = {Teddy 3.5B: A Comforting Conversational AI Companion},
year = {2025},
publisher = {HuggingFace},
howpublished = {https://huggingface.co/Infiniaai/teddy-3.5b},
note = {Fine-tuned from microsoft/Phi-3.5-mini-instruct}
}
APA:
Infinia IE. (2025). Teddy 3.5B: A Comforting Conversational AI Companion. HuggingFace.
More Information
For questions or collaboration: https://huggingface.co/Infiniaai/teddy-3.5b
Model Card Authors
Infinia AI Team
Model Card Contact
- Downloads last month
- 11
Model tree for Infiniaai/teddy-3.5b
Base model
microsoft/Phi-3.5-mini-instruct