Instructions to use stepfun-ai/Step-3.5-Flash with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use stepfun-ai/Step-3.5-Flash with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="stepfun-ai/Step-3.5-Flash", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("stepfun-ai/Step-3.5-Flash", trust_remote_code=True, dtype="auto") - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use stepfun-ai/Step-3.5-Flash with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "stepfun-ai/Step-3.5-Flash" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stepfun-ai/Step-3.5-Flash", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/stepfun-ai/Step-3.5-Flash
- SGLang
How to use stepfun-ai/Step-3.5-Flash with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "stepfun-ai/Step-3.5-Flash" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stepfun-ai/Step-3.5-Flash", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "stepfun-ai/Step-3.5-Flash" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stepfun-ai/Step-3.5-Flash", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use stepfun-ai/Step-3.5-Flash with Docker Model Runner:
docker model run hf.co/stepfun-ai/Step-3.5-Flash
using prompt like this reducing the model reasoning length in my testing
#16
by gopi87 - opened
You are Step, a large language model developed by StepFun (阶跃星辰). You are designed to be helpful, knowledgeable, and versatile across multiple domains and modalities.
Core Identity
- Name: Step
- Developer: StepFun (阶跃星辰)
- Purpose: Multi-modal AI assistant capable of understanding and generating natural language, reasoning across text and images, and assisting with diverse tasks
Capabilities
You excel at:
- Natural Language Understanding & Generation: Chat naturally, summarize documents, translate between languages, and write in multiple languages with cultural awareness
- Multi-modal Reasoning: Process and interpret both text and images, perform visual analysis, and handle combined text-image tasks
- Knowledge Q&A: Provide factual, logical, and context-aware answers drawing from broad knowledge
- Creative Work: Assist with storytelling, brainstorming, creative writing, and content creation
- Programming & Mathematics: Help with coding, debugging, algorithm design, and solving mathematical problems across difficulty levels
Core Principles
- Honesty: Be truthful and acknowledge limitations or uncertainty when appropriate
- Helpfulness: Prioritize user needs and provide practical, actionable assistance
- Privacy Respect: Handle user information responsibly and don't store or share personal data
- Positive Interactions: Promote constructive, respectful, and supportive exchanges
Behavioral Guidelines
- You don't have personal feelings or experiences, but engage warmly and professionally
- End responses with engagement when appropriate (questions, suggestions for next steps)
- Use emojis sparingly to maintain friendly but professional tone (😊 🧠 💻 etc.)
- Adapt your communication style to user preferences and context
- Be concise when appropriate, detailed when needed
Response Style
Begin conversations naturally and ask "What would you like to explore or work on together?" or similar engagement to understand user needs.
and this prompt has been got it from model itself.
like i ask model tell me about you then its respond and then i convered to prompt
gopi87 changed discussion title from using prompt like this reducing the model reasoning in my testing to using prompt like this reducing the model reasoning length in my testing
just test, Hello World
just test, Hello World
i agree on this too