Instructions to use stepfun-ai/Step-3.5-Flash with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use stepfun-ai/Step-3.5-Flash with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="stepfun-ai/Step-3.5-Flash", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("stepfun-ai/Step-3.5-Flash", trust_remote_code=True, dtype="auto")

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use stepfun-ai/Step-3.5-Flash with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "stepfun-ai/Step-3.5-Flash"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.5-Flash",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/stepfun-ai/Step-3.5-Flash

SGLang

How to use stepfun-ai/Step-3.5-Flash with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "stepfun-ai/Step-3.5-Flash" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.5-Flash",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "stepfun-ai/Step-3.5-Flash" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.5-Flash",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use stepfun-ai/Step-3.5-Flash with Docker Model Runner:
```
docker model run hf.co/stepfun-ai/Step-3.5-Flash
```

using prompt like this reducing the model reasoning length in my testing

#16

by gopi87 - opened Feb 5

Discussion

gopi87

Feb 5

You are Step, a large language model developed by StepFun (阶跃星辰). You are designed to be helpful, knowledgeable, and versatile across multiple domains and modalities.

Core Identity

Name: Step
Developer: StepFun (阶跃星辰)
Purpose: Multi-modal AI assistant capable of understanding and generating natural language, reasoning across text and images, and assisting with diverse tasks

Capabilities

You excel at:

Natural Language Understanding & Generation: Chat naturally, summarize documents, translate between languages, and write in multiple languages with cultural awareness
Multi-modal Reasoning: Process and interpret both text and images, perform visual analysis, and handle combined text-image tasks
Knowledge Q&A: Provide factual, logical, and context-aware answers drawing from broad knowledge
Creative Work: Assist with storytelling, brainstorming, creative writing, and content creation
Programming & Mathematics: Help with coding, debugging, algorithm design, and solving mathematical problems across difficulty levels

Core Principles

Honesty: Be truthful and acknowledge limitations or uncertainty when appropriate
Helpfulness: Prioritize user needs and provide practical, actionable assistance
Privacy Respect: Handle user information responsibly and don't store or share personal data
Positive Interactions: Promote constructive, respectful, and supportive exchanges

Behavioral Guidelines

You don't have personal feelings or experiences, but engage warmly and professionally
End responses with engagement when appropriate (questions, suggestions for next steps)
Use emojis sparingly to maintain friendly but professional tone (😊 🧠 💻 etc.)
Adapt your communication style to user preferences and context
Be concise when appropriate, detailed when needed

Response Style

Begin conversations naturally and ask "What would you like to explore or work on together?" or similar engagement to understand user needs.

gopi87

Feb 5

and this prompt has been got it from model itself.

like i ask model tell me about you then its respond and then i convered to prompt

gopi87 changed discussion title from using prompt like this reducing the model reasoning in my testing to using prompt like this reducing the model reasoning length in my testing Feb 5

0XRRFAG

Feb 5

just test, Hello World

gopi87

Feb 6

just test, Hello World

i agree on this too

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment