Instructions to use Lambent/danube2-upscale-1.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Lambent/danube2-upscale-1.1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Lambent/danube2-upscale-1.1")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("Lambent/danube2-upscale-1.1")
model = AutoModelForMultimodalLM.from_pretrained("Lambent/danube2-upscale-1.1")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Lambent/danube2-upscale-1.1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Lambent/danube2-upscale-1.1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Lambent/danube2-upscale-1.1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Lambent/danube2-upscale-1.1

SGLang

How to use Lambent/danube2-upscale-1.1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Lambent/danube2-upscale-1.1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Lambent/danube2-upscale-1.1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Lambent/danube2-upscale-1.1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Lambent/danube2-upscale-1.1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Lambent/danube2-upscale-1.1 with Docker Model Runner:
```
docker model run hf.co/Lambent/danube2-upscale-1.1
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Aim was to repair damage caused by duplicating the upscale with some additional training on completion from Cosmopedia.

Seemed to be converged at 50% epoch so I cut it off and used that adapter, which I hope actually did something because it wasn't a checkpoint.

eq_bench testing, as a quick reference, strongly suggests it did; but I'm not sure how much that one's just random on a small model like this.

It also seems to be generating completions much more smoothly than its predecessor, though, rather than getting stuck in a repeated word, which is certainly a good sign.

Nous evals:

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
danube2-upscale-1.1	25.43	60.13	40.22	32.06	39.46

Original model:

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
h2o-danube2-1.8b-base	25.65	62.26	38.05	32.89	39.71

Axolotl config was something like this:

base_model: Lambent/danube2-upscale-1
model_type: MistralForCausalLM
tokenizer_type: LlamaTokenizer
trust_remote_code: false

load_in_8bit: false
load_in_4bit: true
strict: false

datasets:
  - path: HuggingFaceTB/cosmopedia-100k
    type: completion
dataset_prepared_path: prepared-pedia
val_set_size: 0.01
output_dir: ./qlora-out

sequence_len: 8192
sample_packing: true
eval_sample_packing: false
pad_to_sequence_len: true

adapter: qlora
lora_model_dir:
lora_r: 128
lora_alpha: 128
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:

wandb_project: qlora-danube-upscale
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 1
optimizer: paged_adamw_8bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

loss_watchdog_threshold: 5.0
loss_watchdog_patience: 3

warmup_steps: 10
evals_per_epoch: 4
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.002
fsdp:
fsdp_config:
special_tokens:

Downloads last month: 3

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for Lambent/danube2-upscale-1.1

Quantizations

1 model