Resources

Why does this model use left-padding by default?

#42 opened over 1 year ago by

smbslt3

Can't get repositories permissions for google/gemma-2-9b (can only see google/gemma-2-2b)

#40 opened over 1 year ago by

Connor-Watts

What's the quantization format of 4bit / 8bit?

#39 opened over 1 year ago by

WatermelonEast

Update README.md

#38 opened over 1 year ago by

tylee123

Does Gemma 2 9B Support All Listed Languages on the Gemini 1.5 Page?

#33 opened almost 2 years ago by

i18n-site

dtype: float32 in base model vs. dtype: bfloat16 in the instruction fine-tuned model

#32 opened almost 2 years ago by

tanliboy

Update tokenizer_config.json

#31 opened almost 2 years ago by

reach-vb

Issues with FSDP and DeepSpeed During Distributed Training for Gemma

👍 2

#30 opened almost 2 years ago by

anandhperumal

AttributeError: module 'torch._dynamo' has no attribute 'mark_static_address'

🚀➕ 14

#29 opened almost 2 years ago by

AsirAsir

CUDA usage is low

#28 opened almost 2 years ago by

Max545

Fine-tuning Hyperparameters

#27 opened almost 2 years ago by

tanliboy

Error

#25 opened almost 2 years ago by

ImpactInsights

RuntimeError: Index put requires the source and destination dtypes match, got BFloat16 for the destination and Float for the source.

➕ 4

#24 opened almost 2 years ago by

saireddy

Gemma 2's Flash attention 2 implementation is strange...

#23 opened almost 2 years ago by

GPT007

Request: DOI

#21 opened almost 2 years ago by

Benjitable

Inference error

#20 opened almost 2 years ago by

gsasikiran

16 or 32 bit?

#19 opened almost 2 years ago by

ChrisGoringe

model.generate is throwing AttributeError: 'HybridCache' object has no attribute 'float'

#18 opened almost 2 years ago by

saireddy

ValueError: Transformers does not recognize this architecture.

#15 opened almost 2 years ago by

mike202303

Model repeating information and "spitting out" random characters

#14 opened almost 2 years ago by

brazilianslib

Update README.md

❤️ 1

#13 opened almost 2 years ago by

Criztov

TypeError: arange() received an invalid combination of arguments

#12 opened almost 2 years ago by

darrenbudiman

Aaa

#9 opened almost 2 years ago by

mohamedzairi

Gemma2FlashAttention2 missing sliding_window variable

🚀 7

#8 opened almost 2 years ago by

emozilla