A Bang Up Job

by nightvision04 - opened Mar 28, 2024

Mar 28, 2024

We needed another big win for the open source community. Thanks for taking a big risk for everyone.

You added several innovations here. Would you consider adding the 1.58 bit architecture in the future? I'm curious to know if it was considered.

ricofix

Mar 28, 2024

I haven't seen ternary bits applied to an SSM yet, let alone a hybrid. Would be interesting to see if it's compatiable.

nonetrix

Mar 28, 2024

•

edited Mar 28, 2024

Imagine the efficiency with MoE + Mamba + 1.58 bit 😳

Maybe like make higher parameters version too, I imagine 1.58 bit version could be same memory footprint and speed is 50B version while being a lot more parameters if not double. Then I guess it would be how could we shrink that somehow even like quantization already let's us do with fp16 models

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment