Sky610TX
Model Details
- Architecture: GPT-2 Style (Custom Ascendant Config)
- Parameters: ~389 Million
- Training tokens: 1.3 Billion
- Context Window: 1024 Tokens
- 50k iterations
I will likely go back to do SFT to make this a chatbot if possiblie and see the results
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("8BitStudio/Sky610TX")
tokenizer = AutoTokenizer.from_pretrained("8BitStudio/Sky610TX")
input_text = "User: Hello\nAssistant:"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0]))