artistics/arka-1-1b
artistics/arka-1-1b is a multilingual text-generation model for English and Malayalam content in the domain of classical arts and cultural heritage. The model is fine-tuned on a Kerala classical arts corpus, with supporting language resources and evaluation files included in this repository.
Model Details
- Model name:
artistics/arka-1-1b - Languages: English, Malayalam
- Domain: Classical Arts, Cultural Heritage
- Base: Fine-tuned on Kerala classical arts corpus
- License: Apache 2.0
- Task: Text generation
Repository Status
This repository includes the complete Hugging Face model repository structure. Placeholder scaffold files are provided for model.safetensors, tokenizer.model, and fine-tuning/adapter_model.safetensors; replace them with the trained model, tokenizer, and adapter artifacts before publishing or using the model for inference.
Intended Use
This model is intended for applications involving Kerala classical arts, cultural heritage documentation, educational content generation, multilingual question answering support, and domain-specific writing assistance in English and Malayalam.
Example use cases include:
- Generating explanatory text about Kerala classical art forms
- Assisting cultural heritage documentation workflows
- Supporting bilingual English-Malayalam educational content
- Answering domain-focused questions using fine-tuned cultural context
Limitations
The model is specialized for classical arts and cultural heritage content, especially related to Kerala. It may be less reliable outside this domain and should not be treated as an authoritative source without human review. Generated content may contain factual errors, omissions, or culturally sensitive inaccuracies.
For educational, archival, or public-facing use, outputs should be reviewed by domain experts.
Training Data
The model was fine-tuned on a Kerala classical arts corpus. The repository includes language-specific supporting resources:
languages/english/corpus.txtlanguages/malayalam/corpus.txtlanguages/english/vocab.jsonlanguages/malayalam/vocab.json
Fine-Tuning
Fine-tuning artifacts are provided under fine-tuning/:
fine-tuning/adapter_config.jsonfine-tuning/adapter_model.safetensors
Evaluation
Evaluation files are included under eval/:
eval/english-qa.jsonleval/malayalam-qa.jsonl
These files can be used to assess model behavior on English and Malayalam domain-specific question answering examples.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "artistics/arka-1-1b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
prompt = "Explain the cultural significance of Kathakali in Kerala."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_new_tokens=200,
temperature=0.7,
top_p=0.9,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Repository Structure
.
|-- README.md
|-- LICENSE
|-- .gitattributes
|-- config.json
|-- generation_config.json
|-- tokenizer.json
|-- tokenizer.model
|-- tokenizer_config.json
|-- special_tokens_map.json
|-- model.safetensors
|-- languages/
| |-- english/
| | |-- vocab.json
| | `-- corpus.txt
| `-- malayalam/
| |-- vocab.json
| `-- corpus.txt
|-- fine-tuning/
| |-- adapter_config.json
| `-- adapter_model.safetensors
`-- eval/
|-- english-qa.jsonl
`-- malayalam-qa.jsonl
License
This model is released under the Apache License 2.0. See the LICENSE file for details.
Tags
classical-arts kerala malayalam english multilingual cultural-heritage fine-tuned
- Downloads last month
- 137