artistics/arka-1-1b

artistics/arka-1-1b is a multilingual text-generation model for English and Malayalam content in the domain of classical arts and cultural heritage. The model is fine-tuned on a Kerala classical arts corpus, with supporting language resources and evaluation files included in this repository.

Model Details

  • Model name: artistics/arka-1-1b
  • Languages: English, Malayalam
  • Domain: Classical Arts, Cultural Heritage
  • Base: Fine-tuned on Kerala classical arts corpus
  • License: Apache 2.0
  • Task: Text generation

Repository Status

This repository includes the complete Hugging Face model repository structure. Placeholder scaffold files are provided for model.safetensors, tokenizer.model, and fine-tuning/adapter_model.safetensors; replace them with the trained model, tokenizer, and adapter artifacts before publishing or using the model for inference.

Intended Use

This model is intended for applications involving Kerala classical arts, cultural heritage documentation, educational content generation, multilingual question answering support, and domain-specific writing assistance in English and Malayalam.

Example use cases include:

  • Generating explanatory text about Kerala classical art forms
  • Assisting cultural heritage documentation workflows
  • Supporting bilingual English-Malayalam educational content
  • Answering domain-focused questions using fine-tuned cultural context

Limitations

The model is specialized for classical arts and cultural heritage content, especially related to Kerala. It may be less reliable outside this domain and should not be treated as an authoritative source without human review. Generated content may contain factual errors, omissions, or culturally sensitive inaccuracies.

For educational, archival, or public-facing use, outputs should be reviewed by domain experts.

Training Data

The model was fine-tuned on a Kerala classical arts corpus. The repository includes language-specific supporting resources:

  • languages/english/corpus.txt
  • languages/malayalam/corpus.txt
  • languages/english/vocab.json
  • languages/malayalam/vocab.json

Fine-Tuning

Fine-tuning artifacts are provided under fine-tuning/:

  • fine-tuning/adapter_config.json
  • fine-tuning/adapter_model.safetensors

Evaluation

Evaluation files are included under eval/:

  • eval/english-qa.jsonl
  • eval/malayalam-qa.jsonl

These files can be used to assess model behavior on English and Malayalam domain-specific question answering examples.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "artistics/arka-1-1b"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = "Explain the cultural significance of Kathakali in Kerala."
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    max_new_tokens=200,
    temperature=0.7,
    top_p=0.9,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Repository Structure

.
|-- README.md
|-- LICENSE
|-- .gitattributes
|-- config.json
|-- generation_config.json
|-- tokenizer.json
|-- tokenizer.model
|-- tokenizer_config.json
|-- special_tokens_map.json
|-- model.safetensors
|-- languages/
|   |-- english/
|   |   |-- vocab.json
|   |   `-- corpus.txt
|   `-- malayalam/
|       |-- vocab.json
|       `-- corpus.txt
|-- fine-tuning/
|   |-- adapter_config.json
|   `-- adapter_model.safetensors
`-- eval/
    |-- english-qa.jsonl
    `-- malayalam-qa.jsonl

License

This model is released under the Apache License 2.0. See the LICENSE file for details.

Tags

classical-arts kerala malayalam english multilingual cultural-heritage fine-tuned

Downloads last month
137
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support