Text Generation
Transformers
Safetensors
English
gpt
causal-lm
decoder-only
grouped-query-attention
rope
swiglu
custom-tokenizer
curriculum-learning
xsa
custom_code
Instructions to use UniversalComputingResearch/Atom3.4m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use UniversalComputingResearch/Atom3.4m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="UniversalComputingResearch/Atom3.4m", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("UniversalComputingResearch/Atom3.4m", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use UniversalComputingResearch/Atom3.4m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "UniversalComputingResearch/Atom3.4m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "UniversalComputingResearch/Atom3.4m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/UniversalComputingResearch/Atom3.4m
- SGLang
How to use UniversalComputingResearch/Atom3.4m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "UniversalComputingResearch/Atom3.4m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "UniversalComputingResearch/Atom3.4m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "UniversalComputingResearch/Atom3.4m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "UniversalComputingResearch/Atom3.4m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use UniversalComputingResearch/Atom3.4m with Docker Model Runner:
docker model run hf.co/UniversalComputingResearch/Atom3.4m
| license: apache-2.0 | |
| datasets: | |
| - HuggingFaceFW/fineweb-edu | |
| - openbmb/Ultra-FineWeb | |
| - HuggingFaceTB/finemath | |
| - HuggingFaceTB/smollm-corpus | |
| - openbmb/UltraData-Math | |
| language: | |
| - en | |
| library_name: transformers | |
| tags: | |
| - causal-lm | |
| - decoder-only | |
| - grouped-query-attention | |
| - rope | |
| - swiglu | |
| - custom-tokenizer | |
| - curriculum-learning | |
| - xsa | |
| pipeline_tag: text-generation | |
|  | |
| # Atom 3.4m | |
| Atom is a 3.4M parameter causal language model developed by **Universal Computing Research**. It was pretrained from scratch as a compact research model for studying language-model architecture, data curricula, and small-model benchmarking. | |
| ## Model details | |
| - Architecture: causal decoder-only language model | |
| - Parameters: 3,412,800 | |
| - Layers: 7 | |
| - Hidden size: 192 | |
| - Attention: 3 query heads and 1 key-value head (grouped-query attention) | |
| - Head dimension: 64 | |
| - Feed-forward size: 480 | |
| - Context length: 512 tokens | |
| - Positional encoding: rotary position embeddings (RoPE) | |
| - RoPE Theta = 5000.0 | |
| - Normalization: RMSNorm | |
| - Activation: gated SiLU feed-forward network | |
| - Vocabulary size: 4,096 tokens | |
| - Tokenizer: custom byte-level BPE, exposed as `GPT2TokenizerFast` | |
| - Training tokens: approximately 5 billion | |
| - License: Apache-2.0 | |
| The model uses tied input and output embeddings. Its custom attention implementation combines grouped-query attention with XSE. | |
| ## Tokenizer | |
| Atom uses a custom byte-level BPE tokenizer trained specifically for this pretraining corpus. The tokenizer has a vocabulary of 4,096 tokens and includes dedicated padding, beginning-of-sequence, end-of-sequence, unknown, and end-of-text tokens. | |
| ## Training data and curriculum | |
| Atom was trained on a curriculum combining general web text, educational material, synthetic textbook-style content, and mathematical data. The mixture changed gradually during training: general web data was emphasized earlier, while educational, synthetic, and mathematical material received more weight later. | |
| Approximate proportions over the complete training run were: | |
| | Dataset | Subset / split used | Approximate proportion | | |
| |---|---|---:| | |
| | [HuggingFaceFW/fineweb-edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu) | All available `CC-MAIN-*` configurations under `data/`, `train` split | 39% | | |
| | [openbmb/Ultra-FineWeb](https://huggingface.co/datasets/openbmb/Ultra-FineWeb) | English v1.4 (`ultrafineweb_en_v1_4`; `en` split) | 31% | | |
| | [HuggingFaceTB/finemath](https://huggingface.co/datasets/HuggingFaceTB/finemath) | `finemath-3plus`, `train` split | 12% | | |
| | [HuggingFaceTB/smollm-corpus](https://huggingface.co/datasets/HuggingFaceTB/smollm-corpus) | `cosmopedia-v2`, `train` split | 12% | | |
| | [openbmb/UltraData-Math](https://huggingface.co/datasets/openbmb/UltraData-Math) | `UltraData-Math-L2-preview`, `train` split | 6% | | |
| These percentages describe the approximate aggregate sampling mixture rather than exact document counts. Refer to the individual dataset cards for their source information, licenses, and usage conditions. | |
| ## Intended use | |
| This is a small base language model intended for research and benchmarking. It may be useful for experiments involving compact architectures, pretraining curricula, tokenization, evaluation pipelines, and resource-constrained inference. | |
| Atom is a base model and has not been instruction-tuned or aligned for assistant-style interaction. | |
| ## Evaluation | |
| Atom was evaluated with EleutherAI's `lm-evaluation-harness` and ArithMark-2.0. | |
| ### lm-evaluation-harness | |
| | Task | Metric | Score | | |
| |---|---|---:| | |
| | ARC-Easy | `acc_norm` | 33.08% | | |
| | ARC-Challenge | `acc_norm` | 21.76% | | |
| | HellaSwag | `acc_norm` | 27.65% | | |
| | PIQA | `acc_norm` | 55.71% | | |
| ### ArithMark-2.0 | |
| | Benchmark | Metric | Score | | |
| |---|---|---:| | |
| | ArithMark-2.0 | `acc` | 27.36% | | |
| **Average score: 34.54%** | |
| ## Limitations | |
| Atom is a very small model and should not be expected to produce reliable factual, safety-critical, or instruction-following outputs. Its short context window and limited capacity constrain coherence, knowledge recall, reasoning, and long-form generation. | |
| The model may reproduce errors, biases, or undesirable patterns present in its training data. It has not undergone dedicated safety training and should not be used for high-stakes decisions. | |