Instructions to use deep-conrad/conrad_nit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use deep-conrad/conrad_nit with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="deep-conrad/conrad_nit")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("deep-conrad/conrad_nit") model = AutoModelForMultimodalLM.from_pretrained("deep-conrad/conrad_nit") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use deep-conrad/conrad_nit with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "deep-conrad/conrad_nit" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deep-conrad/conrad_nit", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/deep-conrad/conrad_nit
- SGLang
How to use deep-conrad/conrad_nit with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "deep-conrad/conrad_nit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deep-conrad/conrad_nit", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "deep-conrad/conrad_nit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deep-conrad/conrad_nit", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use deep-conrad/conrad_nit with Docker Model Runner:
docker model run hf.co/deep-conrad/conrad_nit
Conrad_NIT
Conrad_NIT is a hybrid training project from Deep Conrad.
This Space is the user-facing shell for Conrad. It does not rely on the bundled toy checkpoint for normal replies. Instead, it proxies chat requests to the configured production endpoint when CONRAD_ENDPOINT_URL is set.
It uses a two-stage pipeline:
- Stage 1 continues training a pretrained GPT-style backbone on domain data.
- Stage 2 applies LoRA fine-tuning on a LLaMA base model for the assistant release.
This repository is the training and release workspace for that pipeline. It is not just a wrapper around an existing model.
What ships here
train_stage1.pyfor the Stage 1 GPT baselinetrain_llama_stage2.pyfor LoRA fine-tuning on LLaMAtrain_lora.pyas a compatibility entry point for the LoRA pathtrain_pipeline.pyto run both stages in sequencebuild_datasets.pyto generate topic-sharded synthetic datasetsdata/with example JSONL training sets
Model Direction
The project is intended to produce two artifacts:
- a meaningful internal baseline from Stage 1
- a higher-quality assistant checkpoint or adapter from Stage 2
The Stage 2 model is the release artifact for normal inference.
Base Model
Stage 2 currently targets:
meta-llama/Llama-3.1-8B-Instruct
The earlier LoRA script in this repo also supports the same model family.
Training Flow
Recommended flow:
- Run
python build_datasets.py --output_dir data/generated --include_docsto generate topic-specific shards and aggregate JSONL files. - Train Stage 1 with
data/generated/stage1_sft.jsonlif you want the GPT baseline. - Run
python train_stage1.pyto build the small baseline. - Train Stage 2 with
data/generated/stage2_conrad_sft.jsonlor letpython train_pipeline.py --include_docsdo the build step automatically. - Merge the Stage 2 adapter with
python merge_stage2_lora.py. - Sync the merged checkpoint into the model repo with
python sync_checkpoint.py. - Publish the merged checkpoint.
Intended Use
This project is designed for:
- conversational assistants
- documentation assistants
- support routing
- enterprise workflows
- knowledge assistants
- internal tooling
- structured response generation
Notes
- Stage 1 and Stage 2 are separate training jobs.
- Stage 1 is not a wrapper around LLaMA.
- Stage 2 is the higher-quality assistant tuning step.
- The repo includes example datasets only; real training data is still required.
- Set
CONRAD_ENDPOINT_URLandHF_TOKENin the Space secrets or environment to enable production chat. - If the endpoint is unavailable, the Space will show a clear fallback instead of generating from the raw checkpoint.
- Downloads last month
- 37