AI & ML interests

At EnDevSols, we focus on applied AI engineering, bridging the gap between experimental models and robust production systems. Our core interests lie in architecting hallucination-resistant Retrieval-Augmented Generation (RAG) pipelines, orchestrating autonomous multi-agent workflows, and fine-tuning specialized Small Language Models (SLMs) for secure, cloud-avoidant enterprise environments. We actively develop open-source infrastructure to optimize LLM training, advanced document parsing, and agent observability.

Recent Activity

Organization Card

EnDevSols

Welcome to the EnDevSols Hugging Face organization. We are an AI engineering team specializing in production-grade machine learning architecture, focusing heavily on Retrieval-Augmented Generation (RAG) pipelines, Autonomous Agents, and deploying specialized Small Language Models (SLMs) for enterprise environments.

We bridge the gap between experimental models and scalable, "Cloud-Avoidant" production systems.

🛠️ Open Source Tooling

We actively maintain tools designed to optimize LLM workflows, data ingestion, and model observability. You can find these repositories in our Spaces and model cards:

  • Long-Trainer: Framework for streamlining extensive model training and efficient fine-tuning pipelines.
  • LongTracer: Advanced observability tool for tracing execution, debugging, and monitoring multi-step AI agent workflows.
  • LongParser: High-fidelity document parsing engine optimized for seamless, chunked data ingestion into enterprise RAG systems.

🧠 Core AI Capabilities

Our focus is on applied AI and inference optimization rather than just theoretical research:

  • RAG & Knowledge Retrieval: Architecting robust, hallucination-resistant pipelines for proprietary enterprise data.
  • Agentic Workflows: Multi-agent orchestration for automating complex, reasoning-dependent business tasks.
  • Domain-Specific SLMs: Fine-tuning and deploying specialized models (such as leveraging MedGemma for clinical assistants like Vivus AI) where privacy and latency are paramount.
  • Applied Computer Vision & NLP: Implementing edge-ready AI, from receipt OCR and voice-to-transaction parsing (as seen in SmartWalt) to real-time text analysis.

⚙️ Velocity Architecture & Inference

We prioritize a "Velocity Architecture" approach—engineering systems that optimize for iteration speed, low-latency inference, and production reliability.

  • Serving: FastAPI-driven model endpoints.
  • Compute: Optimized inference on AWS infrastructure (including EC2 Graviton and Amazon Bedrock).
  • Orchestration: Containerized local-to-cloud deployments utilizing robust vector stores and NoSQL/SQL databases (MongoDB, PostgreSQL).

Connect with us: If you are looking to integrate highly optimized AI into your production environment, reach out to explore our models, datasets, and custom deployment services.

datasets 0

None public yet