QtMeshEditor — Mesh Part Segmentation

A point-cloud part-segmentation network (PointNet++-style) that predicts a semantic body-part label (head / torso / left+right arm / left+right leg) per point, exported to ONNX for in-app inference via ONNX Runtime.

Built for QtMeshEditor (issue #410) — a free, open-source 3D mesh & animation editor — and its companion QtMesh Cloud asset service. The app downloads this model on first use and runs it locally (offline) to power the Edit-Mode "Select by Part (AI)" action, the qtmesh segment CLI, and the segment_mesh MCP tool. Labels also feed the auto-rigger as placement priors.

Model

Input: a sampled point cloud float32 [1, N, 3] (normalised to a centred unit box; +Y up, character facing +Z, left limbs at +X).
Output: per-point class logits over 7 labels (unknown, head, torso, left_arm, right_arm, left_leg, right_leg); argmax → label, scattered back to mesh vertices/faces by nearest sampled point.
Architecture (v2): shared per-point MLP + two kNN local-aggregation blocks (in-graph cdist+topk, ONNX-exportable) + a global max-pooled feature; ~0.78 MB. Trained at the app's inference sample size (4096 points).
When the model is unavailable, QtMeshEditor falls back to a deterministic geometric segmenter (connected-component islands + an up-axis/lateral spatial heuristic, refined by skeleton-bone proximity when the mesh is rigged).

Training data & license

Trained from scratch on two permissive sources:

Synthetic surface-sampled bodies (CC0, ours): three parametric body plans — humanoid (normal / chibi / lanky proportion regimes, ears/muzzle head bumps, posed arms, feet pointing +Z), quadruped (all four legs labelled by side, tail → torso), and biped-with-tail (dino) — with exact by-construction labels and randomised per-part point density (real characters put 30–50% of their vertices in the head/face).
CC0 rigged characters (Quaternius packs — humanoids, farm animals, dinosaurs, monsters) mined for exact rig-derived labels (bone weights → bone name → part) via qtmesh segment --dump-training-data. Each mined cloud is canonicalised into the training frame from its own labels and its arm/leg sides are reassigned geometrically. A provenance ledger of every pack is kept with the training corpus.

Weights released under CC-BY-4.0; please credit QtMeshEditor.

Evaluation (v2)

Measured per-vertex against exact rig-derived ground truth (bone-weight labels), replicating the app's inference path (4096-point sample, nearest-point scatter):

eval set	v1.1.0	v2.0.0
out-of-distribution rigged test characters (held out of training entirely)	31.5%	94.7%
held-out CC0 rigs (suit, dress, sheep, velociraptor)	—	97.0%
held-out synthetic bodies	—	95.6%

Per-part recall on the out-of-distribution set (v2): head 0.99, torso 0.80, left arm 0.99, right arm 1.00, left leg 0.87, right leg 0.96. Torso is lowest by construction: the ground truth comes from skinning weights, whose shoulder/hip boundaries don't coincide exactly with any geometric boundary.

Reproducing

scripts/export-meshseg-onnx.py in the QtMeshEditor repo (one-time, offline; not shipped with the app). The full v1 failure analysis, data pipeline, and multi-category roadmap live in the repo's docs/MESH_SEGMENTATION_STRATEGY.md.

Versions

v2.0.0 (current) — full data-pipeline rebuild fixing v1's real-mesh failures: surface sampling instead of volumetric primitive fills, connected bodies, chibi/cartoon proportions (v1 collapsed big-headed characters to all-torso), randomised per-part vertex density, correct left/right handedness (v1's training data was mirrored vs the rig convention), quadruped + biped-tail body plans, mined CC0 rigs with label-driven canonicalisation, a deeper two-block kNN network, and fine-tuning at the app's 4096-point inference size (v1 trained at 1024). 31.5% → 94.7% on rig-truth eval. ~0.78 MB.
v1.1.0 — retrained with T/A-pose arms, upright + yaw-only augmentation, head-protrusion samples, wider proportions, and a PointNet++-style local kNN feature. Clean on synthetic humanoids but poor transfer to real meshes (volumetric sampling, flipped handedness). ~0.58 MB.
v1.0.0 — initial flat-PointNet model on synthetic primitive humanoids (arms-down only; over-assigned lateral protrusions to "arm").

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Other

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support