Deformable Linear Objects/Cosserat Rods/Sim-to-Real

Deform

A Versatile Co-Simulation Framework for Deformable Linear Objects

Yi Yang1,3,4,*, Xiang Fei1,*, Lehong Wang1,*, Chenhao Li2, Zilin Dai5, Henry Kou1, Lu Li1, Howie Choset1
1The Robotics Institute, Carnegie Mellon University  ·  2Department of Mechanical Engineering, CMU  ·  3School of Ocean and Civil Engineering, SJTU  ·  4Zhiyuan College, SJTU  ·  5Harvard University
* Equal contribution
DeformX framework overview
Cosserat rod physics × Isaac Sim — dynamics, self-collision, and mesh-skinned rendering
0
Synthetic Images
WireSeg-32k · depth + instance masks
0% mAP@75
Segmentation Gain
SAM3 + LoRA, real held-out test set
0cm
Real-Robot Error
UR5e rope-swing — vs. 15.1 cm baseline
Abstract

Visually realistic and physically faithful DLO simulation

Deformable linear objects (DLOs) such as wires, cables, and ropes are common in robotic manipulation tasks, yet simulating them with both visual realism and physical accuracy remains challenging. Existing visual methods rely on procedural geometric primitives that lack physically grounded deformation, while physics-based approaches often approximate DLOs as rigid-link chains or generic soft bodies — failing to capture the bending, twisting, and shear mechanics of slender elastic structures.

We introduce DeformX, a co-simulation framework that integrates a dedicated Cosserat rod physics engine with NVIDIA Isaac Sim, enabling DLO simulations that are both physically faithful and visually realistic. The Cosserat rod engine simulates the dynamics and self-collisions of DLOs, and contact interactions with arbitrary free-form meshes. For high-fidelity visualization, we employ mesh skinning to map discrete rod deformations onto imported CAD models — to our knowledge the first framework to unify realistic visualization, principled physics, and robot-learning compatibility.

We demonstrate its versatility across synthetic data generation and policy learning, and validate fidelity against real-world experiments. Fine-tuning SAM3 on DeformX-generated data yields a 10.2% mAP@75 improvement in real-image wire segmentation, and a rope-swinging policy trained entirely in DeformX achieves a mean target-hitting error of 6.6 cm on a UR5e manipulator in the real world.

Method

A co-simulation framework that divides labor cleanly

The Cosserat rod engine governs all DLO dynamics, self-collisions, and rod–mesh contact; Isaac Sim handles rigid bodies, robots, control, and photorealistic rendering. A multi-rate scheme keeps the two tightly and stably coupled across very different time scales.

System overview comparing DeformX with existing vision and RL simulators
Compared with existing simulators for vision (left) and RL (right), DeformX jointly achieves visual realism, physical accuracy, DLO CAD support, and robot-learning support by combining Isaac Sim and a dedicated Cosserat rod engine.

Multi-rate co-simulation

The rod engine replicates Isaac Sim's semi-implicit Euler integration to substep DLO dynamics at ~10⁻⁵ s within each ~10⁻² s Isaac step, then returns integrated impulses and wrenches for stable bidirectional coupling.

Free-form mesh contact

Building on PyElastica's penalty contact, we add closest-point queries against arbitrary meshes — accelerated by a BVH and AABB broad-phase pruning, with a repulsion margin that prevents deep penetration under large time steps.

Mesh-skinned visualization

Discrete Cosserat rod deformations drive a skinned tubular mesh every Isaac step, so high-resolution CAD assets deform in full consistency with the underlying physics — CAD-quality, reusable DLO visuals.

Cosserat rod modeling: continuous rod, discrete rod, and skinned mesh
Cosserat rod modeling. (a) Continuous centerline and material frame; (b) discrete vertices and segment frames; (c) mesh skinned to the discrete rod.

We model a DLO as a slender elastic rod under Cosserat rod theory, capturing all deformation modes of a 1-D continuum — stretching, shearing, bending, and twisting — in a unified formulation.

Material behavior is set by physically meaningful parameters such as Young's modulus and shear modulus, linking simulation directly to real material properties and enabling principled calibration instead of heuristic joint-stiffness tuning.

The engine ships as a Python module embedded in Isaac Sim's scripting environment, supporting both interactive UI workflows and headless execution.

Physical Validation

Sim matches reality — knots, twist, and free-form contact

We validate physical fidelity with two real-world experiments whose parameters come from factory material specifications, not manual tuning.

Trefoil knot under twist and a rope wrapping the Stanford bunny, sim vs real
Sim-to-real comparison. (a) A trefoil knot under continuous twist undergoes the same qualitative shape transitions in simulation and reality. (b) A flexible rope wrapping a rigid bunny exhibits multi-point contact, sliding, and self-contact — conforming to the geometry with stable, interpenetration-free behavior.
Knot & twist (video). Live capture of the trefoil knot tightening as twist accumulates — the simulated Cosserat rod reproduces the same buckling and shape transitions.
Application · Data Generation

WireSeg-32k — a wire instance segmentation dataset

Using the framework, we generate 32,000 rendered images across 300+ independent simulation runs, with per-wire instance masks, per-pixel depth, and easy / medium / hard difficulty tiers across three scenario categories: wire-on-plane, flying wires, and data center.

WireSeg-32k data generation pipeline
Generation pipeline. Randomized scenes are built from asset libraries; wires are simulated with the Cosserat rod engine and visualized as skinned meshes. RGB images, segmentation masks, and depth maps are rendered with domain randomization, then split by difficulty.
WireSeg-32k generation (video). Synthetic wire-segmentation scenes rendered across the three scene families and difficulty tiers.
Representative WireSeg-32k images across settings and difficulty tiers
Representative images across the three settings and Easy / Medium / Hard tiers, with Hard ground-truth instance masks.

Table 1. WireSeg-32k versus existing DLO datasets.

DatasetPhysicsInstanceCADGroundedImages
HANDLOOM30k
FASTDLO32k
Fresnillo et al.25k
Zanella et al.28.5k
ISCUTE28k
WireSeg-32k (Ours)32k

“Physics”: physics-based deformation during generation. “Instance”: per-object masks. “CAD”: DLOs as free-form meshes. “Grounded”: rendered in realistic image-based backgrounds.

Table 2. Fine-tuning SAM3 on WireSeg-32k — F1@75 and COCO-style mAP@75 (higher is better) across difficulty tiers, the full synthetic set, and a held-out real test set.

ModelHard (Syn)Medium (Syn)Easy (Syn)Total (Syn)Total (Real)
F1@75mAP@75F1@75mAP@75F1@75mAP@75F1@75mAP@75F1@75mAP@75
SAM3 (Base)0.1790.0660.4460.3100.8030.7350.4090.2900.2960.157
SAM3 + LoRA0.2250.1020.5120.4040.8500.8160.4650.3650.3140.173
Δ (LoRA − Base)+25.7%+54.5%+14.8%+30.3%+5.9%+11.0%+13.7%+25.7%+6.1%+10.2%

Off-the-shelf SAM3 runs in text-prompt mode with the prompt "cable". LoRA adapters (rank 16, α=32) are inserted into the attention projections; 5 epochs, lr 1×10⁻⁵, batch size 16. A complementary set of 300 in-the-wild real images with manual annotations is used for the real-world evaluation.

Open Release

Download WireSeg-32k

Per-wire instance masks and per-pixel depth, generated entirely in DeformX.

Rendered images32,000
LabelsInstance masks + depth
Sim runs300+ independent
Real test set300 annotated images

Scene families

Wire-on-plane Flying wires Data center

Difficulty tiers

EasyMediumHard

Difficulty is assigned per image from wire count, occlusion, and clutter — independent of the scene family.

Application · Robot Learning

Dynamic rope swinging that transfers to a real UR5e

Dynamic “whipping” amplifies modeling error: small inaccuracies in bending, torsion, and contact cause large tip-trajectory deviations. We use a planar hit-target rope-swinging benchmark to stress-test sim-to-real transfer, training PPO with everything fixed except the DLO backend.

Robot-driven rope calibration and trajectory error over time
Parameter calibration. Rope parameters are fit by replaying a motion-captured sinusoidal end-effector trajectory, minimizing per-marker error so both simulators start from a fair, calibrated baseline.
Calibration (video). Replaying the mocap trajectory in DeformX — the simulated rope tracks the real motion that the calibration is fit against.
Goal-conditioned rope swinging in sim and on the real UR5e
Goal-conditioned rope swinging. The learned policy reaches the star-marked goal in simulation (left) and on the real UR5e (right). Colored curves show the DLO configuration over time.
Dynamic rope swinging (video). Cosserat rod dynamics under fast whipping motion — the regime where small modeling errors blow up tip trajectories.

Table 3. Hit-target results — minimum tip-to-goal distance dmin (cm). Real-world results are mean ± std over n = 10 executions per goal.

Target PointMethodSim dminReal (n=10) dmin
(0, 200, 230)Baseline4.915.1 ± 6.1
DeformX (Ours)4.26.6 ± 4.7
(0, 200, 150)Baseline4.425.9 ± 8.9
DeformX (Ours)1.47.3 ± 1.2
(0, 170, 50)Baseline4.330.4 ± 14.3
DeformX (Ours)3.35.8 ± 3.2

Both simulators achieve low in-sim error, but only DeformX transfers reliably to the real robot — evidence that physically accurate bending, torsion, and contact are critical for sim-to-real transfer in dynamic rope swinging.

Real-world hit-target (video). A goal-conditioned policy trained entirely in DeformX swings the rope to strike the target on a real UR5e — zero-shot sim-to-real, no real-world fine-tuning.
Citation

BibTeX

@misc{yang2026deformx,
  title  = {DeformX: A Versatile Co-Simulation Framework for
            Deformable Linear Objects},
  author = {Yang, Yi and Fei, Xiang and Wang, Lehong and Li, Chenhao
            and Dai, Zilin and Kou, Henry and Li, Lu and Choset, Howie},
  note   = {Preprint. Under review},
  year   = {2026}
}