IROS 2026 Accepted/Deformable Linear Objects/Cosserat Rods/Sim-to-Real

Deform

A Versatile Co-Simulation Framework for Deformable Linear Objects

Yi Yang^1,3,4,*, Xiang Fei^1,*, Lehong Wang^1,*, Chenhao Li², Zilin Dai⁵, Henry Kou¹, Lu Li¹, Howie Choset¹

¹The Robotics Institute, Carnegie Mellon University · ²Department of Mechanical Engineering, CMU · ³School of Ocean and Civil Engineering, SJTU · ⁴Zhiyuan College, SJTU · ⁵Harvard University
* Equal contribution

Paper Dataset Code released upon publication

Synthetic Images

WireSeg-36k · depth + instance masks

0% mAP@75

Segmentation Gain

SAM3 + LoRA, real held-out test set

0cm

Real-Robot Error

UR5e rope-swing — vs. 15.1 cm baseline

Real-World Highlight

Hit-apple dynamic striking demonstration on a real UR5e

Qualitative dynamic-striking demo: a UR5e swings a rope to knock an apple from a person’s head, extending the hit-target pipeline beyond the quantitative benchmark.

Hit-apple dynamic striking demonstration. Real-world trials of a UR5e whipping a rope to strike the apple, showing the qualitative dynamic-manipulation behavior reported in the paper’s Fig. 11.

Abstract

Visually realistic and physically faithful DLO simulation

DeformX framework overview — sim/real fidelity, Isaac Sim × Cosserat rod co-simulation, and robot learning

Deformable linear objects (DLOs) such as wires, cables, and ropes are common in robotic manipulation tasks, yet simulating them with both visual realism and physical accuracy remains challenging. Existing visual methods rely on procedural geometric primitives that lack physically grounded deformation, while physics-based approaches often approximate DLOs as rigid-link chains or generic soft bodies — failing to capture the bending, twisting, and shear mechanics of slender elastic structures.

We introduce DeformX, a co-simulation framework that integrates a dedicated Cosserat rod physics engine with NVIDIA Isaac Sim, enabling DLO simulations that are both physically faithful and visually realistic. The Cosserat rod engine simulates the dynamics and self-collisions of DLOs, and contact interactions with arbitrary free-form meshes. For high-fidelity visualization, we employ mesh skinning to map discrete rod deformations onto imported CAD models — to our knowledge the first framework to unify realistic visualization, principled physics, and robot-learning compatibility.

We demonstrate its versatility across synthetic data generation and policy learning, and validate fidelity against real-world experiments. Fine-tuning SAM3 on DeformX-generated data yields a 10.2% mAP@75 improvement in real-image wire segmentation, and a rope-swinging policy trained entirely in DeformX achieves a mean target-hitting error of 6.6 cm on a UR5e manipulator in the real world.

Method

A co-simulation framework that divides labor cleanly

The Cosserat rod engine governs all DLO dynamics, self-collisions, and rod–mesh contact; Isaac Sim handles rigid bodies, robots, control, and photorealistic rendering. A multi-rate scheme keeps the two tightly and stably coupled across very different time scales.

System overview comparing DeformX with existing vision and RL simulators

Multi-rate co-simulation

The rod engine replicates Isaac Sim's semi-implicit Euler integration to substep DLO dynamics at ~10⁻⁵ s within each ~10⁻² s Isaac step, then returns integrated impulses and wrenches for stable bidirectional coupling.

Free-form mesh contact

Building on PyElastica's penalty contact, we add closest-point queries against arbitrary meshes — accelerated by a BVH and AABB broad-phase pruning, with a repulsion margin that prevents deep penetration under large time steps.

Mesh-skinned visualization

Discrete Cosserat rod deformations drive a skinned tubular mesh every Isaac step, so high-resolution CAD assets deform in full consistency with the underlying physics — CAD-quality, reusable DLO visuals.

Cosserat rod modeling: continuous rod, discrete rod, and skinned mesh

We model a DLO as a slender elastic rod under Cosserat rod theory, capturing all deformation modes of a 1-D continuum — stretching, shearing, bending, and twisting — in a unified formulation.

Material behavior is set by physically meaningful parameters such as Young's modulus and shear modulus, linking simulation directly to real material properties and enabling principled calibration instead of heuristic joint-stiffness tuning.

The engine ships as a Python module embedded in Isaac Sim's scripting environment, supporting both interactive UI workflows and headless execution.

Physical Validation

Sim matches reality — knots, twist, and free-form contact

We validate physical fidelity with two real-world experiments whose parameters come from factory material specifications, not manual tuning.

Trefoil knot under twist and a rope wrapping the Stanford bunny, sim vs real

Knot & twist (video). Live capture of the trefoil knot tightening as twist accumulates — the simulated Cosserat rod reproduces the same buckling and shape transitions.

Application · Data Generation

WireSeg-36k — a wire instance segmentation dataset

Using the framework, we generate 36,000 rendered images across 720 independent simulation runs, with per-wire instance masks, per-pixel depth, and easy / medium / hard difficulty tiers across three scenario categories: wire-on-plane, flying wires, and data center.

WireSeg-36k generation (video). Synthetic wire-segmentation scenes rendered across the three scene families and difficulty tiers.

Representative WireSeg-36k images across settings and difficulty tiers

Table 1. WireSeg-36k versus existing DLO datasets.

Dataset	Physics	Instance	CAD	Grounded	Images
HANDLOOM	✗	✗	✗	✗	30k
FASTDLO	✗	✓	✗	✗	32k
Fresnillo et al.	✓	✗	✗	✓	25k
Zanella et al.	✗	✗	✗	✗	28.5k
ISCUTE	✗	✓	✓	✓	28k
WireSeg-36k (Ours)	✓	✓	✓	✓	36k

“Physics”: physics-based deformation during generation. “Instance”: per-object masks. “CAD”: DLOs as free-form meshes. “Grounded”: rendered in realistic image-based backgrounds.

Table 2. Fine-tuning SAM3 on WireSeg-36k — F1@75, COCO-style mAP@75, and mean per-image Jaccard J (higher is better) across difficulty tiers, the full synthetic set, and a held-out real test set.

Model	Hard (Syn)			Medium (Syn)			Easy (Syn)			Total (Syn)			Total (Real)
Model	F1@75	mAP@75	J	F1@75	mAP@75	J	F1@75	mAP@75	J	F1@75	mAP@75	J	F1@75	mAP@75	J
SAM3 (Base)	0.169	0.052	0.467	0.454	0.291	0.676	0.818	0.783	0.848	0.456	0.328	0.679	0.296	0.157	0.486
SAM3 + LoRA	0.191	0.068	0.518	0.517	0.374	0.738	0.855	0.837	0.879	0.501	0.389	0.728	0.314	0.173	0.496
Δ (LoRA − Base)	+13.4%	+31.1%	+10.9%	+13.9%	+28.8%	+9.2%	+4.5%	+6.8%	+3.7%	+9.8%	+18.4%	+7.2%	+6.1%	+10.2%	+2.1%

Off-the-shelf SAM3 runs in text-prompt mode with the prompt "cable". LoRA adapters are inserted into the attention projections (q, k, v; rank 16, α=32, dropout 0.05); 5 epochs, lr 1×10⁻⁵, effective batch size 16. A complementary set of 300 in-the-wild real images with manual annotations is used for the real-world evaluation.

Open Release

Download WireSeg-36k

Per-wire instance masks and per-pixel depth, generated entirely in DeformX.

Rendered images36,000

LabelsInstance masks + depth

Sim runs720 independent

Real test set300 annotated images

Download Dataset Dataset Card

Scene families

Wire-on-plane Flying wires Data center

Difficulty tiers

EasyMediumHard

Difficulty is assigned per image using the off-the-shelf SAM3 baseline Jaccard score: Hard J < 0.6, Medium 0.6 ≤ J < 0.8, and Easy J ≥ 0.8.

Application · Robot Learning

Dynamic rope swinging that transfers to a real UR5e

Dynamic “whipping” amplifies modeling error: small inaccuracies in bending, torsion, and contact cause large tip-trajectory deviations. We use a planar hit-target rope-swinging benchmark to stress-test sim-to-real transfer, training PPO with everything fixed except the DLO backend.

Robot-driven rope calibration and trajectory error over time

Calibration (video). Replaying the mocap trajectory in DeformX — the simulated rope tracks the real motion that the calibration is fit against.

Goal-conditioned rope swinging in sim and on the real UR5e

Dynamic rope swinging (video). Cosserat rod dynamics under fast whipping motion — the regime where small modeling errors blow up tip trajectories.

Table 3. Hit-target results — minimum tip-to-goal distance d_min (cm). Real-world results are mean ± std over n = 10 executions per goal.

Target Point	Method	Sim d_min	Real (n=10) d_min
(0, 200, 230)	Baseline	4.9	15.1 ± 6.1
(0, 200, 230)	DeformX (Ours)	4.2	6.6 ± 4.7
(0, 200, 150)	Baseline	4.4	25.9 ± 8.9
(0, 200, 150)	DeformX (Ours)	1.4	7.3 ± 1.2
(0, 170, 50)	Baseline	4.3	30.4 ± 14.3
(0, 170, 50)	DeformX (Ours)	3.3	5.8 ± 3.2

Both simulators achieve low in-sim error, but only DeformX transfers reliably to the real robot — evidence that physically accurate bending, torsion, and contact are critical for sim-to-real transfer in dynamic rope swinging.

Citation

BibTeX

@inproceedings{yang2026deformx,
  title        = {DeformX: A Versatile Co-Simulation Framework for
                  Deformable Linear Objects},
  author       = {Yang, Yi and Fei, Xiang and Wang, Lehong and Li, Chenhao
                  and Dai, Zilin and Kou, Henry and Li, Lu and Choset, Howie},
  booktitle    = {2026 IEEE/RSJ International Conference on Intelligent
                  Robots and Systems (IROS)},
  year         = {2026},
  organization = {IEEE}
}