Route A — MRI-to-iUS Registration & Pseudo-label Generation
Pipeline: MRI tumor annotation → rigid registration to iUS space → pseudo-label generation → nnU-Net training
Registration was iterated three times (V1 → V2 → V3), with each version addressing specific failure modes identified in the prior run. The final dataset includes rescue passes for initially non-converging cases.
§1 Registration Iteration Summary
Table 1. Per-version registration statistics
| Version | n | Patients | Converged | Rate | Trans. median | Trans. IQR | Trans. range | Rot. median | MI impr. median | Time median | Key change |
|---|---|---|---|---|---|---|---|---|---|---|---|
| V1 | 204 | All (incl. repeat surgery) | 14 | 6.9% | 14.8 mm | 11.1–18.5 | 0.5–28.2 | 5.5° | 0.0445 | 19.9 s | Baseline: 1k iter, no mask |
| V2 | 114 | First-surgery only | 64 | 56.1% | 7.0 mm | 4.5–12.4 | 0.5–38.9 | 7.3° | 0.0231 | 27.2 s | +brain mask, 3k iter, 5 mm dil. |
| V3 | 114 | First-surgery only | 84 | 73.7% | 6.2 mm | 4.1–8.8 | 0.4–32.3 | 6.2° | 0.0229 | 18.6 s | 2 mm dil. (A/B tested vs V2) |
| Final | 114 | After QC exclusion | 84 | 73.7% | — | — | — | — | 30 timepoints excluded (metric + visual QC) | ||
V1 → V2 changes: (1) restricted to first-surgery patients (n reduced from 204 to 114); (2) added brain mask to focus MI computation; (3) increased max iterations from 1,000 to 3,000; (4) added 5 mm mask dilation.
V2 → V3 change: Reduced mask dilation from 5 mm to 2 mm. All other parameters held constant.
Fig. 1. Convergence rate progression
Fig. 1a (left): Optimizer convergence rate across registration versions. V1 baseline (all patients, 1k iterations, no mask) achieved only 6.9%. Adding brain mask and increasing iterations (V2) raised it to 56.1%. Reducing dilation from 5→2 mm (V3) reached 77.2%. After visual QC review, 30 timepoints were excluded (metric failures + clearly wrong pseudo-labels), yielding 84/114 valid (73.7%). Fig. 1b (right): Translation magnitude distribution (Q1 / median / Q3) per version. Lower values indicate tighter MRI-iUS alignment. V3 shows both lower median (6.2 mm vs 14.8 mm) and tighter IQR (4.7 mm span vs 7.4 mm).
§2 Mask Dilation Comparison: V2 (5 mm) vs V3 (2 mm)
V2 and V3 used identical patients (n=114), optimizer, and iteration limits. The only changed parameter was mask dilation radius. This constitutes a controlled single-variable comparison.
Table 2. Paired comparison (same 114 patients)
| Metric | V2 (5 mm dilation) | V3 (2 mm dilation) | Δ |
|---|---|---|---|
| Convergence rate | 56.1% (64/114) | 73.7% (84/114) | +17.6 pp |
| Translation median | 7.0 mm | 6.2 mm | −0.8 mm |
| Translation IQR | 4.5–12.4 mm | 4.1–8.8 mm | Tighter |
| MI improvement median | 0.0231 | 0.0229 | ≈0 (n.s.) |
| Median runtime | 27.2 s | 18.6 s | −8.6 s |
Note on p-value: A paired Wilcoxon signed-rank test on MI improvement yielded p ≈ 0.007 (reported from registration script logs; see
route_a/logs/). This should be independently verified.
§3 Optimizer Configuration Comparison (A/B Test)
A separate experiment tested 7 optimizer configurations on 10 patients to evaluate whether optimizer choice alone could improve convergence.
Table 3. Optimizer A/B test results (ab_test_20260317_075447.csv)
| Config | Optimizer | Max iter | Mask | n | Converged |
|---|---|---|---|---|---|
| baseline | RSGD | 1,000 | No | 10 | 0 (0%) |
| rsgd_3k | RSGD | 3,000 | No | 10 | 0 (0%) |
| rsgd_5k | RSGD | 5,000 | No | 10 | 0 (0%) |
| lbfgs2 | L-BFGS-B | — | No | 10 | 0 (0%) |
| cgls | Conj. Grad. | — | No | 10 | 0 (0%) |
| baseline_mask | RSGD | 1,000 | Yes | 10 | 0 (0%) |
| rsgd_3k_mask | RSGD | 3,000 | Yes | 10 | 0 (0%) |
§4 Pseudo-label Generation & Training Data
Table 4. Data generation pipeline summary
| Stage | Input | Output | Count |
|---|---|---|---|
| Patient filtering | 114 ReMIND patients | First-surgery subset | 62 patients |
| Registration V3 | 62 patients (114 timepoints) | Converged registrations | 84 / 114 converged (73.7%) |
| QC exclusion | 114 timepoints | Quality-filtered registrations | 84 valid (30 excluded: 8 metric + 22 visual QC) |
| Pseudo-label transfer | 84 valid registrations | MRI labels in iUS space | 84 with non-zero labels |
| 2D slicing | 84 pseudo-labeled volumes | Training-ready 2D slices | 44,502 slices |
| nnU-Net formatting | 44,502 slices | Dataset001_iUS | 43 patients, preprocessed |
Patient count: Our first-surgery filter identified 62 patients vs 55 in Faanes 2025. The 7-patient difference is likely due to additional exclusions for missing MRI annotations (n=1, ReMIND-101) and iUS quality issues (n=5–6).
Interpolation method: Pseudo-labels use linear interpolation with 0.5 threshold binarization (instead of nearest-neighbor) to reduce staircase artifacts. This reduced average connected components from 5.2 to 2.8 per volume.
§5 Quality Control Galleries
Fig. 2. Registration QC thumbnails (V3 final — 114 images)
Each thumbnail shows the registered MRI (color overlay) aligned to iUS (grayscale background) after rigid registration. Filter by timepoint (pre-dura / post-dura) below.
Fig. 2: Registration QC from V3 final run. 114 images covering 61 patients (most have both pre-dura and post-dura timepoints). V1 and V2 QC images were not retained; iterative improvement is quantified in Table 1.
Fig. 3. Pseudo-label overlay triptychs (36 images across 6 patients)
Three-panel views showing (left) original iUS, (center) binary pseudo-label mask, (right) overlay. Each patient has 6 representative slices (2 per anatomical plane). Filter by plane below.
Fig. 3: Pseudo-label QC showing spatial alignment quality. Slice numbers correspond to the 2D slice index in the nnU-Net training dataset. Red overlay = transferred MRI tumor label.