Synthetic & Open Data for iUS Tumor Segmentation

This repository documents the data, intermediate results, and quality control outputs for three technical routes to train intraoperative ultrasound (iUS) brain tumor segmentation models without manual iUS annotations. Route A transfers MRI tumor labels to iUS space via rigid registration. Route B synthesises realistic iUS from MRI using MMHVAE (Dorent et al., 2025). Route A+B combines both approaches via two-stage training.

All data derives from the ReMIND dataset (Juvekar et al., 2023) — 114 patients, 41 GB DICOM, publicly available from TCIA. Of these, 62 are first-surgery patients suitable for registration; 110 have sufficient anatomy for virtual sweep simulation. The entire pipeline from raw DICOM to training-ready data was executed over 2026-03-16 to 2026-03-18 on a single RTX 2080 Ti machine.

Table 1. Technical Route Comparison

The three routes differ in how training labels are obtained and whether the model sees real or synthetic iUS. Route A is the simplest baseline (direct label transfer with registration noise); Route B eliminates registration error by synthesising iUS from MRI; Route A+B leverages both real and synthetic data.

	Route A (Baseline)	Route B (Core Method)	Route A+B (Innovation)
Pipeline	MRI → Rigid Reg → Pseudo-label → nnU-Net	MRI → Virtual Sweep → MMHVAE → nnU-Net	A pre-train → B fine-tune
Label quality	Noisy (3–5 mm registration error)	Precise (zero registration error)	Complementary
Training images	Real iUS	Synthetic iUS	Both
Data scale	44,502 slices / 43 patients	1,090 sweeps / 110 cases	Combined
DSC reference	0.58–0.62 (Faanes 2025)	0.74 (Dorent 2025)	Target: 0.84
Status	Data ready; training blocked (GPU)	Sweeps done; MMHVAE blocked (weights)	Pending both routes

Detailed Pages: Route A — registration iteration data (V1→V3), convergence analysis, optimizer comparison, pseudo-label summary · Route B — virtual sweep statistics, MMHVAE input QC gallery (549 images), best-effort sweep QC (40 images) · Progress — pipeline status, blockers, DSC roadmap

Table 2. Data File Index

All quantitative results are stored as flat CSV or per-case JSON files. This index lists the primary data files referenced throughout the dashboard; each is linked to the relevant detail page where its contents are analysed.

File	Rows	Key Columns	Route
`registration_results_v1.csv`	204	patient_id, translation_magnitude_mm, mi_improvement, stop_condition, status	A
`registration_results_v2.csv`	114	+ convergence, mask_coverage	A
`registration_results_v3.csv`	114	+ convergence, mask_coverage	A
`ab_test_20260317_075447.csv`	70	config (7 optimizers × 10 patients), converged, stop_class	A
`sweep_metadata.json` × 110	~10 sweeps each	tumor_diameter_mm, C2_dist_to_tumor, tumor_pixels, saved_slices	B
`conversion_*.csv`	~1090	case_id, modalities, volume_shape, seg_nonzero	B

Table 3. Literature DSC References

Published Dice Similarity Coefficient (DSC) values for iUS tumor segmentation methods relevant to this project.

Method	Dataset	DSC	Source
nnU-Net + noisy pseudo-labels (registration)	ReMIND (55 patients)	0.58–0.62	Faanes et al. 2025
MMHVAE synthesis → SegResNet	RESECT-SEG	0.74	Dorent et al. 2025 (TPAMI)
Supervised baseline (w/ manual iUS labels)	RESECT-SEG	0.73	Dorent et al. 2025 (TPAMI)
Expert inter-rater agreement	ReMIND	0.84	Juvekar & Dorent et al. 2024