Synthetic & Open Data for iUS Tumor Segmentation
MSc Project Data Repository — Last updated 2026-03-23
This repository documents the data, intermediate results, and quality control outputs for three technical routes to train intraoperative ultrasound (iUS) brain tumor segmentation models without manual iUS annotations. Route A transfers MRI tumor labels to iUS space via rigid registration. Route B synthesises realistic iUS from MRI using MMHVAE (Dorent et al., 2025). Route A+B combines both approaches via two-stage training.
All data derives from the ReMIND dataset (Juvekar et al., 2023) — 114 patients, 41 GB DICOM, publicly available from TCIA. Of these, 62 are first-surgery patients suitable for registration; 110 have sufficient anatomy for virtual sweep simulation. The entire pipeline from raw DICOM to training-ready data was executed over 2026-03-16 to 2026-03-18 on a single RTX 2080 Ti machine.
Key Figures at a Glance
Table 1. Technical Route Comparison
The three routes differ in how training labels are obtained and whether the model sees real or synthetic iUS. Route A is the simplest baseline (direct label transfer with registration noise); Route B eliminates registration error by synthesising iUS from MRI; Route A+B leverages both real and synthetic data.
| Route A (Baseline) | Route B (Core Method) | Route A+B (Innovation) | |
|---|---|---|---|
| Pipeline | MRI → Rigid Reg → Pseudo-label → nnU-Net | MRI → Virtual Sweep → MMHVAE → nnU-Net | A pre-train → B fine-tune |
| Label quality | Noisy (3–5 mm registration error) | Precise (zero registration error) | Complementary |
| Training images | Real iUS | Synthetic iUS | Both |
| Data scale | 44,502 slices / 43 patients | 1,090 sweeps / 110 cases | Combined |
| DSC reference | 0.58–0.62 (Faanes 2025) | 0.74 (Dorent 2025) | Target: 0.84 |
| Status | Data ready; training blocked (GPU) | Sweeps done; MMHVAE blocked (weights) | Pending both routes |
Table 2. Data File Index
All quantitative results are stored as flat CSV or per-case JSON files. This index lists the primary data files referenced throughout the dashboard; each is linked to the relevant detail page where its contents are analysed.
| File | Rows | Key Columns | Route |
|---|---|---|---|
registration_results_v1.csv | 204 | patient_id, translation_magnitude_mm, mi_improvement, stop_condition, status | A |
registration_results_v2.csv | 114 | + convergence, mask_coverage | A |
registration_results_v3.csv | 114 | + convergence, mask_coverage | A |
ab_test_20260317_075447.csv | 70 | config (7 optimizers × 10 patients), converged, stop_class | A |
sweep_metadata.json × 110 | ~10 sweeps each | tumor_diameter_mm, C2_dist_to_tumor, tumor_pixels, saved_slices | B |
conversion_*.csv | ~1090 | case_id, modalities, volume_shape, seg_nonzero | B |