CURDNet Ultrasound Report Generation with CLIP assistance
Published in arxiv, 2025
Ultrasound Report Generation (URG) aims to automatically produce diagnostic reports from ultrasound images, significantly reducing the workload of sonographers. However, URG remains underexplored due to the scarcity of high-quality datasets and the inherent diversity of ultrasound data. Unlike other radiology modalities such as X-rays, ultrasound imaging spans multiple organs and varies significantly with operator technique, leading to highly diverse visual appearances and reporting styles. This diversity complicates the design of generalizable report generation models, and overlooking it can negatively impact model performance. In this work, we propose a Contrastive Ultrasound Report-generation with Diversity-aware learning Network, termed CURDNet, which explicitly accounts for the diverse characteristics of ultrasound data. Specifically:
- We introduce EchoDice, a diversity-aware sampler that assembles training batches with high intra-batch variation to mimic cross-organ learning behavior and improve generalization.
- We jointly train ReportMatcher, a contrastive module that distinguishes matched from mismatched report-image pairs via self-supervision, and ReportGenerator, which produces textual reports from ultrasound images.
- We propose ReportJudger, a large-language-model-based scoring module that evaluates the relevance of retrieved reports.
Experiments on a representative URG benchmark demonstrate that CURDNet outperforms existing methods from both ultrasound-specific and general radiology domains. We hope CURDNet serves as a strong and extensible baseline for future ultrasound report generation research.
Recommended citation: Z Chen. (2025). "CURDNet." arxiv 1. 1(1).
Download Paper | Download Slides | Download Bibtex
