Abstract: Computational pathology has revolutionized cancer diagnosis and research through the analysis of digitized whole slide images (WSIs). However, the giga-pixel size of these images presents profound technical challenges, creating two intertwined bottlenecks: computational inefficiency and label inefficiency. The immense data scale makes standard end-to-end (E2E) training of deep neural networks infeasible due to prohibitive GPU memory requirements, while the reliance on expert pathologists for annotations makes obtaining high-quality labeled data a tedious and expensive process. This proposal confronts these dual challenges by developing a series of novel model architectures, training paradigms, and self-supervised learning methods designed to create a more efficient and effective framework for WSI analysis.
To improve computational efficiency, this proposal first introduces a locally supervised learning paradigm that enables E2E training on entire WSIs by partitioning a network into gradient-isolated modules, circumventing the memory bottleneck of backpropagation. Second, it presents Prompt-MIL, a parameter-efficient fine-tuning framework that reduces the number of trainable parameters, memory consumption, and training time by fine-tuning only few prompts to guide large pre-trained models. Third, this work advances the efficient architecture on WSIs by developing novel State-Space Models (SSMs). It proposes 2DMamba, the first intrinsic Mamba architecture that preserves the crucial 2D spatial structure of images, overcoming the spatial discrepancy inherent in 1D models. Fourth, to address the inefficiency of multi-directional scans in Mamba models, including 2DMamba, it presents Locally Bi-directional Mamba (LBMamba), which introduces a novel, hardware-aware local backward scan that integrates bi-directional scan into a single forward pass, significantly improving throughput performance trade-off. Lastly, it proposes an extension to the LBMamba, warp-level Bi-directional Mamba (WLBMamba) that extends the thread-level bidirectional scan to warp-level bidirectional scan that further improves the throughput performance trade-off.
To improve label efficiency, this proposal proposes a Precise Location-based Matching strategy for self-supervised dense contrastive learning. By allowing a local patch in one augmented view to match multiple overlapping patches in another, creates a more accurate correspondence, leading to superior feature representations for dense prediction tasks like segmentation and detection.
In summary, this proposal presents a holistic investigation into the efficiency bottlenecks in computational pathology. Through these combined contributions in model architecture, training paradigms, and self-supervised learning, this work establishes a more scalable, efficient, and powerful computational framework for analyzing giga-pixel pathology images.
Speaker: Jingwei Zhang
Location: Old Computer Science Room 2114
Zoom:
https://stonybrook.zoom.us/j/95187903649?pwd=tV0CNxLu1QKqw7hGmcE1h0rJ2C6n1b.1Meeting ID: 951 8790 3649 | Passcode: 488916