Zoom Link: https://stonybrook.zoom.us/j/98533029054?pwd=5FXO6lWGTJssCADEYkYbA7sjaacPRX.1

Meeting ID: 985 3302 9054

Passcode: 436997

Abstract:

Semantic segmentation, the task of assigning a semantic label to each pixel in an image, is a fundamental problem in the field of Computer Vision. with crucial applications in domains like autonomous driving, drone imagery and medical image analysis. Despite advancements in deep learning architectures, state-of-the-art models still heavily depend on large-scale pixel-level annotations, which are costly and time-consuming to acquire. To address this issue, Semi-Supervised Segmentation (SSS) has emerged as a promising solution, leveraging a small set of labeled images alongside a larger corpus of unlabeled data to reduce the annotation burden. In this proposal, I aim to investigate the challenges of SSS and propose approaches to address them. Existing SSS methods rely on a teacher-student framework to generate pseudo-labels for unlabeled images, which are then used for model training. However, this approach presents two major challenges. Pixel-level consistency fails to effectively capture contextual information, and pseudo-labels are noisy, especially in the early stages of training. To address the challenge of noisy pseudo-labels, existing methods rely on confidence-based thresholding to identify reliable pseudo-labels. However, during early training phases, when the model is poorly calibrated, this approach can select high-confidence but noisy pseudo-labels. To address this, we propose a novel approach that reduces reliance on model confidence to select reliable pseudo-labels. Our method employs an ensemble of a segmentation model and an object detection model to select more reliable pseudo-labels, which are then used to weight pseudo-labels using rank statistics, reducing the influence of noisy labels in training. Next, to address both the challenge of capturing contextual information and noisy pseudo-labels I introduce a novel Multi-scale Patch-based Multi-label Classifier (MPMC), which incorporates patch-level contextual information and reduces the impact of noisy pixel pseudo-labels by using the predictions of the patch-level Multi-label classifier to detect noisy labels, enhancing overall segmentation performance. While my work so far has focused on effectively utilizing unlabeled data to improve segmentation performance, as part of our future work, I will explore the use of textual information, such as category descriptions, for segmentation tasks. In limited labeled data scenarios it is more challenging to align visual features with textual features from large language models (LLMs).

Abstract: Many foundation models for digital pathology have been released recently. Benchmarking available methods then becomes paramount to get a clearer view of the research landscape. For this reason, we introduce THUNDER, a tile-level benchmark for digital pathology foundation models, allowing for efficient comparison of many models on diverse datasets with a series of downstream tasks, studying their feature spaces and assessing the robustness and uncertainty of predictions informed by their embeddings. Such foundation models are often used as feature extractors and combined with Multiple Instance Learning (MIL) aggregators at downstream time. Such aggregation must be efficient and reliable. We will focus on two specific examples of this: (I) HistAug, a fast and efficient generative model for controllable augmentations in the latent space of foundation models to perform data augmentation for MIL, and (ii) CAR-MIL, a method based on counterfactual attention regularisation to improve the reliability of attention maps of MIL methods.

Short-bio: Pierre Marza is a Postdoctoral Researcher at CentraleSupelec in the Biomathematics team of the MICS lab, studying Computer Vision and Deep Learning for Medical Imaging, with a focus on Digital Pathology. Prior to this, he was a PhD student at INSA Lyon, in the LIRIS and CITI labs, advised by Christian Wolf, and co-advised by Laetita Matignon and Olivier Simonin. He studied Visual Navigation, Embodied AI, Spatial Reasoning, more specifically how to learn to represent 3D space, generalize to new environments and master diverse tasks from light supervision.

Location: NCS 220

Zoom: https://stonybrook.zoom.us/j/94798224254?pwd=CFraer25qnpORbJ14aAVHRwaSJOjJM.1

Submit an abstract celebrating research, new discoveries and achievements in medicine and science!

We encourage faculty, nurse practitioners, post-doctoral fellows, fellows, residents, medical students, graduate students and undergraduate students to submit an abstract. Original research, case reports and case series are welcome.

Abstract submission deadline: FEBRUARY 7, 2025

For more details, visit here.

Climate Uncertainty, Decision Making, and AI for Earth System Predictability Dr. Nathan Urban, Brookhaven National Laboratory

Bio: Nathan Urban is the group leader of the Optimal Experimental Design & Uncertainty Quantification group in the Applied Mathematics Department at Brookhaven National Laboratory's Computing & Data Sciences directorate (CDS). He holds a Ph.D. in theoretical condensed matter physics from Penn State, and has previously held research positions at Los Alamos National Laboratory, Princeton, and Penn State. His research interests include Bayesian inference and spatiotemporal statistics, probabilistic prediction and forecasting, multi-model / model-form / model structural uncertainty quantification, reduced order modeling, scientific machine learning and hybrid physical-data driven modeling, in-situ/streaming data analysis at scale, information fusion, decision making under uncertainty and optimal experimental design, and integrated multiscale computational frameworks for decision support.

Location: IACS Seminar Room

Lunch will be provided

Imagine machines that can see beyond human limitations--drones locating hidden survivors, cameras predicting structural failures, or medical devices detecting tumors beneath the skin. Traditional vision systems are constrained by the boundaries of human perception, missing vast information present in light interactions. This talk explores the development of advanced vision systems that capture underutilized dimensions of light, model intricate light-scene interactions, and extract hidden 3D information--around corners, beneath surfaces, and at high speeds. By jointly developing novel imaging hardware, efficient rendering models, and physics-based learning algorithms, we aim to transcend conventional vision capabilities--unlocking critical applications in autonomous navigation, structural monitoring, and non-invasive medical imaging.

Speaker Bio:


Akshat Dave is a Postdoctoral Associate at MIT Media Lab in the Camera Culture group working with Prof. Ramesh Raskar. He received his Ph.D. from Rice University ECE Department in 2023 where he was advised by Prof. Ashok Veeraraghavan. His research lies at the intersection of applied optics, computer graphics, and computer vision. His research focuses on developing vision systems that go beyond human perception. His work has been recognized by Rice University's Best Thesis Award, OSA Best Paper Prize, and fellowships by Texas Instruments and Qualcomm.

Abstract: Pretraining vision encoders with self-supervision (SSL) leads to stronger representations that excel across diverse downstream tasks. One of the key factors enabling self-supervision is extracting multiple views of the same scene to formulate either: 1) View-invariant pretraining (DINO, SimCLR, iBOT), where the objective is predicting the same representation for different views of the scene; or 2) Cross-view pretraining (cross-view Masked Autoencoders), where the objective is predicting missing parts of one view using other views. For extracting multiple views, view-invariant methods rely on a combination of handcrafted augmentations (random cropping, color jittering, gaussian blur, etc.) of the same image, whereas cross-view pretraining methods rely on image cropping or video frames. In this work, we present methods to effectively incorporate synthetic views from diffusion models into SSL training.
For view-invariant pretraining, we introduce Gen-SIS, a method that leverages the ability of diffusion models to generate interpolated images through interpolation in conditioning space. We introduce a disentanglement pretext task: disentangling two source images from an interpolated synthetic image. This disentanglement task, in addition to vanilla single-source generative augmentation for view extraction, improves visual pretraining of various view-invariant methods (DINO, SimCLR, iBOT).
For cross-view pretraining, we introduce CDG-MAE, a novel cross-view masked autoencoder (MAE) based method that uses diverse synthetic views generated from static images via an image-conditioned diffusion model to learn dense correspondences. We present a quantitative method to evaluate the local and global consistency of the generated views to choose the right diffusion model for cross-view pretraining. These generated views exhibit substantial changes in pose and perspective, providing a rich training signal that overcomes the limitations of video (expensive) and crop-based (less variation) methods. CDG-MAE substantially narrows the gap to video-based MAE methods on video label propagation tasks while maintaining the data advantages of image-only MAEs.

Speaker: Varun Belagali

Location: NCS 120
Zoom: https://stonybrook.zoom.us/j/93647452432?pwd=hZaX7LXCAD8KPHWYE1Afw2sDI3owpv.1
AI for Conservation: AI and Humans Combating Extinction Together by Daniel I. Rubenstein of Princeton University

ABSTRACT: The state of our planet is not good. We have lost more than 60% of the world's wildlife. Stopping the decline remains a challenge, especially since acquiring appropriate knowledge is expensive, time consuming and risky. Visual observations following the fates of a few individuals was the currency of the realm. But GPS technology and now machine learning provide a non-invasive scalable alternative. Photographs, taken by field scientists, tourists, automated cameras and incidental photographers, are the most abundant source of data on wildlife today. Wildbook, a project of tech for conservation coordinated by a non-profit Wild Me, is an autonomous computational system that starts from massive collections of images and, by detecting various species of animals and identifying individuals, combined with sophisticated data management, turns them into high-resolution information databases, enabling scientific inquiry, conservation and citizen science.

BIO: Dan Rubenstein is the Class of 1877 Professor of Zoology. He is currently Director of Princeton's Environmental Studies Program and is former Chair of Princeton University's Department of Ecology and Evolutionary Biology and Director of Princeton's Program in African Studies. He is a behavioral ecologist who studies how environmental variation and individual differences shape social behavior, social structure, sex
roles and the dynamics of populations. He has special interests in all species of wild horses, zebras and asses, and has done field work on them throughout the world identifying rules governing decision-making, the emergence of complex behavioral patterns and how these understandings influence their management
and conservation. In Kenya he also works with pastoral communities to develop and assess impacts of various grazing strategies on rangeland quality, wildlife use and livelihoods. He has also developed a scout program for gathering data on Grevy's zebras and created curricular modules for local schools to raise awareness about the plight of this endangered species. He engages people as 'Citizen Scientists' and has recently extended his work to measuring the effects of environmental change, including issues pertaining to the global commons
and changes wrought by management and by global warming, on behavior.
Title: Cyberinfrastructure for forward prediction and inversion estimation with uncertainty quantification

Seminar Speaker: Dr. Mengyang Gu, Assistant Professor, Department of Statistics and Applied Probability, University of California, Santa Barbara

Abstract: In this talk, we introduce four useful tools for forward prediction and inversion estimation. The first tool is the parallel partial Gaussian process surrogate model for emulating expensive computer simulations with massive coordinates. The tool is implemented in the RobustGaSP package available in R, MATLAB, and Python, for predicting both scalar- and vector-valued outputs with uncertainty assessment. The second tool is implemented in the RobustCalibration package, which handles Bayesian data inversion or model calibration by one or multiple types of experimental observations. A unique feature of the package is the inclusion of fast surrogate models of both scalar- and vector-valued computer simulations that bypass the expensive simulation in one line of code. The third tool is implemented in the AIUQ package, available in both R and MATLAB. In this approach, we show that differential dynamic microscopy, a scattering-based analysis tool that extracts dynamical information from microscopy videos, is equivalent to fitting the temporal auto-covariance in Fourier space, based on a latent factor model we construct. We develop a more efficient estimator and reduce the computational cost to pseudolinear order with respect to the number of observations without approximation, by utilizing the generalized Schur algorithm for the Toeplitz covariance. In the last tool, we developed a new method called the inverse Kalman filter, which enables fast matrix-vector multiplication between a covariance matrix from a dynamic linear model and any real-valued vector with a linear computational cost. These new approaches outline a wide range of applications that include emulating expensive simulation at molecular-, meso- and macro-scales, active learning with error control, nonparametric estimation of particle interaction functions, and data inversion from microscopy and velocity fields.

Join Zoom Meeting: https://bnl.zoomgov.com/j/1606285496?pwd=2yJYSG6lx8gMPiibzgAIBQtKHIjuHV.1
Meeting ID: 160 628 5496
Passcode: 472506