Fall 2025, Mondays 2 to 3:20 pm, NCS 220 and Zoom link to be announced soon.

The seminar will be jointly taught by Prof. Dimitris Samaras samaras@cs.stonybrook.edu.

The overall purpose of this seminar is to bring together people with interests in Computer Vision theory and techniques and to examine current research issues. This course will be appropriate for people who already took a Computer Vision graduate course or already had research experience in Computer Vision.

To enroll in this course, you must either: (1) be in the Ph.D. program or (2) receive permission from the instructors.

Each seminar will consist of multiple short talks (around 15 minutes) by multiple students. Students can register for 1 credit for CSE656. Registered students must attend and present a minimum of 2 talks. Registered students must attend in person. Up to 3 absences will be excused. Everyone else is welcome to attend.
Abstract: Many foundation models for digital pathology have been released recently. Benchmarking available methods then becomes paramount to get a clearer view of the research landscape. For this reason, we introduce THUNDER, a tile-level benchmark for digital pathology foundation models, allowing for efficient comparison of many models on diverse datasets with a series of downstream tasks, studying their feature spaces and assessing the robustness and uncertainty of predictions informed by their embeddings. Such foundation models are often used as feature extractors and combined with Multiple Instance Learning (MIL) aggregators at downstream time. Such aggregation must be efficient and reliable. We will focus on two specific examples of this: (I) HistAug, a fast and efficient generative model for controllable augmentations in the latent space of foundation models to perform data augmentation for MIL, and (ii) CAR-MIL, a method based on counterfactual attention regularisation to improve the reliability of attention maps of MIL methods.

Short-bio: Pierre Marza is a Postdoctoral Researcher at CentraleSupelec in the Biomathematics team of the MICS lab, studying Computer Vision and Deep Learning for Medical Imaging, with a focus on Digital Pathology. Prior to this, he was a PhD student at INSA Lyon, in the LIRIS and CITI labs, advised by Christian Wolf, and co-advised by Laetita Matignon and Olivier Simonin. He studied Visual Navigation, Embodied AI, Spatial Reasoning, more specifically how to learn to represent 3D space, generalize to new environments and master diverse tasks from light supervision.

Location: NCS 220

Zoom: https://stonybrook.zoom.us/j/94798224254?pwd=CFraer25qnpORbJ14aAVHRwaSJOjJM.1

Abstract: Pretraining vision encoders with self-supervision (SSL) leads to stronger representations that excel across diverse downstream tasks. One of the key factors enabling self-supervision is extracting multiple views of the same scene to formulate either: 1) View-invariant pretraining (DINO, SimCLR, iBOT), where the objective is predicting the same representation for different views of the scene; or 2) Cross-view pretraining (cross-view Masked Autoencoders), where the objective is predicting missing parts of one view using other views. For extracting multiple views, view-invariant methods rely on a combination of handcrafted augmentations (random cropping, color jittering, gaussian blur, etc.) of the same image, whereas cross-view pretraining methods rely on image cropping or video frames. In this work, we present methods to effectively incorporate synthetic views from diffusion models into SSL training.
For view-invariant pretraining, we introduce Gen-SIS, a method that leverages the ability of diffusion models to generate interpolated images through interpolation in conditioning space. We introduce a disentanglement pretext task: disentangling two source images from an interpolated synthetic image. This disentanglement task, in addition to vanilla single-source generative augmentation for view extraction, improves visual pretraining of various view-invariant methods (DINO, SimCLR, iBOT).
For cross-view pretraining, we introduce CDG-MAE, a novel cross-view masked autoencoder (MAE) based method that uses diverse synthetic views generated from static images via an image-conditioned diffusion model to learn dense correspondences. We present a quantitative method to evaluate the local and global consistency of the generated views to choose the right diffusion model for cross-view pretraining. These generated views exhibit substantial changes in pose and perspective, providing a rich training signal that overcomes the limitations of video (expensive) and crop-based (less variation) methods. CDG-MAE substantially narrows the gap to video-based MAE methods on video label propagation tasks while maintaining the data advantages of image-only MAEs.

Speaker: Varun Belagali

Location: NCS 120
Zoom: https://stonybrook.zoom.us/j/93647452432?pwd=hZaX7LXCAD8KPHWYE1Afw2sDI3owpv.1
Join the Center of Excellence in Wireless and Information Technology (CEWIT) and their co-host IEEE-USA for a livestream panel discussion on Generative Artificial Intelligence (Gen AI). In this engaging livestream, we will dive into the technologies that continue to transform what is possible and explore the dynamic intersection of innovation, creativity, ethics, and Gen AI.

CEWIT is joined by Stony Brook University experts who will provide their insights and perspectives on this rapidly changing technology.

Meet the Panel

Laura Lindenfeld, PhD

Executive Director
Alan Alda Center for Communicating Science®
Dean
School of Communication & Journalism
BIO

Margaret Schedel, PhD
Associate Professor
Composition and Computer Music
Co-Founder
Lyrai
BIO

Steven Skiena, PhD

Interim Director
AI Innovation Institute
Distinguished Professor
Computer Science
BIO

Vivian Zhang
CTO/School Director
NYC Data Science Academy
Chief Data Officer
GoDental.ai
BIO


Register here.

Join us to share your thoughts about teaching, learning, and AI!

The landscape of higher education is rapidly evolving with the integration of Artificial Intelligence (AI). Through the Institute on AI, Pedagogy, and the Curriculum with AAC&U, we are exploring ways that we can better address AI in teaching and learning. We want to hear your experiences, your concerns, and your ideas.

This is an open discussion for all faculty and staff to share their perspectives on the opportunities and challenges AI presents in our academic environment.

We'll be exploring critical questions like:

  • In the age of AI, what are the opportunities you see for enriching the classroom and curriculum? How can it enhance student learning or your professional practice?

  • What are the most significant challenges and concerns that AI raises for you regarding academics, student integrity, or your workload?

  • What resources (tools, training, technical support, policy guidance, etc.) do you need to feel confident and successful in the age of AI?

Dates/Times:

  • Tuesday, 2/3 at 2pm

  • Friday, 2/6 at 9:30am

Please register in advance for the Zoom link.

Can't Make It? Share Your Feedback!

We understand schedules are tight. If you cannot attend the live discussion, you can still share your thoughts! Join our AI Zoom Room to share your thoughts via video recording or email rose.tirotta-esposito@stonybrook.edu with your comments and ideas.

Videos will not be shared publicly and comments will only be shared in aggregate.

Your input is vital. From pedagogy to assessment, your insights will be critical. We look forward to a thoughtful and productive conversation!

  • Dr. Rose Tirotta-Esposito (Assistant Provost; Director of CELT)

  • Dr. Elizabeth Hewitt (Associate Professor in the Department of Technology and Society (DTS) in the College of Engineering and Applied Sciences)

  • Chris Kretz (Associate Librarian and Head of Academic Engagement at SBU Libraries)

  • Prof. Rajiv Lajmi (Assistant Professor in the School of Health Professions and Chair of Applied Health Informatics)

  • Dr. Matthew Salzano (Assistant Professor in the Department of Communication in the School of Communication and Journalism)

Research challenges in using computer vision in robotics systems Abstract The past decade has seen a remarkable increase in the level of performance of computer vision techniques, including with the introduction of effective deep learning techniques. Much of this progress is in the form of rapidly increasing performance on standard, curated datasets. However, translating these results into operational vision systems for robotics applications remains a formidable challenge. This talk with explore some of the fundamental questions at the boundary between computer vision and robotics that need to be addressed. This includes introspection/self-awareness of performance, anytime algorithms for computer vision, multi-hypothesis generation, rapid learning and adaptation. The discussion will be illustrated by examples from autonomous air and ground robots.
18th Annual Engineering Ball Flowerfield, St. James, NY Thursday April, 2nd, 7:00 to 10:00 pm Pick up your tickets in 231 Engineering (Monday - Friday, 10:00 am to 4 pm) Presenting Partner: L3Harris
Join Stony Brook University's Center for Excellence in Learning and Teaching (CELT) for a boot camp on how to use AI to enhance your teaching and courses. This event will demonstrate how ChatGPT, Microsoft Copilot, NotebookLM, and other generative AI platforms can support you in crafting learning objectives, writing exam questions, composing rubrics, and designing course content such as lesson plans, in-class activities, instructional videos, and more.

https://stonybrook.zoom.us/j/92511854285?pwd=QRTHfULqHMWxJYoVyt3piOhNxWLfvs.1