Abstract: Generative visual models like Stable Diffusion and Sora generate photorealistic images and videos that are nearly indistinguishable from real ones to a naive observer. However, their grasp of the physical world remains an open question: Do they understand 3D geometry, light, and object interactions, or are they mere pixel parrots of their training data? Through systematic probing, I will demonstrate that these models surprisingly learn fundamental scene properties--intrinsic images such as surface normals, depth, albedo, and shading (à la Barrow & Tenenbaum, 1978)--without explicit supervision, which enables applications like image relighting. But I will also show that this knowledge is insufficient. Careful analysis reveals unexpected failures: inconsistent shadows, multiple vanishing points, and scenes that defy basic physics. All these findings suggest these models excel at local texture synthesis but struggle with global reasoning: a crucial gap between imitation and true understanding. I will then conclude by outlining a path toward generative world models that emulate global and counterfactual reasoning, causality, and physics.

Bio: Anand Bhattad is a Research Assistant Professor at the Toyota Technological Institute at Chicago. He earned his PhD from the University of Illinois Urbana-Champaign in 2024 under the mentorship of David Forsyth. His research interests lie at the intersection of computer vision and computer graphics, with a current focus on understanding the knowledge encoded in generative models. Anand has received Outstanding Reviewer honors at ICCV 2023 and CVPR 2021, and his CVPR 2022 paper was nominated for a Best Paper Award. He actively contributes to the research community by leading workshops at CVPR and ECCV, including Scholars and Big Models: How Can Academics Adapt? (CVPR 2023), CV 20/20: A Retrospective Vision (CVPR 2024), Knowledge in Generative Models (ECCV 2024), and How to Stand Out in the Crowd? (CVPR 2025). For more details, visit https://anandbhattad.github.io/


Abstract: Modern decision-making increasingly relies on complex data, imperfect models, and limited domain expertise--yet decisions must still be made with confidence and accountability. This talk presents a research perspective on visual analytics as a bridge between data, models, and human judgment. Through three case studies spanning public-health risk analysis, multivariate scientific visualization, and causal model auditing with large language models, I will show how interactive visualization can reveal structure in high-dimensional data, support reasoning under uncertainty, and help humans critically assess both statistical and AI-generated explanations. Together, these examples illustrate how visual analytics enables users not only to explore data, but to form, challenge, and refine beliefs that underpin scientific and societal decisions.

Bio: Klaus Mueller received his Ph.D. in Computer Science from The Ohio State University in 1998. He is a Professor in the Department of Computer Science at Stony Brook University and a Senior Scientist at the Computational Science Initiative at Brookhaven National Laboratory. He currently serves as the Acting Chair of the Department of Technology and Society at Stony Brook. From 2012 to 2015, he was the Founding Chair of the Computer Science Department at SUNY Korea, where he also served as Vice President for Academic Affairs and Finance for two years.
His research interests span visual analytics, explainable AI, machine learning and data science, human-centered responsible AI, fairness, belief modeling and personalized communication, virtual and augmented reality, and computational and medical imaging. Dr. Mueller received the U.S. National Science Foundation Early Career Award in 2001, the SUNY Chancellor's Award for Excellence in Scholarship and Creative Activity in 2011, and the Meritorious Service Certificate and Golden Core Award of the IEEE Computer Society in 2016. In 2018, he was inducted into the U.S. National Academy of Inventors.
To date, he has authored more than 300 peer-reviewed journal and conference papers, which have been cited over 15,000 times. He is a frequent speaker at international conferences, has organized or participated in 18 tutorials, chaired the IEEE Visualization Conference in 2009, served as elected Chair of the IEEE Technical Committee on Visualization and Computer Graphics (VGTC) from 2012-2015, and was Editor-in-Chief of IEEE Transactions on Visualization and Computer Graphics from 2019-2022. He is a Fellow of the IEEE.

Location: NCS 120

The Office for Research and Innovation at Stony Brook University invites you to attend the inaugural Wolf Den, an evening designed to bring together members of the regional innovation and entrepreneurial ecosystem.

Meet investors, researchers, startup founders, and business leaders to exchange ideas, foster collaboration, and strengthen connections that drive technology development and economic growth across Long Island.

Agenda

4:30 - 5:00 PM | Grab some cheer & mingle
5:00 - 5:40 PM | Welcome remarks and AI Panel
5:40 - 6:00PM | Featured lightning pitches
6:00 - 7:00 PM | Food, drinks and great conversations!

Attendees will have the opportunity to learn more about Stony Brook's entrepreneurship ecosystem, hear company pitches from emerging startups, and engage in meaningful networking with innovators, investors and community partners.

Refreshments will be served. Registration is required.

In partnership with Accelerate Long Island.

https://www.stonybrook.edu/commcms/innovation/_events/wolfden.php

Zoom Link: https://github.com/giorgianb/spdhackspring2021/blob/main/bit.ly/spdhack2021

ΣΦΔ Hack Spring 2021 is ΣΦΔ's first annual machine learning hackathon. ΣΦΔ Hack Spring 2021 aims to introduce Stony Brook students to the rich and challenging field of machine learning, and develop the skills necessary to build sophisticated machine learning models on their own.
 
More info here: https://github.com/giorgianb/spdhackspring2021/blob/main/README.md
Abstract: Visual generation is a fundamental problem in computer vision and graphics, with applications ranging from 3D capture to content creation and image/video synthesis. Despite rapid progress in neural rendering and generative models, efficiency remains a key obstacle in practice: high-quality 3D reconstruction often depends on dense multi-view supervision; scalable 3D synthesis faces heavy optimization, training, and rendering costs; and modern image/video generators incur substantial computation as token grids grow with spatial resolution and temporal length.
This thesis targets efficient visual world modeling by improving sample efficiency in 3D reconstruction, representation efficiency in 3D generation, and computational efficiency in image/video synthesis. First, we improve sample efficiency for neural implicit surface reconstruction under sparse views by integrating multi-view stereo probability volumes as a geometric regularizer, enabling high-quality reconstruction from as few as three input images. Next, we introduce an explicit 3D representation for 3D generation, built from multi-view depth and RGB predictions with 3D Gaussian features, which enables the use of 2D generative priors while enforcing multi-view consistency via epipolar attention. We then address the computational bottleneck of image and video synthesis with importance-based token merging, using importance signals available during generation to preserve critical information while merging redundant tokens. Finally, we propose efficient mixed-resolution diffusion transformers via cross-resolution phase-aligned attention, aiming to improve attention stability under mixed token grids and support high-fidelity mixed-resolution generation.

Speaker: Haoyu Wu

Location: NCS120
Abstract: Large language models are prone to memorizing some of their training data. Memorized (and possibly sensitive) samples can then be extracted at generation time by adversarial or benign users. There is hope that model alignment---a standard training process that tunes a model to harmlessly follow user instructions---would mitigate the risk of extraction. However, we develop two novel attacks that undo a language model's alignment and recover thousands of training examples from popular proprietary aligned models such as OpenAI's ChatGPT. Our work highlights the limitations of existing safeguards to prevent training data leakage in production language models.

Speaker: Pegah Alipoormolabashi

Location: CS2311

You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes three short talks on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.

Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.

Tuesday, November 26, 2024, 12:00 pm -- CDS, Bldg. 725, Training Room

Speakers

Hanfei Yan, NSLS-II

David Park, CDS, AI Dept

Xihaier Luo, CDS, AI Dept

Join Zoom Meeting

https://bnl.zoomgov.com/j/1601052863?pwd=eIX9qZKPGNtQ11uwbK8JP5hIdIxA3V.1

Meeting ID: 160 105 2863

Passcode: 442980


The overall purpose of this seminar is to bring together people with interests in Computer Vision theory and techniques and to examine current research issues. This course will be appropriate for people who already took a Computer Vision graduate course or already had research experience in Computer Vision. To enroll in this course, you must either: (1) be in the PhD program or (2) receive permission from the instructors.

Each seminar will consist of multiple short talks (around 10 minutes) by multiple people. Students can register for 1 credit for CSE 656. Registered students must attend and present a minimum of 2 or 3 talks. Everyone else is welcome to attend. Fill in https://forms.gle/pCVXovgfMfQwGqG38 to subscribe to our mailing list for further announcement.

You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes one short talk on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.

Abstract: Two-dimensional (2D) materials such as graphene, hBN, and TMDs offer atomically sharp interfaces and unprecedented tunability when vertically assembled into van der Waals heterostructures. These stacks have enabled discoveries ranging from moiré superconductivity and correlated insulators to quantum emitters and next-generation nanoelectronic devices. Yet constructing high-quality heterostructures remains largely artisanal: researchers manually identify exfoliated flakes, align a polymer stamp by eye, and finely adjust temperature and contact geometry through tacit skill. This manual workflow is difficult to reproduce, scales poorly, and prevents systematic exploration of the enormous combinatorial space of materials, twist angles, and interfacial conditions. AutoLab is an autonomous platform that translates this tacit human expertise into programmable, feedback-driven control. Instead of pressing flakes with predefined trajectories, AutoLab uses machine vision to detect polymer-wafer contact, dynamically regulates contact evolution through closed-loop actuation and temperature control, and captures high-quality flakes with the cleanliness and precision of expert manual fabrication. The system integrates perception, decision making, and motion planning into a single robotic framework, enabling reproducible stacking, wafer-level coverage, and accelerated discovery. Beyond 2D materials, AutoLab illustrates a broader paradigm for AI-native scientific automation: codifying human experimental reasoning into algorithms that interrogate data in real time, adaptively adjust instrumentation, and generate scalable, high-fidelity datasets. Such platforms could generalize to diverse research domains--quantum device fabrication, optical alignment, surface science, autonomous microscopy, and other workflows where expert intuition currently limits throughput and reproducibility. By bridging artisanal manipulation and robotic autonomy, AutoLab points toward a future where scientific discovery is accelerated by machines that not only execute instructions, but learn, respond, and collaborate with human scientists.

Biography: Dr. Yutao Li is a research associate from Department of Condensed Matter Physics and Material Science, Brookhaven National Laboratory. He has 8 years of experience in 2D material sample fabrication, and investigation in their electronic transport, optical and mechanical properties.

Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.

Location: CDS, Bldg. 725, Training Room

Join ZoomGov Meeting: https://bnl.zoomgov.com/j/1604383624?pwd=ffQ5cUPNxTI7nzClKQO6cnsNbhF9Vf.1

Meeting ID: 160 438 3624
Passcode: 558449