CSE 600 Seminar Series | Fall 2025


Abstract: Virtual worlds are prevalent in applications ranging from entertainment, healthcare, retail, to workforce training. With the demand for virtual content growing exponentially, the market for such content is valued at over $200 Billion, which is accelerating the need for advanced computational solutions. In this talk, I will focus on a key challenge in virtual content creation: simulating autonomous agents.
I begin by overviewing this problem domain, through the lens of a physics-based dynamics simulation, which enables the simulation of thousands of agents at interactive rates with GPU programming, achieving a level of performance previously unattainable.
Next, I'll present our recent results in Deep Reinforcement Learning for multi-agent navigation, which enable refined, reward-based strategies to control agent movement. We demonstrate how these techniques can simulate realistic crowds, with broad applications in pedestrians, robots, and swarms. Lastly, I conclude my talk by discussing our lab's work-at-large and the wide range of research opportunities in this emerging area.

Speaker: Tomer Weiss is a professor with New Jersey Institute of Technology since 2020. He received the best student, presentation, and best paper awards in various ACM SIGGRAPH conferences for his work on simulating multi-agent crowds. He was also a finalist in both ACM SIGGRAPH Thesis Fast Forward, and the ACM SIGGRAPH Asia Doctoral Symposium in 2018. He received his PhD in computer science from UCLA in 2018. His research interests include multi-agent dynamics, scene understanding, and interactive visual computing.




https://stonybrook.zoom.us/j/91775729097pwd=Qlc5Nks0NmlyKzJwMjR0S0hrdVZ3QT09

Meeting ID: 917 7572 9097
Passcode: 555459


Abstract: As the saying goes, there are many ways to skin a cat.
While we don't want to go around skinning cats, the world of
optimization is rich with different problems, problem formulations,
and methods and approaches, each with different guarantees and
computational benefits. In this talk we will take a tour down the
problem of structured sparsity in sensing to see how one simple
problem can inspire a wide range of analysis and tools. First, I will
present the optimality conditions for a generalized structured sparse
problem, which can be geometrically visualized as alignment of vectors
and matrices. Then I will introduce three approximation methods for
the problem of phase retrieval, which are a twist on stochastic
gradient and coordinate descent methods. These methods leverage
fundamental numerical linear algebra concepts to give fast approximate
solutions to large-scale problems, which then after postprocessing can
produce more reliable sensing results.

Bio: Yifan Sun received her PhD in Electrical Engineering from the
University of California Los Angeles in 2015, with research focusing
on convex optimization and semidefinite programming. She was then
Technicolor Research and Innovation, focusing on machine learning and
data science applications. More recently, she completed two postdocs,
at the University of British Columbia in Vancouver, Canada and
L'Institut National de Recherche en Informatique et Automatique
(INRIA) in Paris, France.
Abstract: In today's digital era, language functions not only as a medium of information transmission but also as a mechanism of persuasion, framing, and control. The proliferation of online platforms has amplified this dual role: while enabling unprecedented access to knowledge, it has also exacerbated challenges such as misinformation, rhetorical manipulation, and cultural or linguistic disparities in information access. As a result, pragmatic language understanding and information integrity have emerged as central concerns for both computational linguistics and society at large. This research follows how claims are produced, reframed, and contested online through three interconnected threads. First, it models pragmatic deflection in discourse by investigating whataboutism, a rhetorical device that deflects criticism by redirecting discourse, and introduced novel datasets from Twitter (now X) and YouTube. This work underscores how subtle pragmatic maneuvers can erode discourse integrity without relying on outright falsehoods. Second, it advances retrieval and alignment for information integrity in health and news communication. These systems trace claims and narratives across genres (e.g., social posts and news reports) and languages (Chinese and English), linking social posts with journalistic reporting and aligning Chinese news with English biomedical evidence. By accounting for cultural context, assertions can be linked to reliable evidence and organized for systematic comparison. This work surfaces the risks of missing sources, unverifiable claims, and framing disparities in global health discourse, and demonstrates computational solutions that enhance both the credibility and accessibility of information. Third, the methodological centerpiece is Class Distillation (ClaD), a geometry-aware training paradigm for distilling a small, well-defined target class from a large, heterogeneous background. ClaD couples a distribution-aware contrastive loss (instantiated here in a Mahalanobis form when its assumptions fit the data) with an interpretable decision algorithm tuned for class separation. Evaluated on sarcasm, metaphor, and sexism detection, ClaD delivers strong efficiency and robustness, matching or surpassing larger models while using fewer computational resources, making these pipelines practical by learning reliably from small, sharply defined classes. In sum, this research presents an integrated account of language understanding in the digital age. It exposes how integrity falters through pragmatic deflection, cross-genre drift, and cross-lingual misalignment, and translates these insights to move pragmatic language understanding to systems for evidence retrieval, alignment, and verification; and it sheds light on where and how integrity is threatened, and delivers methods that leverage pragmatic language use.

Speaker: Chenlu Wang

Location: (Old) Computer Science Building, Room 2311

You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes one short talk on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.

Abstract: This presentation will begin by outlining key challenges facing the modern power grid and summarizing our group's research efforts to address them. It will then discuss how AI and machine learning are reshaping the grid modernization. The major focus of the talk will highlight a range of AI/ML applications we have developed in recent years to enhance grid operation, planning, control, and security.

Biography: Meng Yue is currently leading the Grid Modernization and Security Group in the Interdisciplinary Science Department at Brookhaven National Laboratory (BNL). He received his Ph. D. from Michigan State University in electrical engineering. His major research interests include power system modeling, simulation, and control, and applications of AI/ML- and quantum machine learning and quantum computing in operation, planning, and security of the future grid.

Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.

Location: CDS, Bldg. 725, Training Room

Join ZoomGov Meeting: https://bnl.zoomgov.com/j/1604383624?pwd=ffQ5cUPNxTI7nzClKQO6cnsNbhF9Vf.1

Meeting ID: 160 438 3624
Passcode: 558449

Join Stony Brook University's Center for Excellence in Learning and Teaching (CELT) for a bootcamp on how to use AI to enhance your teaching and courses. This event will demonstrate how ChatGPT, Microsoft Copilot, and other generative AI platforms can support you in crafting learning objectives, writing exam questions, composing rubrics, and designing course content such as lesson plans, in-class activities, instructional videos, and more.

Register here.

Abstract: The remarkable success of large foundational models, such as LLMs and diffusion models, is built on their learning over vast amounts of static data from the Internet. However, human learning and problem-solving are fundamentally interactive processes--humans learn by engaging with their environment, tools, search engine, and feedback loops, iteratively refining their understanding and decisions. This gap between the interactivity of human learning and the static nature of model training raises a critical question: how can we imbue foundational models with the capacity for meaningful interaction?

In this talk, I will explore methods to enhance foundational models by incorporating interaction with the external environment. I will discuss strategies such as leveraging external tools, compilers, function calls to provide dynamic feedback to enhance foundation models. By drawing inspiration from human's interactive learning processes, I demonstrate how interaction-driven learning can lead to models that are not only more accurate but also more adaptable to real-world applications.

This work bridges the gap between static training paradigms and the dynamic, iterative nature of human intelligence, paving the way for a new generation of interactive AI systems.

Bio: Wenhu Chen has been an assistant professor at the Computer Science Department in University of Waterloo and Vector Institute since 2022. He obtained the Canada CIFAR AI Chair Award in 2022 and CIFAR Catalyst Award in 2024. He has worked for Google Deepmind as a part-time research scientist since 2021. Before that, he obtained his PhD from the University of California, Santa Barbara under the supervision of William Wang and Xifeng Yan. His research interest lies in natural language processing, deep learning and multimodal learning. He aims to design models to handle complex reasoning scenarios like math problem-solving, structure knowledge grounding, etc. He is also interested in building more powerful multimodal models to bridge different modalities. He received the Area Chair Award in AACL 2023, the Best Paper Honorable Mention in WACV 2021, the Best Paper Finalist in CVPR 2024, and the UCSB CS Outstanding Dissertation Award in 2021.

Abstract: The advent of ChatGPT has redrawn the boundary of pedagogical discourse, where the dyadic configuration of teacher-student has, for many, become triadic -- one that includes AI as an relevant third party, not to be missed or dismissed. Within applied linguistics, AI-focused research has predominantly targeted the teaching and learning of writing (Fang & Han, 2025). The work on AI and speaking, on the other hand, has largely involved perception studies documenting its positive impact on learners' willingness to communicate (Goh & Aryadoust, 2025). In this talk, I explore the role of AI in the teaching and learning of speaking, and in particular, the development of interactional competence. Based on a corpus of learner-AI interactions, I demonstrate the ways in which ChatGPT excels and fails at acting as a useful conversation partner, with a view towards furthering our ongoing deliberation on the affordances and constraints of AI in language education.

Speaker: Hansun Zhang Waring (Teachers College, Columbia University)

Hansun Zhang Waring is Professor of Linguistics and Education at Columbia University and founder The Language and Social Interaction Working Group (LANSI). As an applied linguist and a conversation analyst, Hansun is interested in all things interaction -- (second language) pedagogical interaction, communication with the public, parent-child interaction, and human-AI interaction (HAI). Her work has appeared in leading journals in applied linguistics and discourse analysis as well as numerous book volumes, some of which she (co-)authored or co-edited. She is on the editorial boards of Chinese Language and Discourse (CLD), Classroom Discourse (CD), and International Review of Applied Linguistics (IRAL).

Location: Wang Center, Lecture Hall #1

If you need special accommodation, please contact chikako.nakamura@stonybrook.edu.

Over the past decade, researchers in neuroscience, psychology and artificial intelligence have come together to build advanced computer models that mimic how our brain processes what we see. These models are designed to closely copy the brain's visual system, all the way to a key area called the inferior temporal cortex, which plays an important role in recognizing objects.

Because these computer models can be fully observed, scientists can use them to make detailed predictions about how the brain works -- something older, more theoretical models could not do.

Dr. James DiCarlo's work explores whether these computer digital twin models of the brain could help guide safe, non- invasive ways to infl uence brain activity. In his talk, he explains how such a model could be used to design specific patterns of light. When this carefully designed light is added to what the eye naturally sees, it can precisely influence activity in groups of neurons in the inferior temporal cortex.

Since neural activity in this visual brain area may be connected to emotional states like anxiety, this research could eventually open the door to non-invasive approaches that may benefit mental well-being in the future.

Speaker: James J. DiCarlo, MD, PhD, Peter de Florez Professor, MIT Brain and Cognitive Sciences, and Director, MIT Siegel Family Quest for Intelligence

Location: Staller Center Main Stage

The event will be livestreamed at stonybrook.edu/live

CSE 600 Seminar Series | Fall 2025


Abstract: Vision-language models that see and describe the world are now part of our daily lives, from internet search and accessibility tools to content generation and automatic moderation. However, as these models grow and become more widely used, their limitations have also become increasingly visible. In particular, it has been shown that these models are unable to reliably perform complex tasks that require abstraction and compositional reasoning. For example, they struggle to decompose an image or text into entities, attributes, and relations, and then reason over new combinations of these elements. As a result, we see generated content full of hallucinations, privacy leaks in images, and different types of biases in the model outputs.In this talk, I will outline a research agenda that aims to build trustworthy vision-language models in the age of generative AI. I will begin with compositional reasoning: how natural language inference can be used to decompose complex instructions and captions into atomic, verifiable statements, improving both evaluation and model behavior on tasks that require multi-step reasoning. I will then discuss how synthetic data and simulated environments can be used to train more reliable models, and how they can also stress-test models beyond standard benchmarks, revealing when models drop attributes, break object relations, or fail under distribution shifts. I will also share recent work on using hallucination correction as a signal to improve video-language alignment, and on privacy-preserving image understanding for blind and low-vision users. I will conclude with possible ways we can systematically probe, debug, and repair these models, turning synthetic perception into something we can trust in real-world deployments.



Speaker: Paola Cascante-Bonilla is a tenure-track Assistant Professor in the Department of Computer Science at Stony Brook University (SUNY). Before that, she was a Postdoctoral Associate at the University of Maryland Institute for Advanced Computer Studies (UMIACS), developing methods and metrics related to trustworthy machine learning. She received her Ph.D. in Computer Science at Rice University in 2024, working on Computer Vision, Natural Language Processing, and Machine Learning.Her research focuses on developing systems that enable compositional reasoning and common-sense inference through vision and language, while tackling issues such as cultural biases, data distribution, explainability, and trustworthy AI. Additionally, Cascante-Bonilla creates simulated environments for embodied agents to learn in a safe, controlled setting, aiming to facilitate effective collaboration and problem-solving for complex tasks by leveraging the implicit knowledge of large-scale pre-trained deep learning models.
Cascante-Bonilla is the recipient of the Ken Kennedy Institute SLB Graduate Fellowship (2022/23), she was selected as a Future Faculty Fellow by Rice's George R. Brown School of Engineering (2023) and as a Rising Star in EECS (2023).
Location: NCS 120

Abstract: How do humans learn the sound patterns of their language? Despite a variety of methods and advances in phonotactic learning, there is still a paucity of computational research, methods and data for languages with tones. In this talk, I will explore this question specifically in light of tone languages, where pitch plays a crucial role in distinguishing words' meaning. I provide an implementation of the Bottom-Up Factor Inference Algorithm over Autosegmental Representations (BUFIA-AR), which learns the rules governing possible tone patterns. Using a dataset of Hausa, a West African tone language, the algorithm successfully identifies patterns that are not permitted in the language. These results (i) confirm long-standing linguistic generalizations, (ii) make more specific predictions about exceptional cases, and (iii) reveal previously unnoticed patterns. The results show how mathematical models of sound structure can be brought into dialogue with both linguistic theory and computational learning, highlighting the broader potential of formal approaches to capture human linguistic knowledge.

Bio: Han Li is a fifth-year Ph.D. student in Linguistics department, specializing in computational linguistics under the supervision of Professor Jeff Heinz. Her research focuses on how sound patterns in language can be formally represented and computationally learned, bridging theoretical linguistics and computer science.

Location: Institute for Advanced Computational Science, Seminar Room

Zoom Meeting: https://stonybrook.zoom.us/j/94043459206?pwd=3ra47h8HghOFRfobRBjZaDMyTwialr.1
Meeting ID: 940 4345 9206
Passcode: 332717