You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes one short talk on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.

Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.

Learning Generalizable Program and Architecture Representations for Performance Modeling

Abstract: Performance modeling is an essential tool in many areas of computer science and engineering. However, existing performance modeling approaches have limitations, such as high computational cost, narrow flexibility, or restricted accuracy/generality. To address these limitations, this talk introduces PerfVec, a novel deep learning-based performance modeling framework that learns high-dimensional and independent/orthogonal program and microarchitecture representations. Once learned, a program representation can be used to predict its performance on any microarchitecture, and likewise, a microarchitecture representation can be applied in the performance prediction of any program. Additionally, PerfVec yields a foundation model that captures the performance essence of instructions, which can be directly used by developers in numerous performance modeling-related tasks without incurring its training cost. The evaluation demonstrates that PerfVec is more general and efficient than previous approaches. This talk will also introduce how PerfVec's design principles can benefit broader research areas.

Biography: Lingda Li is a computer scientist at Brookhaven National Laboratory. He is generally interested in computer architecture and programming model research, with focus on simulation/modeling, memory systems, and machine learning. Before joining BNL, he worked at the Department of Computer Science of Rutgers University as a postdoc to carry out GPGPU research. He obtained a PhD in computer architecture from the Microprocessor Research and Development Center at Peking University.

Location: CDS, Bldg. 725, Training Room

Join ZoomGov Meeting: https://bnl.zoomgov.com/j/1605837856?pwd=kYqJs4bVBt4E0cMCWR6GXH3wxzOoiw.1

Meeting ID: 160 583 7856
Passcode: 161580

Abstract: As we enter the AI era, domain scientists face a critical question: What can we do to harness AI effectively for scientific discovery? AI has demonstrated remarkable capabilities, from accelerating simulations to uncovering hidden patterns in complex datasets. While these advancements offer unprecedented opportunities, they also raise concerns--AI models often function as black boxes, making it difficult to connect their outputs to established scientific principles. This lack of interpretability can undermine trust and limit adoption, particularly in fields like meteorology where physical understanding is critical.
In this talk, I will explore how interpretable AI can bridge this gap, highlighting its potential to generate explicit, physically meaningful equations rather than opaque neural networks. Through four case studies from my lab, I will showcase how interpretable AI can enhance scientific understanding:
  1. Satellite Precipitation Retrieval: Using AI-based approaches to interpret precipitation retrieval algorithms from AMSU data, we identified critical microwave channels (89 and 150 GHz) that directly link to physical processes in the atmosphere.
  2. Quantitative Precipitation Estimation (QPE): By applying symbolic regression models to polarimetric radar data, we derived mathematical expressions that outperform traditional Z-R relationships and existing QPE algorithms, offering new insights into rainfall microphysics.
  3. Tornado Probability Prediction: Leveraging reinforcement learning-based symbolic deep learning models, we developed interpretable equations that outperform the traditional Significant Tornado Parameter (STP) index, providing a clearer understanding of the relationships between key atmospheric variables and tornado risk.
  4. Domain-Aware Symbolic Regression for Scientific Equations: In our latest work, we introduced a symbolic regression framework that incorporates domain-specific symbol priors extracted from thousands of scientific publications. By encoding common mathematical structures--such as the prevalence of trigonometric functions in physics or logarithmic forms in biology--into a tree-structured reinforcement learning model, we improved both the accuracy and interpretability of discovered equations. This approach accelerates convergence, enforces physical plausibility, and reveals new governing relationships in climate and geophysical data.
Through these examples, I hope to spark discussion on the evolving role of domain scientists in the AI era and inspire new ways to integrate AI with physical understanding in atmospheric research.

IACS Seminar Speaker: Yixin Wen, University of Florida

Location: IACS Seminar Room or Zoom

Join Zoom Meeting: https://stonybrook.zoom.us/j/97596399106?pwd=0PBvElFLqov3biO6OlQxSWLWudkIuH.1
Meeting ID: 975 9639 9106
Passcode: 096213

The Program in Writing and Rhetoric
Invites you to
A Rhetorical/Deliberative Framework for AI Language Model Alignment
featuring
Prof Zoltan Majdik Professor
North Dakota State University
In this talk, Prof. Majdik proposes a framework for aligning LLMs with values grounded in the norms of rhetorical culture and deliberative democracy. Alongside long-standing AI alignment value targets like safety and transparency, this AI alignment framework assesses to what extent a language model exhibits human and humane values that foster communicative engagement, and it codifies approaches to tuning existing models to better align with such values.

Location: Humanities 1008
How do you get the most out of generative AI? Stop by the library Galleria outside of the Central Reading Room to learn more! Librarians Chris Kretz and Ahmad Pratama, along with David Ecker of DoIT, will be demonstrating tools and tips for writing prompts that make the most of what AI can do. And they'll be hosting Explore AI demos this Monday - Wednesday (March 3rd-5th) 12:30 - 1:30. Whether you're new to AI or a current user, they'd love to talk to you about it.

Location: Melville Library Galleria
Abstract: Autonomous systems, whether on Earth or in space, rely on 3D perception to understand and interact with the world around them. Yet traditional techniques for 3D understanding often depend on human designed features, fixed sensors, and conventional imaging modalities. This constrained approach can limit every stage of perception, from sensing to interpretation to decision making.
In this talk, we'll explore an alternative paradigm for imaging: physically based neural representations for 3D scenes and 3D sensing systems. We will discuss how recent advances in large scale learned representations can be used to jointly optimize both 3D scene models and the design of sensing systems for 3D capture, with the goal of enabling task specific perception systems.
Unlike modern AI models trained on internet scale datasets, these specialized 3D representations typically operate in data sparse regimes and therefore require a different kind of prior. We'll examine how grounding these learned representations in the physics of light transport can improve our understanding of scene structure, and inform imaging system design even with limited data. By connecting physical insights with learned representations, we'll highlight new possibilities for robust, efficient, and adaptive perception in challenging environments.

Speaker: Nikhil Behari is a graduate student in the Camera Culture group at the MIT Media Lab, advised by Professor Ramesh Raskar. His research interests include computational imaging, 3D scene understanding, and multi-agent decision-making under uncertainty, with a focus on automating imaging system design for 3D perception in human and planetary health. His research is supported by the NASA Space Technology Graduate Research Fellowship. He received his bachelor's in Computer Science and Statistics from Harvard University in 2022.
The Art Department is hosting a guest artist exhibition, featuring the work of Young Maeng. The Opening Reception will be held on October 10th at 5 PM. Additionally, Young Maeng will be giving a talk on 'AI and Painting' on Oct 9 at 4:30 PM at the Future Histories Studio. Exhibition Location: Gallery Unbound, 3rd Floor, Staller Center, Stony Brook University
Time: May 5, 2022, Thursday, 02:00 PM Eastern Time (US and Canada)
Place: New Computer Science (NCS) Room 220, and Zoom

Zoom link: https://stonybrook.zoom.us/j/95948672934?pwd=d3ZDcUJkK3VweFBDVWhIVDhtaFU2Zz09
Meeting ID:  959 4867 2934
Passcode:  082036

Title:  Generative Adversarial Learning using Optimal Transport

Abstract: 

Generative Adversarial Learning (GAL) aims to learn a target distribution in an adversarial manner. A Generative Adversarial Network (GAN) is a concrete implementation of GAL using a discriminator and a generator that play a min-max game. GANs have been used in many machine learning and computer vision applications. However, GANs are known to be hard to train, mainly because a min-max saddle point optimization problem needs to be solved in GAL. In this thesis, I investigate several methods to improve generative adversarial learning using Optimal Transport (OT). 

Previous Wasserstein GANs (WGANs) do not compute the correct Wasserstein distance to train the discriminator. To address this problem, I propose WGAN-TS that uses the L1 transport cost and computes the correct Wasserstein distance to train the discriminator. To ensure the local convergence of WGANs, I propose WGAN-QC that adopts the quadratic transport cost. I prove that WGAN-QC not only computes the correct Wasserstein distance but also converges to a local equilibrium point. To compute the Wasserstein distance over the whole dataset, I propose to use Semi-Discrete Optimal Transport (SDOT) to match noise points and the real images during GAN training. To measure the quality of an SDOT map, I use the Maximum Relative Error (MRE) and the L_1 distance between the target distribution and the transported distribution obtained by an OT map. I propose statistical methods to estimate the MRE and the L_1 distance. I propose an efficient Epoch Gradient Descent algorithm for SDOT (SDOT-EGD). To deal with the 2D special case of GAL, I propose to use OT to learn 2D distributions. In particular, I adopt OT to match persistent diagrams in training a topology-aware GAN and learn density maps in the crowd counting task. Finally, I use OT and the topological maps of the crowd to improve the crowd counting performance and propose a topology-based metric to measure the quality of the crowd density maps.
Abstract: Datalog is a powerful language for expressing recursive computations through rules: Horn clauses in first order logic. Although effective at expressing queries over existential properties, Datalog and many of its popular implementations struggle with queries that involve more complex aggregates, requiring users to apply verbose, non-composable, and/or inefficient workarounds. Recent work on lattice-based datalogs addresses many of these concerns for aggregates that can be encoded as lattices (e.g., min or max), but more general aggregates like count remain problematic. In this talk, I will argue that this is not a fundamental limitation of Datalog, but rather from its model of truth: Both datalog semantics and evaluation rules make heavy use of the fact that insertion is both monotone and idempotent. Once a fact is known to be true, it can not be retracted, nor can further discoveries of the same fact alter its truth. Monotonicity is critical for forward progress under Datalog's ``open world'' model, as it allows us to safely assert the truth of a body. Meanwhile, idempotence makes it easier to reason about evaluation, as we need only guarantee that each head atom will be derived at-least-once. Unfortunately, more general aggregates like sum() are neither idempotent, nor monotone. I will introduce Hedgelog, a strict generalization of Datalog that uses general monoids as a basis for truth. I will show that this generalization remains compatible with Datalog's open world model, how it enables cleaner and more composable datalog programs, and how the underlying monoid relations open the door to interesting datastructure-level optimizations.

Bio: Oliver Kennedy is an associate professor at the University at Buffalo. He earned his PhD from Cornell University in 2011 and now leads the Online Data Interactions (ODIn) lab, which operates at the intersection of databases and programming languages. Oliver is the recipient of an NSF CAREER award, an IEEE Region 1 Technological Innovation Award, UB's Exceptional Scholar Award, and several UB SEAS teaching awards. Oliver is also one of the founding board members of Breadcrumb Analytics. Several of Oliver's papers have been invited to Best of compilations from SIGMOD and VLDB. The ODIn lab is currently exploring (i) how we can leverage database techniques like incremental view maintenance to make compilers faster, (ii) how to make it easier for data scientists to track how sources of uncertainty, ambiguity, and/or bias affect analyses, and (iii) how to streamline the interfaces --- both human and software --- between different tools for data science, like python, sql, and spreadsheets.

Location: NCS 120
Title: AI-Driven Target Selection Methods for Touch and Gaze Input

Abstract: Accurately selecting targets is an essential aspect of  Human-Computer Interaction. Erroneous selections can cause tedious undo and redo actions. Additionally, some selection errors are non-reversible and can lead to undesirable consequences. However, high-accuracy target selection remains a challenge on touchscreen devices due to the small target size and imprecise touch inputs, and in gaze interaction because of the gaze tracking noise and no easy-to-use selection action. We first propose ReLM, a Reinforcement Learning-based Method for touchscreen target selection. ReLM can automatically show suggestions and require a second touch if the input is ambiguous, and can directly select a target candidate when the input is certain. Our empirical evaluation shows that ReLM reduces the error rate from 6.92% to 1.63%, and the selection time from 2.23s to 1.59s over Shift, an existing suggestion-based method. Compared to BayesianCommand, a direct selection-based method, our ReLM reduces the error rate from 3.64% to 0.89%, while increasing the selection time by only 200 ms. Secondly, we investigate how to improve target selection performance for gaze interaction. We propose BayesGaze, an eye-gaze based target selection method. It accumulates the signal of each gaze point for selecting a target calculated by Bayes Theorem, and uses a threshold mechanism to determine the target selection. Our investigation shows that BayesGaze improves target selection accuracy and speed over a dwell-based selection method, and the Center of Gravity Mapping method.

All are welcome. Here  is the zoom meeting link:
https://stonybrook.zoom.us/j/93130953411?pwd=Rm5IRlVPQ3M0cHJsTXpCVFljUlFGUT09Meeting ID: 931 3095 3411Passcode: 999413