You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes three short talks on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.

Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.

Tuesday, January 7, 2025, 12:00 pm -- CDS, Bldg. 725, Training Room

Join ZoomGov Meeting: https://bnl.zoomgov.com/j/1615289117?pwd=Hqkbj9itxWrFnkhZ8rQXHPInO2gxdF.1

Meeting ID: 161 528 9117
Passcode: 991382

You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes one short talk on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.

Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.

At our Oct 7 Mixer, BNL's newly minted interim director, John Hill will be present to give opening remarks and kick us off on a new year of impactful scientific AI collaborations.

Abstract: Weather extremes and strong seasonal-to- subseasonal variability pose growing challenges to urban populations, infrastructure, and energy systems. Yet, most cities remain data deserts: routine weather observations are sparse, with stations concentrated at airports rather than within the urban core. This lack of coverage limits our ability to monitor and predict fine-scale urban weather patterns precisely where they matter most. We present a new AI-driven framework for optimal sensor placement and urban weather monitoring. Unlike traditional approaches, our method leverages physics- based simulations together with Bayesian experimental design principles, but does so using a computationally efficient variational inference strategy that makes large-scale optimization tractable. This allows us to guide sensor networks in a way that minimizes information loss while capturing spatiotemporal variability at city scales. Applied to Phoenix, Arizona, our framework outperforms random sensor placement strategies, especially when only a limited number of sensors can be deployed. Importantly, the same AI models that guide sensor placement also function as a real-time nowcasting tool, providing urban weather information over the entire domain, beyond sensor locations. Together, these capabilities offer a scalable pathway to reduce urban data deserts, enhance monitoring of weather extremes, and improve resilience planning for energy, transportation, and public health systems.

Biography: Dr. Katia Lamer is an atmospheric scientist and the Director of the Center for Multiscale Applied Sensing at Brookhaven National Laboratory. Originally from Canada, she earned her B.S. and M.S. in Atmospheric and Oceanic Sciences from McGill University and a Ph.D. in Meteorology from Penn State University. Her research focuses on atmospheric boundary layer processes and remote sensing technologies, with a strong emphasis on data science. At Brookhaven, she is known for her work with the CMAS mobile observatories and its facility that connect fundamental atmospheric science to real-world applications, improving weather prediction, environmental monitoring, and urban climate resilience. Her work has been featured in public outlets such as New Scientist and Wired. Dr. Lamer also serves as an invited member of the World Meteorological Organization's Data Assimilation and Observing Systems Working Group, and the American Meteorological Society's Boundary Layer and Turbulence Committee. puting, communications and sensing, all enabled by AI.

Location: CDS, Bldg. 725, Training Room

Join ZoomGov Meeting: https://bnl.zoomgov.com/j/1604383624?pwd=ffQ5cUPNxTI7nzClKQO6cnsNbhF9Vf.1

Meeting ID: 160 438 3624 | Passcode: 558449

Abstract: Anxiety disorders are characterized by persistent and excessive form of fear and worry that interferes with daily functioning, distinguishing it from the adaptive anxiety that helps individuals respond to challenges. Despite affecting millions worldwide and costing a significant public health burden, anxiety disorders still remain underdiagnosed than actual prevalence due to lack of understanding and stigmatization. Leveraging machine learning (ML) and natural language processing (NLP) approaches can help bridge this gap by enabling scalable and accessible mental health assessments, offering a data-driven understanding of anxiety from individual and societal perspectives, and shedding light on societal stigmas toward mental health conditions. At the same time, advancing ML and NLP techniques for anxiety research presents unique technical challenges, such as effectively modeling linguistic markers of anxiety and ensuring interpretability in mental health predictions.

This dissertation investigates anxiety from both individual and societal perspectives using artificial intelligence. First, we explore individual manifestations of anxiety through three methodological advancements: (1) integrating contextual and discourse-level embeddings to improve language-based anxiety prediction using Facebook posts and selfreported surveys; (2) enhancing cognitive dissonance detection in Twitter dataset with transfer learning and active learning; and (3) developing longitudinal representation learning approaches that achieve both predictive utility and interpretability of adolescent psychopathology. Finally, we extended our analysis to societal dimension of anxiety by identifying and categorizing social norms expressed in Reddit and Twitter posts and examining their associations with anxiety. By combining data-driven methods with psychological insights, this work studies anxiety from various angles - capturing both individual experiences and societal influences - offering a step toward a more comprehensive understanding of its causes and manifestations.

Speaker: Swanie Juhng

https://stonybrook.zoom.us/j/98905245099?pwd=M7rI7aNfNio281qyebEUdNPBcSiK7Y.1

You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes one short talk on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.

Abstract: Designing custom proteins could revolutionize medicine and materials, but it remains an immense scientific challenge. Our work uses large-scale AI foundation models to generate novel proteins tailored to bind specific small molecules. Each AI-generated design is passed through a rigorous, multi-stage validation pipeline to ensure it is biophysically realistic. A key innovation is fine-tuning our model with data from molecular dynamics (MD) simulations, exposing it to the conformational dynamics and energetics of protein-ligand binding. This physics-aware training results in novel protein designs with enhanced stability and more effective binding capabilities.

Bio: Xin Dai is an Assistant Computational Scientist in the Artificial Intelligence Department of the CDS. His work centers on AI for Science with a strong focus on computational biology. He earned his PhD in Physics from Tsinghua University.

Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.

Location: CDS, Bldg. 725, Training Room

Join Zoom Meeting: https://bnl.zoomgov.com/j/1604383624?pwd=ffQ5cUPNxTI7nzClKQO6cnsNbhF9Vf.1

Meeting ID: 160 438 3624
Passcode: 558449

Abstract:

Recent advances in deep learning have significantly enhanced the capabilities of Natural Language Processing (NLP) and Vision-Language Models (VLMs). However, these advancements come with increased vulnerabilities, notably through backdoor attacks that pose severe security threats. This thesis addresses two critical dimensions of Trustworthy AI and Efficient Multimodal Representation Learning: (1) security through analyzing, detecting, and designing backdoor attacks in NLP and VLMs, and (2) efficiency through advanced multimodal representation methods tailored for clinical and medical imaging applications.

In the first dimension, we explore the internal mechanisms exploited by backdoor attacks, identifying the distinctive phenomenon of attention focus drifting in compromised transformer models, where trigger tokens consistently hijack attention. Leveraging these insights, we propose robust detection frameworks, including the attention-based Trojan detector (AttenTD) and a task-agnostic logit-based detection method (TABDet), achieving effective identification of backdoored NLP models across diverse tasks. We further introduce novel backdoor attack methodologies: the Trojan Attention Loss (TAL), enhancing attack efficiency and stealth through direct attention manipulation, and BadCLM, demonstrating critical vulnerabilities in clinical decision-support systems by effectively compromising clinical language models.

Extending our security exploration to multimodal settings, we investigate backdoor attacks on Vision-Language Models (VLMs), particularly in complex image-to-text generation tasks, proposing innovative techniques (TrojVLM, VLOOD) capable of embedding backdoors without direct access to original training data, thus showcasing practical risks in real-world scenarios.

In the second dimension, we address efficiency and interpretability challenges in clinical and pathology applications. We introduce TCP-LLaVA, the first multimodal large language model (MLLM) designed explicitly for Whole Slide Image (WSI) Visual Question Answering (VQA). Utilizing a novel token compression mechanism inspired by transformer-based models, TCP-LLaVA substantially reduces computational resource consumption while maintaining superior VQA performance across multiple tumor subtypes. Additionally, we present a multimodal transformer model integrating structured Electronic Health Records (EHR) with clinical notes, demonstrating enhanced predictive accuracy and interpretability for in-hospital mortality prediction through integrated gradient-based interpretability methods.

Together, these contributions present a comprehensive approach to ensuring AI models are not only secure against malicious manipulation but also efficient and interpretable for critical clinical applications, underscoring the essential need for trustworthy and effective AI systems.

Speaker: Weimin Lyu

Zoom: https://stonybrook.zoom.us/j/2392326575?pwd=SVQ2VkFXTnZZYmJUMXgvTXBuZWM3UT09

Meeting ID: 239 232 6575
Passcode: 436192

The International Conference on Machine Learning (ICML) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence known as machine learning.

ICML is globally renowned for presenting and publishing cutting-edge research on all aspects of machine learning used in closely related areas like artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, and robotics.

ICML is one of the fastest growing artificial intelligence conferences in the world. Participants at ICML span a wide range of backgrounds, from academic and industrial researchers, to entrepreneurs and engineers, to graduate students and postdocs.


For more information and registration, visit the official website.

Abstract:

It is known that models like large language models (LLMs) can often suggest colloquial plans given verbal descriptions of tasks, yet they are unable to reliably provide executable and verifiable plans given formally specified environments. In this talk, I will discuss a strand of efforts to have LLMs generate accurate and explainable plans in textual simulations. Instead of directly generating the plan or actions, LLMs are prompted to generate Planning Domain Definition Language (PDDL) that specifies the environment (domain file) and the task (problem file), which can then be deterministically solved with an off-the-shelf planner. In a 3-phase study, my collaborators and I first observed that it is possible but very challenging for LLMs to generate long-form code such as PDDL domain and problem files given textual specifications. Next, we devise methodologies for LLMs to iteratively generate and refine problem files while exploring a partially-observed, simulated, textual environment. Finally, we show that domain files are even more difficult to generate correctly, even on well-established planning tasks such as BlocksWorld. Finally, I will discuss ongoing efforts to improve said ability of structured generation and promising frontiers to explore.

Bio:
Li Harry Zhang is an assistant professor at Drexel University, focusing on Natural Language Processing (NLP) and artificial intelligence (AI). He obtained his PhD degree from the University of Pennsylvania advised by Prof. Chris Callison-Burch. Prior, he obtained his Bachelor's degree at the University of Michigan mentored by Prof. Rada Mihalcea and Prof. Dragomir Radev. His current research uses large language models (LLMs) to reason and plan via symbolic and structured representations. He has published more than 20 peer-reviewed papers in NLP and AI conferences, such as ACL, EMNLP, and AACL, that have been cited more than 1,000 times. He also consistently serves as Area Chair, Session Chair, and reviewer in those venues. Being a musician, producer, and content creator having over 50,000 subscribers, he is also passionate in the research of AI music and creativity.