Abstract: Large Language Models (LLMs) have revolutionized how people interact with knowledge, offering unprecedented opportunities to accelerate the pace of scientific discovery. In this talk, I will discuss my research on the synergy between LLMs and scientific knowledge--specifically how these models extract, induce, and verify knowledge to automate the research lifecycle. First, I will cover our work on improving knowledge extraction from vast scientific literature, focusing on enabling models to comprehend long documents in a cost-efficient and comprehensive manner. I will describe a novel paradigm for representing document-level structured information as question-answer pairs and how we address the challenges of long-context understanding by leveraging global context through retrieval-augmented modeling. Next, I present our pioneering work on using LLMs for new scientific hypothesis generation. We introduce a framework employing reinforcement learning with fine-grained reward modeling and adaptive controllers.
This approach balances novelty, feasibility, and effectiveness to generate inspiring and actionable research hypotheses. Finally, I will discuss work on the first LLM Scientist for machine learning research. I will demonstrate how LLMs can move beyond hypothesis generation to participate in the execution and validation of scientific hypotheses, ensuring that the discovered knowledge is not only innovative but also grounded and verified.

Bio: Xinya Du is a tenure-track assistant professor at UT Dallas Computer Science Department. He earned a Ph.D. degree from Cornell University and was a Postdoctoral Research Associate at the University of Illinois (UIUC). He has also worked at Microsoft Research, Google Research, and Allen Institute AI. His research is on large language models, deep learning, and their applications in science.His work has been published in leading NLP and ML conferences (ACL, ICLR, NeurIPS). His research has received multiple recognitions, including a Best Paper Award at AAAI AI for Research and a Best Poster Award at ICML AI for Science workshop. His work was included in the list of Most Influential ACL Papers and has been covered by major media like New Scientist. He was named a Spotlight Rising Star in Data Science by the University of Chicago and is the recipient of several prestigious awards, including the Amazon Research Award, Cisco Research Award, Open Philanthropy Award, and the NSF CAREER Award.

Location: NCS 120

Imagine machines that can see beyond human limitations--drones locating hidden survivors, cameras predicting structural failures, or medical devices detecting tumors beneath the skin. Traditional vision systems are constrained by the boundaries of human perception, missing vast information present in light interactions. This talk explores the development of advanced vision systems that capture underutilized dimensions of light, model intricate light-scene interactions, and extract hidden 3D information--around corners, beneath surfaces, and at high speeds. By jointly developing novel imaging hardware, efficient rendering models, and physics-based learning algorithms, we aim to transcend conventional vision capabilities--unlocking critical applications in autonomous navigation, structural monitoring, and non-invasive medical imaging.

Speaker Bio:


Akshat Dave is a Postdoctoral Associate at MIT Media Lab in the Camera Culture group working with Prof. Ramesh Raskar. He received his Ph.D. from Rice University ECE Department in 2023 where he was advised by Prof. Ashok Veeraraghavan. His research lies at the intersection of applied optics, computer graphics, and computer vision. His research focuses on developing vision systems that go beyond human perception. His work has been recognized by Rice University's Best Thesis Award, OSA Best Paper Prize, and fellowships by Texas Instruments and Qualcomm.
TITLE: Sampling Using Langevin Diffusions Beyond the Worst-Case by Andrej Risteski of CMU


ABSTRACT: Many tasks involving generative models involve being able to sample from distributions parametrized as p(x) = e^{-f(x)}/Z where Z is the normalizing constant, for some function f whose values and gradients we can query. This mode of access to f is natural -- for instance sampling from posteriors in latent-variable models. Classical results show that a natural random walk, Langevin diffusion, mixes rapidly when f is convex. Unfortunately, even in simple examples, the applications listed above will entail working with functions f that are nonconvex.

We exhibit instances where Langevin diffusion (combined with other tools) can provably be shown to mix rapidly in instances of relevance in practice: distributions p that are multimodal, as well as distributions p that have a natural manifold structure on their level sets. 


Abstract:
Large language models (LLMs) have transformed the way humans write code, bringing unprecedented automation to software development. In this talk, I will first provide an overview of my research on enhancing LLMs' code intelligence, optimizing each step of the development pipeline towards more complex software engineering tasks. I will then delve into my key contributions, focusing on how to equip LLMs with a deeper, more comprehensive understanding of software programs. Finally, I will discuss the future of AI-driven software engineering, envisioning a new era of automation that is more reliable, intelligent, and cost-efficient.

Bio:
Yangruibo (Robin) Ding is a Ph.D. candidate in the Department of Computer Science at Columbia University. His research is at the intersection of Software Engineering and Machine Learning, focusing on developing large language models (LLMs) for code. He trains LLMs to generate, analyze, and refine software programs and constructs benchmarks to systematically evaluate LLMs in solving software engineering tasks. He also studies how to improve LLMs' reasoning capability to tackle complex programming tasks, such as debugging and patching. His interdisciplinary research has been published in top-tier conferences of software engineering, programming languages, natural language processing, and machine learning. He won an ACM SIGSOFT Distinguished Paper Award, an IEEE TSE Best Paper Runner-up, and received an IBM Ph.D. Fellowship.
Location:
NCS 120
Making sense of Twitter @ Bloomberg presented by Daniel Preotiuc-Pietro

ABSTRACT: The Bloomberg Terminal has provided ways for investors and journalists to sift through and understand the immense volume of tweets and discover financially-relevant content ever since the SEC approved the use of Twitter for company disclosures back in 2013.

In the first part of the talk, I will showcase how tweets impact financial markets and how Bloomberg is using Natural Language Processing methods to identify financially relevant tweets that move the markets. Our processing pipeline feeds directly to clients, journalists in the newsroom and powers several news analytic products offered by the company including trending companies and consumer sentiment for publicly traded equities.

However, understanding user pragmatic intent in individual tweets would allow us to gain deeper insights and enable new applications. I will present several recent research studies focused on understanding intent including identifying complaints and the roles with which vulgarity is used in social media and how these can help improve applications such as sentiment analysis and hate speech detection.

BIO: Daniel Preotiuc-Pietro is a Senior Research Engineer and Team Lead at Bloomberg LP, where he works on analyzing and building models for real-world large scale social media and news mining and information extraction. His research interests are focused on understanding the social and temporal aspects of text, especially from social media, with applications in domains such as Social Psychology, Law, Political Science and Journalism. Several of his research studies were featured in popular press including the Washington Post, BBC, New Scientist, Scientific American or FiveThirtyEight. He is a co-organizer of the Natural Legal Language Processing workshop series. Prior to joining Bloomberg LP, Daniel was a postdoctoral researcher at the University of Pennsylvania with the interdisciplinary World Well Being Project and obtained his PhD in Natural Language Processing and Machine Learning at the University of Sheffield, UK.
What AI tools are available to help with the scholarly research process? Are they helpful? What do they do and is it worth the time and energy to try them out? Join librarian Christine Fena to explore and compare established and emerging AI research tools such as Elicit, Scite, Consensus, and Undermind. The online workshop will provide a starting point to understanding what these tools are, the basics of how they work, and how AI research assistants might bring changes to your search process in the future. All are welcome!



Register here via Zoom.
Abstract: Language offers a uniquely powerful lens for understanding the mind: one that can access latent psychological realities often missed by traditional measurement tools. However, as language models expand their ability to capture semantics through context length, expansion into deeper levels of semantics is less explored, especially with respect to understanding cognitive patterns of authors. This dissertation proposes that we can uncover deeper cognitive and affective patterns that reflect more accurate underlying mental states by analyzing language at higher levels of discourse semantics and by modeling latent states.


First, the dissertation focuses on uncovering cognitive styles or thinking patterns manifesting in language. We demonstrate that modeling language at deeper semantic levels such as discourse relations, can unveil latent psychological states and traits, including cognitive styles that influence both mental health and behavior. Introducing a novel blend of transfer and active learning, we efficiently curated a new set of linguistic data on cognitive styles like dissonance. This approach allows for more precise measurement when dealing with rare-classes and low-resource tasks. As a second contribution, effective validation methods are introduced to language-based assessments of the underlying cognitive styles. Controlled behavioral experiments and online studies show that cognitive styles detected through linguistic signals reliably predict real-world behaviors such as decision-making and engagement with extremist communities, both at the individual and community levels, sometimes months in advance

The research further moves beyond traditional measurement tools like questionnaires and expert judgments, which rely on Classical Test Theory, by establishing that language-based assessments more closely approximate true psychological states. The mechanisms by which these assessments outperform standard tools are explained, highlighting their predictive power for behaviors linked to underlying traits. Finally, a more sophisticated approach is explored by modeling psychological outcomes with Item Response Theory (IRT), an improvement over Classical Test Theory. Adaptive language-based assessments are introduced, showing that targeted, adaptive testing based on latent IRT scores can efficiently and accurately capture multiple psychological dimensions.

Taken together, these contributions argue for a shift towards language-based psychological assessments. By integrating deeper discourse-level semantics with measurement theory, this dissertation charts a path towards truer scores of mental states: ones that are more precise, and reflective of the complexity of human cognition and emotions.

Speaker: Vasudha Varadarajan

https://stonybrook.zoom.us/j/99180374682?pwd=w2zZTkQsfunrBZhHgEweR54NjKabZ2.1&jst=2

You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes one short talk on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.

Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.

Learning Generalizable Program and Architecture Representations for Performance Modeling

Abstract: Performance modeling is an essential tool in many areas of computer science and engineering. However, existing performance modeling approaches have limitations, such as high computational cost, narrow flexibility, or restricted accuracy/generality. To address these limitations, this talk introduces PerfVec, a novel deep learning-based performance modeling framework that learns high-dimensional and independent/orthogonal program and microarchitecture representations. Once learned, a program representation can be used to predict its performance on any microarchitecture, and likewise, a microarchitecture representation can be applied in the performance prediction of any program. Additionally, PerfVec yields a foundation model that captures the performance essence of instructions, which can be directly used by developers in numerous performance modeling-related tasks without incurring its training cost. The evaluation demonstrates that PerfVec is more general and efficient than previous approaches. This talk will also introduce how PerfVec's design principles can benefit broader research areas.

Biography: Lingda Li is a computer scientist at Brookhaven National Laboratory. He is generally interested in computer architecture and programming model research, with focus on simulation/modeling, memory systems, and machine learning. Before joining BNL, he worked at the Department of Computer Science of Rutgers University as a postdoc to carry out GPGPU research. He obtained a PhD in computer architecture from the Microprocessor Research and Development Center at Peking University.

Location: CDS, Bldg. 725, Training Room

Join ZoomGov Meeting: https://bnl.zoomgov.com/j/1605837856?pwd=kYqJs4bVBt4E0cMCWR6GXH3wxzOoiw.1

Meeting ID: 160 583 7856
Passcode: 161580

Research challenges in using computer vision in robotics systems Abstract The past decade has seen a remarkable increase in the level of performance of computer vision techniques, including with the introduction of effective deep learning techniques. Much of this progress is in the form of rapidly increasing performance on standard, curated datasets. However, translating these results into operational vision systems for robotics applications remains a formidable challenge. This talk with explore some of the fundamental questions at the boundary between computer vision and robotics that need to be addressed. This includes introspection/self-awareness of performance, anytime algorithms for computer vision, multi-hypothesis generation, rapid learning and adaptation. The discussion will be illustrated by examples from autonomous air and ground robots.