Abstract: Large Language Models (LLMs) have revolutionized how people interact with knowledge, offering unprecedented opportunities to accelerate the pace of scientific discovery. In this talk, I will discuss my research on the synergy between LLMs and scientific knowledge--specifically how these models extract, induce, and verify knowledge to automate the research lifecycle. First, I will cover our work on improving knowledge extraction from vast scientific literature, focusing on enabling models to comprehend long documents in a cost-efficient and comprehensive manner. I will describe a novel paradigm for representing document-level structured information as question-answer pairs and how we address the challenges of long-context understanding by leveraging global context through retrieval-augmented modeling. Next, I present our pioneering work on using LLMs for new scientific hypothesis generation. We introduce a framework employing reinforcement learning with fine-grained reward modeling and adaptive controllers.
This approach balances novelty, feasibility, and effectiveness to generate inspiring and actionable research hypotheses. Finally, I will discuss work on the first LLM Scientist for machine learning research. I will demonstrate how LLMs can move beyond hypothesis generation to participate in the execution and validation of scientific hypotheses, ensuring that the discovered knowledge is not only innovative but also grounded and verified.

Bio: Xinya Du is a tenure-track assistant professor at UT Dallas Computer Science Department. He earned a Ph.D. degree from Cornell University and was a Postdoctoral Research Associate at the University of Illinois (UIUC). He has also worked at Microsoft Research, Google Research, and Allen Institute AI. His research is on large language models, deep learning, and their applications in science.His work has been published in leading NLP and ML conferences (ACL, ICLR, NeurIPS). His research has received multiple recognitions, including a Best Paper Award at AAAI AI for Research and a Best Poster Award at ICML AI for Science workshop. His work was included in the list of Most Influential ACL Papers and has been covered by major media like New Scientist. He was named a Spotlight Rising Star in Data Science by the University of Chicago and is the recipient of several prestigious awards, including the Amazon Research Award, Cisco Research Award, Open Philanthropy Award, and the NSF CAREER Award.

Location: NCS 120

As generative AI (GenAI) continues to reshape the educational landscape, educators must critically examine its implications for course design. How can we adapt our courses to ensure meaningful learning in a post-GenAI world? How can we harness its potential while mitigating risks to student learning? This seminar explores the evolving role of GenAI in higher education, emphasizing learner-centered teaching practices--such as backward design, transparency, and active learning--as essential strategies for navigating both the opportunities and challenges posed by GenAI. We will examine how GenAI disrupts traditional models of teaching and assessment, highlighting course design choices that intentionally promote deep learning and critical thinking in this new era.

Speaker Bio: Dr. Lourdes Alemán is an Associate Director at MIT's Teaching and Learning Lab (TLL). She earned her Ph.D. in Biology from MIT, studying RNA interference (RNAi) with Professor Phil Sharp. She later completed a postdoc in curriculum innovation with Professor Graham Walker's HHMI MIT Education Group. As a postdoc and research scientist, she helped develop software tools for teaching experimental design and data analysis, including collaborations with the MIT-Haiti Initiative. Before joining TLL, she worked at MIT's Open Learning, supporting MIT faculty in blended and online education. At TLL, Lourdes trains graduate students and postdocs in college-level teaching, advises faculty on classroom innovation, and previously designed and taught a hands-on biology module on novel antibiotic discovery for first-year students. She has served on university committees focused on mentoring and advising. Drawing from her experiences as a Cuban immigrant student, she developed MIT's first curriculum on growth mindset and co-founded Flipping Failure, a campus-wide initiative for students to share their stories of academic challenges and the strategies they have used to overcome them.

Abstract: Capturing the spatio-temporal (4D) dynamics of humans has been a long standing research problem in computer vision and graphics. Synthesizing photorealistic human avatars has broad applications, ranging from immersive telepresence in AR/VR and the movie industry, to enriching the education and healthcare systems. Earlier approaches relied on hand-engineered models that use a small amount of data from one or more subjects. With the advent of neural networks, training on large datasets enhanced the output visual quality. Currently, the combination of neural networks with graphics techniques has achieved natural-looking human animation. However, most approaches are identity-specific, trained only on a single identity, and use only one modality.

In this dissertation, we address the problem of learning neural representations of humans in a holistic way. Given that the video data in the real world include multiple modalities (e.g., audio and video) and multiple identities, we develop multi-modal and multi-identity representations. First, we propose to reconstruct the 4D face geometry of humans by leveraging both audio and video information. In this way, the network produces accurate lip shapes and is robust to cases when either modality is insufficient. Next, we introduce a NeRF-based representation for audio-driven human face animation that achieves high-quality lip synchronization for cinematic content. Since humans communicate with their full body, combining body pose, hand gestures, and facial expressions, we extend the network to capture full-body human motion for multiple identities simultaneously. In order to better disentangle identity and non-identity specific information, we subsequently study non-linear interactions between latent factors of variation, and propose a specific multiplicative module. In this way, we learn a multi-identity NeRF that robustly animates human faces under novel expressions and achieves a significant decrease in the total training time. Similarly, we propose a multi-identity Gaussian splatting representation for human bodies, by constructing a high-order tensor. Assuming a low-rank structure, we learn a tensor decomposition that leads to a significant decrease in the total number of learnable parameters, as well as to a robust animation under novel poses. Last but not least, we propose to jointly synthesize audio and visual outputs from just text input. Given the recent rise of large language models, coupling text with natural-looking avatars can enhance the overall interaction between a human and an AI system.

Location: NCS 220 or Zoom

The Art Department is hosting a guest artist exhibition, featuring the work of Young Maeng. The Opening Reception will be held on October 10th at 5 PM. Additionally, Young Maeng will be giving a talk on 'AI and Painting' on Oct 9 at 4:30 PM at the Future Histories Studio. Exhibition Location: Gallery Unbound, 3rd Floor, Staller Center, Stony Brook University
AI/ML Working Group Seminar

Time/Date: 12:00 PM ET, Tuesday, March 1st, 2022

Seminar Speaker: Yen-Chi (Sam) Chen, CSI, Brookhaven National Laboratory

Title: When reinforcement learning meets quantum computing

Abstract: Recently, reinforcement learning (RL) has demonstrated
various applications with superhuman performance such as mastering the
game of Go.  Meanwhile, the development of quantum computing hardware
shed light on building practical quantum applications to tackle
previously unsolved problems. What will happen if we combine these two
fascinating techniques? In this talk, I will present the recent
progress in quantum RL as well as using classical RL to help certain
tasks in quantum computing.



Host: Meifeng Lin, Computational Science Initiative

_______________________________________________

Nicole Medaglia is inviting you to a scheduled ZoomGov meeting.

Join ZoomGov Meeting
https://bnl.zoomgov.com/j/1619877909?pwd=T041dGl4SURUK0Mwbmp0b1QvVjVtZz09

Meeting ID: 161 987 7909
Passcode: 338057
One tap mobile
+16692545252,,1619877909#,,,,*338057# US (San Jose)
+16468287666,,1619877909#,,,,*338057# US (New York)

Dial by your location
        +1 669 254 5252 US (San Jose)
        +1 646 828 7666 US (New York)
        +1 669 216 1590 US (San Jose)
        +1 551 285 1373 US
Meeting ID: 161 987 7909
Passcode: 338057
Find your local number: https://bnl.zoomgov.com/u/abMDS0zjuq

Join by SIP
1619877909@sip.zoomgov.com

Join by H.323
161.199.138.10 (US West)
161.199.136.10 (US East)
Meeting ID: 161 987 7909
Passcode: 338057
Over the past decade, Artificial Intelligence (AI) has made stunning advances, from mastering language to solving the structure of proteins. These breakthroughs arise from more than forty years of work in neural networks, where ideas from neuroscience have inspired solutions in AI. In this lecture, Anthony Zador, MD, PhD, will explore how reverse engineering the brain's computations has driven progress in both fields, and how this back-and-forth between neuroscience and AI is set to grow even stronger -- with brain-inspired designs driving new AI advances while AI tools transform our understanding of how the brain works.

Speaker:
Dr. Zador works at the intersection of neuroscience and artificial intelligence. He is the Alle Davis Harris Professor of Biology at Cold Spring Harbor Laboratory, where he served as Chair of Neuroscience. He was named one of Foreign Policy's 100 Leading Global Thinkers and is a recipient of the Brain Research Foundation Fellowship, the Gill Symposium Transformative Investigator Award, and the Allen Distinguished Investigator Award.

Watch online at stonybrook.edu/live
TITLE: Towards a Theory of Encode/Decoder Architectures by Andrej Risteski of CMU

ABSTRACT: A common choice of architecture in representation learning (i.e., learning a good embedding of the data) is an encoder/decoder architecture, which tries to map a part of the input into a good latent representation (via an encoder), and predict the remaining part of the input (via a decoder). Two common examples are universal machine translation: where one tries to learn to translate between any pair of a set of languages via a common latent language, given paired up corpora for only a part of the pairs; and contextual encoders -- where one tries to predict a part of the image, given the rest of the image.
 
We will give a framework for analyzing the sample complexity of such architectures -- i.e., how many pairs of languages do we need to have paired up corpora for? How many image prediction tasks do we have to solve to get a good representation?