Are you tired of drowning in a sea of resumes and losing top talent in the hiring whirlwind? Transform your hiring process through a different lens and learn about AI in the Workplace and the Applicant Tracking System (ATS). Whether you're a recent graduate seeking your first job or an undergraduate student looking to delve into more career-oriented opportunities, this workshop by SBU Career Center is designed to equip you with the knowledge and strategies needed to succeed.
Register here: https://stonybrook.joinhandshake.com/stu/events/1568133?
Bio: Wenhan Gao is a fourth-year Ph.D. student in Applied Mathematics under the supervision of Professor Yi Liu. He was also a Staff Research Scientist Intern at VISA Research, where he worked on large language models (LLMs) and multi-agent systems for commerce. Wenhan's research focuses on AI for Science (AI4Sci), with a particular emphasis on generative AI. His work looks deep into the fundamental mechanisms of AI models when applied to scientific tasks, and he strives to incorporate established scientific priors, such as symmetry, into model design. He has published papers as a first or corresponding author in leading AI and computational venues, including ICLR, ICML, NeurIPS, TMLR, ACL, and the Journal of Computational Physics. In addition to his research, Wenhan has served as a reviewer and oral session chair for top AI conferences and as a lecturer for both undergraduate and graduate courses at Stony Brook University.
Location: IACS Seminar Room or Zoom
This seminar will take place in person and online*
Join Zoom Meeting: https://stonybrook.zoom.us/j/
Meeting ID: 916 7009 3552
Passcode: 434045
You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes three short talks on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.
Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.
Speakers
Kriti Chopra, Computing & Data Sciences (CDS)
Thomas Flynn, Computing & Data Sciences (CDS)
Wenjie Liao, Chemistry Division
Tuesday, January 7, 2025, 12:00 pm -- CDS, Bldg. 725, Training Room
Join ZoomGov Meeting: https://bnl.zoomgov.com/j/1615289117?pwd=Hqkbj9itxWrFnkhZ8rQXHPInO2gxdF.1
Meeting ID: 161 528 9117
Passcode: 991382
IACS Student Seminar Speaker: Xiangyan Yang, Dept. of Applied Math & Statistics
Location: IACS Seminar Room or Zoom
Join Zoom Meeting: https://stonybrook.zoom.us/j/91650247483?pwd=fvAGEwadplJh7jFC5RWcdvZ5NWPJth.1
Meeting ID: 916 5024 7483
Passcode: 631055
In this dissertation, we address the problem of learning neural representations of humans in a holistic way. Given that the video data in the real world include multiple modalities (e.g., audio and video) and multiple identities, we develop multi-modal and multi-identity representations. First, we propose to reconstruct the 4D face geometry of humans by leveraging both audio and video information. In this way, the network produces accurate lip shapes and is robust to cases when either modality is insufficient. Next, we introduce a NeRF-based representation for audio-driven human face animation that achieves high-quality lip synchronization for cinematic content. Since humans communicate with their full body, combining body pose, hand gestures, and facial expressions, we extend the network to capture full-body human motion for multiple identities simultaneously. In order to better disentangle identity and non-identity specific information, we subsequently study non-linear interactions between latent factors of variation, and propose a specific multiplicative module. In this way, we learn a multi-identity NeRF that robustly animates human faces under novel expressions and achieves a significant decrease in the total training time. Similarly, we propose a multi-identity Gaussian splatting representation for human bodies, by constructing a high-order tensor. Assuming a low-rank structure, we learn a tensor decomposition that leads to a significant decrease in the total number of learnable parameters, as well as to a robust animation under novel poses. Last but not least, we propose to jointly synthesize audio and visual outputs from just text input. Given the recent rise of large language models, coupling text with natural-looking avatars can enhance the overall interaction between a human and an AI system.
Location: NCS 220 or Zoom
passcode: 045476
Abstract: Implicit functions have long been a fundamental representation for both 2D and 3D objects in computer graphics, playing a significant role in the field's early development. With the rise of 3D deep learning and the rapid advancement of neural rendering techniques, implicit representations of 3D shapes have regained significant attention in recent years. In this talk, I will present several recent research projects focusing on implicit function-based 3D reconstruction and neural rendering. Furthermore, I will discuss potential future developments in this dynamic and rapidly evolving field.
Biography: Ying He is an Associate Professor at the College of Computing and Data Science, Nanyang Technological University, where he also serves as the Director of the Centre for Augmented and Virtual Reality. His research interests lie in geometric computation and analysis, with applications spanning computer graphics, 3D vision, computer-aided design, multimedia, and wireless sensor networks. Dr. He is an active member of the technical program committees for major conferences on geometric modeling and has served on the editorial boards of IEEE Transactions on Visualization and Computer Graphics, Computer Graphics Forum, and Computational Visual Media. He has also taken on key leadership roles as General/Program Co-Chair for several conferences, including Shape Modeling International (SMI) 2022, Solid and Physical Modeling (SPM) 2022 & 2023, Geometric Modeling and Processing (GMP) 2014 & 2021, and Computational Visual Media (CVM) 2020. For more information, please visit https://personal.ntu.
Location: NCS 115
Abstract: Vision-language models that see and describe the world are now part of our daily lives, from internet search and accessibility tools to content generation and automatic moderation. However, as these models grow and become more widely used, their limitations have also become increasingly visible. In particular, it has been shown that these models are unable to reliably perform complex tasks that require abstraction and compositional reasoning. For example, they struggle to decompose an image or text into entities, attributes, and relations, and then reason over new combinations of these elements. As a result, we see generated content full of hallucinations, privacy leaks in images, and different types of biases in the model outputs.In this talk, I will outline a research agenda that aims to build trustworthy vision-language models in the age of generative AI. I will begin with compositional reasoning: how natural language inference can be used to decompose complex instructions and captions into atomic, verifiable statements, improving both evaluation and model behavior on tasks that require multi-step reasoning. I will then discuss how synthetic data and simulated environments can be used to train more reliable models, and how they can also stress-test models beyond standard benchmarks, revealing when models drop attributes, break object relations, or fail under distribution shifts. I will also share recent work on using hallucination correction as a signal to improve video-language alignment, and on privacy-preserving image understanding for blind and low-vision users. I will conclude with possible ways we can systematically probe, debug, and repair these models, turning synthetic perception into something we can trust in real-world deployments.
Speaker: Paola Cascante-Bonilla is a tenure-track Assistant Professor in the Department of Computer Science at Stony Brook University (SUNY). Before that, she was a Postdoctoral Associate at the University of Maryland Institute for Advanced Computer Studies (UMIACS), developing methods and metrics related to trustworthy machine learning. She received her Ph.D. in Computer Science at Rice University in 2024, working on Computer Vision, Natural Language Processing, and Machine Learning.Her research focuses on developing systems that enable compositional reasoning and common-sense inference through vision and language, while tackling issues such as cultural biases, data distribution, explainability, and trustworthy AI. Additionally, Cascante-Bonilla creates simulated environments for embodied agents to learn in a safe, controlled setting, aiming to facilitate effective collaboration and problem-solving for complex tasks by leveraging the implicit knowledge of large-scale pre-trained deep learning models.
Cascante-Bonilla is the recipient of the Ken Kennedy Institute SLB Graduate Fellowship (2022/23), she was selected as a Future Faculty Fellow by Rice's George R. Brown School of Engineering (2023) and as a Rising Star in EECS (2023).
Location: NCS 120