Face Editing with Machine Learning presented by Zhixin Shu

ABSTRACT: The face is the most informative feature of humans and has been a long-standing research topic in Computer Vision and Graphics. Images of faces are also ubiquitous in photography and social media, and people have devoted significant resources to capturing and editing face images. Face editing can be broadly viewed as the encoding, manipulation and the decoding of some representations for face images. The challenges are that we want to manipulate an image in a controllable way and generate results that are both desirable and as realistic as possible. This thesis explores different Machine Learning-based face-editing approaches. I discuss the role of machine learning for achieving desirable edits by learning both the physical aspects as well as the statistical manifold of human faces. In my work for eye-editing, I discuss the importance of understanding multiple physical elements of a face image, such as shape, illumination, pose, etc. In a deep-learning-based approach, I introduce image formation domain knowledge to the construction and training of a neural network. This network provides transparent access to the disentangled representations of the aforementioned physical properties. With this network, we can achieve various face editing tasks in forms of representation manipulation. After that, I introduce Deforming Autoencoders, a network that learns to disentangle shape and appearance in an unsupervised manner. This disentanglement is beneficial for the learning of some other factors of variations, such as illumination and facial expression. In an extension of Deforming Autoencoders, we incorporate non-rigid structure-from-motion to learn a 3D morphable model for faces that only requires an image set for training. At last, I describe an image-to-image network for 3D face reconstruction, which also utilizes structure-from-motion in deep learning. With real face images in training, this network not only reconstructs 3D faces more accurately than prior art but also has better generalization ability in real-life testing cases.

You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes three short talks on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.

Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.

Tuesday, January 7, 2025, 12:00 pm -- CDS, Bldg. 725, Training Room

Speakers

Maria Zawadowicz, EBNN--ML for Atmospheric Aerosol Research

Mohammad Atif, CDS--An Extensible Digital Twin Framework

Guang Zhao, CDS--Pareto Prompt Optimization

Join ZoomGov Meeting: https://bnl.zoomgov.com/j/1615289117?pwd=Hqkbj9itxWrFnkhZ8rQXHPInO2gxdF.1

Meeting ID: 161 528 9117
Passcode: 991382

The New York Academy of Sciences Presents AI for Materials: From Discovery to Production - A Virtual Symposium

Event Description: This interdisciplinary symposium covers the application of artificial intelligence (AI) throughout the entire life cycle of new materials -- from materials simulations and synthesis to translating research into high-volume industrial production.

Event Link & Registration: nyas.org/AI4Materials2020




Time: 04/28 Wed 3pm-4pm

Remote Access
Join Zoom Meeting https://stonybrook.zoom.us/j/95617197636?pwd=KytzZ2pVRG9SZGpKZUtpNXJISjNjZz09 
Meeting ID: 956 1719 7636 Passcode: 924293

Title: Brain imaging genetics for Alzheimer's disease: integrated analysis and machine learning

Li Shen, Ph.D.
Professor of Informatics
Department of Biostatistics, Epidemiology and Informatics 
Perelman School of Medicine
University of Pennsylvania

Bio: Li Shen, Ph.D., is a Professor of Informatics in the Department of Biostatistics, Epidemiology and Informatics at the Perelman School of Medicine in the University of Pennsylvania. He is an elected fellow of the American Institute for Medical and Biological Engineering (AIMBE). He obtained his Ph.D. degree in Computer Science from Dartmouth College. The central theme of his lab is focused on developing computational and informatics methods for integrative analysis of multimodal imaging data, high throughput omics data, cognitive and other biomarker data, electronic health record (EHR) data, and rich biological knowledge such as pathways and networks, with applications to complex disorders. His research interests include medical image computing, biomedical informatics, machine learning, network science, imaging genomics, Alzheimer's disease, and big data science in biomedicine. He has authored over 280 peer-reviewed articles (h-index 57) in these fields. Dr. Shen's work has been continuously supported by the NIH and NSF, and he is presently the PI of multiple NIH and NSF grants on developing computational methods for various biomedical applications including brain imaging genomics, genetics of Alzheimer's disease, genetics of human connectome, mining drug effects from the EHR data, and big data mining in brain science. He is co-leading the NIA Alzheimer's Disease Sequencing Project AI4AD Consortium and oversees the imaging genomics aspect of this landmark project. Dr. Shen served as the Executive Director of the Medical Image Computing and Computer Assisted Intervention (MICCAI) Society Board of Directors during 2016-2019. He has chaired and co-chaired various professional meetings in medical image computing and bioinformatics. He is an Associate Editor of BioData Mining and Frontiers in Radiology (Section of AI in Radiology), and serves on the Editorial Board of Medical Image Analysis and Brain Imaging and Behavior.

Abstract: Brain imaging genetics is an emerging data science field, where integrated analysis of brain imaging and genetics data, often combined with other biomarker, clinical and environmental data, is performed to gain new insights into the genetic, molecular and phenotypic characteristics of the brain as well as their impact on normal and disordered brain function and behavior. Many methodological advances in brain imaging genetics are attributed to large-scale landmark biobank projects such as the Alzheimer's Disease Sequencing Project, the Alzheimer's Disease Neuroimaging Initiative, and the UK Biobank. Using the study of Alzheimer's disease as an example, we will discuss fundamental concepts, state-of-the-art statistical and machine learning methods, and innovative applications in this rapidly evolving field. We show that the wide availability of brain imaging genetics data from various large-scale biobanks, coupled with advances in biomedical statistics, informatics and computing, provides enormous opportunities to contribute significantly to biomedical discoveries in brain science and to impact the development of new diagnostic, therapeutic and preventative approaches for complex brain disorders such as Alzheimer's disease.

More details:
https://bmi.stonybrookmedicine.edu/sites/default/files/shen_li_04_28_flyer.pdf

Abstract: Pre-trained diffusion and flow matching models have made visual generation remarkably powerful, enabling high-fidelity synthesis of images and videos from natural language prompts. However, their behavior is still largely dictated by the pre-training data distribution and likelihood objective, which do not directly encode downstream desiderata such as fine-grained semantic alignment, controllability, or realism. This gap motivates post-training: starting from a base generator and further optimizing it with additional supervision signals derived from human or reward model preferences.This work presents post-training for visual generative models through two complementary case studies. First, Hummingbird addresses the problem of fine-grained contextual alignment in image-text-to-image generation. We introduce a multimodal context evaluator that scores the consistency between rich contextual descriptions and generated images, capturing fine-grained alignment beyond global CLIP similarity. By directly backpropagating these differentiable rewards through the diffusion sampler, Hummingbird substantially improves semantic faithfulness while preserving high visual quality.
Second, PISCES tackles post-training for text-to-video generation, where alignment is inherently semantic-spatio-temporal. We show that naive VLM-based rewards suffer from distributional mismatch and token-level misalignment, leading to reward hacking and suboptimal optimization. PISCES introduces a bi-objective, Optimal Transport (OT)-aligned reward module: distributional OT using Neural Optimal Transport to align text and video embedding distributions, and discrete, partial OT over a spatio-temporal cost matrix to capture semantic alignment at the token level. These rewards are integrated into both direct backpropagation and GRPO-style optimization to post-train state-of-the-art text-to-video generators. Together, Hummingbird and PISCES provide a unified view of how carefully designed visual reward models, coupled with OT-based representation alignment, can reliably improve the downstream behavior of pre-trained image and video generators.

Speaker: Minh Quan Le

Location: NCS 220

Zoom: https://stonybrook.zoom.us/j/94798224254?pwd=CFraer25qnpORbJ14aAVHRwaSJOjJM.1
Abstract: Large language models are prone to memorizing some of their training data. Memorized (and possibly sensitive) samples can then be extracted at generation time by adversarial or benign users. There is hope that model alignment---a standard training process that tunes a model to harmlessly follow user instructions---would mitigate the risk of extraction. However, we develop two novel attacks that undo a language model's alignment and recover thousands of training examples from popular proprietary aligned models such as OpenAI's ChatGPT. Our work highlights the limitations of existing safeguards to prevent training data leakage in production language models.

Speaker: Pegah Alipoormolabashi

Location: CS2311
Abstract: Large Language Models (LLMs) have transitioned from standalone prediction interfaces into integrated systems that incorporate content protection, external knowledge retrieval, and multi-step reasoning. While these functional layers expand model capabilities, they also introduce complex, inter-component dependencies that create novel and systemic security risks. This research provides a systematic deconstruction of the structural vulnerabilities emerging across these functional layers.

In this proposal, we evaluate the security boundaries of LLM systems through three pivotal dimensions:
The Content Layer: We present Watermark under Fire, revealing the inherent fragility of content-based tracing mechanisms under adaptive perturbations and highlighting the limitations of surface-level safety measures.
The Retrieval Layer: We introduce GraphRAG under Fire to examine the security of topology-aware knowledge integration. We reveal how graph-based indexing can be exploited as a structural lever for high-success poisoning attacks.
The Reasoning Layer: We detail AutoRAN, the first framework demonstrating the hijacking of internal safety reasoning in Large Reasoning Models (LRMs). This work proves that the transparency of the reasoning process itself creates a critical and exploitable attack surface.

Collectively, these studies demonstrate a systemic failure of add-on safety mechanisms in securing the broader LLM ecosystem. By identifying recurring patterns of exploitation across different system layers, this research provides the necessary foundation for transitioning from reactive patching to a more unified and architecturally-grounded approach to AI trustworthiness.

Speaker: Jiacheng Liang

Zoom: https://stonybrook.zoom.us/j/6669990420?pwd=dkY0eEw5YXpPSWo3RUE4OE1oVW90UT09&omn=97367037382
Meeting ID: 666 999 0420
Passcode: 075299
CSE 600 Seminar Series | Fall 2025


Abstract: Large reasoning models have demonstrated capabilities to solve competition-level math problems, answer deep research questions, and address complex coding needs. Much of this progress has been enabled by scaling of data: pre-training data to learn vast knowledge, fine-tuning data to learn natural language reasoning, and RL environments to refine that reasoning. In this talk, I will describe the current LLM reasoning paradigm, its boundaries, and the future of LLM reasoning beyond scaling. First, I will describe the state of reasoning models and where I think scaling can lead to some additional (though perhaps limited) successes. I will then shift to discussing more fundamental issues with models that scale will not resolve in the next few years. I will touch on four current limitations: outdated knowledge, generator-validator gaps, limited creativity, and poor compositional generalization. In all cases, fundamental limitations of LLMs or of supervised learning in general make these problems challenging, inviting future study and novel solutions beyond scaling.

Bio: Greg Durrett is an associate professor in the Department of Computer Science and the Center for Data Science at New York University. His research is broadly in the areas of natural language processing and machine learning. Currently, his group's focus is on reasoning about knowledge in text, verifying correctness of generation methods, and studying how to make progress on problems that defy LLM scaling. He is a 2023 Sloan Research Fellow and a recipient of a 2022 NSF CAREER award. He has served in numerous roles for ACL conferences, recently as a member of the NAACL Board since 2024 and as Senior Area Chair for ACL 2025 and EMNLP 2025. He received his BS in Computer Science and Mathematics from MIT and his PhD in Computer Science from UC Berkeley, where he was advised by Dan Klein.
AI Seminar: Computational Pathology: Deep Learning, Classification and
Predicting the Future  - Joel Saltz

Abstract:  Pathologists have been looking at tissue through microscopes since the 1800s.  During each pathologist's career,  he or she views slides having  roughly 1,000,000,000,000 cells. Deep learning methods are rapidly being developed to assimilate the huge amount of information walked inside of tissue images and to use this information to predict outcomes and responses to treatments.

Stony Brook is a leader in this type of multi-disciplinary work. I will provide an overview of Stony Brook computational Pathology efforts and articulate how these have the potential to create biomedical advances as well as to drive development of new computer science. 


Bio: Dr. Joel Saltz is a leader in research on advanced information technologies for large scale data science and biomedical/scientific research. He has developed innovative pathology informatics methods, including: the first published whole slide virtual microscope system; pioneering pathology computer-aided diagnosis techniques; and methods for decomposing pathology images into features and linking those features to cancer omics, response to treatment and outcome. He has broken new ground in big data through development of the filter-stream based DataCutter system, the map-reduce style Active Data Repository and the inspector-executor runtime compiler framework. He has also been an active contributor in clinical informatics, having developed
predictive models for hospital readmissions, point of care laboratory testing quality assurance systems, decision support systems for electrophoresis interpretation and graphical user interfaces to support clinical data warehouse queries. Dr. Saltz has been a pioneer in establishing the field of biomedical informatics; he founded and built two highly successful departments of biomedical informatics, one at Ohio State University and one at Emory University. In 2013, he came to Stony Brook as Vice President for Clinical Informatics and Founding Department Chair of Biomedical Informatics - to create a living laboratory for biomedical informatics and to create a third unique biomedical informatics department dually housed in the School of Medicine and the College of Engineering. Dr. Saltz is trained both as a computer scientist and as a physician through the MSTP program at Duke University. He has deep experience in computer science, having served on the computer science faculties at Yale University and the University of Maryland. He completed his residency in clinical
pathology at Johns Hopkins University and he is a practicing, board-certified clinical pathologist. 
Hyperscale Verification in Microsoft Azure talk by Nikolaj Bjorner

Abstract: Cloud providers are increasingly embracing network verification for managing complex datacenter network infrastructure. Microsoft's Azure cloud infrastructure integrates the SecGuru tool, which leverages the Z3 Satisfiability Modulo Theories solver, for checking network access
control lists. It also integrates a verifier that uses both custom verification algorithms and Z3 that checks correctness of forwarding tables in Azure data-centers. These tools assure that the network is configured to preserve desired intent over hundreds of thousands of network devices. We describe our experiences building and running SecGuru for network verification in Azure.

Finally we mention recent advances in Z3, including a distributed version of Z3 that scales with Azure's elastic cloud. It integrates recent advances in lookahead and distributed SAT solving for Z3's
engines for SMT. A different recent advance includes integration of DNNs to learn variable branching strategies for high-performance SAT solvers, including MiniSAT, Glucose and Z3's SAT solver.

Bio: Nikolaj Bjorner is a Principal Researcher at Microsoft Research, Redmond, working in the area of Automated Theorem Proving and Software Engineering. His current main line of work is around the state-of-the art theorem prover Z3, which is used as a foundation of several software engineering tools. Z3 received the 2015 ACM SIGPLAN Software System award and most influential tool paper in the first 20 years of TACAS in 2014, and test of time award at ETAPS 2018. Together with Leonardo de Moura received the CADE 2019 Herbrand award for contributions to SMT and applications. Previously, he developed the DFSR, Distributed File System - Replication, and Remote Differential
Compression protocols, RDC, part of Windows Server since 2005 and before that worked on distributed file sharing systems at a startup, and program synthesis and transformation systems at the Kestrel Institute. He received his Master's and PhD degrees in computer science from Stanford University.