How to Do Spectral Learning at Scale for Science and Engineering

Abstract: Spectral decompositions such as singular value decompositions (SVDs) and eigenvalue decompositions (EVDs) are central tools across a vast swath of scientific computing and machine learning, with abundant engineering applications. Yet many modern methods for learning such decompositions in high dimensions struggle with instability, bias, and poor scalability, even when approximation power is not the limiting factor. I argue that these difficulties are not intrinsic to spectral problems, but instead arise from a shared reliance on Rayleigh-quotient-based constrained optimization, which forces explicit orthogonality handling through penalties, normalization, or whitening.
To address these challenges, I present a reformulation based on unconstrained variational objectives that implicitly encode spectral structure, eliminating the need for orthogonalization and ad-hoc regularization. This perspective leads to a conceptually simpler and scalable parametric framework for learning ordered spectral representations via nested optimization. The resulting framework is well matched to diverse settings in science and engineering. As examples, I demonstrate its effectiveness on eigenvalue problems for linear PDEs such as the Schrödinger equation, spectral (Koopman) analysis of nonlinear dynamical systems such as molecular dynamics, and structured representation learning with deep neural nets. Collectively, these examples illustrate how abandoning Rayleigh-quotient-based formulations resolves long-standing optimization pathologies across domains.

Bio: Jongha (Jon) Ryu is a postdoctoral associate at MIT EECS. He received his Ph.D. in Electrical and Computer Engineering from UC San Diego. His research develops statistical and mathematical foundations for scientific machine learning, with a focus on scalable spectral methods, efficient generative modeling, and reliable uncertainty quantification for scientific and engineering systems.

Location: NCS 120

Join us to share your thoughts about teaching, learning, and AI!

The landscape of higher education is rapidly evolving with the integration of Artificial Intelligence (AI). Through the Institute on AI, Pedagogy, and the Curriculum with AAC&U, we are exploring ways that we can better address AI in teaching and learning. We want to hear your experiences, your concerns, and your ideas.

This is an open discussion for all faculty and staff to share their perspectives on the opportunities and challenges AI presents in our academic environment.

We'll be exploring critical questions like:

  • In the age of AI, what are the opportunities you see for enriching the classroom and curriculum? How can it enhance student learning or your professional practice?

  • What are the most significant challenges and concerns that AI raises for you regarding academics, student integrity, or your workload?

  • What resources (tools, training, technical support, policy guidance, etc.) do you need to feel confident and successful in the age of AI?

Dates/Times:

  • Tuesday, 2/3 at 2pm

  • Friday, 2/6 at 9:30am

Please register in advance for the Zoom link.

Can't Make It? Share Your Feedback!

We understand schedules are tight. If you cannot attend the live discussion, you can still share your thoughts! Join our AI Zoom Room to share your thoughts via video recording or email rose.tirotta-esposito@stonybrook.edu with your comments and ideas.

Videos will not be shared publicly and comments will only be shared in aggregate.

Your input is vital. From pedagogy to assessment, your insights will be critical. We look forward to a thoughtful and productive conversation!

  • Dr. Rose Tirotta-Esposito (Assistant Provost; Director of CELT)

  • Dr. Elizabeth Hewitt (Associate Professor in the Department of Technology and Society (DTS) in the College of Engineering and Applied Sciences)

  • Chris Kretz (Associate Librarian and Head of Academic Engagement at SBU Libraries)

  • Prof. Rajiv Lajmi (Assistant Professor in the School of Health Professions and Chair of Applied Health Informatics)

  • Dr. Matthew Salzano (Assistant Professor in the Department of Communication in the School of Communication and Journalism)

The Hudson River Estuary (HRE) and New York Bight (NYB) are closely connected, with HRE acting as crucial areas where many NYB marine species spawn and grow. Understanding how these biotic and abiotic environments interact, especially with rapid climate change, is key to better managing fisheries and conserving ecosystems. To better understand the HRE-NYB ecosystem, we develop a comprehensive ecosystem model that links physical and biological processes. Using data from long-term monitoring programs, we analyze ecological patterns and identify key factors regulating the ecosystem. We use this information to develop a model that mimics the food web from tiny plankton to large predators in the ecosystem. This model can help us better understand how changes in the environment, like rising temperatures, and human activities such as fishing affect marine lives and ecosystem over time. The insights from this model can support smarter fisheries management and efforts to conserve marine ecosystems in the HRE-NYB region.

IACS Student Seminar Speaker: Xiangyan Yang, Dept. of Applied Math & Statistics

Location: IACS Seminar Room or Zoom

Join Zoom Meeting: https://stonybrook.zoom.us/j/91650247483?pwd=fvAGEwadplJh7jFC5RWcdvZ5NWPJth.1
Meeting ID: 916 5024 7483
Passcode: 631055

Abstract: The remarkable success of large foundational models, such as LLMs and diffusion models, is built on their learning over vast amounts of static data from the Internet. However, human learning and problem-solving are fundamentally interactive processes--humans learn by engaging with their environment, tools, search engine, and feedback loops, iteratively refining their understanding and decisions. This gap between the interactivity of human learning and the static nature of model training raises a critical question: how can we imbue foundational models with the capacity for meaningful interaction?

In this talk, I will explore methods to enhance foundational models by incorporating interaction with the external environment. I will discuss strategies such as leveraging external tools, compilers, function calls to provide dynamic feedback to enhance foundation models. By drawing inspiration from human's interactive learning processes, I demonstrate how interaction-driven learning can lead to models that are not only more accurate but also more adaptable to real-world applications.

This work bridges the gap between static training paradigms and the dynamic, iterative nature of human intelligence, paving the way for a new generation of interactive AI systems.

Bio: Wenhu Chen has been an assistant professor at the Computer Science Department in University of Waterloo and Vector Institute since 2022. He obtained the Canada CIFAR AI Chair Award in 2022 and CIFAR Catalyst Award in 2024. He has worked for Google Deepmind as a part-time research scientist since 2021. Before that, he obtained his PhD from the University of California, Santa Barbara under the supervision of William Wang and Xifeng Yan. His research interest lies in natural language processing, deep learning and multimodal learning. He aims to design models to handle complex reasoning scenarios like math problem-solving, structure knowledge grounding, etc. He is also interested in building more powerful multimodal models to bridge different modalities. He received the Area Chair Award in AACL 2023, the Best Paper Honorable Mention in WACV 2021, the Best Paper Finalist in CVPR 2024, and the UCSB CS Outstanding Dissertation Award in 2021.
University Libraries Present: Qualitative data can be challenging to analyze and interpret effectively. In this workshop, SBU Libraries' Data Literacies Lead, Ahmad Pratama will show you how to extract meaningful insights from textual data, including understanding sentiment trends. Learn to explore qualitative data with Python using word clouds, basic natural language processing (NLP) techniques, and lexicon-based sentiment analysis with VADER.
https://stonybrook.zoom.us/meeting/register/k0r6mPYCRayk2AOGmyd0qw#/registration

Learn how to prompt AI to help clean datasets and write formulas in Google Sheets.

When you have a messy dataset, it can take a lot of time to clean it up before you can start analyzing. Can AI help? In this workshop, we'll collect live data and then use Gemini AI (the stand alone tool) to help clean up the data. Then, we'll use it to help do some analysis. Because we'll be working with live data live in Gemini, we don't know exactly what will happen, but that's the reality of data and data cleaning!

In this session, you will

  1. Craft effective AI prompts to generate Google Sheets formulas for data analysis and manipulation
  2. Utilize Gemini to develop regular expression formulas to extract, reformat, clean text-based data
  3. Develop formulas for numerical analysis using Gemini AI

https://stonybrookuniversity.co1.qualtrics.com/jfe/form/SV_dht1o3rNzlZhHka?source=event+manager&session=0815250900sheets

You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes three short talks on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.

Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.

Tuesday, November 12, 2024, 12:00 pm -- CDS, Bldg. 725, Training Room

Speakers

Carlos Soto, CDS

Yi Huang, CDS

Kevin Yager, CFN

Abstract: Many foundation models for digital pathology have been released recently. Benchmarking available methods then becomes paramount to get a clearer view of the research landscape. For this reason, we introduce THUNDER, a tile-level benchmark for digital pathology foundation models, allowing for efficient comparison of many models on diverse datasets with a series of downstream tasks, studying their feature spaces and assessing the robustness and uncertainty of predictions informed by their embeddings. Such foundation models are often used as feature extractors and combined with Multiple Instance Learning (MIL) aggregators at downstream time. Such aggregation must be efficient and reliable. We will focus on two specific examples of this: (I) HistAug, a fast and efficient generative model for controllable augmentations in the latent space of foundation models to perform data augmentation for MIL, and (ii) CAR-MIL, a method based on counterfactual attention regularisation to improve the reliability of attention maps of MIL methods.

Short-bio: Pierre Marza is a Postdoctoral Researcher at CentraleSupelec in the Biomathematics team of the MICS lab, studying Computer Vision and Deep Learning for Medical Imaging, with a focus on Digital Pathology. Prior to this, he was a PhD student at INSA Lyon, in the LIRIS and CITI labs, advised by Christian Wolf, and co-advised by Laetita Matignon and Olivier Simonin. He studied Visual Navigation, Embodied AI, Spatial Reasoning, more specifically how to learn to represent 3D space, generalize to new environments and master diverse tasks from light supervision.

Location: NCS 220

Zoom: https://stonybrook.zoom.us/j/94798224254?pwd=CFraer25qnpORbJ14aAVHRwaSJOjJM.1

Imagine machines that can see beyond human limitations--drones locating hidden survivors, cameras predicting structural failures, or medical devices detecting tumors beneath the skin. Traditional vision systems are constrained by the boundaries of human perception, missing vast information present in light interactions. This talk explores the development of advanced vision systems that capture underutilized dimensions of light, model intricate light-scene interactions, and extract hidden 3D information--around corners, beneath surfaces, and at high speeds. By jointly developing novel imaging hardware, efficient rendering models, and physics-based learning algorithms, we aim to transcend conventional vision capabilities--unlocking critical applications in autonomous navigation, structural monitoring, and non-invasive medical imaging.

Speaker Bio:


Akshat Dave is a Postdoctoral Associate at MIT Media Lab in the Camera Culture group working with Prof. Ramesh Raskar. He received his Ph.D. from Rice University ECE Department in 2023 where he was advised by Prof. Ashok Veeraraghavan. His research lies at the intersection of applied optics, computer graphics, and computer vision. His research focuses on developing vision systems that go beyond human perception. His work has been recognized by Rice University's Best Thesis Award, OSA Best Paper Prize, and fellowships by Texas Instruments and Qualcomm.
As generative AI tools become increasingly prevalent in education, their impact on collegiate writing raises important questions about creativity, academic integrity, and effective teaching practices. This panel brings together faculty and students to share perspectives on the opportunities and challenges that AI presents in an academic setting. Through an open dialogue, participants will engage in meaningful conversations, allowing for a deeper understanding of each other's viewpoints and fostering collaboration. Students and faculty will explore diverse ways AI can be used in teaching and learning and seek solutions to utilize AI writing tools ethically. This exchange aims to build a community of trust and shared knowledge, ensuring that AI's role in education is both innovative and responsible.

Register here: https://stonybrook.zoom.us/meeting/register/tJAqdOitpjIpHtDGAsGBfEb3ah0YIzhIJolN