The North East AI Agents Day Organizing Committee invites you to '2026 AI Agents Day.'

The goal of this workshop is to offer a comprehensive overview of AI agents, bring ML, Systems, and HCI research communities together to share progress, discuss common problems and evaluation setups, and identify opportunities for collaboration. We aim to bring together attendees from diverse disciplines to foster interdisciplinary collaboration and discuss open research questions.

Location: Jane Street Offices, New York

Register here.
Abstract: As we enter the AI era, domain scientists face a critical question: What can we do to harness AI effectively for scientific discovery? AI has demonstrated remarkable capabilities, from accelerating simulations to uncovering hidden patterns in complex datasets. While these advancements offer unprecedented opportunities, they also raise concerns--AI models often function as black boxes, making it difficult to connect their outputs to established scientific principles. This lack of interpretability can undermine trust and limit adoption, particularly in fields like meteorology where physical understanding is critical.
In this talk, I will explore how interpretable AI can bridge this gap, highlighting its potential to generate explicit, physically meaningful equations rather than opaque neural networks. Through four case studies from my lab, I will showcase how interpretable AI can enhance scientific understanding:
  1. Satellite Precipitation Retrieval: Using AI-based approaches to interpret precipitation retrieval algorithms from AMSU data, we identified critical microwave channels (89 and 150 GHz) that directly link to physical processes in the atmosphere.
  2. Quantitative Precipitation Estimation (QPE): By applying symbolic regression models to polarimetric radar data, we derived mathematical expressions that outperform traditional Z-R relationships and existing QPE algorithms, offering new insights into rainfall microphysics.
  3. Tornado Probability Prediction: Leveraging reinforcement learning-based symbolic deep learning models, we developed interpretable equations that outperform the traditional Significant Tornado Parameter (STP) index, providing a clearer understanding of the relationships between key atmospheric variables and tornado risk.
  4. Domain-Aware Symbolic Regression for Scientific Equations: In our latest work, we introduced a symbolic regression framework that incorporates domain-specific symbol priors extracted from thousands of scientific publications. By encoding common mathematical structures--such as the prevalence of trigonometric functions in physics or logarithmic forms in biology--into a tree-structured reinforcement learning model, we improved both the accuracy and interpretability of discovered equations. This approach accelerates convergence, enforces physical plausibility, and reveals new governing relationships in climate and geophysical data.
Through these examples, I hope to spark discussion on the evolving role of domain scientists in the AI era and inspire new ways to integrate AI with physical understanding in atmospheric research.

IACS Seminar Speaker: Yixin Wen, University of Florida

Location: IACS Seminar Room or Zoom

Join Zoom Meeting: https://stonybrook.zoom.us/j/97596399106?pwd=0PBvElFLqov3biO6OlQxSWLWudkIuH.1
Meeting ID: 975 9639 9106
Passcode: 096213

Abstract: Traditional questionnaires remain the primary method for assessing psychological outcomes and beliefs, capturing individuals' and populations' inner states. This dissertation presents an alternative computational method that overcomes key limitations in current mental health monitoring, particularly in spatiotemporal resolution, responses to major events, and automatic belief identification. By analyzing ∼1 billion Tweets from 2 million geo-located users, we created a big data pipeline for estimating depression and anxiety at the county-week level. These Language-Based Mental Health Assessments (LBMHA) demonstrated higher reliability and validity than traditional survey measures. Our approach effectively captured mental health trends and highlighted significant increases in mental illness following major events. Using the LBMHA pipeline, we conducted quasi-experiments, research designs that simulate randomized control trials, to generate explanations for mental health changes due to COVID-19 incidence/death. Utilizing these time-series analyses, we conducted discontinuity forecasting for community-specific anxiety shifts using statistical learning via ensemble and contextual models. To likewise investigate individual internal states, we created a novel task and annotated dataset for self belief language identification. Our fine-tuned language model for self-belief classification, despite its relatively small scale, outperformed GPT-4o. The self belief topics identified by our model successfully predicted depression, anxiety, and stress, offering insights into the relationship between self-conceptualization and mental health. The adoption of scalable language-based assessments with modern distributed computation presents a promising avenue for advancing community and individual mental health research.

Speaker: Siddharth Mangalik

https://stonybrook.zoom.us/j/91251321639?pwd=faggV5jZ7ByFDCFmnLXD3HiYxjQ1Eb.1&jst=2

Time: Jan 26, 2021 03:00 PM Eastern Time (US and Canada)

All are welcome!

Zoom Meeting:
https://stonybrook.zoom.us/j/93818552212?pwd=ajZkT2x4a2tiaDJUL1h3VFhLZEgwQT09

Meeting ID: 938 1855 2212
Passcode: 802722

Title: Data-Driven Document Unwarping

Abstract: Capturing document images is a common way to digitize and record physical documents due to the ubiquitousness of mobile cameras. To make text recognition easier, it is often desirable to digitally flatten a document image when the physical document sheet is folded or curved. However, unwarping a document from a single image in natural scenes is very challenging due to the complexity of document sheet deformation, document texture, and environmental conditions. Previous model-driven approaches struggle with inefficiency and limited generalizability. In this thesis, I investigate several data-driven approaches to tackle the document unwarping problem.

Data acquisition is the central challenge in data-driven methods. I first design an efficient data synthesis pipeline based on 2D image warping and train DocUNet, the pioneering data-driven document unwarping model, on the synthetic data. A benchmark dataset is also created to facilitate comprehensive evaluation and comparison. To improve the unwarping performance by training on more realistic data, I introduce the Doc3D dataset and DewarpNet. Supervised by 3D shape ground truth in Doc3D, DewarpNet is significantly better than DocUNet. DocUNet and DewarpNet depend on the synthetic data for the ground truth deformation annotation. To exploit the real-world images, I propose PaperEdge, a weakly supervised model trained with in-the-wild document images with easy-to-obtain boundary information. PaperEdge surpasses DewarpNet by utilizing both the synthetic data and weakly annotated real data in the Document In the Wild (DIW) dataset. Finally, I propose to incorporate the 3D physical constraints in training DewarpNet and PaperEdge. The constraints regulate the possible deformations on document papers. I also propose to augment the Doc3D and DIW dataset by introducing an online document segmentation model and better hardware.

Visual Analytics and Machine Learning for Biomedical Imaging Diagnosis

 

Arie Kaufman

 

We present an integrated approach using visual analytics and machine learning (ML) to diagnose abnormalities in 3D radiological imaging and biological microscopes. The primary example will involve 3D virtual pancreatography (VP), a novel visualization-ML procedure and application for non-invasive diagnosis and classification of pancreatic lesions, the precursors of pancreatic cancer. Currently, non-invasive screening of patients is performed through visual inspection of 2D axis-aligned CT images, though the relevant features are often not clearly visible nor automatically detected. VP is an end-to-end visual diagnosis system that includes an ML-based automatic segmentation of the pancreatic gland and the lesions, a semi-automatic approach to extract the primary pancreatic duct, an ML-based automatic classification of lesions into four prominent types, and specialized 3D and 2D exploratory visualizations of the pancreas, lesions and surrounding anatomy. We combine volume rendering with pancreas- and lesion-centric visualizations and measurements for effective diagnosis. We designed VP through close collaboration and feedback from expert radiologists, and evaluated it on multiple real-world CT datasets with various pancreatic lesions and case studies examined by the expert radiologists. Other applications include virtual colonoscopy, COVID-19, pathology, brain neurites, etc.


Biography: Arie Kaufman is Distinguished Professor and formerChair of the Department of Computer Science at Stony Brook University, where he is also Director of the Center for Visual Computing (CVC), and Chief Scientist at the Center of Excellence in Wireless and Information Technology (CEWIT). 

He received his PhD in Computer Science at Ben-Gurion University of the Negev in 1977.   He is known for his work in visualization, graphics, virtual reality, user interfaces, multimedia, and their applications, especially in bio-medicine. He is especially well known for his work on the 3-dimensional virtual colonoscopy, a revolutionary low-risk technique for colon cancer screening, and for pioneering the use of Graphics Processing Units (GPUs) and GPU-clusters. In 2012, he presided over the development and opening of the Reality Deck, the largest virtual reality display in the world, at Stony Brook University.

Kaufman was the founding Editor in Chief of IEEE Transactions on Visualization and Computer Graphics (TVCG), co-founded the IEEE Visualization Conference and Volume Graphics series, and is currently the director of IEEE Computer Society Technical Committee on Visualization and Graphics. He is an IEEE Fellow, ACM Fellow, winner of many awards, including the IEEE Visualization Career Award, and member of the European Academy of Sciences.



Steven Skiena is inviting you to a scheduled Zoom meeting.

Topic: AI Seminar: Arie Kaufman
Time: Apr 21, 2021 10:00 AM Eastern Time (US and Canada)

Join Zoom Meeting
https://stonybrook.zoom.us/j/96017498640?pwd=SE0rdHB6ZVlCM2ZpY2RnRUxyVnR3Zz09

AI on Campus: Your Thoughts, Your Future

Join the Conversation: Share Your Thoughts about Learning, Academics, and AI

The world of college is changing fast, and Artificial Intelligence (AI) is at the center of it. We are part of the Institute on AI, Pedagogy, and the Curriculum with AAC&U, and we need to hear from the people AI affects most: you!

This is an open discussion for all students to share their honest experiences, their top concerns, and their best ideas about AI in our academic environment. We'll be diving into these key questions:

  • How can AI actually make learning better or easier? What opportunities do you see for using AI tools to enhance your assignments, research, or skills?

  • What are your biggest worries about AI? Is it about cheating, being graded fairly, or preparing for the job market? How is AI impacting your workload or stress levels?

  • What specific tools, workshops, or policies would help you use AI responsibly and successfully? (Think training, software, or clear rules.)

Dates/Times:

  • Wednesday, 2/4 at 2pm

  • Thursday, 2/5 at 12pm

Please register in advance for the Zoom link.

Can't Make It? Share Your Feedback!

Don't worry if you can't attend! You can still share your thoughts via video in our AI Zoom Room or via email: rose.tirotta-esposito@stonybrook.edu.

Videos will not be shared publicly and comments will only be shared in aggregate.

Your voice matters. Come tell us how AI is affecting your studies, your stress, and your success!

  • Dr. Rose Tirotta-Esposito (Assistant Provost; Director of CELT)

  • Dr. Elizabeth Hewitt (Associate Professor in the Department of Technology and Society (DTS) in the College of Engineering and Applied Sciences)

  • Chris Kretz (Associate Librarian and Head of Academic Engagement at SBU Libraries)

  • Prof. Rajiv Lajmi (Assistant Professor in the School of Health Professions and Chair of Applied Health Informatics)

  • Dr. Matthew Salzano (Assistant Professor in the Department of Communication in the School of Communication and Journalism)

How Language Makes us Smart (without Big Data) presented by Charles Yang

Abstract: Language provides the glue that combines simpler concepts into complex ones. To study how language guides conceptual development, we need precise accounts of how rules are learned from the child's linguistic experience, which is extremely limited in comparison to the amount of data available to current machine learning methods. In this talk, I discuss a mathematical model of inductive generalization, which enables language learning with very small amount of data. Such a view of learning has strong implications for the cross-cultural/linguistic variation of development. As a case study, I show that Hong Kong children learning Cantonese, which has a relatively simpler formal counting system, develop understanding of symbolic numbers a full year ahead of English-learning children in the United States, which is precisely predictable from the learning model. The new conception of learning adds another wrinkle to the eternal question of how language and thought are related to each other.

Bio: Charles Yang studied at the MIT AI lab and now teaches linguistics, computer science and psychology and directs the Program in Cognitive Science at the University of Pennsylvania. He is the author of several books: The Price of Linguistic Productivity (2016 MIT Press) won the Leonard Bloomfield Award from the Linguistic Society of America. His honors include a Guggenheim fellowship.



Matthew Salzano (Stony Brook), AI and DEIA: Getting at the Roots

Link to the talk (no pre-registration required this time): https://stonybrook.zoom.us/j/96209347479?pwd=Cs8fEfFdbXrGTC5cQgyHRb8Msh5vp8.1Meeting ID: 962 0934 7479 Passcode: 272489

Abstract: Conversations about AI and DEIA (Diversity, Equity, Inclusion, and Access) often unwittingly assume that social problems can and should have technical fixes. Left unaddressed, scholars, advocates, and technologists inevitably miss important consequences in our proposed solutions, and focus on surface-level problems rather than addressing the root causes of inequity. Drawing from scholarship in communication, rhetoric, and critical digital studies, this talk explains how we are often trimming branches when we need to pull out roots -- and introduces new terms and questions that can help reorient our conversations about AI and DEIA.

Speaker Bio: Matthew Salzano, Ph.D., is a communication scholar researching new media technologies, user practices, and cultural trends that threaten to limit possibilities for diverse engagement in public argument, debate, and protest. His scholarship has appeared in journals like The Quarterly Journal of Speech, Critical Studies in Media Communication, and Women's Studies in Communication, and his research on DEIA, AI, and advocacy communications has been funded by the Waterhouse Family Institute at Villanova University. He is currently an Inclusion, Diversity, Equity, and Access fellow in Ethical AI at Stony Brook University's School of Communication and Journalism and Alan Alda Center for Communicating Science.

Spring 2026, Wednesdays 2 to 3:20 pm, NCS 220 and Zoom link to be announced soon.

The seminar will be jointly taught by Prof. Dimitris Samaras (samaras@cs.stonybrook.edu).

The overall purpose of this seminar is to bring together people with interests in Computer Vision theory and techniques and to examine current research issues. This course will be appropriate for people who already took a Computer Vision graduate course or already had research experience in Computer Vision.

To enroll in this course, you must either: (1) be in the Ph.D. program or (2) receive permission from the instructors.

Each seminar will consist of multiple short talks (around 15 minutes) by multiple students. Students can register for 1 credit for CSE656. Registered students must attend and present a minimum of 2 talks. Registered students must attend in person. Up to 3 absences will be excused. Everyone else is welcome to attend.

Please note: Exceptionally, the first meeting on 1/28 will be in NCS 120.
Hidden Biases. Ethical Issues in NLP, and What to Do about Them presented by Dirk Hovy of Bocconi University

ABSTRACT: Through language, we fundamentally express who we are as humans. This property makes text a fantastic resource for research into the complexity of the human mind, from social sciences to humanities. However, it is exactly that property that also creates some ethical problems. Texts reflect the authors' biases, which get magnified by statistical models. This has unintended consequences for our analysis: If our data is not reflective of the population as a whole, if we do not pay attention to the biases contained, we can easily draw the wrong conclusions, and create disadvantages for our users.

In this talk, I will discuss several types of biases that affect NLP models, their sources, and potential counter measures: (1) Bias stemming from data, i.e., selection bias (if our texts do not adequately reflect the population we want to study), label bias (if the labels we use are skewed) and semantic bias (the latent stereotypes encoded in embeddings); (2) Biases deriving from the models themselves, i.e., their tendency to amplify any imbalances that are present in the data; (3) Design bias, i.e., the biases arising from our (the researchers) decisions which topics to analyze, which data sets to use, and what to do with them. For each bias, I will provide examples and discuss the possible ramifications for a wide range of applications, and various ways to address and counteract these biases, ranging from simple labeling considerations to new types of models.

BIO: Dirk Hovey is an associate professor of Computer Science in the department of marketing at Bocconi University. He received his PhD from the University of Southern California in Los Angeles, where he worked as a research assistant at the Information Sciences Institute. 

He works in Natural Language Processing (NLP), a subfield of artificial intelligence. His research focuses on computational social science. His interests include integrating sociolinguistic knowledge into NLP models, using large-scale statistics to model the interaction between people's socio-demographic profile and their language use, and ethics for data science and algorithmic fairness.