Abstract:
People shift their visual attention to gather and prioritize information from their surroundings, helping them navigate complex environments. Understanding these attentional shifts involves decoding the features that guide where attention is directed (spatial areas of focus) and when attention shifts (timing). Decoding these processes can aid applications from interface design to medical diagnosis. However, prior models have not fully explored the underlying factors addressing these aspects. In this dissertation, we study the factors that guide visual attention across diverse image types, spanning natural images, graphic design documents, and whole slide images (WSIs) of cancer tissues, while also predicting visual attention based on these factors.
First, we propose a method to quantify object recognition uncertainty as a factor influencing spatio-temporal attention (where and when) in natural images. We found that it plays a larger role than bottom-up saliency in guiding visual attention. Second, we analyze graphic design documents such as webpages, comics, posters, mobile UIs, etc., which differ from natural images in that they are designed to convey specific messages or elicit desired viewer response. We propose a unified and interpretable deep learning model that predicts both static and dynamic visual attention behavior (addressing where and when) by integrating document layout and content saliency as factors, enhancing attention prediction performance. Finally, in the domain of digital pathology, we investigate pathologists' attention during their examination of giga-pixel WSIs of prostate cancer with an objective to aid in the development of computer-assisted pathology training and clinical decision support systems. Using a digital microscope interface, we collected the largest known dataset of pathologist attention, which allows us to study the factors that guide their spatial and temporal attention patterns (where and when) and develop predictive models. Our study explores key factors guiding their attention, including magnification, slide staining, the nature of the diagnostic task, and their expertise. Motivated by this analysis, we propose deep learning models to solve two tasks: 1) predicting pathologist attention via spatial (heatmaps) and spatio-temporal (scanpaths) models, and 2) inferring pathologist expertise level, both essential technical components towards developing an AI-assisted pathology training pipeline.

Speaker:
Souradeep Chakraborty

Location: New Computer Science Bldg., Room 220

Zoom Link: https://stonybrook.zoom.us/j/9755288447?pwd=TW95T2xqOUZjRnlqcnVFcUQvN0JMdz09
Meeting ID: 975 528 8447
Passcode: 338037

Title: AI-Driven Target Selection Methods for Touch and Gaze Input

Abstract: Accurately selecting targets is an essential aspect of  Human-Computer Interaction. Erroneous selections can cause tedious undo and redo actions. Additionally, some selection errors are non-reversible and can lead to undesirable consequences. However, high-accuracy target selection remains a challenge on touchscreen devices due to the small target size and imprecise touch inputs, and in gaze interaction because of the gaze tracking noise and no easy-to-use selection action. We first propose ReLM, a Reinforcement Learning-based Method for touchscreen target selection. ReLM can automatically show suggestions and require a second touch if the input is ambiguous, and can directly select a target candidate when the input is certain. Our empirical evaluation shows that ReLM reduces the error rate from 6.92% to 1.63%, and the selection time from 2.23s to 1.59s over Shift, an existing suggestion-based method. Compared to BayesianCommand, a direct selection-based method, our ReLM reduces the error rate from 3.64% to 0.89%, while increasing the selection time by only 200 ms. Secondly, we investigate how to improve target selection performance for gaze interaction. We propose BayesGaze, an eye-gaze based target selection method. It accumulates the signal of each gaze point for selecting a target calculated by Bayes Theorem, and uses a threshold mechanism to determine the target selection. Our investigation shows that BayesGaze improves target selection accuracy and speed over a dwell-based selection method, and the Center of Gravity Mapping method.

All are welcome. Here  is the zoom meeting link:
https://stonybrook.zoom.us/j/93130953411?pwd=Rm5IRlVPQ3M0cHJsTXpCVFljUlFGUT09Meeting ID: 931 3095 3411Passcode: 999413
The overall purpose of this seminar is to bring together people with interests in Computer Vision theory and techniques and to examine current research issues. This course will be appropriate for people who already took a Computer Vision graduate course or already had research experience in Computer Vision. To enroll in this course, you must either: (1) be in the PhD program or (2) receive permission from the instructors. Each seminar will consist of multiple short talks (around 15 minutes) by multiple students. Students can register for 1 credit for CSE656. Registered students must attend and present a minimum of 2 talks. Everyone else is welcome to attend. Fill in https://forms.gle/q6UG9ygauLp2a8Po8 to subscribe to our mailing list for further announcement.

Abstract:

Many real world complex problems are multi-step reasoning tasks. These range from analytic tasks such as answering questions to automation tasks where agents complete tasks on behalf of users.. Evaluation, datasets, and models for such tasks can be unreliable for multiple reasons. (i) Datasets often have annotation artifacts and biases, allowing models to take reasoning shortcuts. Such shortcuts can allow models to make effective guesses -- or, in a sense, cheat -- to achieve high performance without any multi-step reasoning. This issue is further exacerbated for complex tasks because as the number of the required reasoning steps increases, so do the avenues for bypassing those steps. (ii) Models trained on such dataset/s learn to solve the task by taking reasoning shortcuts instead of proper multi-step reasoning. As a result, these models are not robust (reliable) when evaluated in an out-of-distribution evaluation setting. (iii) Lastly, recent works have shown that language models can solve complex multi-step tasks by producing a step-by-step explanation without any training. However, these methods often hallucinate factually incorrect (i.e., unreliable) explanations when posed with knowledge-intensive tasks.

I address these challenges by carefully characterizing the requirements of robust multi-step reasoning and designing reliable evaluation datasets and training methods that necessitate thorough multi-step reasoning. In DiRe, I first formalize and introduce Disconnected Reasoning, i.e., reasoning that allows models to arrive at the correct answer by bypassing necessary reasoning steps, and use this formalization to measure how much multi-step reasoning a model does on a dataset. In MuSiQue, I built a multi-step reasoning dataset for QA from scratch that avoids cheatability via disconnected reasoning, providing a more reliable evaluation. In TeaBReaC, I developed a synthetically generated multi-step QA pretraining dataset designed to force models to avoid disconnected reasoning and learn reliable multi-step reasoning. In IRCoT, I address the reliability of model-generated multi-step reasoning chains by interleaving models' step-by-step reasoning with a step-by-step retrieval from an external corpus, resulting in more factually correct reasoning. Finally, in AppWorld, I built a multi-step reasoning dataset that requires highly interactive problem-solving in an environment carefully designed to ensure models need thorough reasoning to succeed.
Speaker: Harsh Trivedi

Location: NCS 220 or Zoom

https://stonybrook.zoom.us/j/99096379762?pwd=zYCJZQVxRuZd9BboscO4nlodCwsKBr.1


Date of Event

Joel H. Saltz, MD, PhD
SUNY Distinguished Professor Cherith Professor and Founding Chair
Department of Biomedical Informatics
Stony Brook University

Apostolos K. Tassiopoulos, MD, FACS
Professor of surgery and vice chair for quality and outcomes Chief of the Division of Vascular and Endovascular Surgery
Director of the Stony Brook Vascular Center Stony Brook Medicine

Title: Clinical applications of artificial intelligence to improve diagnosis and risk stratification for patients with aortic aneurysms

Time: Wednesday, Feb 17, 2021 3 pm - 4 pm

Join Zoom Meeting
https://stonybrook.zoom.us/j/95617197636?pwd=KytzZ2pVRG9SZGpKZUtpNXJISj...
Meeting ID: 956 1719 7636 Passcode: 924293

University Libraries Presents: The Library AI Club is a welcoming space for students, faculty, and staff to explore AI in a supportive, low-pressure environment. Meeting every two weeks, the club features discussions, collaborative projects, guest speakers, and hands-on experiments. Join us to learn, share ideas, and engage with AI responsibly and creatively. We'd love to see you at an upcoming meeting! Location: Melville Library, Scholarly Communication Seminar Room
Abstract: Sea ice is crucial to Earth's climate, Arctic communities, and ecosystems, yet climate change is driving significant losses, threatening polar stability. Quantifying the long-term impacts of a declining sea ice cover requires tools which improve climate-timescale prediction and bring new understanding of climate interactions. In this talk, I discuss how meeting this challenge requires a multi-disciplinary approach. Climate models, while essential, suffer from systematic biases due to missing or inaccurate physics, leading to uncertainty in future projections. I show how data assimilation (DA) offers a statistical framework for integrating satellite observations with climate models to quantify systematic sea ice model errors. Using convolutional neural networks (CNNs), we can learn these errors based on the model's atmospheric, oceanic, and sea ice conditions--what I term a state-dependent representation of the error. This approach enables real-time corrections to subsequent model simulations, which systematically reduces global sea ice biases. I highlight key successes and challenges in developing this hybrid ML+climate modeling framework, including transfer learning to enhance online generalization of ML models, and new methods for integrating Python-based ML frameworks with Fortran climate model code. Finally, I introduce GPSat, a scalable Gaussian process-based tool for reconstructing complete sea ice fields from sparse satellite altimetry data. Together, the DA+ML framework and GPSat offer future opportunities for improving targeted model physics errors for more robust climate simulation.


IACS Seminar Speaker: William Gregory, Princeton University

Location: IACS Seminar Room
AI for Conservation: AI and Humans Combating Extinction Together by Daniel I. Rubenstein of Princeton University

ABSTRACT: The state of our planet is not good. We have lost more than 60% of the world's wildlife. Stopping the decline remains a challenge, especially since acquiring appropriate knowledge is expensive, time consuming and risky. Visual observations following the fates of a few individuals was the currency of the realm. But GPS technology and now machine learning provide a non-invasive scalable alternative. Photographs, taken by field scientists, tourists, automated cameras and incidental photographers, are the most abundant source of data on wildlife today. Wildbook, a project of tech for conservation coordinated by a non-profit Wild Me, is an autonomous computational system that starts from massive collections of images and, by detecting various species of animals and identifying individuals, combined with sophisticated data management, turns them into high-resolution information databases, enabling scientific inquiry, conservation and citizen science.

BIO: Dan Rubenstein is the Class of 1877 Professor of Zoology. He is currently Director of Princeton's Environmental Studies Program and is former Chair of Princeton University's Department of Ecology and Evolutionary Biology and Director of Princeton's Program in African Studies. He is a behavioral ecologist who studies how environmental variation and individual differences shape social behavior, social structure, sex
roles and the dynamics of populations. He has special interests in all species of wild horses, zebras and asses, and has done field work on them throughout the world identifying rules governing decision-making, the emergence of complex behavioral patterns and how these understandings influence their management
and conservation. In Kenya he also works with pastoral communities to develop and assess impacts of various grazing strategies on rangeland quality, wildlife use and livelihoods. He has also developed a scout program for gathering data on Grevy's zebras and created curricular modules for local schools to raise awareness about the plight of this endangered species. He engages people as 'Citizen Scientists' and has recently extended his work to measuring the effects of environmental change, including issues pertaining to the global commons
and changes wrought by management and by global warming, on behavior.
Abstract: Theory-internal work on opacity in phonology has been focused on the challenges these interactions present for one theory (rules, constraints) versus another. But there has also been interest in studying the formal, invariant properties of opaque and other process interactions (Chandlee et al. 2018; Bakovic and Blumenfeld 2024), though these works crucially differ in their underlying assumptions. In this talk I will recontextualize Chandlee et al. (2018)'s result that opaque maps are ISL in light of Bakovic and Blumenfeld (2024)'s recent formal typology of process interactions, and this recontextualization will provide an answer to an open question about the k-value of an interaction map. I will then discuss the implications of this collective formal understanding of opacity for a recent model of lexicon and phonological grammar learning (i.e., Hua and Jardine 2021, Chandlee and Jardine to appear).


Speaker: Prof. Jane Chandlee, Associate Professor in the Department of Linguistics at Haverford College

Location: IACS Seminar room.