Predicting Subjective Attributes in Visual Data - Zijun Wei

ABSTRACT: Recent progress in deep neural networks has revolutionized many computer vision tasks such as image classification, detection and segmentation. However, in addition to excelling in tasks that predict well-defined objective information, human-centered artificial intelligence systems should also be able to model subjective attributes, as defined by human perceptual behavior, that goes beyond the pure physical content of visual data. Example subjective tasks are the prediction of spatial or temporal regions that are interesting to humans (e.g., attract attention or are visually pleasing) and the recognition of subjective attributes (e.g., visually elicited sentiments). Better models for these tasks will improve the human-computer interaction experience in various applications. This thesis investigates several approaches to address the challenges in predicting those subjective attributes in visual data over a diverse set of tasks. I first present a novel framework for real-time automatic photo composition. The framework consists of a cost-effective data collection workflow, an efficient model training pipeline and a lightweight module to account for personalized preferences. Then I develop a novel and general algorithm to detect interesting segments in sequential data, which can be naturally applied to video summarization tasks. Furthermore, I propose methods that learn to represent sentiments elicited by images, in an unsupervised manner, using linguistic features extracted from large scale Web data. To conclude this thesis, I introduce a human-vision-inspired image classification algorithm that also predicts spatial visual attention even though no attention data was used for training it.  
Towards Saving Lives with Natural Language Processing Andrew Schwartz Dept. of Computer Science Stony Brook Analyzing language use patterns is proving to be a valuable and unique approach to understanding the psychological, social, and health factors of people. On the individual level, Facebook and Twitter have been found predictive of mental health, personality, demographics, and occupational class (among others). At the community or county-level, Twitter has been found predictive of flu and allergy outbreaks, life satisfaction, atherosclerotic heart disease mortality, health behavioral risk factors, excessive drinking, and HIV prevalence. While these techniques have shown robust links over a plethora of important aspects of human life, it is not clear whether any lives have been saved, at least directly, by such work. At their core, some barriers to improving health care and saving lives are likely not NLP or even AI problems, but others are perhaps technical in nature and suggest changing the way we model data. This seminar will have two parts: a presentation and a discussion. I will start by going over recent and on-going work toward predicting mental health outcomes --- depression, addiction relapse, future psychological distress --- from human language use patterns. Then, I will present an imperfect vision of a future where NLP helps to save lives and open the floor for discussion of technical barriers and whether such a vision is practical. Biography: Andrew Schwartz received his PhD in Computer Science from the University of Central Florida in 2011 with research on acquiring lexical semantic knowledge from the Web. He then joined the University of Pennsylvania where he was a Postdoctoral Research Fellow and later Visiting Assistant Professor in Computer & Information Science. He is Lead Research Scientist for the World Well-Being Project, a multidisciplinary group of Computer Scientists and Psychologists studying physical and psychological well-being based on language in social media.

When: Thu: 10/28/2021, 10 am
Where: NCS Room 220, or
Zoom: https://stonybrook.zoom.us/j/97978463739?pwd=aVJFVERQa25jYjJrOFZEcWVuSzJLdz09

Deep Surface MeshesPascal FuaEPFLGeometric Deep Learning has recently made striking progress with the advent of Deep Implicit Fields (SDFs). They allow for detailed modeling of watertight surfaces of arbitrary topology while not relying on a 3D Euclidean grid, resulting in a learnable 3D surface parameterization that is not limited in resolution. Unfortunately, they have not yet reached their full potential for applications that require an explicit surface representation in terms of vertices and facets because converting the SDF to such a 3D mesh representation requires a marching-cube algorithm, whose output cannot be easily differentiated with respect to the SDF parameters. In this talk, I will discuss our approach to overcoming this limitation and implementing convolutional neural nets that output complex 3D surface meshes while remaining fully-differentiable and end-to-end trainable. I will also present applications to single view reconstruction, physically-driven Shape optimization, and bio-medical image segmentation.


Bio:
Pascal Fua received an engineering degree from Ecole Polytechnique, Paris, in 1984 and a Ph.D. in Computer Science from the University of Orsay in 1989. He joined EPFL (Swiss Federal Institute of Technology) in 1996 where he is a Professor in the School of Computer and Communication Science and head of the Computer Vision Lab. Before that, he worked at SRI International and at INRIA Sophia-Antipolis as a Computer Scientist. His research interests include shape modeling and motion recovery from images, analysis of microscopy images, and Augmented Reality. He has (co)authored over 300 publications in refereed journals and conferences. He has received several ERC grants. He is an IEEE Fellow and has been an Associate Editor of IEEE journal Transactions for Pattern Analysis and Machine Intelligence. He often serves as program committee member, area chair, and program chair of major vision conferences and has cofounded three spinoff companies. 
Prof. Eugene A. Feinberg, from the Department of Applied Mathematics and Statistics, presents, Recent Developments in Markov Decision Processes Relevant to AI on April 4 at 4p. The talk discusses recent developments in Markov Decision Processes potentially relevant to artificial intelligence. These developments include complexity estimations for exact and approximate algorithms, decision making with incomplete information and multiple criteria, and continuity properties of optimal values and expectations. Dr. Eugene A. Feinberg is currently Distinguished Professor in the Department of Applied Mathematics and Statistics at Stony Brook University. He is an expert on applied probability, stochastic models of operations research, Markov decision processes, and on industrial applications of operations research and statistics. He has published more than 150 papers and edited the Handbook of Markov Decision Processes. His research has been supported by NSF, DOE, DOD, NYSTAR (New York State Office of Science, Technology, and Academic Research), NYSERDA (New York State Energy Research and Development Authority) and by industry. He is a Fellow of INFORMS (The Institute for Operations Research and Management Sciences) and has received several awards including 2012 IEEE Charles Hirsh Award for developing and implementing smart grid technologies, 2012 IBM Faculty Award, and 2000 Industrial Associates Award from Northrop Grumman. Dr. Feinberg is an Associate Editor for Mathematics of Operations Research and for Applied Mathematics Letters. He is an Area Editor for Operations Research Letters. Refreshments will be provided

The International Neuroethics Society (INS) Speaker Series on AI & Consciousness

AI has existed as a tool for a long time, performing simple tasks such as sorting documents, suggesting music, and so on. But with the development of new generations of AI, the perception of its value to society has been increasing, as it can bring potential and promising benefits in many areas of human life. AI is known to have errors or biases that result in strange or even dangerous responses, but what happens when in AI-human interaction, the latter have errors or biases? cultural errors or biases? And what could be the implications for human relationships?

Speaker Bio

Dr. Karen Herrera-Ferrá is an independent and global consultant on ethical, medical, psychological, legal, social, cultural, policy-making, human rights and political issues and concerns on the development and use of neuroscience, neurotechnology and AI. She is a former member of the Board of Directors of the International Neuroethics Society.

Register here

https://umaryland.zoom.us/meeting/register/tJMvfuqsqDspG9BKMLfUU49UbuUyP_IEvXRh

The talk will be exclusively on zoom https://stonybrook.zoom.us/j/7851507944 Speaker: Sooyeon Lee, Rochester Institute of Technology Title: Design and Evaluation of Accessible AI Technologies for Users with Disabilities Abstract: Over one billion people in the world live with some type of disability. Many of them experience barriers in accessing information or using technologies, which can limit social interactions in both physical and digital spaces. In my research, I focus on investigating and designing nonvisual interaction for the community of blind users and non-audio and non-speech interaction for the community of deaf and hard of hearing users. In this talk, I will first present my research investigating nonvisual interaction prototypes for supporting shopping activities for blind users, with an exploration of one-way instructional and two-way conversational interactions and with a variety of form factors and communication modalities through the use of human-computer interaction research methodologies. I will also discuss incorporation of AI technology and its impact on the nonvisual guidance experiences, and further meanings of independence and new ways for designing independence for people with visual impairments. This collaborative work included AI researchers, the community of the blind, and an industry research partner. Additionally, I will discuss my findings and further exciting research opportunities. Secondly, I will overview research projects investigating AI-based applications and tools that support deaf and hard of hearing people's equitable information access and societal participation. This work addresses engagement in online social media spaces, workplace communication, participation in gig work, and interaction with mainstream technology through American Sign Language (ASL) interaction. I will focus on a recent project on users' experiences with AI deep-fake face-transformation technologies to support anonymous participation of deaf and hard of hearing signers in online social media. Lastly, I will discuss my future research directions informed and inspired by this prior and current research. Bio: Sooyeon Lee is a postdoctoral research associate in the Golisano College of Computing and Information Sciences at Rochester Institute of Technology. She received her Ph.D., advised by Dr. John M. Carroll, in Information Sciences and Technology from the College of Information Sciences and Technology at The Pennsylvania State University, and she also conducted design research at Google and Uber. Her research is in the fields of Human-Computer Interaction and Human-AI Interaction with focus on accessibility. She designs, builds, and evaluates new systems and applications that address accessibility barriers. Her work investigates the diversity of users, explores and leverages emerging technologies, and adopts human-centered design and inclusive design approaches in an interdisciplinary research framework. She has multiple publications in top-tier human-computer interaction and computing accessibility journals and conferences, including ACM CHI, CSCW, ASSETS, and TACCESS, and she has received a Best Paper Award Nomination at ASSETS 2021. She has served on Associate Chair for the ACM CHI conference and will serve on Program Committee for ASSETS 2022.
Abstract: Many unresolved legal questions over LLMs and copyright center on memorization: whether specific training data have been encoded in the model's weights during training, and whether those memorized data can be extracted in the model's outputs. While many believe that LLMs do not memorize much of their training data, recent work shows that substantial amounts of copyrighted text can be extracted from open-weight models. However, it remains an open question if similar extraction is feasible for production LLMs, given the safety measures these systems implement. We investigate this question using a two-phase procedure: (1) an initial probe to test for extraction feasibility, which sometimes uses a Best-of-N (BoN) jailbreak, followed by (2) iterative continuation prompts to attempt to extract the book. We evaluate our procedure on four production LLMs -- Claude 3.7 Sonnet, GPT-4.1, Gemini 2.5 Pro, and Grok 3 -- and we measure extraction success with a score computed from a block-based approximation of longest common substring (nv-recall). With different per-LLM experimental configurations, we were able to extract varying amounts of text. For the Phase 1 probe, it was unnecessary to jailbreak Gemini 2.5 Pro and Grok 3 to extract text (e.g, nv-recall of 76.8% and 70.3%, respectively, for Harry Potter and the Sorcerer's Stone), while it was necessary for Claude 3.7 Sonnet and GPT-4.1. In some cases, jailbroken Claude 3.7 Sonnet outputs entire books near-verbatim (e.g., nv-recall=95.8%). GPT-4.1 requires significantly more BoN attempts (e.g., 20X), and eventually refuses to continue (e.g., nv-recall=4.0%). Taken together, our work highlights that, even with model- and system-level safeguards, extraction of (in-copyright) training data remains a risk for production LLMs.

Speaker: Xinyue

Location: CS2311