| Abstract: Astronomers slowly made sense of the cosmos by following the stars night after night. I suggest we examine human identity in a similar way. Let's observe the words individuals use to describe themselves day after day. In this presentation, I will introduce ipseology - a new approach to studying human selves. Ipseology is the systematic, empirical study of ipseity: selfhood, individuality and the elements of identity. The primary idea is that we can learn a lot about people from their self-authored self-descriptions - especially if we follow their revisions over time. I will discuss results from sampling millions of social media bios over more than a decade and present new approaches for observation in the Post-API age. Bio: Dr. Jason Jeffrey Jones is a computational social scientist whose expertise includes online experiments, social networks, high-throughput text analysis and machine learning. He is interested in humans' perceptions of themselves and the developing role of artificial intelligence in society. Dr. Jones is the director of CSSERG (pronounced sea surge): the Computational Social Science of Emerging Realities Group. CSSERG is a team of scholars committed to cross-disciplinary collaboration, united by common computational methodologies and always with eyes on the near future. CSSERG has studied the effectiveness of virtual reality in evoking empathy, the dynamics of gender stereotypes in language over decades and temporal trends in personally expressed identity. This seminar will take place in person and online (zoom link below): Join Zoom Meeting https://stonybrook.zoom.us/j/ Meeting ID: 936 8660 9778 Passcode: 638699 |
Speaker TBD
Refreshments will be provided
As generative AI tools become increasingly prevalent in education, their impact on collegiate writing raises important questions about creativity, academic integrity, and effective teaching practices. This panel brings together faculty and students to share perspectives on the opportunities and challenges that AI presents in an academic setting. Through an open dialogue, participants will engage in meaningful conversations, allowing for a deeper understanding of each other's viewpoints and fostering collaboration. Students and faculty will explore diverse ways AI can be used in teaching and learning and seek solutions to utilize AI writing tools ethically. This exchange aims to build a community of trust and shared knowledge, ensuring that AI's role in education is both innovative and responsible.
Register here: https://stonybrook.zoom.us/meeting/register/tJAqdOitpjIpHtDGAsGBfEb3ah0YIzhIJolN
Register here: https://stonybrook.zoom.us/meeting/register/tJAqdOitpjIpHtDGAsGBfEb3ah0YIzhIJolN
CVPR 2020 - Seattle, Washington - Strong possibility this will be held remotely - http://cvpr2020.thecvf.com/
This event brings together people with interests in Computer Vision theory and techniques and examines current research issues in the field.
Each seminar consists of multiple short talks (around 15 minutes) by several students.
Join Zoom Meeting:
https://stonybrook.zoom.us/j/93547152068?pwd=WVpoRVgzelBXeloxdXVEakNSb2M5UT09
Meeting ID: 935 4715 2068 | Passcode: 481832
Each seminar consists of multiple short talks (around 15 minutes) by several students.
Join Zoom Meeting:
https://stonybrook.zoom.us/j/93547152068?pwd=WVpoRVgzelBXeloxdXVEakNSb2M5UT09
Meeting ID: 935 4715 2068 | Passcode: 481832
Hieu Le presents Incorporating Physical Illumination Constraints into Deep Learning Shadow Detection and Removal (PhD Proposal)
Shadows provide useful cues to analyze the scene but also hamper many computer vision algorithms such as image segmentation, object detection or tracking. For those reasons, shadow detection and shadow removal have been well studied topics in computer vision. Early approaches for shadow detection and removal focus on physical illumination models of shadows. These methods can express, identify, and remove shadows in a physically plausible manner. However, these models are often hard to optimize and slow in inference due to reliance on hand-designed image features. On the other hand, recent deep-learning approaches have achieved breakthroughs in performances for both shadow detection and removal. They learn to extract useful features automatically through training while being extremely efficient in computation. However, these models are data-dependent, opaque and ignore the physical aspects of shadows.
We propose to incorporate physical illumination constraints into deep-learning frameworks. Thus the mapping learned by the deep-network closely follows the physics of shadows, enabling the network to systematically and realistically modify shadows in images. For shadow detection, we present a novel GAN framework in which the generator can generate realistic images with attenuated shadows that can be used to train a shadow detector. For shadow removal, we propose a method that uses deep-networks to estimate the unknown parameters for a shadow image formation model that removes shadows. The system outputs shadow-free images in high-quality with no image artifacts and achieves state-of-the-art shadow removal performance. Lastly, we propose a system trained without the need for any shadow-free images in which physical constraints play pivotal roles that enable training the networks.
For Zoom information, please email events@cs.stonybrook.edu.
Shadows provide useful cues to analyze the scene but also hamper many computer vision algorithms such as image segmentation, object detection or tracking. For those reasons, shadow detection and shadow removal have been well studied topics in computer vision. Early approaches for shadow detection and removal focus on physical illumination models of shadows. These methods can express, identify, and remove shadows in a physically plausible manner. However, these models are often hard to optimize and slow in inference due to reliance on hand-designed image features. On the other hand, recent deep-learning approaches have achieved breakthroughs in performances for both shadow detection and removal. They learn to extract useful features automatically through training while being extremely efficient in computation. However, these models are data-dependent, opaque and ignore the physical aspects of shadows.
We propose to incorporate physical illumination constraints into deep-learning frameworks. Thus the mapping learned by the deep-network closely follows the physics of shadows, enabling the network to systematically and realistically modify shadows in images. For shadow detection, we present a novel GAN framework in which the generator can generate realistic images with attenuated shadows that can be used to train a shadow detector. For shadow removal, we propose a method that uses deep-networks to estimate the unknown parameters for a shadow image formation model that removes shadows. The system outputs shadow-free images in high-quality with no image artifacts and achieves state-of-the-art shadow removal performance. Lastly, we propose a system trained without the need for any shadow-free images in which physical constraints play pivotal roles that enable training the networks.
For Zoom information, please email events@cs.stonybrook.edu.
Kate Armstrong, a Vancouver-based artist, writer, and independent curator, will explore the role of AI in art and creativity through three AI-driven projects: KEKE Terminal, Botto, and Sasha Stiles' AI collaborator Technelegy. She will compare these projects to historical artistic movements and investigate AI's role as an autonomous creative agent, the function of community participation, and the shifting dynamics of authorship.
Location: Humanities Institute Room 1008
Location: Humanities Institute Room 1008
How Language Makes us Smart (without Big Data) presented by Charles Yang
Abstract: Language provides the glue that combines simpler concepts into complex ones. To study how language guides conceptual development, we need precise accounts of how rules are learned from the child's linguistic experience, which is extremely limited in comparison to the amount of data available to current machine learning methods. In this talk, I discuss a mathematical model of inductive generalization, which enables language learning with very small amount of data. Such a view of learning has strong implications for the cross-cultural/linguistic variation of development. As a case study, I show that Hong Kong children learning Cantonese, which has a relatively simpler formal counting system, develop understanding of symbolic numbers a full year ahead of English-learning children in the United States, which is precisely predictable from the learning model. The new conception of learning adds another wrinkle to the eternal question of how language and thought are related to each other.
Bio: Charles Yang studied at the MIT AI lab and now teaches linguistics, computer science and psychology and directs the Program in Cognitive Science at the University of Pennsylvania. He is the author of several books: The Price of Linguistic Productivity (2016 MIT Press) won the Leonard Bloomfield Award from the Linguistic Society of America. His honors include a Guggenheim fellowship.
Abstract: Language provides the glue that combines simpler concepts into complex ones. To study how language guides conceptual development, we need precise accounts of how rules are learned from the child's linguistic experience, which is extremely limited in comparison to the amount of data available to current machine learning methods. In this talk, I discuss a mathematical model of inductive generalization, which enables language learning with very small amount of data. Such a view of learning has strong implications for the cross-cultural/linguistic variation of development. As a case study, I show that Hong Kong children learning Cantonese, which has a relatively simpler formal counting system, develop understanding of symbolic numbers a full year ahead of English-learning children in the United States, which is precisely predictable from the learning model. The new conception of learning adds another wrinkle to the eternal question of how language and thought are related to each other.
Bio: Charles Yang studied at the MIT AI lab and now teaches linguistics, computer science and psychology and directs the Program in Cognitive Science at the University of Pennsylvania. He is the author of several books: The Price of Linguistic Productivity (2016 MIT Press) won the Leonard Bloomfield Award from the Linguistic Society of America. His honors include a Guggenheim fellowship.
Postmortem Program Analysis from a Conventional Program Analysis Method to an AI-assisted Approach
Abstract: Despite the best efforts of developers, software inevitably contains flaws that may be leveraged as security vulnerabilities. Modern operating systems integrate various security mechanisms to prevent software faults from being exploited. To bypass these defenses and hijack program execution, an attacker needs to constantly mutate an exploit and make many attempts. While in their attempts, the exploit triggers a security vulnerability and makes the running process abnormally terminate.
After a program has crashed and abnormally terminated, it typically leaves behind a snapshot of its crashing state in the form of a core dump. While a core dump carries a large amount of information, which has long been used for software debugging, it barely serves as informative debugging aids in locating software faults, particularly memory corruption vulnerabilities. As such, previous research mainly seeks fully reproducible execution tracing to identify software vulnerabilities in crashes. However, such techniques are usually impractical for complex programs. Even for simple programs, the overhead of fully reproducible tracing may only be acceptable at the time of in-house testing.
In this talk, I will discuss how we tackle this issue by bridging program analysis with artificial intelligence (AI). More specifically, I will first talk about the history of postmortem program analysis, characterizing and disclosing their limitations. Second, I will introduce how we design a new reverse-execution approach for postmortem program analysis. Third, I will discuss how we integrate AI into our reverse-execution method to escalate its analysis efficiency and accuracy. Last but not least, as part of this talk, I will demonstrate the effectiveness of this AI-assisted postmortem program analysis framework by using massive amounts of real-world programs.
Bio: Dr. Xinyu Xing is an Assistant Professor at Pennsylvania State University. His research interests include exploring, designing and developing new program analysis and AI techniques to automate vulnerability discovery, failure reproduction, vulnerability diagnosis (and triage), exploit and security patch generation. His past research has been featured by many mainstream media and received the best paper awards from ACM CCS and ACSAC. Going beyond academic research, he also actively participates and hosts many world-class cybersecurity competitions (such as HITB and XCTF). As the founder of JD-OMEGA, his team has been selected for DEFCON/GeekPwn AI challenge grand final at Las Vegas. Currently, his research is mainly supported by NSF, ONR, NSA and industry partners.
Abstract: Despite the best efforts of developers, software inevitably contains flaws that may be leveraged as security vulnerabilities. Modern operating systems integrate various security mechanisms to prevent software faults from being exploited. To bypass these defenses and hijack program execution, an attacker needs to constantly mutate an exploit and make many attempts. While in their attempts, the exploit triggers a security vulnerability and makes the running process abnormally terminate.
After a program has crashed and abnormally terminated, it typically leaves behind a snapshot of its crashing state in the form of a core dump. While a core dump carries a large amount of information, which has long been used for software debugging, it barely serves as informative debugging aids in locating software faults, particularly memory corruption vulnerabilities. As such, previous research mainly seeks fully reproducible execution tracing to identify software vulnerabilities in crashes. However, such techniques are usually impractical for complex programs. Even for simple programs, the overhead of fully reproducible tracing may only be acceptable at the time of in-house testing.
In this talk, I will discuss how we tackle this issue by bridging program analysis with artificial intelligence (AI). More specifically, I will first talk about the history of postmortem program analysis, characterizing and disclosing their limitations. Second, I will introduce how we design a new reverse-execution approach for postmortem program analysis. Third, I will discuss how we integrate AI into our reverse-execution method to escalate its analysis efficiency and accuracy. Last but not least, as part of this talk, I will demonstrate the effectiveness of this AI-assisted postmortem program analysis framework by using massive amounts of real-world programs.
Bio: Dr. Xinyu Xing is an Assistant Professor at Pennsylvania State University. His research interests include exploring, designing and developing new program analysis and AI techniques to automate vulnerability discovery, failure reproduction, vulnerability diagnosis (and triage), exploit and security patch generation. His past research has been featured by many mainstream media and received the best paper awards from ACM CCS and ACSAC. Going beyond academic research, he also actively participates and hosts many world-class cybersecurity competitions (such as HITB and XCTF). As the founder of JD-OMEGA, his team has been selected for DEFCON/GeekPwn AI challenge grand final at Las Vegas. Currently, his research is mainly supported by NSF, ONR, NSA and industry partners.
Recently, large-scale language data combined with modern machine learning techniques have shown strong value as means for studying human psychology and behavior. For example, language alone has been shown predictive in mental health, personality, and health behaviors. However, many applications for such language-based assessments have readily available and important data beyond language (i.e. extra-linguistics), such as predicting the subjective well-being of a community using tweets, where one can take into account their age, education, and demographic attributes. Language may capture some characteristics while extra-linguistic variables captures others. We believe that effectively integrating linguistic and extra-linguistic data can yield benefits beyond either independently.
In this thesis, we develop methods which effectively integrate extra-linguistic data with language data focused primarily on social scientific applications. The central challenge is dealing with the size and heterogeneity of, often sparse and noisy, language data versus the, often low-dimensional and non-sparse, extra-linguistic variables. First, we consider structured extra-linguistics, like socioeconomic (income and education rates) and demographics (age, gender, etc.), and propose two integration methods, named residualized controls (RC) and residualized factor adaptation (RFA), to be used in county-wise prediction tasks. Demonstrating techniques that integrate information at both the model-level and data-level, we found consistently strong improvement over naively combining features, for example, increasing county level well-being predictions by over 12%. Next, we consider unstructured extra-linguistic data. In the first part, we incorporate social network connections and language over time to propose a novel metric for quantifying the stickiness of words - their ability to spread across friendship connections in a social network over time (or in other words, stick in ones vocabulary after seeing friends use it). We obtain which language features are more probable to disseminate through friendship and show such a metric is useful for predicting who will be friends and what content will spread. In addition, we analyze language content over time by proposing a novel dynamic content-specific topic modeling technique that can help to identify different sub-domains of a thematic scope and can be used to track societal shifts in concerns or views over time.