Abstract: Self-supervised representation learning (SRL) has emerged as a pivotal advancement in machine learning, offering high-quality data representations without the need for labeled datasets. While SRL has demonstrated enhanced adversarial robustness compared to supervised learning, its resilience against other attack types, particularly backdoor attacks, remains an open question. Recent studies have revealed potential vulnerabilities in SRL, underscoring the necessity for a comprehensive security analysis. However, existing research often extrapolates attacks from supervised learning paradigms, neglecting the unique challenges and opportunities inherent to self-supervised mechanisms.

This thesis proposal aims to address three critical objectives in the realm of self-supervised learning: (1) exploring novel attack vectors, (2) implementing and evaluating practical attacks, and (3) developing robust countermeasures. We focus on two key SRL paradigms: Contrastive Learning and Diffusion Models. For Contrastive Learning, we synthesize existing security vulnerabilities and introduce innovative attack vectors, such as CTRL, to uncover distinctive risks. We conduct a comparative analysis of contrastive and supervised learning approaches in their defense against these threats, exploring potential safeguards and highlighting the limitations of current protective measures in self-supervised contexts. Regarding Diffusion Models, we demonstrate inherent vulnerabilities in their application to adversarial purification.

Our research aims to illuminate the unique challenges posed by emerging attack vectors in self-supervised learning, fostering technical advancements to address underlying security risks in real-world applications. By contributing to the development of more resilient and secure self-supervised representation learning systems, we seek to enhance their reliability and trustworthiness in practical scenarios. This comprehensive examination of SRL's security landscape will provide valuable insights for the broader machine-learning community and pave the way for more robust AI systems.

Join here.
The Renaissance School Of Medicine Department of Scientific Affairs and its Single Cell Genomics facility are excited to host a special seminar and discussion on AI and single cell genomics analysis:

With the decreasing cost of sequencing, many biobanks and large research cohorts have moved to whole genome sequencing (WGS) and single-cell RNA-seq. However, making use of this deluge of data remains a challenge. I will discuss statistical and deep learning approaches that we are exploring to address the challenge of noncoding variant interpretation, including our work as part of the Alzheimer's disease sequencing project.

Speaker: David A. Knowles, PhD. Asst. Professor of Computer Science, Interdisciplinary Appointee in Systems Biology, Columbia University Core Faculty Member, New York Genome Center

Join us in person: Health Science Tower Level 3, Lecture Hall 5
The overall purpose of this seminar is to bring together people with interests in Computer Vision theory and techniques and to examine current research issues. This course will be appropriate for people who already took a Computer Vision graduate course or already had research experience in Computer Vision. To enroll in this course, you must either: (1) be in the PhD program or (2) receive permission from the instructors. Each seminar will consist of multiple short talks (around 15 minutes) by multiple students. Students can register for 1 credit for CSE656. Registered students must attend and present a minimum of 2 talks. Everyone else is welcome to attend. Fill in https://forms.gle/q6UG9ygauLp2a8Po8 to subscribe to our mailing list for further announcement.
Title:Deep Contextual Modeling for Natural Language Understanding, Generation, and Grounding Zoom instructions: Join Zoom Meeting https://stonybrook.zoom.us/j/645050299?pwd=TVJVRkc3dlhxdDF5d00xWGlDQkovZz09 Meeting ID: 645 050 299 Password: 810247 One tap mobile +16468769923,,645050299#,,#,810247# US (New York) +13126266799,,645050299#,,#,810247# US (Chicago) Dial by your location +1 646 876 9923 US (New York) +1 312 626 6799 US (Chicago) +1 301 715 8592 US +1 346 248 7799 US (Houston) +1 408 638 0968 US (San Jose) +1 669 900 6833 US (San Jose) +1 253 215 8782 US Meeting ID: 645 050 299 Password: 810247 Find your local number: https://stonybrook.zoom.us/u/aemTiJMXu6 Abstract: Natural language is a fundamental form of information and communication. In both human-human and human-computer communication, people reason about the context of text and world state to understand language and produce language response. In this talk, I present several deep neural network based systems that first understand the meaning of language grounded in various contexts where the language is used, and then generate effective language responses in different forms for information access and human-computer communication. First, I will introduce Speaker Interaction RNNs for addressee and response selection in multi-party conversations based on explicit representations for different discourse participants. Then, I will present a text summarization approach for generating email subject lines by optimizing quality scores in a reinforcement learning framework. Finally, I will show an editing-based multi-turn SQL query generation system towards intelligent natural language interfaces to databases. Bio:Rui Zhang is a final year Ph.D. student at Yale University advised by Professor Dragomir Radev. His research interest lies in various natural language processing problems in understanding, generation, and grounding. He has been working on (1) End-to-End Neural Modeling for Entities, Sentences, Documents, and Multi-party Multi-turn Dialogues, (2) Text Summarization for Emails, News, and Scientific Articles, (3) Cross-lingual Information Retrieval for Low-Resource Languages, (4) Context-Dependent Text-to-SQL Semantic Parsing in Human-Computer Interaction. Rui Zhang has published papers and served as Program Committee members at top-tier NLP and AI conferences including ACL, NAACL, EMNLP, AAAI, CoNLL. During his Ph.D., He has done research internships at IBM Thomas J. Watson Research Center, Grammarly Research, and Google AI. He was a graduate student at the University of Michigan and got his bachelor's degrees at both the University of Michigan and Shanghai Jiao Tong University from the UM-SJTU Joint Institute.
Making sense of Twitter @ Bloomberg presented by Daniel Preotiuc-Pietro

ABSTRACT: The Bloomberg Terminal has provided ways for investors and journalists to sift through and understand the immense volume of tweets and discover financially-relevant content ever since the SEC approved the use of Twitter for company disclosures back in 2013.

In the first part of the talk, I will showcase how tweets impact financial markets and how Bloomberg is using Natural Language Processing methods to identify financially relevant tweets that move the markets. Our processing pipeline feeds directly to clients, journalists in the newsroom and powers several news analytic products offered by the company including trending companies and consumer sentiment for publicly traded equities.

However, understanding user pragmatic intent in individual tweets would allow us to gain deeper insights and enable new applications. I will present several recent research studies focused on understanding intent including identifying complaints and the roles with which vulgarity is used in social media and how these can help improve applications such as sentiment analysis and hate speech detection.

BIO: Daniel Preotiuc-Pietro is a Senior Research Engineer and Team Lead at Bloomberg LP, where he works on analyzing and building models for real-world large scale social media and news mining and information extraction. His research interests are focused on understanding the social and temporal aspects of text, especially from social media, with applications in domains such as Social Psychology, Law, Political Science and Journalism. Several of his research studies were featured in popular press including the Washington Post, BBC, New Scientist, Scientific American or FiveThirtyEight. He is a co-organizer of the Natural Legal Language Processing workshop series. Prior to joining Bloomberg LP, Daniel was a postdoctoral researcher at the University of Pennsylvania with the interdisciplinary World Well Being Project and obtained his PhD in Natural Language Processing and Machine Learning at the University of Sheffield, UK.
Join librarian Christine Fena for an interactive workshop that invites you to explore AI tools firsthand, not just as users, but as critical investigators. Through playful experimentation and collaborative discovery, you'll uncover inherent biases, probe algorithmic flaws, and gain a deeper understanding of AI's limitations and societal impacts.

Location: Melville Library, Central Reading Room, Lab B

https://library.stonybrook.edu/library-events/critiquing-ai/