Speaker Petar Djuric

Refreshments will be provided

Deep Gaussian processes: Theory and applications
Petar M. Djurić
Department of Electrical and Computer Engineering
Stony Brook University

Abstract: Gaussian processes are an infinite-dimensional generalization of multivariate normal distributions. They provide a principled approach to learning with kernel machines and they have found wide applications in many fields. More recently, with the advance of deep learning, the concept of deep Gaussian processes has emerged. Deep Gaussian processes can be viewed as multilayer hierarchical organizations of Gaussian processes that are equivalent to infinitely wide multiple layer neural networks. Deep Gaussian processes have improved capacity for prediction and classification over standard Gaussian processes, while models based on them continue to allow for full Bayesian treatment and for applications when the amount of available data is limited. The theory of recent progress in deep Gaussian processes will be presented and some applications will be provided.

Biosketch: Petar M. Djurić received the B.S. and M.S. degrees in electrical engineering from the University of Belgrade, Belgrade, Yugoslavia, respectively, and the Ph.D. degree in electrical engineering from the University of Rhode Island, Kingston, RI, USA. He is a SUNY Distinguished Professor and currently, he is a Chair of the Department of Electrical and Computer Engineering, Stony Brook University, Stony Brook, NY, USA. Djurić was a recipient of the IEEE Signal Processing Magazine Best Paper Award in 2007 and the EURASIP Technical Achievement Award in 2012. From 2008 to 2009, he was a Distinguished Lecturer of the IEEE Signal Processing Society. He was the Editor-in-Chief of the IEEE Transactions on Signal and Information Processing over Networks (2015-2018). Djurić is a Fellow of IEEE and EURASIP

West Campus - SAC- Student Activities Center - Ballrooms A & B
100 Nicolls Road
Stony Brook NY 11794

Job Fair.jpg
The Career Center invites Alumni Employers and Job Seekers to the IT/Computer Science Job and Internship Fair this spring.

Job Seekers:
A job fair is an opportunity for you to present yourself professionally in person to a potential employer, while showcasing your communication skills.
Get more information

Alumni Employers:
Held in both the fall and spring semesters, this event is ideal for employers looking to fill internship, co-op, part-time and full-time opportunities in the field of information technology (i.e. Software Engineering, Network Administration, Web Development, etc.).
Register here to recruit top SBU talent.

Title:Deep Contextual Modeling for Natural Language Understanding, Generation, and Grounding

Zoom instructions:

Join Zoom Meeting
https://stonybrook.zoom.us/j/645050299?pwd=TVJVRkc3dlhxdDF5d00xWGlDQkov…

Meeting ID: 645 050 299
Password: 810247
One tap mobile
+16468769923,,645050299#,,#,810247# US (New York)
+13126266799,,645050299#,,#,810247# US (Chicago)

Dial by your location
+1 646 876 9923 US (New York)
+1 312 626 6799 US (Chicago)
+1 301 715 8592 US
+1 346 248 7799 US (Houston)
+1 408 638 0968 US (San Jose)
+1 669 900 6833 US (San Jose)
+1 253 215 8782 US
Meeting ID: 645 050 299
Password: 810247
Find your local number: https://stonybrook.zoom.us/u/aemTiJMXu6

Abstract:
Natural language is a fundamental form of information and communication. In both human-human and human-computer communication, people reason about the context of text and world state to understand language and produce language response. In this talk, I present several deep neural network based systems that first understand the meaning of language grounded in various contexts where the language is used, and then generate effective language responses in different forms for information access and human-computer communication. First, I will introduce Speaker Interaction RNNs for addressee and response selection in multi-party conversations based on explicit representations for different discourse participants. Then, I will present a text summarization approach for generating email subject lines by optimizing quality scores in a reinforcement learning framework. Finally, I will show an editing-based multi-turn SQL query generation system towards intelligent natural language interfaces to databases.

Bio:Rui Zhang is a final year Ph.D. student at Yale University advised by Professor Dragomir Radev. His research interest lies in various natural language processing problems in understanding, generation, and grounding. He has been working on (1) End-to-End Neural Modeling for Entities, Sentences, Documents, and Multi-party Multi-turn Dialogues, (2) Text Summarization for Emails, News, and Scientific Articles, (3) Cross-lingual Information Retrieval for Low-Resource Languages, (4) Context-Dependent Text-to-SQL Semantic Parsing in Human-Computer Interaction. Rui Zhang has published papers and served as Program Committee members at top-tier NLP and AI conferences including ACL, NAACL, EMNLP, AAAI, CoNLL. During his Ph.D., He has done research internships at IBM Thomas J. Watson Research Center, Grammarly Research, and Google AI. He was a graduate student at the University of Michigan and got his bachelor's degrees at both the University of Michigan and Shanghai Jiao Tong University from the UM-SJTU Joint Institute.

Time: May 5, 2022, Thursday, 02:00 PM Eastern Time (US and Canada)
Place: New Computer Science (NCS) Room 220, and Zoom

Zoom link: https://stonybrook.zoom.us/j/95948672934?pwd=d3ZDcUJkK3VweFBDVWhIVDhtaFU2Zz09
Meeting ID:  959 4867 2934
Passcode:  082036

Title:  Generative Adversarial Learning using Optimal Transport

Abstract: 

Generative Adversarial Learning (GAL) aims to learn a target distribution in an adversarial manner. A Generative Adversarial Network (GAN) is a concrete implementation of GAL using a discriminator and a generator that play a min-max game. GANs have been used in many machine learning and computer vision applications. However, GANs are known to be hard to train, mainly because a min-max saddle point optimization problem needs to be solved in GAL. In this thesis, I investigate several methods to improve generative adversarial learning using Optimal Transport (OT). 

Previous Wasserstein GANs (WGANs) do not compute the correct Wasserstein distance to train the discriminator. To address this problem, I propose WGAN-TS that uses the L1 transport cost and computes the correct Wasserstein distance to train the discriminator. To ensure the local convergence of WGANs, I propose WGAN-QC that adopts the quadratic transport cost. I prove that WGAN-QC not only computes the correct Wasserstein distance but also converges to a local equilibrium point. To compute the Wasserstein distance over the whole dataset, I propose to use Semi-Discrete Optimal Transport (SDOT) to match noise points and the real images during GAN training. To measure the quality of an SDOT map, I use the Maximum Relative Error (MRE) and the L_1 distance between the target distribution and the transported distribution obtained by an OT map. I propose statistical methods to estimate the MRE and the L_1 distance. I propose an efficient Epoch Gradient Descent algorithm for SDOT (SDOT-EGD). To deal with the 2D special case of GAL, I propose to use OT to learn 2D distributions. In particular, I adopt OT to match persistent diagrams in training a topology-aware GAN and learn density maps in the crowd counting task. Finally, I use OT and the topological maps of the crowd to improve the crowd counting performance and propose a topology-based metric to measure the quality of the crowd density maps.

Title: Sustainable NLP

Time: Friday 4/29, 2:40 PM

Location: NCS 120

Abstract:

Natural language processing (NLP) technology has supercharged many real-world applications ranging from intelligent personal assistants (like Alexa, Siri, and Google Assistant) to commercial search engines such as Google and Bing. But current NLP applications use extremely large neural models, making them (i) expensive to deploy on servers, requiring large amounts of compute resources and power, and (ii) impossible to run on mobile devices, making on-device, privacy-preserving applications impractical.

In the first part of the talk, I will describe systems optimizations we have developed that significantly reduce the compute and memory requirement of NLP models. The optimizations we developed can be applied broadly and results in over 10x reduction in latency when deployed on mobile devices. In the second part of the talk, I will describe our recent work on predicting energy consumption of NLP models. Existing energy prediction approaches are not accurate, making it difficult for developers and practitioners to reason about their models in terms of power. We use a multi-level regression approach that produces highly accurate and interpretable energy predictions.

Bio:
Aruna Balasubramanian is an Associate Professor at Stony Brook University. She received her Ph.D from the University of Massachusetts Amherst, where her dissertation won the UMass outstanding dissertation award and was the SIGCOMM dissertation award runner up. She works in the area of networked systems. Her current work consists of two threads: (1) significantly improving Quality of Experience of Internet applications, and (2) improving the usability, accessibility, and privacy of mobile systems. She is the recipient of the SIGMobile Rockstar award, a Ubicomp best paper award, a Computing Innovation Fellowship, a VMWare Early Career award, several Google research awards, an