Abstract:
Large language models (LLMs) have transformed the way humans write code, bringing unprecedented automation to software development. In this talk, I will first provide an overview of my research on enhancing LLMs' code intelligence, optimizing each step of the development pipeline towards more complex software engineering tasks. I will then delve into my key contributions, focusing on how to equip LLMs with a deeper, more comprehensive understanding of software programs. Finally, I will discuss the future of AI-driven software engineering, envisioning a new era of automation that is more reliable, intelligent, and cost-efficient.
Bio:
Yangruibo (Robin) Ding is a Ph.D. candidate in the Department of Computer Science at Columbia University. His research is at the intersection of Software Engineering and Machine Learning, focusing on developing large language models (LLMs) for code. He trains LLMs to generate, analyze, and refine software programs and constructs benchmarks to systematically evaluate LLMs in solving software engineering tasks. He also studies how to improve LLMs' reasoning capability to tackle complex programming tasks, such as debugging and patching. His interdisciplinary research has been published in top-tier conferences of software engineering, programming languages, natural language processing, and machine learning. He won an ACM SIGSOFT Distinguished Paper Award, an IEEE TSE Best Paper Runner-up, and received an IBM Ph.D. Fellowship.
Location:
NCS 120
ABSTRACT: The key success of deep learning is the increasing size of models that can achieve high accuracy. At the same time, it is difficult to train the complex models with large data sets. Therefore, it is crucial to accelerate training with distributed systems and architectures, where communication and heterogeneity are two key challenges. In this talk, I will present two heterogeneity-aware decentralized training protocols without communication bottleneck. Specifically, Hop supports arbitrary iteration gap between workers by novel queue-based synchronization which can tolerate heterogeneity with system techniques. Prague uses randomized communication to tolerate heterogeneity with a new training algorithm based on partial reduce -- an efficient communication primitive. If time permits, I will present the systematic tensor partitioning for training on heterogeneous accelerator arrays (e.g., GPU/TPU). We believe that our principled approaches are crucial for achieving high-performance and efficient distributed training.
BIO: Xuehai Qian is an assistant professor at University of Southern California. His research interests include domain-specific systems and architectures, performance tuning and resource management of cloud systems and parallel computer architectures. He received his PhD from the University of Illinois Urbana Champaign and was a postdoc at UC Berkeley. He is the recipient of W.J Poppelbaum Memorial Award at UIUC, NSF CRII and CAREER Award, and the inaugural ACSIC (American Chinese Scholar In Computing) Rising Star Award.
Scaling the NY AI Innovation Ecosystem
The State University of New York at Stony Brook will bring together leading AI experts to promote a future where AI drives responsible progress. This two-day event will provide a significant opportunity to explore the future of AI, exchange ideas, and connect with those at the forefront of research and deployment. We invite faculty, staff, and students from all SUNY institutions and beyond, as well as industry AI practitioners and policymakers to attend.
Recognized AI experts from academia, industry, and government will present on topics such as AI applications, innovative developments in research and technology, workforce development, as well as ethical and societal impacts.
A 90-minute poster session is included in the schedule. If you would like to submit an abstract for consideration, please see the Call for Abstracts. The poster session segment of the symposium will be held in honor of the Inauguration of Dr. Andrea Goldsmith, the State University of New York at Stony Brook's seventh President. Poster printing for all participants will be covered by the Inauguration Planning Committee. SUNY students presenting posters are also eligible for travel reimbursement.
We kindly ask faculty to encourage their students to attend and to submit their work for presentation.
For additional information and to register, visit the symposium website. Please direct any questions to suny-ai-symposium-sbu@
International Love Data Week is a global event dedicated to celebrating data in all its forms. This year, Stony Brook University is excited to celebrate Love Data Week with a series of 30-minute webinars aimed to promote proficiency with data, showcase innovative data projects, and foster a community of data enthusiasts across campus. Hosted by the Division of Educational & Institutional Effectiveness and facilitated by the Office of Educational Effectiveness, we invite all SBU faculty, staff and students to join in the festivities, learn from colleagues in our campus community, and fall in love with the power of data! Learn more here. |
Abstract:
Deep learning models have achieved remarkable success across a wide range of computer vision tasks, including image classification, semantic segmentation, etc. However, such success highly relies on a large amount of annotated data, which are expensive to obtain. Moreover, their performance often degrades when there exist distribution shifts between training and test data. Domain Adaptation overcomes these issues by transferring knowledge from a label-rich source domain to a related but different target domain. Despite its popularity, domain adaptation is still a challenging task, especially when the data distribution shifts are severe, while the target domain has no or few labeled data.
In this thesis, I develop four efficient domain adaptation approaches to improve model performance on the target domain. Firstly, inspired by the large-scale pretraining of Vision Transformers, I explore Transformer-based domain adaptation for stronger feature representation and design a safe training mechanism to avoid model collapse in the situation of a large domain gap. Secondly, I observe that source models have low confidences on the target data. To address this, I focus on the penultimate activations of target data and propose an adversarial training strategy to enhance model prediction confidences. Thirdly, I study using weak supervision from prior knowledge about target domain label distribution. A novel Knowledge-guided Unsupervised Domain Adaptation paradigm is devised, and a plug-in module is designed to rectify pseudo labels. Lastly, I step into the task of Active Domain Adaptation, where the labels of a small portion of target data can be inquired. I propose a novel active selection criterion based on the local context and devise a progressive augmentation module to better utilize queried target data. The robustness of domain adaptation approaches, in addition to accuracy, is critical yet under-explored. To conclude the thesis, I empirically study set prediction in domain adaptation using the tool of conformal prediction and conformal training.
Location: New Computer Science Bldg., Room 120
Zoom Link: https://stonybrook.zoom.
Passcode: 466399