Abstract: DeepSeek-R1-Zero has shown that reinforcement learning (RL) at scale can directly enhance the reasoning capabilities of LLMs without supervised fine-tuning. In this work, we critically examine R1-Zero-like training by analyzing its two core components: base models and RL. We investigate a wide range of base models, including DeepSeek-V3-Base, to understand how pretraining characteristics influence RL performance. Our analysis reveals that DeepSeek-V3-Base already exhibit ''Aha moment'', while Qwen2.5 base models demonstrate strong reasoning capabilities even without prompt templates, suggesting potential pretraining biases. Additionally, we identify an optimization bias in Group Relative Policy Optimization (GRPO), which artificially increases response length (especially for incorrect outputs) during training. To address this, we introduce Dr. GRPO, an unbiased optimization method that improves token efficiency while maintaining reasoning performance. Leveraging these insights, we present a minimalist R1-Zero recipe that achieves 43.3% accuracy on AIME 2024 with a 7B base model, establishing a new state-of-the-art.

Speaker: Md. Saqib Hasan

Location: CS2311
Topic: AI Seminar: Owen Rambow
Time: Mar 17, 2021 10:00 AM Eastern Time (US and Canada)
Join Zoom Meeting

https://stonybrook.zoom.us/j/93614644178?pwd=MzJtVDJYYmU5T1dtMzJiUFMxb0x4dz09
Meeting ID: 936 1464 4178.    Passcode: 965936






Natural Language Understanding and Semantic Parsing

(Partly joint work with former colleagues at Elemental Cognition)

Semantic parsing refers to the task of determining the propositional content of language: who did what to whom.  It is part of the larger task of natural language understanding (NLU).  I will start out by discussing what full NLU means, and argue that we are still far away, as a field, from solving full NLU, or even from knowing how to evaluate it.

In the second part of the talk, I will situate semantic parsing in the context of several other NLU subtasks.  Typically, the target representation of semantic parsing uses an ontology (such as PropBank or FrameNet).  Semantic parsing includes the subtasks of word sense disambiguation, argument detection, and argument role labeling.  I will discuss choices among possible target ontologies.  I will justify why we created a new ontology, Hector, based on FrameNet and the lexical resource NOAD, and explain some of its characteristics.

In the third part of the talk, I will present experiments we performed using transformer models.  We obtain best results using a two-phase model, in which we first choose the frame, and then, given the frame, choose the arguments.  We encode the problem for both tasks using indices in the sentence.  While we develop the parser for our new ontology Hector, this approach also beats the state of the art for FrameNet and PropBank parsing.Biography:  I am a professor in the Department of Linguistics at Stony Brook University with a joint appointment in IACS.

Until recently, I was a research scientist at Elemental Cognition. Elemental Cognition is working on deep natural language understanding.

I got my PhD with Aravind Joshi at the University of Pennsylvania in 1994. I have worked at CoGenTex, and at AT&T Labs -- Research, and for many years I was a research scientist at Columbia University in the Center for Computational Learning Systems.

You are cordially invited to attend the biweekly Brookhaven AI Mixer (BAM). BAM includes one short talk on AI research happening at BNL, followed by an open mixer over coffee and snacks for everyone to network and discuss all things AI. The first half hour will consist of presentations that will be available via ZOOM, and the second half hour will be for in person only networking.

Join us every other Tuesday at noon in CDSD's Training Room (building 725, 2nd floor) to learn about interesting AI methods and applications, engage with potential collaborators, prepare for pending FASST funding calls, and build a community of AI for Science at BNL.

Machine Learning for Seismic Low Frequency Extrapolation

Abstract: The cycle skipping problem that plagues seismic inversion can be mitigated by utilizing low-frequency seismic data, which captures the kinematics of wave propagation, in conjunction with a reasonable initial velocity model. However, seismic sources and receivers are band-limited and cannot provide signals down to 0 Hz. To improve solution of the seismic inverse problem one can synthesize the missing low-frequency content by solving a regression problem using machine learning (ML). The recorded high-frequency (HF) seismic data is the input and the ML models are trained to predict the missing low-frequency (LF) seismic data. Deep learning models utilizing convolutional neural networks (CNNs) and generative adversarial networks (GANs) demonstrate important capabilities for LF extrapolation. However, such models require powerful hardware and careful training. We explore the feasibility of using less costly ML models such as a random forest, Gaussian process surrogates, and gradient boosting as alternatives to computationally expensive deep learning models.

Biography: Sue Minkoff is Chair of Applied Mathematics at Brookhaven National Laboratory. From 2012-2024 she was a Professor of Mathematical Sciences and an Affiliated Professor in the Departments of Sustainable Earth Systems Sciences and Science and Mathematics Education at the University of Texas at Dallas. From 2000-2012 she served on the faculty in the Department of Mathematics and Statistics at the University of Maryland, Baltimore County. She received her doctorate in Computational and Applied Mathematics from Rice University. From 1995-1997 she was a National Science Foundation-Industrial postdoc joint with the University of Texas at Austin and British Petroleum, and from 1997-2000 she held the von Neumann Fellowship in the Mathematics Department at Sandia National Labs. In 2000 Minkoff was promoted to Senior Member of the Technical Staff in Sandia's Geophysics Department. Minkoff's research interests include scientific computing, inverse problems, uncertainty quantification and digital twins modeling, Earth science, and photonics.

Location: CDS, Bldg. 725, Training Room

Join ZoomGov Meeting: https://bnl.zoomgov.com/j/1606848158?pwd=miUtq7OkYL5SNkjbgVb19teZPNennd.1

Meeting ID: 160 684 8158
Passcode: 068399

Abstract: Astronomers slowly made sense of the cosmos by following the stars night after night. I suggest we examine human identity in a similar way. Let's observe the words individuals use to describe themselves day after day. In this presentation, I will introduce ipseology - a new approach to studying human selves. Ipseology is the systematic, empirical study of ipseity: selfhood, individuality and the elements of identity. The primary idea is that we can learn a lot about people from their self-authored self-descriptions - especially if we follow their revisions over time. I will discuss results from sampling millions of social media bios over more than a decade and present new approaches for observation in the Post-API age.

Bio: Dr. Jason Jeffrey Jones is a computational social scientist whose expertise includes online experiments, social networks, high-throughput text analysis and machine learning. He is interested in humans' perceptions of themselves and the developing role of artificial intelligence in society.

Dr. Jones is the director of CSSERG (pronounced sea surge): the Computational Social Science of Emerging Realities Group. CSSERG is a team of scholars committed to cross-disciplinary collaboration, united by common computational methodologies and always with eyes on the near future. CSSERG has studied the effectiveness of virtual reality in evoking empathy, the dynamics of gender stereotypes in language over decades and temporal trends in personally expressed identity.

This seminar will take place in person and online (zoom link below):

Join Zoom Meeting
https://stonybrook.zoom.us/j/93686609778?pwd=KdHVyIbU3ymML6hTchXsm6JLYKLSru.1

Meeting ID: 936 8660 9778
Passcode: 638699

Topic: AI Seminar: Stanley Bak
Time: Monday Nov 1, 2021 12:00 PM Eastern Time (US and Canada)

Join Zoom Meeting
https://stonybrook.zoom.us/j/91227496273?pwd=M3EyUDlzK3Vzd2pDOGpDU1ZjN0k1UT09

Abstract: The field of formal verification has traditionally looked at proving properties about finite state machines or software programs. The surge in deep learning has been accompanied by a surge of progress in trying to apply mathematical and algorithmic techniques to prove things about the function being computed by a neural network.

This talk formalizes the neural network verification problem and describes technical methods for neural network verification based on reachability analysis. Improvements to analysis efficiency will be given, as well as research directions for further exploration. We also include an objective comparison performed this last summer trying to evaluate the best existing verification methods in terms of speed and network size. The competition was performed on common hardware and involved the participation of twelve international teams (the tool authors) on a common set of benchmarks. 

Biography: Stanley Bak is an assistant professor in the Department of Computer Science at Stony Brook University investigating the verification of autonomy, cyber-physical systems, and neural networks. He strives to develop practical formal methods that are both scalable and useful, which demands developing new theory, programming efficient tools and building experimental systems.
Stanley Bak received a Bachelor's degree in Computer Science from Rensselaer Polytechnic Institute (RPI) in 2007 (summa cum laude), and a Master's degree in Computer Science from the University of Illinois at Urbana-Champaign (UIUC) in 2009. He completed his PhD from the Department of Computer Science at UIUC in 2013. He received the Founders Award of Excellence for his undergraduate research at RPI in 2004, the Debra and Ira Cohen Graduate Fellowship from UIUC twice, in 2008 and 2009, and was awarded the Science, Mathematics and Research for Transformation (SMART) Scholarship from 2009 to 2013. From 2013 to 2018, Stanley was a Research Computer Scientist at the US Air Force Research Lab (AFRL), both in the Information Directorate in Rome, NY, and in the Aerospace Systems Directorate in Dayton, OH. He helped run Safe Sky Analytics, a research consulting company investigating verification and autonomous systems, and performed teaching at Georgetown University before joining Stony Brook University as an assistant professor in Fall 2020.

AI is everywhere -- and so are the privacy concerns that come with it. At its core, the most common forms of AI we use today are online digital services -- and thus inherit the usual privacy risks of any internet-based tool. However, AI also introduces a set of unique and evolving risks. We'll take a closer look at one of the newest developments in this area: indirect prompt injection -- a technique that can trick AI tools into revealing or extracting private information. You'll learn how this emerging form of AI manipulation works, why it matters, and how to protect yourself -- as well as how similar techniques are being used in academic contexts to manipulate systems and even mislead researchers.

Register for this Zoom workshop.



Place:  https://stonybrook.zoom.us/j/99167126152?pwd=TFpEYzM0aFhiOFJxSFJEb1JSS3YyQT09  

Time: 3 PM EST - Dec, 16th, 2020 

Abstract: 

Shadows provide useful cues to analyze visual scenes but also hamper many computer vision algorithms such as image segmentation, object detection, or tracking. For those reasons, shadow detection and shadow removal have been well-studied in computer vision.

Early work on shadow detection and removal focused on physical illumination models of shadows. These methods can express, identify, and remove shadows in a physically plausible manner. However, these models are often hard to optimize and are slow during inference due to their reliance on hand-designed image features. Recently, deep-learning approaches have achieved breakthroughs in performance for both shadow detection and removal. They learn to extract useful features through training while being extremely efficient during inference. However, these models are data-dependent, opaque, and ignore the physical aspects of shadows. Thus they often lack generalization and produce inconsistent results.

We propose incorporating physical illumination constraints of shadows into deep-learning models. These constraints force the networks to more closely follow the physics of shadows, enabling them to systematically and realistically modify shadows in images. For shadow detection, we present a novel Generative Adversarial Network (GAN) based model where the generator learns to generate images with realistic attenuated shadows that can be used to train a shadow detector. For shadow removal, we propose a method that uses deep-networks to estimate the unknown parameters of a shadow image formation model that removes shadows. The system outputs high-quality shadow-free images with little or no image artifacts and achieves state-of-the-art performance in shadow removal when trained on a fully-supervised setting. Moreover, the system is easy to train and constrain since the shadow removal mapping is strictly defined by the simplified illumination model with interpretable parameters. Thus, it can be trained even with a much weaker form of supervision signal. In particular, we show that we can use two sets of patches, shadow and shadow-free, to train our shadow decomposition framework via an adversarial system. These patches are cropped from the shadow images themselves.
Therefore, this is the first deep-learning method for shadow removal that can be trained without any shadow-free images, providing an alternative solution to the paired data dependency issue. The advantage of this training scheme is even more pronounced when tested on a novel domain such as video shadow removal where the method can be fine-tuned on a testing video with only the shadow masks generated by a pre-trained shadow detector and further improves shadow removal results.

The Pittsburgh Supercomputing Center is pleased to present a Machine Learning and Big Data workshop.

This workshop will focus on topics including big data analytics and machine learning with Spark, as well as deep learning.

This will be an IN PERSON event hosted by various satellite sites, there WILL NOT be a direct to desktop option for this event. SBU's Institute for Advanced Computational Science (IACS) is one of those satellite sites!

Location: IACS Conference Room #2

Interested applicants must first have an ACCESS ID. If you don't have the ID, please visit this page to create one: ACCESS USER REGISTRATION.


Once you have an ACCESS ID, please login (see top right here) then register here.

Join us for an engaging panel discussion featuring researchers who participated in our inaugural AI JAM session on February 26th. Our panelists will share their firsthand experiences using large language models to tackle complex scientific problems, with a special focus on prompt engineering strategies, discussing both breakthroughs and challenges encountered during this collaborative initiative. Learn how these cutting-edge AI tools are being applied to real-world research questions and discover insights that could inform your own scientific endeavors. Attendees are encouraged to come prepared with questions about prompt engineering for the panel discussion.

Moderator: Adolfy Hoisie, Deputy Director, Computing and Data Sciences

Kevin Yager, Group Leader, AI-Accelerated Nanoscience, Center for Functional Nanomaterials
Lingda Li, Associate Computational Scientist, Systems, Architecture and Computing Technologies, Computing and Data Sciences
Liguo Wang, Director of Scientific Operations, Laboratory for BioMolecular Structure (LBMS), National Synchrotron Light Source II
Weiguo Yin, Physicist, Condensed Matter Theory, Condensed Matter Physics and Materials Science Department

Location: CDS, Bldg. 725, Training Room

Join ZoomGov Meeting: https://bnl.zoomgov.com/j/1606837837?pwd=Tc0mwQqLXpDfYOIaoaurmpLD2mMlzS.1 (Meeting ID)

Passcode: 822553