Stony Brook AI Researchers Showcase Breakthroughs at EMNLP 2024

From using the New York Times Connections word game to decoding the basics of language and cognition, AI researchers from Stony Brook University present their out-of-the-box approaches at this year's EMNLP conference on how we can improve machines’ understanding of human language.

Stony Brook, NY, Oct 25, 2024 - Stony Brook AI professors Niranjan Balasubramanian, Tuhin Chakrabarty, Tengfei Ma, and Steven Skiena have been recognized for their work on Natural Language Processing (NLP) at the 2024 EMNLP (Empirical Methods in NLP) Conference, held in Miami, Florida.

A leading conference in the field of Natural Language Processing, EMNLP provides a platform for showcasing the latest advancements, sharing knowledge, discussing common challenges, and fostering collaboration in the domain of NLP and its empirical foundations. It also serves as a venue for publishing high-quality Research papers, which undergo a rigorous peer-review process.

This year the conference highlighted the work of four SBU AI researchers, noting their contributions to the field:

  1. CaT-Bench: Benchmarking Language Model Understanding of Causal and Temporal Dependencies in Plans
    Authors: Yash Kumar Lal, Vanya Cohen, Nathanael Chambers, Niranjan Balasubramanian, Ray Mooney
    The research aims to help machines understand and execute tasks that come with step-by-step instructions, like recipes, or assembling furniture. It explores the complexity of how these steps depend on each other, and what the machine should do when an instruction feels unnecessary, or off, showing us that there’s significant room for improvement in Large Language Models’ ability to detect dependence between steps.
     
  2. Connecting the Dots: Evaluating Abstract Reasoning Capabilities of LLMs Using the New York Times Connections Word Game
    Authors: Prisha Samdarshi, Mariam Mustafa, Anushka Kulkarni, Raven Rothkopf, Tuhin Chakrabarty, Smaranda Muresan
    Tuhin’s work is an exemplary attempt to improve abstract reasoning capabilities of AI systems, for which — the team discovered — the NYT Connections game serves as a good benchmark. It shows that AI can only solve 18% of the games. While AI is better at categorizing words based on semantic relations, the study concludes, it struggles with other types of knowledge — such as Encyclopedic Knowledge, Multiword Expressions, or knowledge that combines both Word Form and Meaning.
     
  3. FAC2E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition
    Authors: Xiaoqiang Wang, Lingfei Wu, Tengfei Ma, Bang Liu
    This work posits that Large language models’ performance is mainly evaluated on the basis of their ability to understand text and generate tasks, but this approach fails to comprehensively differentiate underlying language and cognitive skills from the model. Thus, it introduces a new framework, FAC2E, which helps us better measure LLMs’ capabilities by dissociating their language and cognition capabilities from them.
     
  4. The Shape of Word Embeddings: Quantifying Non-Isometry with Topological Data Analysis
    Authors: Ondřej Draganov, Steven Skiena
    The paper attempts to improve machines’ understanding of over 81 Indo-European languages by using topology to study, recognize, and reconstruct word groupings as language phylogenetic trees.

Varying across core NLP research, automation, and applications in communication, SBU’s contributions in Artificial Intelligence are novel and industry-leading.

“We’re proud to share our advancements with our colleagues at the NLP community,” said AI Institute Director Steven Skiena, “Showcasing this research also gives us space to collaborate to better understand how we can meet the challenges of human-machine interaction at such granular level.”

 

Ankita Nagpal
Communications Assistant