On the Systemic Insecurity of Large Language Model Systems: Vulnerabilities in Content, Retrieval, and Reasoning

Event Description

Abstract: Large Language Models (LLMs) have transitioned from standalone prediction interfaces into integrated systems that incorporate content protection, external knowledge retrieval, and multi-step reasoning. While these functional layers expand model capabilities, they also introduce complex, inter-component dependencies that create novel and systemic security risks. This research provides a systematic deconstruction of the structural vulnerabilities emerging across these functional layers.

In this proposal, we evaluate the security boundaries of LLM systems through three pivotal dimensions:
The Content Layer: We present Watermark under Fire, revealing the inherent fragility of content-based tracing mechanisms under adaptive perturbations and highlighting the limitations of surface-level safety measures.
The Retrieval Layer: We introduce GraphRAG under Fire to examine the security of topology-aware knowledge integration. We reveal how graph-based indexing can be exploited as a structural lever for high-success poisoning attacks.
The Reasoning Layer: We detail AutoRAN, the first framework demonstrating the hijacking of internal safety reasoning in Large Reasoning Models (LRMs). This work proves that the transparency of the reasoning process itself creates a critical and exploitable attack surface.

Collectively, these studies demonstrate a systemic failure of add-on safety mechanisms in securing the broader LLM ecosystem. By identifying recurring patterns of exploitation across different system layers, this research provides the necessary foundation for transitioning from reactive patching to a more unified and architecturally-grounded approach to AI trustworthiness.

Speaker: Jiacheng Liang

Zoom: https://stonybrook.zoom.us/j/6669990420?pwd=dkY0eEw5YXpPSWo3RUE4OE1oVW90UT09&omn=97367037382
Meeting ID: 666 999 0420
Passcode: 075299

Date Start

Mon, 03/09/2026 - 15:00

Date End

Mon, 03/09/2026 - 16:00

AI Innovation Institute

Event Description

Date Start

Date End