ICLR2025

Beyond Mere Token Analysis: A Hypergraph Metric Space Framework for Defending Against Socially Engineered LLM Attacks

Manohar Kaul, Aditya Saibewar, Sadbhavana Babar

Abstract

The Problem Attacks have evolved from algorithmic jailbreaks to sophisticated social engineering that exploits LLMs' human-like communication Why It's hard to defend against • Attacks mirror natural human discourse • Token-level defenses fail against multi-layered manipulation • Attackers rapidly adapt with novel patterns What is a Hypergraph? Informal Description A hypergraph is a natural extension of a graph where edges (called hyperedges) can link multiple vertices together.