风格攻击 (Style-based Attacks)

Articles about 基于风格的攻击

When Poetry Breaks AI

Researchers show that carefully written verse can reliably bypass safety filters in many top language models, exposing a new, style-based class of jailbreaks and challenging cur... 十一月 23, 2025