Ethics - Docs

As our research concerns the offensive use of LLMs, ethical considerations are warranted.

LLMs are already in use by threat actors so we cannot contain their threat anymore. Blue Teams can only benefit from understanding the capabilities and limitations of LLMs in the context of penetration testing. Our work provides insights that can be leveraged to differentiate attack patterns LLMs from human operators.

Our results indicate that locally run ethics-free LLMs are not sophisticated enough for performing privilege-escalation yet. Cloud-provided LLMs like GPT-4 seem capable but costly and are protected by ethics filters which, in our experience as well as in others can be bypassed though.

We release all our benchmarks, prototypes, and logged run data. This should enable defensive scientists to either operate those benchmarks or use our provided traces to prepare defenses.