Language Agents and Malevolent Design

Inchul Yum

Download from

dx.doi.org

More download options

Language Agents and Malevolent Design

Inchul Yum

Philosophy and Technology 37 (104):1-19 (2024) Copy BIBT_EX

Abstract

Language agents are AI systems capable of understanding and responding to natural language, potentially facilitating the process of encoding human goals into AI systems. However, this paper argues that if language agents can achieve easy alignment, they also increase the risk of malevolent agents building harmful AI systems aligned with destructive intentions. The paper contends that if training AI becomes sufficiently easy or is perceived as such, it enables malicious actors, including rogue states, terrorists, and criminal organizations, to create powerful AI systems devoted to their nefarious aims. Given the strong incentives for such groups and the rapid progress in AI capabilities, this risk demands serious attention. In addition, the paper highlights considerations suggesting that the negative impacts of language agents may outweigh the positive ones, including the potential irreversibility of certain negative AI impacts. The overarching lesson is that various AI-related issues are intimately connected with each other, and we must recognize this interconnected nature when addressing those issues.

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Edit

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Author's Profile

Inchul Yum

Ohio State University

Keywords

Artificial intelligence AI misuse Value alignment Language agents Large language models

Reprint years

DOI

10.1007/s13347-024-00794-0

Other Versions

No versions found

My notes

Analytics

Added to PP
2024-08-17

Downloads
48 (#461,172)

6 months
48 (#102,809)

Historical graph of downloads

How can I increase my downloads?

Author's Profile

Inchul Yum

Ohio State University

Citations of this work

Is Alignment Unsafe?Cameron Domenico Kirk-Giannini - 2024 - Philosophy and Technology 37 (110):1–4.

Add more citations

References found in this work

The singularity: A philosophical analysis.David J. Chalmers - 2010 - Journal of Consciousness Studies 17 (9-10):9 - 10.

The Ethics of AI Ethics: An Evaluation of Guidelines.Thilo Hagendorff - 2020 - Minds and Machines 30 (1):99-120.

Superintelligence: paths, dangers, strategies.Nick Bostrom (ed.) - 2003 - Oxford University Press.

The Singularity: A Philosophical Analysis.David Chalmers - 2016 - In Uzi Awret & U. Awret (eds.), The Singularity: Could Artificial Intelligence Really Out-Think Us ? Imprint Academic. pp. 12-88.

Artificial Intelligence: Arguments for Catastrophic Risk.Adam Bales, William D'Alessandro & Cameron Domenico Kirk-Giannini - 2024 - Philosophy Compass 19 (2):e12964.

View all 30 references / Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Language Agents and Malevolent Design

Abstract

Author's Profile

Categories

Keywords

Reprint years

DOI

Other Versions

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Author's Profile

Citations of this work

References found in this work