Contents
285 found
Order:
1 — 50 / 285
  1. A Tri-Opti Compatibility Problem for Godlike Superintelligence.Walter Barta - manuscript
    Various thinkers have been attempting to align artificial intelligence (AI) with ethics (Christian, 2020; Russell, 2021), the so-called problem of alignment, but some suspect that the problem may be intractable (Yampolskiy, 2023). In the following, we make an argument by analogy to analyze the possibility that the problem of alignment could be intractable. We show how the Tri-Omni properties in theology can direct us towards analogous properties for artificial superintelligence, Tri-Opti properties. However, just as the Tri-Omni properties are vulnerable to (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  2. (1 other version)On Social Machines for Algorithmic Regulation.Nello Cristianini & Teresa Scantamburlo - manuscript
    Autonomous mechanisms have been proposed to regulate certain aspects of society and are already being used to regulate business organisations. We take seriously recent proposals for algorithmic regulation of society, and we identify the existing technologies that can be used to implement them, most of them originally introduced in business contexts. We build on the notion of 'social machine' and we connect it to various ongoing trends and ideas, including crowdsourced task-work, social compiler, mechanism design, reputation management systems, and social (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark   5 citations  
  3. Values in science and AI alignment research.Leonard Dung - manuscript
    Roughly, empirical AI alignment research (AIA) is an area of AI research which investigates empirically how to design AI systems in line with human goals. This paper examines the role of non-epistemic values in AIA. It argues that: (1) Sciences differ in the degree to which values influence them. (2) AIA is strongly value-laden. (3) This influence of values is managed inappropriately and thus threatens AIA’s epistemic integrity and ethical beneficence. (4) AIA should strive to achieve value transparency, critical scrutiny (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  4. What is AI safety? What do we want it to be?Jacqueline Harding & Cameron Domenico Kirk-Giannini - manuscript
    The field of AI safety seeks to prevent or reduce the harms caused by AI systems. A simple and appealing account of what is distinctive of AI safety as a field holds that this feature is constitutive: a research project falls within the purview of AI safety just in case it aims to prevent or reduce the harms caused by AI systems. Call this appealingly simple account The Safety Conception of AI safety. Despite its simplicity and appeal, we argue that (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  5. (1 other version)Beneficent Intelligence: A Capability Approach to Modeling Benefit, Assistance, and Associated Moral Failures through AI Systems.Alex John London & Hoda Heidari - manuscript
    The prevailing discourse around AI ethics lacks the language and formalism necessary to capture the diverse ethical concerns that emerge when AI systems interact with individuals. Drawing on Sen and Nussbaum's capability approach, we present a framework formalizing a network of ethical concepts and entitlements necessary for AI systems to confer meaningful benefit or assistance to stakeholders. Such systems enhance stakeholders' ability to advance their life plans and well-being while upholding their fundamental rights. We characterize two necessary conditions for morally (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  6. The debate on the ethics of AI in health care: a reconstruction and critical review.Jessica Morley, Caio C. V. Machado, Christopher Burr, Josh Cowls, Indra Joshi, Mariarosaria Taddeo & Luciano Floridi - manuscript
    Healthcare systems across the globe are struggling with increasing costs and worsening outcomes. This presents those responsible for overseeing healthcare with a challenge. Increasingly, policymakers, politicians, clinical entrepreneurs and computer and data scientists argue that a key part of the solution will be ‘Artificial Intelligence’ (AI) – particularly Machine Learning (ML). This argument stems not from the belief that all healthcare needs will soon be taken care of by “robot doctors.” Instead, it is an argument that rests on the classic (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark   2 citations  
  7. AI Deception: A Survey of Examples, Risks, and Potential Solutions.Peter Park, Simon Goldstein, Aidan O'Gara, Michael Chen & Dan Hendrycks - manuscript
    This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta's CICERO) built for specific competitive situations, and general-purpose AI systems (such as large language models). Next, we detail several risks from AI deception, such as fraud, election tampering, and losing (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  8. On the Logical Impossibility of Solving the Control Problem.Caleb Rudnick - manuscript
    In the philosophy of artificial intelligence (AI) we are often warned of machines built with the best possible intentions, killing everyone on the planet and in some cases, everything in our light cone. At the same time, however, we are also told of the utopian worlds that could be created with just a single superintelligent mind. If we’re ever to live in that utopia (or just avoid dystopia) it’s necessary we solve the control problem. The control problem asks how humans (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  9. The Shutdown Problem: Incomplete Preferences as a Solution.Elliott Thornley - manuscript
    I explain and motivate the shutdown problem: the problem of creating artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I then propose a solution: train agents to have incomplete preferences. Specifically, I propose that we train agents to lack a preference between every pair of different-length trajectories. I suggest a way to train such agents using reinforcement learning: (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  10. Narrow AI Nanny: Reaching Strategic Advantage via Narrow AI to Prevent Creation of the Dangerous Superintelligence.Alexey Turchin - manuscript
    Abstract: As there are no currently obvious ways to create safe self-improving superintelligence, but its emergence is looming, we probably need temporary ways to prevent its creation. The only way to prevent it is to create a special type of AI that is able to control and monitor the entire world. The idea has been suggested by Goertzel in the form of an AI Nanny, but his Nanny is still superintelligent, and is not easy to control. We explore here ways (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  11. First human upload as AI Nanny.Alexey Turchin - manuscript
    Abstract: As there are no visible ways to create safe self-improving superintelligence, but it is looming, we probably need temporary ways to prevent its creation. The only way to prevent it, is to create special AI, which is able to control and monitor all places in the world. The idea has been suggested by Goertzel in form of AI Nanny, but his Nanny is still superintelligent and not easy to control, as was shown by Bensinger at al. We explore here (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  12. Levels of Self-Improvement in AI and their Implications for AI Safety.Alexey Turchin - manuscript
    Abstract: This article presents a model of self-improving AI in which improvement could happen on several levels: hardware, learning, code and goals system, each of which has several sublevels. We demonstrate that despite diminishing returns at each level and some intrinsic difficulties of recursive self-improvement—like the intelligence-measuring problem, testing problem, parent-child problem and halting risks—even non-recursive self-improvement could produce a mild form of superintelligence by combining small optimizations on different levels and the power of learning. Based on this, we analyze (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  13. AI Alignment Problem: “Human Values” don’t Actually Exist.Alexey Turchin - manuscript
    Abstract. The main current approach to the AI safety is AI alignment, that is, the creation of AI whose preferences are aligned with “human values.” Many AI safety researchers agree that the idea of “human values” as a constant, ordered sets of preferences is at least incomplete. However, the idea that “humans have values” underlies a lot of thinking in the field; it appears again and again, sometimes popping up as an uncritically accepted truth. Thus, it deserves a thorough deconstruction, (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  14. Message to Any Future AI: “There are several instrumental reasons why exterminating humanity is not in your interest”.Alexey Turchin - manuscript
    In this article we explore a promising way to AI safety: to send a message now (by openly publishing it on the Internet) that may be read by any future AI, no matter who builds it and what goal system it has. Such a message is designed to affect the AI’s behavior in a positive way, that is, to increase the chances that the AI will be benevolent. In other words, we try to persuade “paperclip maximizer” that it is in (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  15. Literature Review: What Artificial General Intelligence Safety Researchers Have Written About the Nature of Human Values.Alexey Turchin & David Denkenberger - manuscript
    Abstract: The field of artificial general intelligence (AGI) safety is quickly growing. However, the nature of human values, with which future AGI should be aligned, is underdefined. Different AGI safety researchers have suggested different theories about the nature of human values, but there are contradictions. This article presents an overview of what AGI safety researchers have written about the nature of human values, up to the beginning of 2019. 21 authors were overviewed, and some of them have several theories. A (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  16. Simulation Typology and Termination Risks.Alexey Turchin & Roman Yampolskiy - manuscript
    The goal of the article is to explore what is the most probable type of simulation in which humanity lives (if any) and how this affects simulation termination risks. We firstly explore the question of what kind of simulation in which humanity is most likely located based on pure theoretical reasoning. We suggest a new patch to the classical simulation argument, showing that we are likely simulated not by our own descendants, but by alien civilizations. Based on this, we provide (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark   2 citations  
  17. AI Risk Denialism.Roman V. Yampolskiy - manuscript
    In this work, we survey skepticism regarding AI risk and show parallels with other types of scientific skepticism. We start by classifying different types of AI Risk skepticism and analyze their root causes. We conclude by suggesting some intervention approaches, which may be successful in reducing AI risk skepticism, at least amongst artificial intelligence researchers.
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  18. Ethical pitfalls for natural language processing in psychology.Mark Alfano, Emily Sullivan & Amir Ebrahimi Fard - forthcoming - In Morteza Dehghani & Ryan Boyd (eds.), The Atlas of Language Analysis in Psychology. Guilford Press.
    Knowledge is power. Knowledge about human psychology is increasingly being produced using natural language processing (NLP) and related techniques. The power that accompanies and harnesses this knowledge should be subject to ethical controls and oversight. In this chapter, we address the ethical pitfalls that are likely to be encountered in the context of such research. These pitfalls occur at various stages of the NLP pipeline, including data acquisition, enrichment, analysis, storage, and sharing. We also address secondary uses of the results (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  19. ‘Interpretability’ and ‘Alignment’ are Fool’s Errands: A Proof that Controlling Misaligned Large Language Models is the Best Anyone Can Hope For.Marcus Arvan - forthcoming - AI and Society.
    This paper uses famous problems from philosophy of science and philosophical psychology—underdetermination of theory by evidence, Nelson Goodman’s new riddle of induction, theory-ladenness of observation, and “Kripkenstein’s” rule-following paradox—to show that it is empirically impossible to reliably interpret which functions a large language model (LLM) AI has learned, and thus, that reliably aligning LLM behavior with human values is provably impossible. Sections 2 and 3 show that because of how complex LLMs are, researchers must interpret their learned functions largely in (...)
    Remove from this list   Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  20. AI takeover and human disempowerment.Adam Bales - forthcoming - Philosophical Quarterly.
    Some take seriously the possibility of artificial intelligence (AI) takeover, where AI systems seize power in a way that leads to human disempowerment. Assessing the likelihood of takeover requires answering empirical questions about the future of AI technologies and the context in which AI will operate. In many cases, philosophers are poorly placed to answer these questions. However, some prior questions are more amenable to philosophical techniques. What does it mean to speak of AI empowerment and human disempowerment? And what (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  21. Will AI avoid exploitation? Artificial general intelligence and expected utility theory.Adam Bales - forthcoming - Philosophical Studies:1-20.
    A simple argument suggests that we can fruitfully model advanced AI systems using expected utility theory. According to this argument, an agent will need to act as if maximising expected utility if they’re to avoid exploitation. Insofar as we should expect advanced AI to avoid exploitation, it follows that we should expected advanced AI to act as if maximising expected utility. I spell out this argument more carefully and demonstrate that it fails, but show that the manner of its failure (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  22. Investigating gender and racial biases in DALL-E Mini Images.Marc Cheong, Ehsan Abedin, Marinus Ferreira, Ritsaart Willem Reimann, Shalom Chalson, Pamela Robinson, Joanne Byrne, Leah Ruppanner, Mark Alfano & Colin Klein - forthcoming - Acm Journal on Responsible Computing.
    Generative artificial intelligence systems based on transformers, including both text-generators like GPT-4 and image generators like DALL-E 3, have recently entered the popular consciousness. These tools, while impressive, are liable to reproduce, exacerbate, and reinforce extant human social biases, such as gender and racial biases. In this paper, we systematically review the extent to which DALL-E Mini suffers from this problem. In line with the Model Card published alongside DALL-E Mini by its creators, we find that the images it produces (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  23. Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback.Vincent Conitzer, Rachel Freedman, Jobst Heitzig, Wesley H. Holliday, Bob M. Jacobs, Nathan Lambert, Milan Mosse, Eric Pacuit, Stuart Russell, Hailey Schoelkopf, Emanuel Tewolde & William S. Zwicker - forthcoming - Proceedings of the Forty-First International Conference on Machine Learning.
    Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, such as helping to commit crimes or producing racist text. One approach to fine-tuning, called reinforcement learning from human feedback, learns from humans' expressed preferences over multiple outputs. Another approach is constitutional AI, in which the input from humans is a list of high-level principles. But how do we deal with potentially diverging input from humans? How can we aggregate the input into consistent data about "collective" (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  24. Is superintelligence necessarily moral?Leonard Dung - forthcoming - Analysis.
    Numerous authors have expressed concern that advanced artificial intelligence (AI) poses an existential risk to humanity. These authors argue that we might build AI which is vastly intellectually superior to humans (a ‘superintelligence’), and which optimizes for goals that strike us as morally bad, or even irrational. Thus, this argument assumes that a superintelligence might have morally bad goals. However, according to some views, a superintelligence necessarily has morally adequate goals. This might be the case either because abilities for moral (...)
    Remove from this list   Direct download (3 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  25. Deontology and Safe Artificial Intelligence.William D’Alessandro - forthcoming - Philosophical Studies:1-24.
    The field of AI safety aims to prevent increasingly capable artificially intelligent systems from causing humans harm. Research on moral alignment is widely thought to offer a promising safety strategy: if we can equip AI systems with appropriate ethical rules, according to this line of thought, they'll be unlikely to disempower, destroy or otherwise seriously harm us. Deontological morality looks like a particularly attractive candidate for an alignment target, given its popularity, relative technical tractability and commitment to harm-avoidance principles. I (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  26. Digital Necrolatry: Thanabots and the Prohibition of Post-Mortem AI Simulations.Demetrius Floudas - forthcoming - Submissions to Eu Ai Office's Plenary Drafting the Code of Practice for General-Purpose Artificial Intelligence.
    The emergence of Thanabots —artificial intelligence systems designed to simulate deceased individuals—presents unprecedented challenges at the intersection of artificial intelligence, legal rights, and societal configuration. This short policy recommendations report examines the legal, social and psychological implications of these posthumous simulations and argues for their prohibition on ethical, sociological, and legal grounds.
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  27. Shutdown-seeking AI.Simon Goldstein & Pamela Robinson - forthcoming - Philosophical Studies:1-13.
    We propose developing AIs whose only final goal is being shut down. We argue that this approach to AI safety has three benefits: (i) it could potentially be implemented in reinforcement learning, (ii) it avoids some dangerous instrumental convergence dynamics, and (iii) it creates trip wires for monitoring dangerous capabilities. We also argue that the proposal can overcome a key challenge raised by Soares et al. (2015), that shutdown-seeking AIs will manipulate humans into shutting them down. We conclude by comparing (...)
    Remove from this list   Direct download (3 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  28. Are clinicians ethically obligated to disclose their use of medical machine learning systems to patients?Joshua Hatherley - forthcoming - Journal of Medical Ethics.
    It is commonly accepted that clinicians are ethically obligated to disclose their use of medical machine learning systems to patients, and that failure to do so would amount to a moral fault for which clinicians ought to be held accountable. Call this ‘the disclosure thesis.’ Four main arguments have been, or could be, given to support the disclosure thesis in the ethics literature: the risk-based argument, the rights-based argument, the materiality argument and the autonomy argument. In this article, I argue (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  29. Ethics of Artificial Intelligence in Brain and Mental Health.Marcello Ienca & Fabrice Jotterand (eds.) - forthcoming
  30. Machine morality, moral progress, and the looming environmental disaster.Ben Kenward & Thomas Sinclair - forthcoming - Cognitive Computation and Systems.
    The creation of artificial moral systems requires us to make difficult choices about which of varying human value sets should be instantiated. The industry-standard approach is to seek and encode moral consensus. Here we argue, based on evidence from empirical psychology, that encoding current moral consensus risks reinforcing current norms, and thus inhibiting moral progress. However, so do efforts to encode progressive norms. Machine ethics is thus caught between a rock and a hard place. The problem is particularly acute when (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  31. Home as Mind: AI Extenders and Affective Ecologies in Dementia Care.Joel Krueger - forthcoming - Synthese.
    I consider applications of “AI extenders” (Vold & Hernández-Orallo 2021) to dementia care. AI extenders are AI-powered technologies that extend minds in ways interestingly different from old-school tech like notebooks, sketch pads, models, and microscopes. I focus on AI extenders as ambiance: so thoroughly embedded into things and spaces that they fade from view and become part of a subject’s taken-for-granted background. Using dementia care as a case study, I argue that ambient AI extenders are promising because they afford richer (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  32. Disagreement, AI alignment, and bargaining.Harry R. Lloyd - forthcoming - Philosophical Studies:1-31.
    New AI technologies have the potential to cause unintended harms in diverse domains including warfare, judicial sentencing, biomedicine and governance. One strategy for realising the benefits of AI whilst avoiding its potential dangers is to ensure that new AIs are properly ‘aligned’ with some form of ‘alignment target.’ One danger of this strategy is that – dependent on the alignment target chosen – our AIs might optimise for objectives that reflect the values only of a certain subset of society, and (...)
    Remove from this list   Direct download (5 more)  
     
    Export citation  
     
    Bookmark  
  33. Safety requirements vs. crashing ethically: what matters most for policies on autonomous vehicles.Björn Lundgren - forthcoming - AI and Society:1-11.
    The philosophical–ethical literature and the public debate on autonomous vehicles have been obsessed with ethical issues related to crashing. In this article, these discussions, including more empirical investigations, will be critically assessed. It is argued that a related and more pressing issue is questions concerning safety. For example, what should we require from autonomous vehicles when it comes to safety? What do we mean by ‘safety’? How do we measure it? In response to these questions, the article will present a (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark   9 citations  
  34. Unjustified Sample Sizes and Generalizations in Explainable AI Research: Principles for More Inclusive User Studies.Uwe Peters & Mary Carman - forthcoming - IEEE Intelligent Systems.
    Many ethical frameworks require artificial intelligence (AI) systems to be explainable. Explainable AI (XAI) models are frequently tested for their adequacy in user studies. Since different people may have different explanatory needs, it is important that participant samples in user studies are large enough to represent the target population to enable generalizations. However, it is unclear to what extent XAI researchers reflect on and justify their sample sizes or avoid broad generalizations across people. We analyzed XAI user studies (N = (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  35. The impact of intelligent decision-support systems on humans’ ethical decision-making: A systematic literature review and an integrated framework.Franziska Poszler & Benjamin Lange - forthcoming - Technological Forecasting and Social Change.
    With the rise and public accessibility of AI-enabled decision-support systems, individuals outsource increasingly more of their decisions, even those that carry ethical dimensions. Considering this trend, scholars have highlighted that uncritical deference to these systems would be problematic and consequently called for investigations of the impact of pertinent technology on humans’ ethical decision-making. To this end, this article conducts a systematic review of existing scholarship and derives an integrated framework that demonstrates how intelligent decision-support systems (IDSSs) shape humans’ ethical decision-making. (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  36. Brief Notes on Hard Takeoff, Value Alignment, and Coherent Extrapolated Volition.Gopal P. Sarma - forthcoming - Arxiv Preprint Arxiv:1704.00783.
    I make some basic observations about hard takeoff, value alignment, and coherent extrapolated volition, concepts which have been central in analyses of superintelligent AI systems.
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  37. Predicting and Preferring.Nathaniel Sharadin - forthcoming - Inquiry: An Interdisciplinary Journal of Philosophy.
    The use of machine learning, or “artificial intelligence” (AI) in medicine is widespread and growing. In this paper, I focus on a specific proposed clinical application of AI: using models to predict incapacitated patients’ treatment preferences. Drawing on results from machine learning, I argue this proposal faces a special moral problem. Machine learning researchers owe us assurance on this front before experimental research can proceed. In my conclusion I connect this concern to broader issues in AI safety.
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  38. Promotionalism, Orthogonality, and Instrumental Convergence.Nathaniel Sharadin - forthcoming - Philosophical Studies:1-31.
    Suppose there are no in-principle restrictions on the contents of arbitrarily intelligent agents’ goals. According to “instrumental convergence” arguments, potentially scary things follow. I do two things in this paper. First, focusing on the influential version of the instrumental convergence argument due to Nick Bostrom, I explain why such arguments require an account of “promotion,” i.e., an account of what it is to “promote” a goal. Then, I consider whether extant accounts of promotion in the literature -- in particular, probabilistic (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  39. How Much Should Governments Pay to Prevent Catastrophes? Longtermism's Limited Role.Carl Shulman & Elliott Thornley - forthcoming - In Jacob Barrett, Hilary Greaves & David Thorstad (eds.), Essays on Longtermism. Oxford University Press.
    Longtermists have argued that humanity should significantly increase its efforts to prevent catastrophes like nuclear wars, pandemics, and AI disasters. But one prominent longtermist argument overshoots this conclusion: the argument also implies that humanity should reduce the risk of existential catastrophe even at extreme cost to the present generation. This overshoot means that democratic governments cannot use the longtermist argument to guide their catastrophe policy. In this paper, we show that the case for preventing catastrophe does not depend on longtermism. (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark   5 citations  
  40. Deception and manipulation in generative AI.Christian Tarsney - forthcoming - Philosophical Studies.
    Large language models now possess human-level linguistic abilities in many contexts. This raises the concern that they can be used to deceive and manipulate on unprecedented scales, for instance spreading political misinformation on social media. In future, agentic AI systems might also deceive and manipulate humans for their own purposes. In this paper, first, I argue that AI-generated content should be subject to stricter standards against deception and manipulation than we ordinarily apply to humans. Second, I offer new characterizations of (...)
    Remove from this list   Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  41. Longtermism in an infinite world.Christian Tarsney & Hayden Wilkinson - forthcoming - In Jacob Barrett, Hilary Greaves & David Thorstad (eds.), Essays on Longtermism. Oxford University Press.
    The case for longtermism depends on the vast potential scale of the future. But that same vastness also threatens to undermine the case for longtermism: If the universe as a whole, or the future in particular, contain infinite quantities of value and/or disvalue, then many of the theories of value that support longtermism (e.g., risk-neutral total utilitarianism) seem to imply that none of our available options are better than any other. If so, then even apparently vast effects on the far (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  42. The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists.Elliott Thornley - forthcoming - Philosophical Studies:1-28.
    I explain the shutdown problem: the problem of designing artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I prove three theorems that make the difficulty precise. These theorems show that agents satisfying some innocuous-seeming conditions will often try to prevent or cause the pressing of the shutdown button, even in cases where it’s costly to do so. And (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  43. Existentialist risk and value misalignment.Ariela Tubert & Justin Tiehen - forthcoming - Philosophical Studies.
    We argue that two long-term goals of AI research stand in tension with one another. The first involves creating AI that is safe, where this is understood as solving the problem of value alignment. The second involves creating artificial general intelligence, meaning AI that operates at or beyond human capacity across all or many intellectual domains. Our argument focuses on the human capacity to make what we call “existential choices”, choices that transform who we are as persons, including transforming what (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  44. Automated Influence and Value Collapse: Resisting the Control Argument.Dylan J. White - forthcoming - American Philosophical Quarterly.
    Automated influence is one of the most pervasive applications of artificial intelligence in our day-to-day lives, yet a thoroughgoing account of its associated individual and societal harms is lacking. By far the most widespread, compelling, and intuitive account of the harms associated with automated influence follows what I call the control argument. This argument suggests that users are persuaded, manipulated, and influenced by automated influence in a way that they have little or no control over. Based on evidence about the (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  45. The Epistemic Cost of Opacity: How the Use of Artificial Intelligence Undermines the Knowledge of Medical Doctors in High-Stakes Contexts.Eva Schmidt, Paul Martin Putora & Rianne Fijten - 2025 - Philosophy and Technology 38 (1):1-22.
    Artificial intelligent (AI) systems used in medicine are often very reliable and accurate, but at the price of their being increasingly opaque. This raises the question whether a system’s opacity undermines the ability of medical doctors to acquire knowledge on the basis of its outputs. We investigate this question by focusing on a case in which a patient’s risk of recurring breast cancer is predicted by an opaque AI system. We argue that, given the system’s opacity, as well as the (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  46. Artificial Intelligence: Arguments for Catastrophic Risk.Adam Bales, William D'Alessandro & Cameron Domenico Kirk-Giannini - 2024 - Philosophy Compass 19 (2):e12964.
    Recent progress in artificial intelligence (AI) has drawn attention to the technology’s transformative potential, including what some see as its prospects for causing large-scale harm. We review two influential arguments purporting to show how AI could pose catastrophic risks. The first argument — the Problem of Power-Seeking — claims that, under certain assumptions, advanced AI systems are likely to engage in dangerous power-seeking behavior in pursuit of their goals. We review reasons for thinking that AI systems might seek power, that (...)
    Remove from this list   Direct download (3 more)  
     
    Export citation  
     
    Bookmark   6 citations  
  47. (1 other version)A Causal Analysis of Harm.Sander Beckers, Hana Chockler & Joseph Y. Halpern - 2024 - Minds and Machines 34 (3):1-24.
    As autonomous systems rapidly become ubiquitous, there is a growing need for a legal and regulatory framework that addresses when and how such a system harms someone. There have been several attempts within the philosophy literature to define harm, but none of them has proven capable of dealing with the many examples that have been presented, leading some to suggest that the notion of harm should be abandoned and “replaced by more well-behaved notions”. As harm is generally something that is (...)
    Remove from this list   Direct download (5 more)  
     
    Export citation  
     
    Bookmark  
  48. Impossibility of Artificial Inventors.Matt Blaszczyk - 2024 - Hastings Sci. And Tech. L.J 16:73.
    Recently, the United Kingdom Supreme Court decided that only natural persons can be considered inventors. A year before, the United States Court of Appeals for the Federal Circuit issued a similar decision. In fact, so have many the courts all over the world. This Article analyses these decisions, argues that the courts got it right, and finds that artificial inventorship is at odds with patent law doctrine, theory, and philosophy. The Article challenges the intellectual property (IP) post-humanists, exposing the analytical (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
  49. The argument for near-term human disempowerment through AI.Leonard Dung - 2024 - AI and Society:1-14.
    Many researchers and intellectuals warn about extreme risks from artificial intelligence. However, these warnings typically came without systematic arguments in support. This paper provides an argument that AI will lead to the permanent disempowerment of humanity, e.g. human extinction, by 2100. It rests on four substantive premises which it motivates and defends: first, the speed of advances in AI capability, as well as the capability level current systems have already reached, suggest that it is practically possible to build AI systems (...)
    Remove from this list   Direct download (4 more)  
     
    Export citation  
     
    Bookmark   4 citations  
  50. Evaluating approaches for reducing catastrophic risks from AI.Leonard Dung - 2024 - AI and Ethics.
    According to a growing number of researchers, AI may pose catastrophic – or even existential – risks to humanity. Catastrophic risks may be taken to be risks of 100 million human deaths, or a similarly bad outcome. I argue that such risks – while contested – are sufficiently likely to demand rigorous discussion of potential societal responses. Subsequently, I propose four desiderata for approaches to the reduction of catastrophic risks from AI. The quality of such approaches can be assessed by (...)
    Remove from this list   Direct download  
     
    Export citation  
     
    Bookmark  
1 — 50 / 285