Results for 'AI Alignment'

973 found
Order:
  1. AI Alignment vs. AI Ethical Treatment: Ten Challenges.Adam Bradley & Bradford Saad - manuscript
    A morally acceptable course of AI development should avoid two dangers: creating unaligned AI systems that pose a threat to humanity and mistreating AI systems that merit moral consideration in their own right. This paper argues these two dangers interact and that if we create AI systems that merit moral consideration, simultaneously avoiding both of these dangers would be extremely challenging. While our argument is straightforward and supported by a wide range of pretheoretical moral judgments, it has far-reaching moral implications (...)
    Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  2. Values in science and AI alignment research.Leonard Dung - manuscript
    Roughly, empirical AI alignment research (AIA) is an area of AI research which investigates empirically how to design AI systems in line with human goals. This paper examines the role of non-epistemic values in AIA. It argues that: (1) Sciences differ in the degree to which values influence them. (2) AIA is strongly value-laden. (3) This influence of values is managed inappropriately and thus threatens AIA’s epistemic integrity and ethical beneficence. (4) AIA should strive to achieve value transparency, critical (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  3. Disagreement, AI alignment, and bargaining.Harry R. Lloyd - forthcoming - Philosophical Studies:1-31.
    New AI technologies have the potential to cause unintended harms in diverse domains including warfare, judicial sentencing, biomedicine and governance. One strategy for realising the benefits of AI whilst avoiding its potential dangers is to ensure that new AIs are properly ‘aligned’ with some form of ‘alignment target.’ One danger of this strategy is that – dependent on the alignment target chosen – our AIs might optimise for objectives that reflect the values only of a certain subset of (...)
    Direct download (5 more)  
     
    Export citation  
     
    Bookmark  
  4.  58
    Expanding AI and AI Alignment Discourse: An Opportunity for Greater Epistemic Inclusion.A. E. Williams - manuscript
    The AI and AI alignment communities have been instrumental in addressing existential risks, developing alignment methodologies, and promoting rationalist problem-solving approaches. However, as AI research ventures into increasingly uncertain domains, there is a risk of premature epistemic convergence, where prevailing methodologies influence not only the evaluation of ideas but also determine which ideas are considered within the discourse. This paper examines critical epistemic blind spots in AI alignment research, particularly the lack of predictive frameworks to differentiate problems (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  5.  43
    Philosophical Investigations into AI Alignment: A Wittgensteinian Framework.José Antonio Pérez-Escobar & Deniz Sarikaya - 2024 - Philosophy and Technology 37 (3):1-25.
    We argue that the later Wittgenstein’s philosophy of language and mathematics, substantially focused on rule-following, is relevant to understand and improve on the Artificial Intelligence (AI) alignment problem: his discussions on the categories that influence alignment between humans can inform about the categories that should be controlled to improve on the alignment problem when creating large data sets to be used by supervised and unsupervised learning algorithms, as well as when introducing hard coded guardrails for AI models. (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  6. AI, alignment, and the categorical imperative.Fritz McDonald - 2023 - AI and Ethics 3:337-344.
    Tae Wan Kim, John Hooker, and Thomas Donaldson make an attempt, in recent articles, to solve the alignment problem. As they define the alignment problem, it is the issue of how to give AI systems moral intelligence. They contend that one might program machines with a version of Kantian ethics cast in deontic modal logic. On their view, machines can be aligned with human values if such machines obey principles of universalization and autonomy, as well as a deontic (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  7.  94
    Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback.Vincent Conitzer, Rachel Freedman, Jobst Heitzig, Wesley H. Holliday, Bob M. Jacobs, Nathan Lambert, Milan Mosse, Eric Pacuit, Stuart Russell, Hailey Schoelkopf, Emanuel Tewolde & William S. Zwicker - forthcoming - Proceedings of the Forty-First International Conference on Machine Learning.
    Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, such as helping to commit crimes or producing racist text. One approach to fine-tuning, called reinforcement learning from human feedback, learns from humans' expressed preferences over multiple outputs. Another approach is constitutional AI, in which the input from humans is a list of high-level principles. But how do we deal with potentially diverging input from humans? How can we aggregate the input into consistent data about "collective" (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  8.  25
    Beyond Preferences in AI Alignment.Tan Zhi-Xuan, Micah Carroll, Matija Franklin & Hal Ashton - forthcoming - Philosophical Studies:1-51.
    The dominant practice of AI alignment assumes (1) that preferences are an adequate representation of human values, (2) that human rationality can be understood in terms of maximizing the satisfaction of preferences, and (3) that AI systems should be aligned with the preferences of one or more humans to ensure that they behave safely and in accordance with our values. Whether implicitly followed or explicitly endorsed, these commitments constitute what we term apreferentistapproach to AI alignment. In this paper, (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  9.  19
    Kantian Fallibilist Ethics for AI alignment.Vadim Chaly - 2024 - Journal of Philosophical Investigations 18 (47):303-318.
    The problem of AI alignment has parallels in Kantian ethics and can benefit from its concepts and arguments. The Kantian framework allows us to better answer the question of what exactly AI is being aligned to, what are the problems of alignment of rational agents in general, and what are the prospects for achieving a state of alignment. Having described the state of discussions about alignment in AI, I will reformulate them in Kantian terms. Thus, the (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  10.  14
    Reflections on the AI alignment problem.Dan Bruiger - forthcoming - AI and Society:1-10.
    The Alignment Problem in artificial intelligence concerns how to insure that artificial general intelligence (AGI) conforms to human goals and values and remains under human control. The concept of general intelligence, modelled on human and animal behavior, lacks coherence. The ideal of autonomy inherent in AGI conflicts with the ideal of external control. Truly autonomous agents are necessarily _embodied,_ but embodiment implies more than physical instantiation or sensory input. It means being an autopoietic system (like a natural organism), with (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  11.  42
    Aesthetic Value and the AI Alignment Problem.Alice C. Helliwell - 2024 - Philosophy and Technology 37 (4):1-21.
    The threat from possible future superintelligent AI has given rise to discussion of the so-called “value alignment problem”. This is the problem of how to ensure artificially intelligent systems align with human values, and thus (hopefully) mitigate risks associated with them. Naturally, AI value alignment is often discussed in relation to morally relevant values, such as the value of human lives or human wellbeing. However, solutions to the value alignment problem target all human values, not only morally (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  12.  59
    Calibrating machine behavior: a challenge for AI alignment.Erez Firt - 2023 - Ethics and Information Technology 25 (3):1-8.
    When discussing AI alignment, we usually refer to the problem of teaching or training advanced autonomous AI systems to make decisions that are aligned with human values or preferences. Proponents of this approach believe it can be employed as means to stay in control over sophisticated intelligent systems, thus avoiding certain existential risks. We identify three general obstacles on the path to implementation of value alignment: a technological/technical obstacle, a normative obstacle, and a calibration problem. Presupposing, for the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  13.  23
    Toleration and Justice in the Laozi: Engaging with Tao Jiang's Origins of Moral-Political Philosophy in Early China.Ai Yuan - 2023 - Philosophy East and West 73 (2):466-475.
    In lieu of an abstract, here is a brief excerpt of the content:Toleration and Justice in the Laozi:Engaging with Tao Jiang's Origins of Moral-Political Philosophy in Early ChinaAi Yuan (bio)IntroductionThis review article engages with Tao Jiang's ground-breaking monograph on the Origins of Moral-Political Philosophy in Early China with particular focus on the articulation of toleration and justice in the Laozi (otherwise called the Daodejing).1 Jiang discusses a naturalistic turn and the re-alignment of values in the Laozi, resulting in a (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  14. Saliva Ontology: An ontology-based framework for a Salivaomics Knowledge Base.Jiye Ai, Barry Smith & David Wong - 2010 - BMC Bioinformatics 11 (1):302.
    The Salivaomics Knowledge Base (SKB) is designed to serve as a computational infrastructure that can permit global exploration and utilization of data and information relevant to salivaomics. SKB is created by aligning (1) the saliva biomarker discovery and validation resources at UCLA with (2) the ontology resources developed by the OBO (Open Biomedical Ontologies) Foundry, including a new Saliva Ontology (SALO). We define the Saliva Ontology (SALO; http://www.skb.ucla.edu/SALO/) as a consensus-based controlled vocabulary of terms and relations dedicated to the salivaomics (...)
    Direct download  
     
    Export citation  
     
    Bookmark   4 citations  
  15.  93
    Discovering Our Blind Spots and Cognitive Biases in AI Research and Alignment.A. E. Williams - manuscript
    The challenge of AI alignment is not just a technological issue but fundamentally an epistemic one. AI safety research predominantly relies on empirical validation, often detecting failures only after they manifest. However, certain risks—such as deceptive alignment and goal misspecification—may not be empirically testable until it is too late, necessitating a shift toward leading-indicator logical reasoning. This paper explores how mainstream AI research systematically filters out deep epistemic insight, hindering progress in AI safety. We assess the rarity of (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  16.  42
    AI in the noosphere: an alignment of scientific and wisdom traditions.Stephen D. Edwards - 2021 - AI and Society 36 (1):397-399.
  17.  13
    A Note on “Philosophical Investigations into AI Alignment: A Wittgensteinean Framework” by J.A. Pérez-Escobar and D. Sarikaya. [REVIEW]Sorin Bangu - 2024 - Philosophy and Technology 37 (3):1-5.
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  18.  23
    Comparative Analysis of Food Related Sustainable Development Goals in the North Asia Pacific Region.Charles V. Trappey, Amy J. C. Trappey, Hsin-Jung Lin & Ai-Che Chang - 2023 - Food Ethics 8 (2):1-24.
    Member States of the United Nations proposed Seventeen Sustainable Development Goals (SDGs) in 2015, emphasizing the well-being of people, planet, prosperity, peace, and partnership. Countries are expected to work diligently to achieve these goals by the year 2030. The paths chosen to achieve the SDGs depend on each country’s specific needs, challenges, and opportunities. This contribution conducts a bibliometric study of selected SDG research related to hunger and climate change among countries of the North Asia Pacific region. A review of (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  19.  17
    Democratizing value alignment: from authoritarian to democratic AI ethics.Linus Ta-Lun Huang, Gleb Papyshev & James K. Wong - 2024 - AI and Ethics.
    Value alignment is essential for ensuring that AI systems act in ways that are consistent with human values. Existing approaches, such as reinforcement learning with human feedback and constitutional AI, however, exhibit power asymmetries and lack transparency. These “authoritarian” approaches fail to adequately accommodate a broad array of human opinions, raising concerns about whose values are being prioritized. In response, we introduce the Dynamic Value Alignment approach, theoretically grounded in the principles of parallel constraint satisfaction, which models moral (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  20. Applying AI for social good: Aligning academic journal ratings with the United Nations Sustainable Development Goals (SDGs).David Steingard, Marcello Balduccini & Akanksha Sinha - 2023 - AI and Society 38 (2):613-629.
    This paper offers three contributions to the burgeoning movements of AI for Social Good (AI4SG) and AI and the United Nations Sustainable Development Goals (SDGs). First, we introduce the SDG-Intense Evaluation framework (SDGIE) that aims to situate variegated automated/AI models in a larger ecosystem of computational approaches to advance the SDGs. To foster knowledge collaboration for solving complex social and environmental problems encompassed by the SDGs, the SDGIE framework details a benchmark structure of data-algorithm-output to effectively standardize AI approaches to (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  21. Current cases of AI misalignment and their implications for future risks.Leonard Dung - 2023 - Synthese 202 (5):1-23.
    How can one build AI systems such that they pursue the goals their designers want them to pursue? This is the alignment problem. Numerous authors have raised concerns that, as research advances and systems become more powerful over time, misalignment might lead to catastrophic outcomes, perhaps even to the extinction or permanent disempowerment of humanity. In this paper, I analyze the severity of this risk based on current instances of misalignment. More specifically, I argue that contemporary large language models (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   7 citations  
  22.  63
    Honor Ethics: The Challenge of Globalizing Value Alignment in AI.Stephen Tze-Inn Wu, Dan Demetriou & Rudwan Ali Husain - 2023 - 2023 Acm Conference on Fairness, Accountability, and Transparency (Facct '23), June 12-15, 2023.
    Some researchers have recognized that privileged communities dominate the discourse on AI Ethics, and other voices need to be heard. As such, we identify the current ethics milieu as arising from WEIRD (Western, Educated, Industrialized, Rich, Democratic) contexts, and aim to expand the discussion to non-WEIRD global communities, who are also stakeholders in global sociotechnical systems. We argue that accounting for honor, along with its values and related concepts, would better approximate a global ethical perspective. This complex concept already underlies (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  23.  35
    An explanation space to align user studies with the technical development of Explainable AI.Garrick Cabour, Andrés Morales-Forero, Élise Ledoux & Samuel Bassetto - 2023 - AI and Society 38 (2):869-887.
    Providing meaningful and actionable explanations for end-users is a situated problem requiring the intersection of multiple disciplines to address social, operational, and technical challenges. However, the explainable artificial intelligence community has not commonly adopted or created tangible design tools that allow interdisciplinary work to develop reliable AI-powered solutions. This paper proposes a formative architecture that defines the explanation space from a user-inspired perspective. The architecture comprises five intertwined components to outline explanation requirements for a task: (1) the end-users’ mental models, (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  24.  19
    Minangkabaunese matrilineal: The correlation between the Qur’an and gender.Halimatussa’Diyah Halimatussa’Diyah, Kusnadi Kusnadi, Ai Y. Yuliyanti, Deddy Ilyas & Eko Zulfikar - 2024 - HTS Theological Studies 80 (1):7.
    Upon previous research, the matrilineal system seems to oppose Islamic teaching. However, the matrilineal system practiced by the Minangkabau society in West Sumatra, Indonesia has its uniqueness. Thus, this study aims to examine the correlation between the Qur’an and gender roles within the context of Minangkabau customs, specifically focusing on the matrilineal aspect. The present study employs qualitative methods for conducting library research through critical analysis. This study discovered that the matrilineal system practiced by the Minangkabau society aligns with Qur’anic (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  25.  84
    Explicability as an AI Principle: Technology and Ethics in Cooperation.Moto Kamiura - forthcoming - Proceedings of the 39Th Annual Conference of the Japanese Society for Artificial Intelligence, 2025.
    This paper categorizes current approaches to AI ethics into four perspectives and briefly summarizes them: (1) Case studies and technical trend surveys, (2) AI governance, (3) Technologies for AI alignment, (4) Philosophy. In the second half, we focus on the fourth perspective, the philosophical approach, within the context of applied ethics. In particular, the explicability of AI may be an area in which scientists, engineers, and AI developers are expected to engage more actively relative to other ethical issues in (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  26. AI Survival Stories: a Taxonomic Analysis of AI Existential Risk.Herman Cappelen, Simon Goldstein & John Hawthorne - forthcoming - Philosophy of Ai.
    Since the release of ChatGPT, there has been a lot of debate about whether AI systems pose an existential risk to humanity. This paper develops a general framework for thinking about the existential risk of AI systems. We analyze a two-premise argument that AI systems pose a threat to humanity. Premise one: AI systems will become extremely powerful. Premise two: if AI systems become extremely powerful, they will destroy humanity. We use these two premises to construct a taxonomy of ‘survival (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  27. Aligning artificial intelligence with moral intuitions: an intuitionist approach to the alignment problem.Dario Cecchini, Michael Pflanzer & Veljko Dubljevic - 2024 - AI and Ethics:1-11.
    As artificial intelligence (AI) continues to advance, one key challenge is ensuring that AI aligns with certain values. However, in the current diverse and democratic society, reaching a normative consensus is complex. This paper delves into the methodological aspect of how AI ethicists can effectively determine which values AI should uphold. After reviewing the most influential methodologies, we detail an intuitionist research agenda that offers guidelines for aligning AI applications with a limited set of reliable moral intuitions, each underlying a (...)
    Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  28.  4
    Multi-Value Alignment for Ml/Ai Development Choices.Hetvi Jethwani & Anna C. F. Lewis - 2025 - American Philosophical Quarterly 62 (2):133-152.
    We outline a four-step process for ML/AI developers to align development choices with multiple values, by adapting a widely-utilized framework from bioethics: (1) identify the values that matter, (2) specify identified values, (3) find solution spaces that allow for maximal alignment with identified values, and 4) make hard choices if there are unresolvable trade-offs between the identified values. Key to this approach is identifying resolvable trade-offs between values (Step 3). We survey ML/AI methods that could be used to this (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  29.  16
    A Justifiable Investment in AI for Healthcare: Aligning Ambition with Reality.Kassandra Karpathakis, Jessica Morley & Luciano Floridi - 2024 - Minds and Machines 34 (4):1-40.
    Healthcare systems are grappling with critical challenges, including chronic diseases in aging populations, unprecedented health care staffing shortages and turnover, scarce resources, unprecedented demands and wait times, escalating healthcare expenditure, and declining health outcomes. As a result, policymakers and healthcare executives are investing in artificial intelligence (AI) solutions to increase operational efficiency, lower health care costs, and improve patient care. However, current level of investment in developing healthcare AI among members of the global digital health partnership does not seem to (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  30. Aligning artificial intelligence with human values: reflections from a phenomenological perspective.Shengnan Han, Eugene Kelly, Shahrokh Nikou & Eric-Oluf Svee - 2022 - AI and Society 37 (4):1383-1395.
    Artificial Intelligence (AI) must be directed at humane ends. The development of AI has produced great uncertainties of ensuring AI alignment with human values (AI value alignment) through AI operations from design to use. For the purposes of addressing this problem, we adopt the phenomenological theories of material values and technological mediation to be that beginning step. In this paper, we first discuss the AI value alignment from the relevant AI studies. Second, we briefly present what are (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   5 citations  
  31. Is Alignment Unsafe?Cameron Domenico Kirk-Giannini - 2024 - Philosophy and Technology 37 (110):1–4.
    Inchul Yum (2024) argues that the widespread adoption of language agent architectures would likely increase the risk posed by AI by simplifying the process of aligning artificial systems with human values and thereby making it easier for malicious actors to use them to cause a variety of harms. Yum takes this to be an example of a broader phenomenon: progress on the alignment problem is likely to be net safety-negative because it makes artificial systems easier for malicious actors to (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  32. From Confucius to Coding and Avicenna to Algorithms: Cultivating Ethical AI Development through Cross-Cultural Ancient Wisdom.Ammar Younas & Yi Zeng - manuscript
    This paper explores the potential of integrating ancient educational principles from diverse eastern cultures into modern AI ethics curricula. It draws on the rich educational traditions of ancient China, India, Arabia, Persia, Japan, Tibet, Mongolia, and Korea, highlighting their emphasis on philosophy, ethics, holistic development, and critical thinking. By examining these historical educational systems, the paper establishes a correlation with modern AI ethics principles, advocating for the inclusion of these ancient teachings in current AI development and education. The proposed integration (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  33. Artificial Intelligence, Values, and Alignment.Iason Gabriel - 2020 - Minds and Machines 30 (3):411-437.
    This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   72 citations  
  34. ‘Interpretability’ and ‘Alignment’ are Fool’s Errands: A Proof that Controlling Misaligned Large Language Models is the Best Anyone Can Hope For.Marcus Arvan - forthcoming - AI and Society.
    This paper uses famous problems from philosophy of science and philosophical psychology—underdetermination of theory by evidence, Nelson Goodman’s new riddle of induction, theory-ladenness of observation, and “Kripkenstein’s” rule-following paradox—to show that it is empirically impossible to reliably interpret which functions a large language model (LLM) AI has learned, and thus, that reliably aligning LLM behavior with human values is provably impossible. Sections 2 and 3 show that because of how complex LLMs are, researchers must interpret their learned functions largely in (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  35.  21
    Knowledge-augmented face perception: Prospects for the Bayesian brain-framework to align AI and human vision.Martin Maier, Florian Blume, Pia Bideau, Olaf Hellwich & Rasha Abdel Rahman - 2022 - Consciousness and Cognition 101:103301.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  36.  72
    Value Alignment for Advanced Artificial Judicial Intelligence.Christoph Winter, Nicholas Hollman & David Manheim - 2023 - American Philosophical Quarterly 60 (2):187-203.
    This paper considers challenges resulting from the use of advanced artificial judicial intelligence (AAJI). We argue that these challenges should be considered through the lens of value alignment. Instead of discussing why specific goals and values, such as fairness and nondiscrimination, ought to be implemented, we consider the question of how AAJI can be aligned with goals and values more generally, in order to be reliably integrated into legal and judicial systems. This value alignment framing draws on AI (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  37.  85
    Security practices in AI development.Petr Spelda & Vit Stritecky - forthcoming - AI and Society.
    What makes safety claims about general purpose AI systems such as large language models trustworthy? We show that rather than the capabilities of security tools such as alignment and red teaming procedures, it is security practices based on these tools that contributed to reconfiguring the image of AI safety and made the claims acceptable. After showing what causes the gap between the capabilities of security tools and the desired safety guarantees, we critically investigate how AI security practices attempt to (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  38. The argument for near-term human disempowerment through AI.Leonard Dung - 2024 - AI and Society:1-14.
    Many researchers and intellectuals warn about extreme risks from artificial intelligence. However, these warnings typically came without systematic arguments in support. This paper provides an argument that AI will lead to the permanent disempowerment of humanity, e.g. human extinction, by 2100. It rests on four substantive premises which it motivates and defends: first, the speed of advances in AI capability, as well as the capability level current systems have already reached, suggest that it is practically possible to build AI systems (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   6 citations  
  39.  39
    A comment on the pursuit to align AI: we do not need value-aligned AI, we need AI that is risk-averse.Rebecca Raper - forthcoming - AI and Society:1-3.
  40. AI Ethics by Design: Implementing Customizable Guardrails for Responsible AI Development.Kristina Sekrst, Jeremy McHugh & Jonathan Rodriguez Cefalu - manuscript
    This paper explores the development of an ethical guardrail framework for AI systems, emphasizing the importance of customizable guardrails that align with diverse user values and underlying ethics. We address the challenges of AI ethics by proposing a structure that integrates rules, policies, and AI assistants to ensure responsible AI behavior, while comparing the proposed framework to the existing state-of-the-art guardrails. By focusing on practical mechanisms for implementing ethical standards, we aim to enhance transparency, user autonomy, and continuous improvement in (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  41.  1
    AI metrics and policymaking: assumptions and challenges in the shaping of AI.Konstantinos Sioumalas-Christodoulou & Aristotle Tympas - forthcoming - AI and Society:1-16.
    This paper explores the interplay between AI metrics and policymaking by examining the conceptual and methodological frameworks of global AI metrics and their alignment with National Artificial Intelligence Strategies (NAIS). Through topic modeling and qualitative content analysis, key thematic areas in NAIS are identified. The findings suggest a misalignment between the technical and economic focus of global AI metrics and the broader societal and ethical priorities emphasized in NAIS. This highlights the need to recalibrate AI evaluation frameworks to include (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  42. Reflections on Putting AI Ethics into Practice: How Three AI Ethics Approaches Conceptualize Theory and Practice.Hannah Bleher & Matthias Braun - 2023 - Science and Engineering Ethics 29 (3):1-21.
    Critics currently argue that applied ethics approaches to artificial intelligence (AI) are too principles-oriented and entail a theory–practice gap. Several applied ethical approaches try to prevent such a gap by conceptually translating ethical theory into practice. In this article, we explore how the currently most prominent approaches of AI ethics translate ethics into practice. Therefore, we examine three approaches to applied AI ethics: the embedded ethics approach, the ethically aligned approach, and the Value Sensitive Design (VSD) approach. We analyze each (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   5 citations  
  43.  74
    Human-aligned artificial intelligence is a multiobjective problem.Peter Vamplew, Richard Dazeley, Cameron Foale, Sally Firmin & Jane Mummery - 2018 - Ethics and Information Technology 20 (1):27-40.
    As the capabilities of artificial intelligence systems improve, it becomes important to constrain their actions to ensure their behaviour remains beneficial to humanity. A variety of ethical, legal and safety-based frameworks have been proposed as a basis for designing these constraints. Despite their variations, these frameworks share the common characteristic that decision-making must consider multiple potentially conflicting factors. We demonstrate that these alignment frameworks can be represented as utility functions, but that the widely used Maximum Expected Utility paradigm provides (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   11 citations  
  44. Systematizing AI Governance through the Lens of Ken Wilber's Integral Theory.Ammar Younas & Yi Zeng - manuscript
    We apply Ken Wilber's Integral Theory to AI governance, demonstrating its ability to systematize diverse approaches in the current multifaceted AI governance landscape. By analyzing ethical considerations, technological standards, cultural narratives, and regulatory frameworks through Integral Theory's four quadrants, we offer a comprehensive perspective on governance needs. This approach aligns AI governance with human values, psychological well-being, cultural norms, and robust regulatory standards. Integral Theory’s emphasis on interconnected individual and collective experiences addresses the deeper aspects of AI-related issues. Additionally, we (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  45. Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence.Shakir Mohamed, Marie-Therese Png & William Isaac - 2020 - Philosophy and Technology 33 (4):659-684.
    This paper explores the important role of critical science, and in particular of post-colonial and decolonial theories, in understanding and shaping the ongoing advances in artificial intelligence. Artificial intelligence is viewed as amongst the technological advances that will reshape modern societies and their relations. While the design and deployment of systems that continually adapt holds the promise of far-reaching positive change, they simultaneously pose significant risks, especially to already vulnerable peoples. Values and power are central to this discussion. Decolonial theories (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   42 citations  
  46. “Democratizing AI” and the Concern of Algorithmic Injustice.Ting-an Lin - 2024 - Philosophy and Technology 37 (3):1-27.
    The call to make artificial intelligence (AI) more democratic, or to “democratize AI,” is sometimes framed as a promising response for mitigating algorithmic injustice or making AI more aligned with social justice. However, the notion of “democratizing AI” is elusive, as the phrase has been associated with multiple meanings and practices, and the extent to which it may help mitigate algorithmic injustice is still underexplored. In this paper, based on a socio-technical understanding of algorithmic injustice, I examine three notable notions (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  47. The linguistic dead zone of value-aligned agency, natural and artificial.Travis LaCroix - 2024 - Philosophical Studies:1-23.
    The value alignment problem for artificial intelligence (AI) asks how we can ensure that the “values”—i.e., objective functions—of artificial systems are aligned with the values of humanity. In this paper, I argue that linguistic communication is a necessary condition for robust value alignment. I discuss the consequences that the truth of this claim would have for research programmes that attempt to ensure value alignment for AI systems—or, more loftily, those programmes that seek to design robustly beneficial or (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  48. Ethical funding for trustworthy AI: proposals to address the responsibilities of funders to ensure that projects adhere to trustworthy AI practice.Marie Oldfield - 2021 - AI and Ethics 1 (1):1.
    AI systems that demonstrate significant bias or lower than claimed accuracy, and resulting in individual and societal harms, continue to be reported. Such reports beg the question as to why such systems continue to be funded, developed and deployed despite the many published ethical AI principles. This paper focusses on the funding processes for AI research grants which we have identified as a gap in the current range of ethical AI solutions such as AI procurement guidelines, AI impact assessments and (...)
    Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  49. ChatGPT: towards AI subjectivity.Kristian D’Amato - 2024 - AI and Society 39:1-15.
    Motivated by the question of responsible AI and value alignment, I seek to offer a uniquely Foucauldian reconstruction of the problem as the emergence of an ethical subject in a disciplinary setting. This reconstruction contrasts with the strictly human-oriented programme typical to current scholarship that often views technology in instrumental terms. With this in mind, I problematise the concept of a technological subjectivity through an exploration of various aspects of ChatGPT in light of Foucault’s work, arguing that current systems (...)
    Direct download (6 more)  
     
    Export citation  
     
    Bookmark   3 citations  
  50. Variable Value Alignment by Design; averting risks with robot religion.Jeffrey White - forthcoming - Embodied Intelligence 2023.
    Abstract: One approach to alignment with human values in AI and robotics is to engineer artiTicial systems isomorphic with human beings. The idea is that robots so designed may autonomously align with human values through similar developmental processes, to realize project ideal conditions through iterative interaction with social and object environments just as humans do, such as are expressed in narratives and life stories. One persistent problem with human value orientation is that different human beings champion different values as (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
1 — 50 / 973