Extending Environments To Measure Self-Reflection In Reinforcement Learning

Journal of Artificial General Intelligence 13 (1) (2022)
  Copy   BIBTEX

Abstract

We consider an extended notion of reinforcement learning in which the environment can simulate the agent and base its outputs on the agent's hypothetical behavior. Since good performance usually requires paying attention to whatever things the environment's outputs are based on, we argue that for an agent to achieve on-average good performance across many such extended environments, it is necessary for the agent to self-reflect. Thus weighted-average performance over the space of all suitably well-behaved extended environments could be considered a way of measuring how self-reflective an agent is. We give examples of extended environments and introduce a simple transformation which experimentally seems to increase some standard RL agents' performance in a certain type of extended environment.

Other Versions

No versions found

Links

PhilArchive

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Machine Learning, Functions and Goals.Patrick Butlin - 2022 - Croatian Journal of Philosophy 22 (66):351-370.
Reinforcement Learning In Dynamic Environments: Optimizing Real-Time Decision Making For Complex Systems.N. Geetha - 2025 - International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering (Ijareeie) 14 (3):694-697.
Reinforcement Learning in Dynamic Environments: Optimizing Real-Time Decision Making for Complex Systems.P. V. Asha - 2025 - International Journal of Multidisciplinary Research in Science, Engineering, Technology and Management 12 (3):754-759.
Causal Inference for Mean Field Multi-Agent Reinforcement Learning.Vishal Jadhav Vaishnavi Jarande - 2024 - International Journal of Multidisciplinary Research in Science, Engineering, Technology and Management 12 (12):10956-10959.
強化学習エージェントへの階層化意志決定法の導入―追跡問題を例に―.輿石 尚宏 謙吾 片山 - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:279-291.
Reinforcement with iterative punishment.Jeffrey A. Barrett & Nathan Gabriel - 2022 - Journal of Experimental & Theoretical Artificial Intelligence 36 (7):1361-1383.

Analytics

Added to PP
2021-10-13

Downloads
649 (#44,816)

6 months
143 (#37,581)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Samuel Allen Alexander
Ohio State University (PhD)