Results for 'reinforcement learning, profit sharing, continuous state spaces'

963 found
Order:
  1.  27
    合理的政策形成アルゴリズムの連続値入力への拡張.木村 元 宮崎 和光 - 2007 - Transactions of the Japanese Society for Artificial Intelligence 22 (3):332-341.
    Reinforcement Learning is a kind of machine learning. We know Profit Sharing, the Rational Policy Making algorithm, the Penalty Avoiding Rational Policy Making algorithm and PS-r* to guarantee the rationality in a typical class of the Partially Observable Markov Decision Processes. However they cannot treat continuous state spaces. In this paper, we present a solution to adapt them in continuous state spaces. We give RPM a mechanism to treat continuous state (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  2.  28
    不完全知覚判定法を導入した Profit Sharing.Masuda Shiro Saito Ken - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:379-388.
    To apply reinforcement learning to difficult classes such as real-environment learning, we need to use a method robust to perceptual aliasing problem. The exploitation-oriented methods such as Profit Sharing can deal with the perceptual aliasing problem to a certain extent. However, when the agent needs to select different actions at the same sensory input, the learning efficiency worsens. To overcome the problem, several state partition methods using history information of state-action pairs are proposed. These methods try (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  3.  24
    経験に固執しない Profit Sharing 法.Ueno Atsushi Uemura Wataru - 2006 - Transactions of the Japanese Society for Artificial Intelligence 21:81-93.
    Profit Sharing is one of the reinforcement learning methods. An agent, as a learner, selects an action with a state-action value and receives rewards when it reaches a goal state. Then it distributes receiving rewards to state-action values. This paper discusses how to set the initial value of a state-action value. A distribution function ƒ( x ) is called as the reinforcement function. On Profit Sharing, an agent learns a policy by distributing (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  4.  28
    Profit Sharing 法における強化関数に関する一考察.Tatsumi Shoji Uemura Wataru - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:197-203.
    In this paper, we consider profit sharing that is one of the reinforcement learning methods. An agent learns a candidate solution of a problem from the reward that is received from the environment if and only if it reaches the destination state. A function that distributes the received reward to each action of the candidate solution is called the reinforcement function. On this learning system, the agent can reinforce the set of selected actions when it gets (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  5. When, What, and How Much to Reward in Reinforcement Learning-Based Models of Cognition.Christian P. Janssen & Wayne D. Gray - 2012 - Cognitive Science 36 (2):333-358.
    Reinforcement learning approaches to cognitive modeling represent task acquisition as learning to choose the sequence of steps that accomplishes the task while maximizing a reward. However, an apparently unrecognized problem for modelers is choosing when, what, and how much to reward; that is, when (the moment: end of trial, subtask, or some other interval of task performance), what (the objective function: e.g., performance time or performance accuracy), and how much (the magnitude: with binary, categorical, or continuous values). In (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   6 citations  
  6.  25
    尤度情報に基づく温度分布を用いた強化学習法.鈴木 健嗣 小堀 訓成 - 2005 - Transactions of the Japanese Society for Artificial Intelligence 20:297-305.
    In the existing Reinforcement Learning, it is difficult and time consuming to find appropriate the meta-parameters such as learning rate, eligibility traces and temperature for exploration, in particular on a complicated and large-scale problem, the delayed reward often occurs and causes a difficulty in solving the problem. In this paper, we propose a novel method introducing a temperature distribution for reinforcement learning. In addition to the acquirement of policy based on profit sharing, the temperature is given to (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  7.  25
    強化学習エージェントへの階層化意志決定法の導入―追跡問題を例に―.輿石 尚宏 謙吾 片山 - 2004 - Transactions of the Japanese Society for Artificial Intelligence 19:279-291.
    Reinforcement Learning is a promising technique for creating agents that can be applied to real world problems. The most important features of RL are trial-and-error search and delayed reward. Thus, agents randomly act in the early learning stage. However, such random actions are impractical for real world problems. This paper presents a novel model of RL agents. A feature of our learning agent model is to integrate the Analytic Hierarchy Process into the standard RL agent model, which consists of (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  8.  34
    Profit Sharing の不完全知覚環境下への拡張: PS-r^* の提案と評価.Kobayashi Shigenobu Miyazaki Kazuteru - 2003 - Transactions of the Japanese Society for Artificial Intelligence 18:286-296.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  9. A lineage explanation of human normative guidance: the coadaptive model of instrumental rationality and shared intentionality.Ivan Gonzalez-Cabrera - 2022 - Synthese 200 (6):1-32.
    This paper aims to contribute to the existing literature on normative cognition by providing a lineage explanation of human social norm psychology. This approach builds upon theories of goal-directed behavioral control in the reinforcement learning and control literature, arguing that this form of control defines an important class of intentional normative mental states that are instrumental in nature. I defend the view that great ape capacities for instrumental reasoning and our capacity (or family of capacities) for shared intentionality coadapted (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  10. Multi-Agent Reinforcement Learning: Weighting and Partitioning.Ron Sun & Todd Peterson - unknown
    This paper addresses weighting and partitioning in complex reinforcement learning tasks, with the aim of facilitating learning. The paper presents some ideas regarding weighting of multiple agents and extends them into partitioning an input/state space into multiple regions with di erential weighting in these regions, to exploit di erential characteristics of regions and di erential characteristics of agents to reduce the learning complexity of agents (and their function approximators) and thus to facilitate the learning overall. It analyzes, in (...)
     
    Export citation  
     
    Bookmark   6 citations  
  11.  19
    Deep Reinforcement Learning for UAV Intelligent Mission Planning.Longfei Yue, Rennong Yang, Ying Zhang, Lixin Yu & Zhuangzhuang Wang - 2022 - Complexity 2022:1-13.
    Rapid and precise air operation mission planning is a key technology in unmanned aerial vehicles autonomous combat in battles. In this paper, an end-to-end UAV intelligent mission planning method based on deep reinforcement learning is proposed to solve the shortcomings of the traditional intelligent optimization algorithm, such as relying on simple, static, low-dimensional scenarios, and poor scalability. Specifically, the suppression of enemy air defense mission planning is described as a sequential decision-making problem and formalized as a Markov decision process. (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  12. 601 Books on Space.Francisco Caruso - 2012 - Maluhy & Co..
    Space is one of the most fundamental concepts over which scientific knowledge has been constructed. But it is also true that space concepts extrapolate by far the scientific domain, and permeate many other branches of human knowledge. Those are fascinating aspects that could di per se justify the compilation of a long bibliography. Another one is the passion for books. My interest in some physical, historical and philosophical problems concerning the concept of space in Physics, and its properties, can be (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  13.  50
    Decolonization Projects.Cornelius Ewuoso - 2023 - Voices in Bioethics 9.
    Photo ID 279661800 © Sidewaypics|Dreamstime.com ABSTRACT Decolonization is complex, vast, and the subject of an ongoing academic debate. While the many efforts to decolonize or dismantle the vestiges of colonialism that remain are laudable, they can also reinforce what they seek to end. For decolonization to be impactful, it must be done with epistemic and cultural humility, requiring decolonial scholars, project leaders, and well-meaning people to be more sensitive to those impacted by colonization and not regularly included in the discourse. (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  14.  85
    Teaching & Learning Guide for: Belief‐Desire Explanation.Nikolaj Nottelmann - 2012 - Philosophy Compass 7 (1):71-73.
    This guide accompanies the following article: Nikolaj Nottelmann, ‘Belief‐Desire Explanation’. Philosophy Compass Vol/Iss : 1–10. doi: 10.1111/j.1747‐9991.2011.00446.xAuthor’s Introduction“Belief‐desire explanation” is short‐hand for a type of action explanation that appeals to a set of the agent’s mental states consisting of 1. Her desire to ψ and 2. Her belief that, were she to φ, she would promote her ψ‐ing. Here, to ψ could be to eat an ice cream, and to φ could be to walk to the ice cream vendor. Adherents (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  15.  26
    Gandhi's Hope: Learning from Other Religions as a Path to Peace (review).Christopher Chapple - 2006 - Buddhist-Christian Studies 26 (1):237-240.
    In lieu of an abstract, here is a brief excerpt of the content:Reviewed by:Gandhi's Hope: Learning from Other Religions as a Path to PeaceChristopher Key ChappleGandhi's Hope: Learning from Other Religions as a Path to Peace. By Jay McDaniel. Maryknoll, NY: Orbis Books, 2005. 134 + viii pp.This book by prominent Protestant theologian Jay McDaniel suggests that Mahatma Gandhi challenged the modern world by publicly revealing that which he learned from other faith traditions and advocating this path as a way (...)
    Direct download (5 more)  
     
    Export citation  
     
    Bookmark  
  16.  28
    Resilience Analysis of Urban Road Networks Based on Adaptive Signal Controls: Day-to-Day Traffic Dynamics with Deep Reinforcement Learning.Wen-Long Shang, Yanyan Chen, Xingang Li & Washington Y. Ochieng - 2020 - Complexity 2020:1-19.
    Improving the resilience of urban road networks suffering from various disruptions has been a central focus for urban emergence management. However, to date the effective methods which may mitigate the negative impacts caused by the disruptions, such as road accidents and natural disasters, on urban road networks is highly insufficient. This study proposes a novel adaptive signal control strategy based on a doubly dynamic learning framework, which consists of deep reinforcement learning and day-to-day traffic dynamic learning, to improve the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  17. Automatic Partitioning for Multi-Agent Reinforcement Learning.Ron Sun - unknown
    This paper addresses automatic partitioning in complex reinforcement learning tasks with multiple agents, without a priori domain knowledge regarding task structures. Partitioning a state/input space into multiple regions helps to exploit the di erential characteristics of regions and di erential characteristics of agents, thus facilitating learning and reducing the complexity of agents especially when function approximators are used. We develop a method for optimizing the partitioning of the space through experience without the use of a priori domain knowledge. (...)
     
    Export citation  
     
    Bookmark  
  18.  24
    Online Optimal Control of Robotic Systems with Single Critic NN-Based Reinforcement Learning.Xiaoyi Long, Zheng He & Zhongyuan Wang - 2021 - Complexity 2021:1-7.
    This paper suggests an online solution for the optimal tracking control of robotic systems based on a single critic neural network -based reinforcement learning method. To this end, we rewrite the robotic system model as a state-space form, which will facilitate the realization of optimal tracking control synthesis. To maintain the tracking response, a steady-state control is designed, and then an adaptive optimal tracking control is used to ensure that the tracking error can achieve convergence in an (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  19.  20
    Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning.Rui Wang, Xianghua Gan, Qing Li & Xiao Yan - 2021 - Complexity 2021:1-17.
    We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  20.  13
    How Statistical Learning Can Play Well with Universal Grammar.Lisa S. Pearl - 2021 - In Nicholas Allott, Terje Lohndal & Georges Rey (eds.), A Companion to Chomsky. Wiley. pp. 267–286.
    A key motivation for Universal Grammar (UG) is developmental: UG can help children acquire the linguistic knowledge that they do as quickly as they do from the data that's available to them. Some of the most fruitful recent work in language acquisition has combined ideas about different hypothesis space building blocks with domain‐general statistical learning. Statistical learning can then provide a way to help navigate the hypothesis space in order to converge on the correct hypothesis. Reinforcement learning is a (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  21. Numerical simulations of the Lewis signaling game: Learning strategies, pooling equilibria, and the evolution of grammar.Jeffrey A. Barrett - unknown
    David Lewis (1969) introduced sender-receiver games as a way of investigating how meaningful language might evolve from initially random signals. In this report I investigate the conditions under which Lewis signaling games evolve to perfect signaling systems under various learning dynamics. While the 2-state/2- term Lewis signaling game with basic urn learning always approaches a signaling system, I will show that with more than two states suboptimal pooling equilibria can evolve. Inhomogeneous state distributions increase the likelihood of pooling (...)
     
    Export citation  
     
    Bookmark   30 citations  
  22. Locating Values in the Space of Possibilities.Sara Aronowitz - forthcoming - Philosophy of Science.
    Where do values live in thought? A straightforward answer is that we (or our brains) make decisions using explicit value representations which are our values. Recent work applying reinforcement learning to decision-making and planning suggests that more specifically, we may represent both the instrumental expected value of actions as well as the intrinsic reward of outcomes. In this paper, I argue that identifying value with either of these representations is incomplete. For agents such as humans and other animals, there (...)
    Direct download  
     
    Export citation  
     
    Bookmark  
  23.  31
    重点サンプリングを用いた Ga による強化学習.Kimura Hajime Tsuchiya Chikao - 2005 - Transactions of the Japanese Society for Artificial Intelligence 20:1-10.
    Reinforcement Learning (RL) handles policy search problems: searching a mapping from state space to action space. However RL is based on gradient methods and as such, cannot deal with problems with multimodal landscape. In contrast, though Genetic Algorithm (GA) is promising to deal with them, it seems to be unsuitable for policy search problems from the viewpoint of the cost of evaluation. Minimal Generation Gap (MGG), used as a generation-alternation model in GA, generates many offspring from two or (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  24. On Love and Poetry—Or, Where Philosophers Fear to Tread.Jeremy Fernando - 2011 - Continent 1 (1):27-32.
    continent. 1.1 (2011): 27-32. “My”—what does this word designate? Not what belongs to me, but what I belong to,what contains my whole being, which is mine insofar as I belong to it. Søren Kierkegaard. The Seducer’s Diary . I can’t sleep till I devour you / And I’ll love you, if you let me… Marilyn Manson “Devour” The role of poetry in the relationalities between people has a long history—from epic poetry recounting tales of yore; to emotive lyric poetry; to (...)
     
    Export citation  
     
    Bookmark  
  25.  27
    Ga により探索空間の動的生成を行う Q 学習.Matsuno Fumitoshi Ito Kazuyuki - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16:510-520.
    Reinforcement learning has recently received much attention as a learning method for complicated systems, e.g., robot systems. It does not need prior knowledge and has higher capability of reactive and adaptive behaviors. However increase in dimensionality of the action-state space makes it diffcult to accomplish learning. The applicability of the existing reinforcement learning algorithms are effective for simple tasks with relatively small action-state space. In this paper, we propose a new reinforcement learning algorithm: “Q-learning with (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  26.  30
    Qdsega による多足ロボットの歩行運動の獲得.Matsuno Fumitoshi Ito Kazuyuki - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:363-372.
    Reinforcement learning is very effective for robot learning. Because it does not need priori knowledge and has higher capability of reactive and adaptive behaviors. In our previous works, we proposed new reinforcement learning algorithm: “Q-learning with Dynamic Structuring of Exploration Space Based on Genetic Algorithm (QDSEGA)”. It is designed for complicated systems with large action-state space like a robot with many redundant degrees of freedom. And we applied it to 50 link manipulator and effective behavior is acquired. (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  27. Supervised, Unsupervised and Reinforcement Learning-Face Recognition Using Null Space-Based Local Discriminant Embedding.Yanmin Niu & Xuchu Wang - 2006 - In O. Stock & M. Schaerf (eds.), Lecture Notes In Computer Science. Springer Verlag. pp. 4114--245.
     
    Export citation  
     
    Bookmark  
  28.  27
    Surface Strategies And Constructive Line-Preferential Planes, Contour, Phenomenal Body In The Work Of Bacon, Chalayan, Kawakubo.Dagmar Reinhardt - 2005 - Colloquy 9:49-70.
    The paper investigates Maurice Merleau-Ponty’s discussion of body and space and Gilles Deleuze’s reading of Francis Bacon’s work, in order to derive a renegotiated interrelation between habitual body, phenomenal space, preferential plane and constructive line. The resulting system is ap- plied as a filter to understand the sartorial fashion of Rei Kawakubo and Hussein Chalayan and their potential as a spatial prosthesis: the operative third skin. If the evolutionary nature of culture demands a constant change, how does the surface of (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  29. Reinforcement learning with raw image pixels as state input.D. Ernst, R. Marée & L. Wehenkel - 2006 - In O. Stock & M. Schaerf (eds.), Lecture Notes In Computer Science. Springer Verlag. pp. 4153.
  30.  27
    Algorithmic sovereignty: Machine learning, ground truth, and the state of exception.Matthew Martin - forthcoming - Philosophy and Social Criticism.
    This article examines the interplay between contemporary algorithmic security technology and the political theory of the state of exception. I argue that the exception, as both a political and a technological concept, provides a crucial way to understand the power operating through machine learning technologies used in the security apparatuses of the modern state. I highlight how algorithmic security technology, through its inherent technical properties, carries exceptions throughout its political and technological architecture. This leads me to engage with (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  31.  14
    The Juggling Act.Samantha René Merriwether - 2013 - Narrative Inquiry in Bioethics 3 (3):205-207.
    In lieu of an abstract, here is a brief excerpt of the content:The Juggling ActSamantha René MerriwetherDepressed. Anxious. Insomniac. Learning Disabled. Physically impaired. Sufferer of Post–Traumatic Stress Disorder. Would you choose any of these labels? How about taking two or three? Sound manageable? Probably not. But why? All across our society are plastered expectations of perfection, normalcy and “acceptable” images.I am 27–years–old and, despite the years of education I have received, the communication skills I have gained in English and American (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
  32.  54
    The New Mizrahi Narrative in Israel.Arie Kizel - 2014 - Resling.
    The trend to centralization of the Mizrahi narrative has become an integral part of the nationalistic, ethnic, religious, and ideological-political dimensions of the emerging, complex Israeli identity. This trend includes several forms of opposition: strong opposition to "melting pot" policies and their ideological leaders; opposition to the view that ethnicity is a dimension of the tension and schisms that threaten Israeli society; and, direct repulsion of attempts to silence and to dismiss Mizrahim and so marginalize them hegemonically. The Mizrahi Democratic (...)
    Direct download  
     
    Export citation  
     
    Bookmark   1 citation  
  33.  14
    The Opinion of Teachers of Religious Culture and Ethics Course About Subject-Based Classroom Application.Şefika Mutlu - 2019 - Cumhuriyet İlahiyat Dergisi 23 (3):1209-1234.
    This study aims to determine the opinions of teachers of Religious Culture and Ethics Course (DKAB) about subject-based classroom application in-depth. The research has been carried from qualitative research methods with a case study design. In order to determine the working group of the study, criteria sampling was used in the first stage, and the maximum diversity sampling method was used in the next step. The sample of this research consists of 8 DKAB teachers working in Ankara province. A semi-structured (...)
    No categories
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  34.  10
    The Encyclopedia of Philosophy of Religion.Stewart Goetz & Charles Taliaferro (eds.) - 2021 - Hoboken, NJ: Wiley-Blackwell.
    Why an encyclopedia of the philosophy of religion? Because human beings have been and continue to be religious. Indeed, if one thinks in terms of what it is to be human, what is the essence of a human being, one can reasonably hold that it includes the property of trying to make sense of things and events, and religion, in terms of both belief and practice, is a way of doing this. A religious response to this attempt at sense-making in (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  35. A Continuous Act..Nico Jenkins - 2012 - Continent 2 (4):248-250.
    In this issue we include contributions from the individuals presiding at the panel All in a Jurnal's Work: A BABEL Wayzgoose, convened at the second Biennial Meeting of the BABEL Working Group. Sadly, the contributions of Daniel Remein, chief rogue at the Organism for Poetic Research as well as editor at Whiskey & Fox , were not able to appear in this version of the proceedings. From the program : 2ND BIENNUAL MEETING OF THE BABEL WORKING GROUP CONFERENCE “CRUISING IN (...)
     
    Export citation  
     
    Bookmark  
  36.  25
    Κ-確実探査法と動的計画法を用いた mdps 環境の効率的探索法.Kawada Seiichi Tateyama Takeshi - 2001 - Transactions of the Japanese Society for Artificial Intelligence 16:11-19.
    One most common problem in reinforcement learning systems (e.g. Q-learning) is to reduce the number of trials to converge to an optimal policy. As one of the solution to the problem, k-certainty exploration method was proposed. Miyazaki reported that this method could determine an optimal policy faster than Q-learning in Markov decision processes (MDPs). This method is very efficient learning method. But, we propose an improvement plan that makes this method more efficient. In k-certainty exploration method, in case there (...)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  37. An Approach to Subjective Computing: a Robot that Learns from Interaction with Humans.Patrick Grüneberg & Kenji Suzuki - 2014 - Ieee Transactions on Autonomous Mental Development 6 (1):5-18.
    We present an approach to subjective computing for the design of future robots that exhibit more adaptive and flexible behavior in terms of subjective intelligence. Instead of encapsulating subjectivity into higher order states, we show by means of a relational approach how subjective intelligence can be implemented in terms of the reciprocity of autonomous self-referentiality and direct world-coupling. Subjectivity concerns the relational arrangement of an agent’s cognitive space. This theoretical concept is narrowed down to the problem of coaching a (...) learning agent by means of binary feedback. Algorithms are presented that implement subjective computing. The relational characteristic of subjectivity is further confirmed by a questionnaire on human perception of robot’s behavior. The results imply that subjective intelligence cannot be externally observed. In sum, we conclude that subjective intelligence in relational terms is fully tractable and therefore implementable in artificial agents. (shrink)
     
    Export citation  
     
    Bookmark   1 citation  
  38.  34
    Message to Buddhists for the Feast of Vesakh 2007.Paul Poupard & Pier Luigi Celata - 2007 - Buddhist-Christian Studies 27 (1):131-132.
    In lieu of an abstract, here is a brief excerpt of the content:Message to Buddhists for the Feast of Vesakh 2007:Christians and Buddhists: Educating Communities to Live in Harmony and PeacePaul Cardinal Poupard, President and Archbishop Pier Luigi Celata, SecretaryDear Buddhist Friends,1. On the occasion of the festival of Vesakh, I am writing to Buddhist communities in different parts of the world to convey my own good wishes, as well as those of the Pontifical Council for Interreligious Dialogue.2. We, Catholics (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  39. Integrating reinforcement learning, bidding and genetic algorithms.Ron Sun - unknown
    This paper presents a GA-based multi-agent reinforce- ment learning bidding approach (GMARLB) for perform- ing multi-agent reinforcement learning. GMARLB inte- grates reinforcement learning, bidding and genetic algo- rithms. The general idea of our multi-agent systems is as follows: There are a number of individual agents in a team, each agent of the team has two modules: Q module and CQ module. Each agent can select actions to be performed at each step, which are done by the Q module. (...)
     
    Export citation  
     
    Bookmark  
  40.  14
    Home Thoughts from Abroad: Derrida, Austin, and the Oxford Connection.Christopher Norris - 1986 - Philosophy and Literature 10 (1):1-25.
    In lieu of an abstract, here is a brief excerpt of the content:Christopher Norris HOME THOUGHTS FROM ABROAD: DERRIDA, AUSTIN, AND THE OXFORD CONNECTION THERE IS NO philosophical school or tradition that does not carry along with it a background narrative linking up present and past concerns. Most often this selective prehistory entails not only an approving account of ideas that fit in with the current picture but also an effort to repress or marginalize anytíiing that fails so to fit. (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  41.  8
    Planetary Passport: Re-presentation, Accountability and Re-Generation.Janet McIntyre-Mills - 2017 - Cham: Imprint: Springer.
    This book explores the implications of knowing our place in the universe and recognising our hybridity. It is a series of self-reflections and essays drawing on many diverse ways of knowing. The book examines the complex ethical challenges of closing the wide gap in living standards between rich and poor people/communities. The notion of an ecological citizen is presented with a focus on protecting current and future generations. The idea is to track the distribution and redistribution of resources in the (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  42.  34
    Network formation by reinforcement learning: The long and medium run.Brian Skyrms - unknown
    We investigate a simple stochastic model of social network formation by the process of reinforcement learning with discounting of the past. In the limit, for any value of the discounting parameter, small, stable cliques are formed. However, the time it takes to reach the limiting state in which cliques have formed is very sensitive to the discounting parameter. Depending on this value, the limiting result may or may not be a good predictor for realistic observation times.
    Direct download  
     
    Export citation  
     
    Bookmark   21 citations  
  43. A Revolutionary New Metaphysics, Based on Consciousness, and a Call to All Philosophers.Lorna Green - manuscript
    June 2022 A Revolutionary New Metaphysics, Based on Consciousness, and a Call to All Philosophers We are in a unique moment of our history unlike any previous moment ever. Virtually all human economies are based on the destruction of the Earth, and we are now at a place in our history where we can foresee if we continue on as we are, our own extinction. As I write, the planet is in deep trouble, heat, fires, great storms, and record flooding, (...)
    No categories
    Direct download  
     
    Export citation  
     
    Bookmark  
  44.  59
    Reinforcement Learning and Counterfactual Reasoning Explain Adaptive Behavior in a Changing Environment.Yunfeng Zhang, Jaehyon Paik & Peter Pirolli - 2015 - Topics in Cognitive Science 7 (2):368-381.
    Animals routinely adapt to changes in the environment in order to survive. Though reinforcement learning may play a role in such adaptation, it is not clear that it is the only mechanism involved, as it is not well suited to producing rapid, relatively immediate changes in strategies in response to environmental changes. This research proposes that counterfactual reasoning might be an additional mechanism that facilitates change detection. An experiment is conducted in which a task state changes over time (...)
    Direct download (4 more)  
     
    Export citation  
     
    Bookmark  
  45.  23
    State space search nogood learning: Online refinement of critical-path dead-end detectors in planning.Marcel Steinmetz & Jörg Hoffmann - 2017 - Artificial Intelligence 245 (C):1-37.
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  46.  23
    マルチエージェント連続タスクにおける報酬設計の実験的考察: RoboCup Soccer Keepaway タスクを例として.Tanaka Nobuyuki Arai Sachiyo - 2006 - Transactions of the Japanese Society for Artificial Intelligence 21 (6):537-546.
    In this paper, we discuss guidelines for a reward design problem that defines when and what amount of reward should be given to the agent/s, within the context of reinforcement learning approach. We would like to take keepaway soccer as a standard task of the multiagent domain which requires skilled teamwork. The difficulties of designing reward for this task are due to its features as follows: i) since it belongs to the continuing task which has no explicit goal to (...)
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  47.  16
    認知距離学習による問題解決器の実行時探索削減の評価と学習プロセスの解析.宮本 裕司 山川 宏 - 2002 - Transactions of the Japanese Society for Artificial Intelligence 17:1-13.
    Our proposed cognitive distance learning problem solver generates sequence of actions from initial state to goal states in problem state space. This problem solver learns cognitive distance of arbitrary combination of two states. Action generation at each state is selection of next state that has minimum cognitive distance to the goal, like Q-learning agent. In this paper, first, we show that our proposed method reduces search cost than conventional search method by analytical simulation in spherical (...) space. Second, we show that an average search cost is more reduced more the prior learning term is long and our problem solver is familiar to the environment, by a computer simulation in a tile world state space. Third, we showed that proposed problem solver is superior to the reinforcement learning techniques when goal is changed by a computer simulation. Forth, we found that our simulation result consist with psychological experimental results. (shrink)
    No categories
    Direct download (2 more)  
     
    Export citation  
     
    Bookmark  
  48.  26
    Counterfactual state explanations for reinforcement learning agents via generative deep learning.Matthew L. Olson, Roli Khanna, Lawrence Neal, Fuxin Li & Weng-Keen Wong - 2021 - Artificial Intelligence 295 (C):103455.
  49.  28
    Reinforcement learning for Golog programs with first-order state-abstraction.D. Beck & G. Lakemeyer - 2012 - Logic Journal of the IGPL 20 (5):909-942.
  50.  23
    Learning the Structure of Bayesian Networks: A Quantitative Assessment of the Effect of Different Algorithmic Schemes.Stefano Beretta, Mauro Castelli, Ivo Gonçalves, Roberto Henriques & Daniele Ramazzotti - 2018 - Complexity 2018:1-12.
    One of the most challenging tasks when adopting Bayesian networks is the one of learning their structure from data. This task is complicated by the huge search space of possible solutions and by the fact that the problem isNP-hard. Hence, a full enumeration of all the possible solutions is not always feasible and approximations are often required. However, to the best of our knowledge, a quantitative analysis of the performance and characteristics of the different heuristics to solve this problem has (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark  
1 — 50 / 963