decision process | Pollux - Fachinformationsdienst Politikwissenschaft

Filter

Format

Medientyp

Sprache

Weitere Sprachen

Jahre

1628 Ergebnisse

Sortierung:

Open Access

Open Access#12006

Contribution to the resolution of decentralized Markov decision processes ; Contribution à la résolution des processus de décision markoviens décentralisés

The subject of this thesis is the optimal resolution of decentralized Markov decision processes (DEC-POMDPs). The DEC-POMDP model has been introduced in 2000 and constitutes a formal framework for describing cooperative distributed decision problems under uncertainty. We present a first generalized overview for solving DEC-POMDPs optimally, including game theory, multi-agent planning and reinforcement learning. Our contributions constitute a theoretical approach for building optimal multi-agent systems. Solving DEC-POMDPs can be separated into two categories. If the underlying model of the system is known in advance, the optimal solution can be planned prior execution in a centralized way. We introduce two new planning algorithms. The first one is a point-based multi-agent dynamic programming approach, which constitutes a synthesis of classical multi-agent dynamic programming, and point-based dynamic programming for single-agent POMDPs. Our approach is hence able to concentrate the computational effort on the relevant regions of the policy space. The second approach is an entirely new way of applying heuristic search techniques, such as A*, to decentralized decision problems. We introduce multi-agent A* (MAA*), the first heuristic search algorithm for solving DEC-POMDPs, both over finite and infinite horizons. If the underlying model is not known, an optimal policy can be obtained by a trial-and-error approach based on reinforcement learning methods. We analyse the additional constraints in multi-agent learning vs. planning, before introducing a new multi-agent reinforcement learning algorithm based on mutual notifications of changes in the value function. ; Nous abordons dans cette thèse la résolution optimale des processus de décision markoviens décentralisés (DEC-POMDPs). Le modèle DEC-POMDP constitue un formalisme théorique pour la description de problèmes de prise de décision distribuée et coopérative, et cette thèse est l'une des premières à proposer des algorithmes exactes de recherche de politiques ...

Zugriff(Open Access)

BASE

Open Access

Open Access#22006

Contribution to the resolution of decentralized Markov decision processes ; Contribution à la résolution des processus de décision markoviens décentralisés

The subject of this thesis is the optimal resolution of decentralized Markov decision processes (DEC-POMDPs). The DEC-POMDP model has been introduced in 2000 and constitutes a formal framework for describing cooperative distributed decision problems under uncertainty. We present a first generalized overview for solving DEC-POMDPs optimally, including game theory, multi-agent planning and reinforcement learning. Our contributions constitute a theoretical approach for building optimal multi-agent systems. Solving DEC-POMDPs can be separated into two categories. If the underlying model of the system is known in advance, the optimal solution can be planned prior execution in a centralized way. We introduce two new planning algorithms. The first one is a point-based multi-agent dynamic programming approach, which constitutes a synthesis of classical multi-agent dynamic programming, and point-based dynamic programming for single-agent POMDPs. Our approach is hence able to concentrate the computational effort on the relevant regions of the policy space. The second approach is an entirely new way of applying heuristic search techniques, such as A*, to decentralized decision problems. We introduce multi-agent A* (MAA*), the first heuristic search algorithm for solving DEC-POMDPs, both over finite and infinite horizons. If the underlying model is not known, an optimal policy can be obtained by a trial-and-error approach based on reinforcement learning methods. We analyse the additional constraints in multi-agent learning vs. planning, before introducing a new multi-agent reinforcement learning algorithm based on mutual notifications of changes in the value function. ; Nous abordons dans cette thèse la résolution optimale des processus de décision markoviens décentralisés (DEC-POMDPs). Le modèle DEC-POMDP constitue un formalisme théorique pour la description de problèmes de prise de décision distribuée et coopérative, et cette thèse est l'une des premières à proposer des algorithmes exactes de recherche de politiques ...

Zugriff(Open Access)

BASE

Open Access

Open Access#32015

Solving F^3MDPs: Markov decision processes with factored transitions, rewards and stochastics policies ; Résolution de PDMF^3 : processus décisionnels de Markov à transitions, récompenses et politiques stochastiques factorisées

Radoszycki, Julia; Peyrard, Nathalie; Sabbadin, Regis

Radoszycki, Julia; Peyrard, Nathalie; Sabbadin, Regis

National audience ; Markov Decision Processes with factored state and action spaces, usually referred to as FA-FMDPs, provide a rich framework to model problems of sequential decision under uncertainty, where both the state and action spaces are of high dimension and highly structured, as in robotics, conservation biology or disease management domains. However, even dedicated solution algorithms (exact or approximate) do not apply when the dimensions of the state and action spaces both exceed 20-30, except under strong assumptions about state transitions or value function. In this paper we introduce the F^3 MDP framework and associated approximate solution algorithms which can tackle much larger problems. An F^3 MDP is an FA-FMDP whose reward function is additively factored and solution policies are constrained to be factored and can be stochastic. The proposed algorithms belong to the family of Policy Iteration (PI) algorithms and exploit continuous optimization tools. We validate them on extensive experiments. On small problems, where the optimal policy is available, they provide policies close to optimal. On larger problems belonging to the subclass of GMDPs they compete well with state-of-the-art resolution algorithms in terms of quality. Finally, we show that our algorithms can tackle very large F^3 MDPs. Indeed, they can solve problems of disease management in crop fields with state and action spaces of size 2^100.

Zugriff(Open Access)

BASE

Open Access

Open Access#42017

Rationaliser le politique : comment les décideurs incorporent la prépondérance des parties prenantes dans leur processus décisionnel en gouvernance des technologies de l'information ; Rationalizing politics : how decision makers incorporate stakeholders' salience in their decision process pertaining...

In: http://hdl.handle.net/11143/11402

Messabia, Nabil

Messabia, Nabil

La thèse vise à obtenir une meilleure compréhension du processus et de la dynamique à travers lesquels les décideurs considèrent la prépondérance des parties prenantes dans leur prise de décision en gouvernance des technologies de l'information (GTI). Une théorie enracinée sur la rationalisation politique baptisée la théorie de la dynamique des pesanteurs est alors produite au moyen d'une approche constructiviste informée par Charmaz. Les données sont fournies par 33 praticiens de GTI à travers des entrevues intensives et des graphiques générés à l'aide d'un outil Web spécialement développé aux fins de la thèse. La théorie de la dynamique des pesanteurs est une contribution originale dans le domaine de la prise de décision en gouvernance des TI. La théorie de la dynamique des pesanteurs explique les circonstances où le politique joue un rôle de levier à la bonne implantation des décisions et celles où le politique y joue un rôle dévastateur. ; Abstract : This thesis adopts stakeholder paradigm to investigate the decision-making process within the context of Information Technology Governance (ITG) as an integral part of enterprise governance. The Theory of Mass Dynamics (TMD) is produced through the present thesis by means of constructivist approach to grounded theory as informed by Charmaz. Data are provided by 33 ITG practitioners via intensive interviews and elicited charts generated using a web tool specially developed for this research purposes. The TMD suggests a three-level analysis framework describing how stakeholder salience interferes on the ITG decision making process. First, at organization-wide level, stakeholders are socially perceived as having unequal weight in terms of power. When an IT related draft decision gets on the agenda of an enterprise governing body, stakeholders get divided into two groups: change masses and status quo masses engaging in dynamics that end up turning the scale in favor of either decision adoption or denial. The TDM rule suggests that the draft decision fate will be determined by the side on which most powerful stakeholders eventually stand to in the change/status quo scale. Second, at the governing body level, both change and status quo masses are directly or indirectly represented by members of the decision making group. Bearing in mind the TDM rule, those stakeholders' representatives engage in a negotiation process aiming at moving the group from conflict towards compromise. Depending on whether conflict is motivated by rational, non-rational or mixed considerations, actors will go in search of compromise either through boardroom negotiations, through corridor negotiations or through a mix of both. Finally, at the individual level of analysis, a decision maker involved in this process would engage in a political rationalization attitude through which he or she thinks and sets negotiation arguments bearing in mind the stakeholders' power. At this individual level of analysis, the TDM suggest a four-shade spectrum of political rationalization respectively described as follows: rationalizing politics as unavoidable evil, rationalizing mitigated intentions, rationalizing bad practices and rationalizing potentially fraudulent practices. Thanks to such a non-technical knowledge, the TDM may contribute in filling the gap between ITG and board of directors.

Zugriff(Open Access)

BASE

Open Access

Open Access#52018

Markov Decision Processes for adjustable autonomy and heterogeneous interaction between autonomous and piloted robots ; Processus Décisionnels de Markov pour l'autonomie ajustable et l'interaction hétérogène entre engins autonomes et pilotés

Lelerre, Mathieu

Lelerre, Mathieu

Robots will be more and more used in both civil and military fields. These robots, operating in fleet, can accompany soldiers in fight, or accomplish a mission while being supervised by a control center. Considering the requirement of a military operation, it is complicated to let robots decide their action without an operator agreement or watch, in function of the situation.In this thesis, we focus on two problematics:First, we try to exploit adjustable autonomy to make a robot accomplishes is mission as efficiency as possible, while he respects restrictions, assigned by an operator, on his autonomy level. For this, it is able to define for given sets of states and actions a restriction level. This restriction can force, for example, the need of being tele-operated to access a dangerous zone.Secondly, we consider that several robots can be deployed at the same time. These robots have to coordinate to accomplish their objectives. However, since operators can take the control of some robots, the coordination is harder. In fact, the operator has preferences, perception, hesitation, stress that are not modeled by the agent. It is then hard to estimate his next actions, so to coordinate with him. We propose in this thesis an approach to estimate the policy executed by a tele-operated robot from learning methods, based on observed actions from this robot.The notion of planning his important in these works. These are based on planning models, such as Markov Decision Processes. ; Les robots vont être de plus en plus utilisés dans les domaines civils, comme dans le domaine militaire. Ces robots, opérant en flottes, peuvent accompagner des soldats au combat, ou accomplir une mission en étant supervisés par un poste de contrôle. Du fait des exigences d'une opération militaire, il est difficile de laisser les robots décider de leurs actions sans accord d'un opérateur ou surveillance, en fonction de la situation. Dans cette thèse, nous nous attardons sur deux problématiques:D'une part, nous cherchons à exploiter l'autonomie ...

Zugriff(Open Access)

BASE

Open Access

Open Access#62018

Markov Decision Processes for adjustable autonomy and heterogeneous interaction between autonomous and piloted robots ; Processus Décisionnels de Markov pour l'autonomie ajustable et l'interaction hétérogène entre engins autonomes et pilotés

Lelerre, Mathieu

Lelerre, Mathieu

Robots will be more and more used in both civil and military fields. These robots, operating in fleet, can accompany soldiers in fight, or accomplish a mission while being supervised by a control center. Considering the requirement of a military operation, it is complicated to let robots decide their action without an operator agreement or watch, in function of the situation.In this thesis, we focus on two problematics:First, we try to exploit adjustable autonomy to make a robot accomplishes is mission as efficiency as possible, while he respects restrictions, assigned by an operator, on his autonomy level. For this, it is able to define for given sets of states and actions a restriction level. This restriction can force, for example, the need of being tele-operated to access a dangerous zone.Secondly, we consider that several robots can be deployed at the same time. These robots have to coordinate to accomplish their objectives. However, since operators can take the control of some robots, the coordination is harder. In fact, the operator has preferences, perception, hesitation, stress that are not modeled by the agent. It is then hard to estimate his next actions, so to coordinate with him. We propose in this thesis an approach to estimate the policy executed by a tele-operated robot from learning methods, based on observed actions from this robot.The notion of planning his important in these works. These are based on planning models, such as Markov Decision Processes. ; Les robots vont être de plus en plus utilisés dans les domaines civils, comme dans le domaine militaire. Ces robots, opérant en flottes, peuvent accompagner des soldats au combat, ou accomplir une mission en étant supervisés par un poste de contrôle. Du fait des exigences d'une opération militaire, il est difficile de laisser les robots décider de leurs actions sans accord d'un opérateur ou surveillance, en fonction de la situation. Dans cette thèse, nous nous attardons sur deux problématiques:D'une part, nous cherchons à exploiter l'autonomie ...

Zugriff(Open Access)

BASE

Open Access

Open Access#72011

The impact of environmental public policy tools on consumer decision process: the purchase of low carbon emissions cars ; L'impact des instruments des politiques publiques environnementales sur les processus de décision du consommateur: l'achat de voitures à faibles émissions de carbone

Alaux, Christophe

Alaux, Christophe

Environmental public policy tools aim to impact consumer behavior. Nevertheless, the causal relationship system between the implementation of a public policy and behavior is full of disconnections. Thus, it should be deepen with the combined analysis of public policies and consumer decision process. Indeed, this latter also depends on others psychosocial determinants towards behavior and other contextual forces. The impact of public policy tools need to be distinguished among them. Our study on the French environmental public policy aimed at acquiring low-carbon emission cars focuses on understanding the impact of public policy tools on consumer buying decision process. Indeed, the attitude towards public policy tools affects consumer decision process. It results that the impact is not so direct but it moderates the relationship between the main determinants of behavior. These moderation effects depend on the psychological or structural nature of the public policy tools which impacts specific relationships of the consumer decision process. ; Les politiques publiques environnementales cherchent à impacter des comportements de consommation. Néanmoins, la relation causale entre l'action publique mise en œuvre et le changement de comportement se caractérise par des discontinuités. Elle doit donc être approfondie en combinant l'angle d'analyse des politiques publiques et du processus de décision du consommateur. En effet, ce dernier dépend également d'autres déterminants psychosociaux et d'autres facteurs contextuels. L'impact spécifique des instruments des politiques publiques doit cependant pouvoir y être distingué. Notre étude sur la politique publique environnementale française visant à l'acquisition de voitures à faibles émissions de carbone permet de comprendre l'impact des instruments des politiques publiques sur le processus de décision d'achat du consommateur. En effet, l'attitude envers les instruments de l'action publique produit des effets sur le processus de décision du consommateur. Cet impact n'est pas ...

Zugriff(Open Access)

BASE

Open Access

Open Access#82011

The impact of environmental public policy tools on consumer decision process: the purchase of low carbon emissions cars ; L'impact des instruments des politiques publiques environnementales sur les processus de décision du consommateur: l'achat de voitures à faibles émissions de carbone

Alaux, Christophe

Alaux, Christophe

Environmental public policy tools aim to impact consumer behavior. Nevertheless, the causal relationship system between the implementation of a public policy and behavior is full of disconnections. Thus, it should be deepen with the combined analysis of public policies and consumer decision process. Indeed, this latter also depends on others psychosocial determinants towards behavior and other contextual forces. The impact of public policy tools need to be distinguished among them. Our study on the French environmental public policy aimed at acquiring low-carbon emission cars focuses on understanding the impact of public policy tools on consumer buying decision process. Indeed, the attitude towards public policy tools affects consumer decision process. It results that the impact is not so direct but it moderates the relationship between the main determinants of behavior. These moderation effects depend on the psychological or structural nature of the public policy tools which impacts specific relationships of the consumer decision process. ; Les politiques publiques environnementales cherchent à impacter des comportements de consommation. Néanmoins, la relation causale entre l'action publique mise en œuvre et le changement de comportement se caractérise par des discontinuités. Elle doit donc être approfondie en combinant l'angle d'analyse des politiques publiques et du processus de décision du consommateur. En effet, ce dernier dépend également d'autres déterminants psychosociaux et d'autres facteurs contextuels. L'impact spécifique des instruments des politiques publiques doit cependant pouvoir y être distingué. Notre étude sur la politique publique environnementale française visant à l'acquisition de voitures à faibles émissions de carbone permet de comprendre l'impact des instruments des politiques publiques sur le processus de décision d'achat du consommateur. En effet, l'attitude envers les instruments de l'action publique produit des effets sur le processus de décision du consommateur. Cet impact n'est pas direct, mais il modère les relations causales entre les principaux déterminants du comportement. Ces effets modérateurs dépendent de la nature psychologique ou structurelle des instruments des politiques publiques qui impactent des relations spécifiques du processus de décision du consommateur.

Zugriff(Open Access)

BASE

Open Access

Open Access#92011

The impact of environmental public policy tools on consumer decision process: the purchase of low carbon emissions cars ; L'impact des instruments des politiques publiques environnementales sur les processus de décision du consommateur: l'achat de voitures à faibles émissions de carbone

Alaux, Christophe

Alaux, Christophe

Environmental public policy tools aim to impact consumer behavior. Nevertheless, the causal relationship system between the implementation of a public policy and behavior is full of disconnections. Thus, it should be deepen with the combined analysis of public policies and consumer decision process. Indeed, this latter also depends on others psychosocial determinants towards behavior and other contextual forces. The impact of public policy tools need to be distinguished among them. Our study on the French environmental public policy aimed at acquiring low-carbon emission cars focuses on understanding the impact of public policy tools on consumer buying decision process. Indeed, the attitude towards public policy tools affects consumer decision process. It results that the impact is not so direct but it moderates the relationship between the main determinants of behavior. These moderation effects depend on the psychological or structural nature of the public policy tools which impacts specific relationships of the consumer decision process. ; Les politiques publiques environnementales cherchent à impacter des comportements de consommation. Néanmoins, la relation causale entre l'action publique mise en œuvre et le changement de comportement se caractérise par des discontinuités. Elle doit donc être approfondie en combinant l'angle d'analyse des politiques publiques et du processus de décision du consommateur. En effet, ce dernier dépend également d'autres déterminants psychosociaux et d'autres facteurs contextuels. L'impact spécifique des instruments des politiques publiques doit cependant pouvoir y être distingué. Notre étude sur la politique publique environnementale française visant à l'acquisition de voitures à faibles émissions de carbone permet de comprendre l'impact des instruments des politiques publiques sur le processus de décision d'achat du consommateur. En effet, l'attitude envers les instruments de l'action publique produit des effets sur le processus de décision du consommateur. Cet impact n'est pas direct, mais il modère les relations causales entre les principaux déterminants du comportement. Ces effets modérateurs dépendent de la nature psychologique ou structurelle des instruments des politiques publiques qui impactent des relations spécifiques du processus de décision du consommateur.

Zugriff(Open Access)

BASE

Aufsatz

Aufsatz(gedruckt)#102007

Making Foreign Policy. Presidential Management of the Decision-Making Process

In: Politique étrangère: PE ; revue trimestrielle publiée par l'Institut Français des Relations Internationales, Heft 1, S. 221-223

Battistella, Dario

Battistella, Dario

ISSN: 0032-342X

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

Open Access

Open Access#111978

Les ambiguïtés du processus décisionnel ; The ambiguous process of decision in France

PERCEBOIS, Jacques

PERCEBOIS, Jacques

The point of view of the EDF (French National Electricity Company) must be considered, but should not be predominant, as it has practically been up to now : the national energy policy which depends on long range previsions is the Government's business. But as there is no general agreement about the aims and the priorities, and about tomorrow's Society, the Government should keep different possibilities open for the future and not be bound by non reversible choices in the field of energy production.

Zugriff(Open Access)

BASE

Aufsatz

Aufsatz(elektronisch)#1228. November 2018

Rethinking the international decision-making process: the relevance of local governments

In: La revue internationale et stratégique: revue trimestrielle publiée par l'Institut de Relations Internationales et Stratégiques (IRIS), Band 112, Heft 4, S. 95-100

Saiz, Emilia; Chaouad, Robert; Carlet, Fabien; Verzeroli, Marc

Saiz, Emilia; Chaouad, Robert; Carlet, Fabien; Verzeroli, Marc

Verfügbarkeit an Ihrem Standort wird überprüft

Dieser Artikel ist auch in Ihrer Bibliothek verfügbar: |

elektronisch

gedruckt

Open Access

Open Access#132009

THE DECISION-MAKING PROCESS IN COMPLEX SYSTEMS: AN ANALYSIS OF A SYSTEMIC INTERVENTION ; LE PROCESSUS DE DÉCISION DANS LES SYSTÈMES COMPLEXES : UNE ANALYSE D'UNE INTERVENTION SYSTÉMIQUE

Bérard, Céline

Bérard, Céline

The objective of this thesis is to contribute to a better understanding of decision processes in a complex system, by analysing how systemic interventions produce changes in the decision-making process followed by individuals. More precisely, this research analyses the potential effects of the use of a systemic model by decision-makers, on both the constitutive activities and the dimensions of decision processes, while taking into consideration their potential determinants. The research involves an experiment based on one simulated case about the intellectual property system of biotechnological innovations: experimental sessions consist in individual interviews with policy-makers, and the systemic intervention concerns the use of a simulation model based on the system dynamics approach. The results suggest: 1) a multiple, cumulative, conjunctive, and iterative progression; 2) an incremental, based on multiple perspectives, and creative decision-making procedure; 3) the multiplicity of involved actors, with diversified interests and roles; 4) rationalities which are political, limited, contextual, and even socio-cognitive. Moreover, the results show that decision-makers, who benefit from a systemic intervention, tend to consider more analytical elements and scientific disciplines during their decision analysis, and to involve additional internal and external actors. ; L'objectif de cette thèse est de contribuer à une meilleure compréhension des processus de décision dans les systèmes complexes, en analysant comment les interventions systémiques produisent des changements dans le processus décisionnel mis en œuvre par les individus. Plus précisément, la recherche consiste à analyser les effets potentiels de l'utilisation d'un modèle systémique par les décideurs, tant sur les activités constitutives du processus de décision, que sur ses dimensions, tout en prenant en considération les déterminants susceptibles d'exercer une influence. Elle s'appuie sur une expérimentation basée sur un cas décisionnel simulé, qui ...

Zugriff(Open Access)

BASE

Open Access

Open Access#142009

THE DECISION-MAKING PROCESS IN COMPLEX SYSTEMS: AN ANALYSIS OF A SYSTEMIC INTERVENTION ; LE PROCESSUS DE DÉCISION DANS LES SYSTÈMES COMPLEXES : UNE ANALYSE D'UNE INTERVENTION SYSTÉMIQUE

Bérard, Céline

Bérard, Céline

The objective of this thesis is to contribute to a better understanding of decision processes in a complex system, by analysing how systemic interventions produce changes in the decision-making process followed by individuals. More precisely, this research analyses the potential effects of the use of a systemic model by decision-makers, on both the constitutive activities and the dimensions of decision processes, while taking into consideration their potential determinants. The research involves an experiment based on one simulated case about the intellectual property system of biotechnological innovations: experimental sessions consist in individual interviews with policy-makers, and the systemic intervention concerns the use of a simulation model based on the system dynamics approach. The results suggest: 1) a multiple, cumulative, conjunctive, and iterative progression; 2) an incremental, based on multiple perspectives, and creative decision-making procedure; 3) the multiplicity of involved actors, with diversified interests and roles; 4) rationalities which are political, limited, contextual, and even socio-cognitive. Moreover, the results show that decision-makers, who benefit from a systemic intervention, tend to consider more analytical elements and scientific disciplines during their decision analysis, and to involve additional internal and external actors. ; L'objectif de cette thèse est de contribuer à une meilleure compréhension des processus de décision dans les systèmes complexes, en analysant comment les interventions systémiques produisent des changements dans le processus décisionnel mis en œuvre par les individus. Plus précisément, la recherche consiste à analyser les effets potentiels de l'utilisation d'un modèle systémique par les décideurs, tant sur les activités constitutives du processus de décision, que sur ses dimensions, tout en prenant en considération les déterminants susceptibles d'exercer une influence. Elle s'appuie sur une expérimentation basée sur un cas décisionnel simulé, qui ...

Zugriff(Open Access)

BASE

Open Access

Open Access#152015

Résolution de PDMF^3 : processus décisionnels de Markov à transitions, récompenses et politiques stochastiques factorisées

In: Actes des 10èmes Journées Francophones Planification, Décision, Apprentissage pour la conduite des systèmes. 2015; JFPDA 2015 - 10èmes Journées Francophones de Planification, Décision et Apprentissage, Rennes, FRA, 2015-07-01-2015-07-03

Radoszycki, Julia; Dubois Peyrard, Nathalie; Sabbadin, Regis

Radoszycki, Julia; Dubois Peyrard, Nathalie; Sabbadin, Regis

Markov Decision Processes with factored state and action spaces, usually referred to as FA-FMDPs, provide a rich framework to model problems of sequential decision under uncertainty, where both the state and action spaces are of high dimension and highly structured, as in robotics, conservation biology or disease management domains. However, even dedicated solution algorithms (exact or approximate) do not apply when the dimensions of the state and action spaces both exceed 20-30, except under strong assumptions about state transitions or value function. In this paper we introduce the F^3 MDP framework and associated approximate solution algorithms which can tackle much larger problems. An F^3 MDP is an FA-FMDP whose reward function is additively factored and solution policies are constrained to be factored and can be stochastic. The proposed algorithms belong to the family of Policy Iteration (PI) algorithms and exploit continuous optimization tools. We validate them on extensive experiments. On small problems, where the optimal policy is available, they provide policies close to optimal. On larger problems belonging to the subclass of GMDPs they compete well with state-of-the-art resolution algorithms in terms of quality. Finally, we show that our algorithms can tackle very large F^3 MDPs. Indeed, they can solve problems of disease management in crop fields with state and action spaces of size 2^100.

Zugriff(Open Access)

BASE