User profiles for A. Tamae
Aviv TamarTechnion Verified email at technion.ac.il Cited by 12274 |
Multi-agent actor-critic for mixed cooperative-competitive environments
We explore deep reinforcement learning methods for multi-agent domains. We begin by
analyzing the difficulty of traditional algorithms in the multi-agent case: Q-learning is challenged …
analyzing the difficulty of traditional algorithms in the multi-agent case: Q-learning is challenged …
[HTML][HTML] Stress hormones increase cell proliferation and regulates interleukin-6 secretion in human oral squamous cell carcinoma cells
DG Bernabé, AC Tamae, ÉR Biasoli… - Brain, behavior, and …, 2011 - Elsevier
Patients with oral cancer can have high psychological distress levels, but the effects of stress-related
hormones on oral cancer cells and possible mechanisms underlying these …
hormones on oral cancer cells and possible mechanisms underlying these …
Constrained policy optimization
For many applications of reinforcement learning it can be more convenient to specify both a
reward function and constraints, rather than trying to design behavior through the reward …
reward function and constraints, rather than trying to design behavior through the reward …
Cell‐type‐specific excitatory and inhibitory circuits involving primary afferents in the substantia gelatinosa of the rat spinal dorsal horn in vitro
…, MH Rashid, M Sonohata, A Tamae… - The Journal of …, 2007 - Wiley Online Library
The substantia gelatinosa (SG) of the spinal dorsal horn shows significant morphological
heterogeneity and receives primary afferent input predominantly from Aδ‐ and C‐fibres. …
heterogeneity and receives primary afferent input predominantly from Aδ‐ and C‐fibres. …
Value iteration networks
We introduce the value iteration network (VIN): a fully differentiable neural network with
aplanning module'embedded within. VINs can learn to plan, and are suitable for predicting …
aplanning module'embedded within. VINs can learn to plan, and are suitable for predicting …
Model-ensemble trust-region policy optimization
Model-free reinforcement learning (RL) methods are succeeding in a growing number of
tasks, aided by recent advances in deep learning. However, they tend to suffer from high …
tasks, aided by recent advances in deep learning. However, they tend to suffer from high …
[PDF][PDF] Policy gradients with variance related risk criteria
Managing risk in dynamic decision problems is of cardinal importance in many fields such as
finance and process control. The most common approach to defining risk is through various …
finance and process control. The most common approach to defining risk is through various …
Direct inhibition of substantia gelatinosa neurones in the rat spinal cord by activation of dopamine D2‐like receptors
A Tamae, T Nakatsuka, K Koga, G Kato… - The Journal of …, 2005 - Wiley Online Library
Dopaminergic innervation of the spinal cord is largely derived from the brain. To understand
the cellular mechanisms of antinociception mediated by descending dopaminergic pathways…
the cellular mechanisms of antinociception mediated by descending dopaminergic pathways…
Learning to route
Recently, much attention has been devoted to the question of whether/when traditional network
protocol design, which relies on the application of algorithmic insights by human experts, …
protocol design, which relies on the application of algorithmic insights by human experts, …
Variance adjusted actor critic algorithms
We present an actor-critic framework for MDPs where the objective is the variance-adjusted
expected return. Our critic uses linear function approximation, and we extend the concept of …
expected return. Our critic uses linear function approximation, and we extend the concept of …