Shie Mannor – Reinforcement Learning Research Labs

Shie Mannor's image

About Me

I graduated from the Technion with a BSc in Electrical Engineering and BA in mathematics (both summa cum laude) in 1996. After that I spent almost four years as an intelligence officer with the Israeli Defense Forces. I was subsequently involved in a few ventures in the high-tech industry. I earned my PhD in Electrical Engineering from the Technion at 2002, under the supervision of Nahum Shimkin. I was then a Fulbright postdoctoral associate with LIDS (MIT) working with John Tsitsiklis for two years. I was at the Department of Electrical and Computer Engineering in McGill University from July 2004 until August 2010, where I held a Canada Research Chair in Machine Learning from 2005 to 2009. I have been with the Department of Electrical Engineering at the Technion since 2008 where I am a professor. I am father to Liam, Oliver and Dylan.

Scientific objectives

I am in the business of being a professor because I want to understand how to act and make decisions in dynamic, complex and uncertain environments. In plain language, I want to build machines (e.g., software agents) that learn, evolve, and improve over time. I work mostly in machine learning, but also in certain application domains.

Reinforcement learning
High dimensional statistics and learning
Uncertainty and risk in decision making
Learning and modeling dynamics from data
Systems that include multiple decision makers: Multi-agent/distributed/many players/adaptive systems

See my Publications page for more details.

More specific research interests

Machine Learning (theory, algorithms, and applications). High-dimensional problems with uncertainty in the data and modeling and learning dynamics (e.g., networks).
Reinforcement Learning and Markov decision processes. Theory and application of Markov decision processes. I have worked quite a bit on adaptive control and learning algorithms for (large) stochastic systems in what is known as reinforcement learning.
Learning, optimization and control under uncertainty. Robust and stochastic optimization and statistical analysis of such approaches.
Games. Stochastic, dynamic, network, and differential games; applications in networks and resource sharing.
Multi-agent systems. Especially learning in such systems (e.g., online learning and learning in games). The goal here is to design economic systems (e.g., markets) where equilibrium is also a good social outcome.
Optimization of large scale problems. Especially combinatorial optimization using heuristic and statistical methods (e.g., the Cross Entropy method) and stochastic optimization.
Power Grid. Especially in reliability, pricing, and decision making in large-scale power grids (smart grids). My approach is very much data-driven: I try to understand the actual dynamics of the grid so that I can propose concrete policies for control of the grid, as well as evaluate market mechanisms and anomalies. See, for example, the EU funded GARPUR project that looks at probabilistic reliability models for large-scale grids.
Applications. I am interested and have worked (i.e., got to a semi-commercial prototype at least or plan to) on the following eclectic list of applications: large-scale communication network optimization, power management for laptops, adaptive compression of large databases, a learning agent for combat planes simulator, cognitive radio networks, human activity recognition and context identification on mobiles, stochastic approaches to decoding of LDPC codes.

Open Positions

I am looking for a postdoc and a couple of graduate students to join my team. Please consider that working with me requires very strong mathematical skills and/or true hacking capabilities. Email me your resume and a brief explanation of what you want to do if you are interested.

Staff

PhD Students

Chen Tessler

Focusing on Deep Reinforcement Learning, specifically on bridging the gap between theory and what we observe in practice.

Lior Shani

My main research areas include: RL optimization algorithms, efficient exploration and applying regularization in RL.

Guy Tennenholtz

My main research focus is on the combination of Reinforcement Learning with Natural Language and Causal Inference.

Nadav Merlis

Multi-Armed bandit and combinatorial bandits. Theory of Reinforcement Learning.

Yonathan Efroni

Theory of Reinforcement Learning and Markov Decision Processes

Nir Baram

Esther Derman

Reinforcement Learning under parameter uncertainty.

Shirli Di-Castro Shashua

Reinforcement Learning and robustness to uncertainties in large scale dynamic systems.

MSc Students

Avrech Ben-David

Deep RL for Combinatorial Optimization.

*Joint supervision with Tamir Hazan

Benny Perets

System biology.

*Joint supervision with Shai Shen-Orr

Alumni

PhD

Kirill Dyagilev
(3/2009-8-2014)

Information Propagation in Social Networks: A Game Theoretic View

First position after PhD: Postdoc at John Hopkins

Daniel Vainsencher
(12/2009-12/2014)

Learning from Unruly Data

First position after PhD: Postdoc at Rutgers and Princeton

Aviv Tamar
(12/2011-8/2015)

Risk Aware Reinforcement Learning

First position after PhD: Postdoc at UC Berkeley. Currently a faculty member at the Technion

Maayan Gal-on
(3/2009-12/2015)

On Detection and Adaptation to Changes in a Learning Problem

First position after PhD: Researcher at Jether Energy Research

Orly Avner
(10/2010-8/2016)

Machine Learning inspired by Cognitive Radio Networks

First position after PhD: Researcher at Intel

Oren Anava
(10/2012-8/2016)

Online Learning for Time Series, jointly supervised with E. Hazan

First position after PhD: researcher at Voleon (hedge fund)

Eli Meirom
(10/2011-12/2017)

Infection Detection from Weak Signatures, jointly supervised with A. Orda

First position after PhD: Researcher at SAIPS (a Ford company)

Assaf Halak
(12/2011-3/2018)

Model Selection in Markov Decision Processes

First position after PhD: Researcher at SAIPS (a Ford company)

Daniel Mankowitz
(10/2013-2/2018)

On Temporal Abstraction for Dynamic Decision Problems

First position after PhD: A Researcher at Google Deepmind

Gal Dalal
(10/2013-6/2018)

Large Scale Dynamic Optimization for Power Networks

First position after PhD: Researcher at SAIPS (a Ford company)

Oran Richman
(10/2012-6/2019)

Uncertainty Management in Learning Problems

First position after PhD: Researcher at the Israeli Defense Forces

Tom Zahavy
(10/2015-8/2019)

Reinforcement Learning and Deep Representations

First position after PhD: A Researcher at Google Deepmind

MSc

Ofir Mebel
(9/2010-11/2012)

Lightning does not strike twice: Robust Markov Decision Processes with correlated uncertainty

Graduated with distinction

Yoav Haimovich
(3/2011-12/2012)

Large Scale Semi Supervised Sentiment Analysis, jointly with K. Crammer

Barak Almog
(10/2009-8/2013)

Learning Policies in One-on-one Dogfights

Akram Baransi
(10/2009-3/2015)

Best Empirical Sampled Average – BESA – an Algorithm for the Multi-Armed Bandits

Nadav Maoz
(10/2012-3/2016)

The Cocktail Party Effect for Pattern Discovery in Heterogeneous Time Series Data

Nir Levin
(10/2015-8/2017)

Rotting Bandits

Snir Cohen
(1/2016-12/2017)

Subsampling for Decision Problems

Guy Tennenholtz
(1/2016-5/2018)

The Stochastic Firefighter Problem”. Graduated with distinction

Asaf Casel
(4/2016-6/2018)

A General Approach to Multi-Armed Bandits Under Risk Criteria

Postdoc

Loc Xuan Bui
(9/2011-1/2012, visitor)

Currently a faculty member at the department of Electrical Engineering, Tan Tao University, Vietnam

Dotan Di-Castro
(1/2011-4/2013, postdoc)

Currently at Bosch AI Research

Aditya Gopalan
(10/2011-3/2014, postdoc)

Currently an assistant professor at IISc Bangalore

Timothy Mann
(2/2013-1/2015, postdoc)

Currently a researcher at Google DeepMind

Odalric–Ambrym Maillard
(2/2013-12/2014, postdoc)

Currently a researcher at INRS-Saclay, France

Francois Schnitzler
(2/2013-9/2015, postdoc)

Currently a researcher at Technicolor, Vennes, France

Jiashi Feng
(2/2014-5/2014, visitor)

Currently an assistant professor at the department of Computer Science at NUS

Elad Gilboa
(9/2014-6/2016, postdoc)

Currently a researcher at SAIPS, a Ford company

Balazs Szorney
(6/2015-12/2017, postdoc)

Currently at Yahoo! Research NYC

Chao Qu
(12/2017-12/2018, postdoc)

Currently at Alibaba

Mark Kozdoba
(3/2014-, postdoc and then staff researcher)

Avinash Mohan
(3/2019-, postdoc)