Jan-2025: Teaching a course on topics in RL. For details, click here.
Nov-2025: A paper entitled Policy Newton methods for Distortion Riskmetrics accepted for publication in AAAI (2026).
Jul-2025: Back at IITM after visiting C-MinDS at IITB from Aug-2024 to Jul-2025.
May-2025: A book entitled ‘Gradient-based algorithms for zeroth-order optimization’ published. See book page for the details.
May-2025: A paper entitled ‘Finite Time Analysis of Temporal Difference Learning for Mean-Variance in a Discounted MDP’ accepted for publication in Reinforcement Learning Conference (RLC). Click here for the arxiv report.
Mar-2025: Invited talk on ‘Distorted bandits or How I learned to be risk-seeking without regretting it’ at National Conference on Communications (NCC-2025) held at IIT Delhi. Click here for the slides.
Jan-2025: Invited talk on ‘Reinforcement Learning and Bandit Algorithms for Distortion Riskmetrics’ at Reinforcement learning workshop held at IISc, Bengaluru. Click here for the video.
Jan-2025: A paper entitled Generalized Simultaneous Perturbation-based Gradient Search with Reduced Estimator Bias accepted for publication in IEEE Transactions on Automatic Control and another paper entitled ‘‘Risk-sensitive Bandits: Arm Mixture Optimality and Regret-efficient Algorithms’’ accepted for publication in AISTATS.
Jan-2025: Teaching a course on stochastic optimization. For details, click here.
Jun-2024: A paper entitled Optimization of utility-based shortfall risk: A non-asymptotic viewpoint accepted to IEEE Conference on Decision and Control (CDC).
Jun-2024: A paper entitled Online Estimation and Optimization of Utility-Based Shortfall Risk accepted for publication in Mathematics of Operations Research.
Feb-2024: Invited talk on ‘A cubic-regularized policy Newton for reinforcement learning’ at Reinforcement learning workshop held at IISc, Bengaluru. Click here for the video.
Jan-2024: A paper entitled A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning accepted for publication in AISTATS.
Aug-2023: Teaching a course on operating systems. For details, click here.
Jul-2023: Invited talk on ‘Finite time analysis of temporal difference learning with linear function approximation’ at Data science: Probabilistic and optimization methods held at International Centre for Theoretical Sciences, Bengaluru. Click here for the video.
Jan-2023: A paper entitled A policy gradient approach for optimization of smooth risk measures accepted for publication in UAI.
Feb-2023: Invited talk on ‘Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation’ at Networks Seminar Series held (in-person) at Indian Institute of Science. Click here for the video.
Feb-2023: Tutorial on risk-sensitive reinforcement learning at AAAI-2023. Click here for details.
Jan-2023: A paper entitled Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation accepted for publication in AISTATS.
Jan-2023: Teaching a course on stochastic optimization. For details, click here.
Jan-2023: Invited talk on ‘A Wasserstein distance approach for concentration of empirical risk estimates’ at Information Theory and Data Science Workshop held (in-person) at National University of Singapore.
Aug-2022: A paper entitled A Wasserstein distance approach for concentration of empirical risk estimates accepted for publication in Journal of Machine Learning Research.
Jul-2022: Teaching a course on programming and data structures. For details, click here.
Jul-2022: Tutorial on Risk-Aware Multi-armed Bandits at SPCOM 2022. Slides here.
Jun-2022: A monograph entitled Risk-Sensitive Reinforcement Learning via Policy Gradient Search published by Foundations and Trends in Machine Learning.
Apr-2022: A survey article entitled A Survey of Risk-Aware Multi-Armed Bandits accepted at IJCAI-2022.
Feb-2022: Invited talk on ‘Concentration of risk measures: A Wasserstein distance approach’ at ‘IITB Workshop on Stochastic Models’.
Jan-2022: Teaching a course on object oriented analysis using C++. For details, click here.
Oct-2021: Invited talk on ‘Concentration bounds for temporal difference learning with linear function approximation: The case of batch data and uniform sampling’ at IISc workshop on Deep Reinforcement Learning. For the video recording, click here.
Aug-2021: Teaching a course on RL. For details, click here.
Aug-2021: A paper entitled Non-asymptotic bounds for stochastic optimization with biased noisy gradient oracles accepted with minor revisions for publication in IEEE Transactions on Automatic Control.
Aug-2021: Serving on the 'Senior Program Committee’ of AAAI-22.
Jul-2021: Nirav Bhavsar wins ‘Biswajit Sain MS Thesis Award 2021’.
Jul-2021: A paper entitled Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint accepted for publication in Systems and Control Letters.
Feb-2021: Teaching a course on RL. Programming assignments facilitated by Aicrowd - see here, here, and here. For course details, click here.
Dec-2020: A paper entitled Estimation of Spectral Risk Measures accepted at AAAI-21.
Sep-2020: A paper entitled Concentration bounds for temporal difference learning with linear function approximation: The case of batch data and uniform sampling accepted for publication in the Machine Learning journal.
Aug-2020: Serving on the 'Senior Program Committee’ of AAAI-21.
Aug-2020: Teaching a course on stochastic modeling and the theory of queues. For details, click here.
Jun-2020: A paper entitled Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions accepted at ICML 2020.
Jan-2020: Tutorial on reinforcement learning at CCBR-IITM. Check here
Dec-2019: Visited UMD College Park to collaborate with Prof. Michael Fu and Prof. Steve Marcus. Attended NeurIPS 2019 and ICC 2019. Visited TCS Research, Hyderabad.
Aug-2019: A paper entitled Concentration of risk measures: A Wasserstein distance approach accepted at NeurIPS 2019.
Jul-2019: A paper entitled Random directions stochastic approximation with deterministic perturbations accepted to IEEE Transactions on Automatic Control.
Jun-2019: Tutorial on reinforcement learning at the ACM India Summer School on theoretical and algorithmic aspects on Machine Learning. Hand-written notes here.
Apr-2019: A paper on Correlated bandits accepted at ICML 2019.
Jan-2019 Teaching courses on ML and bandits. For details, click here and here.
Nov-2018: Posted a survey article on “Risk-sensitive reinforcement learning: A constrained optimization viewpoint” to arxiv. Check it out here.
Nov-2018: A paper on concentration bounds for Conditional Value-at-Risk (CVaR) accepted to Operations research letters.
Aug-2018 Teaching a course on RL. For details, click here.
May-2018: DST-ECRA (Early Career Research Award).
Jan-2018: Tutorial on Simultaneous perturbation methods for simulation optimization at Indian Control Conference 2018. Slides here.
Dec-2017: A paper on Stochastic optimization using Cumulative Prospect Theory accepted to IEEE Transactions on Automatic Control.
Jul-2017: Gave a tutorial on Simultaneous perturbation methods for stochastic non-convex optimization at ACM MobiHoc 2017. Slides here.
Mar-2017: Joined Department of Computer Science and Engineering at Indian Institute of Technology Madras.
Nov-2016: Attended INFORMS annual meeting at Nashville to present work on weighted bandits in this session. Slides here. In related news, this work got accepted at AAAI 2017.
