MOS2019:P000197

中国运筹学会第十二届全国数学优化会议暨第六次数学规划分会代表大会

2019年 4月19日 ~ 22日

P000197

Multi-Agent Reinforcement Learning for Content Caching in Wireless Networks

*梅霞陶 (上海交通大学)


Content caching is an effective technique to alleviate peak-hour traffic congestion, reduce backhaul pressure, and improve user perceived experience in wireless networks. The main design issue of content caching is where, when, and what content to cache. The past few years have witnessed the extensive progress on addressing this issue from both the information-theoretic perspective and the optimization perspective by assuming that content popularity or user preference is given in advance. In this talk, we will investigate this issue from the machine learning perspective in practice when content popularity and user preference are unknown and may change over time. In specific, we formulate the caching optimization problem in multi-cell wireless networks as a multi-agent reinforcement learning framework. Both multi-agent Q-learning-based and multi-agent multi-armed bandit-based algorithms will be designed by taking into account the nature of base station cooperation in wireless networks. Simulation results based on real-world data set will demonstrate the advantage of our proposed learning algorithms.

Content caching is an effective technique to alleviate peak-hour traffic congestion, reduce backhaul pressure, and improve user perceived experience in wireless networks. The main design issue of content caching is where, when, and what content to cache. The past few years have witnessed the extensive progress on addressing this issue from both the information-theoretic perspective and the optimization perspective by assuming that content popularity or user preference is given in advance. In this talk, we will investigate this issue from the machine learning perspective in practice when content popularity and user preference are unknown and may change over time. In specific, we formulate the caching optimization problem in multi-cell wireless networks as a multi-agent reinforcement learning framework. Both multi-agent Q-learning-based and multi-agent multi-armed bandit-based algorithms will be designed by taking into account the nature of base station cooperation in wireless networks. Simulation results based on real-world data set will demonstrate the advantage of our proposed learning algorithms.

Supported by SmartChair

Math formula preview: