Abstract
The Stewart platform is an entirely parallel robot with mechanical differences from typical serial robotic manipulators, which has a wide application area ranging from flight and driving simulators to structural test platforms. This work concentrates on learning to control a complex model of the Stewart platform using state-of-the-art deep reinforcement learning (DRL) algorithms. In this regard, to enhance the reliability of the learning performance and to have a test bed capable of mimicking the behavior of the system completely, a precisely designed simulation environment is presented. Therefore, we first design a parametric representation for the kinematics of the Stewart platform in Gazebo and robot operating system (ROS) and integrate it with a Python class to conveniently generate the structures in simulation description format (SDF). Then, to control the system, we benefit from three DRL algorithms: the asynchronous advantage actor–critic (A3C), the deep deterministic policy gradient (DDPG), and the proximal policy optimization (PPO) to learn the control gains of a proportional integral derivative (PID) controller for a given reaching task. We chose to apply these algorithms due to the Stewart platform’s continuous action and state spaces, making them well-suited for our problem, where exact controller tuning is a crucial task. The simulation results show that the DRL algorithms can successfully learn the controller gains, resulting in satisfactory control performance.