OPRL

Modular Library for Off-Policy Reinforcement Learning

DeepMind Control Suite

Environment DDPG TQC
acrobot-swingup 155.59 269.49
ball_in_cup-catch 975.56 980.80
cartpole-balance 951.99 995.74
cartpole-swingup 837.24 877.58
cheetah-run 665.71 857.45
finger-spin 975.86 985.18
finger-turn_easy 869.67 971.58
finger-turn_hard 801.58 953.54
fish-upright 824.81 936.42
fish-swim 103.96 647.46
hopper-stand 825.56 827.44
hopper-hop 240.66 234.14
humanoid-stand 228.32 786.03
humanoid-walk 101.72 517.65
humanoid-run 1.12 164.55
pendulum-swingup 789.10 783.78
point_mass-easy 411.64 875.83
reacher-easy 890.17 981.49
reacher-hard 921.87 949.69
swimmer-swimmer6 358.72 587.06
walker-stand 974.92 985.22
walker-walk 957.28 971.97
walker-run 688.34 794.57