OPRL
Modular Library for Off-Policy Reinforcement Learning
DeepMind Control Suite
| Environment |
DDPG |
TQC |
| acrobot-swingup |
155.59 |
269.49 |
| ball_in_cup-catch |
975.56 |
980.80 |
| cartpole-balance |
951.99 |
995.74 |
| cartpole-swingup |
837.24 |
877.58 |
| cheetah-run |
665.71 |
857.45 |
| finger-spin |
975.86 |
985.18 |
| finger-turn_easy |
869.67 |
971.58 |
| finger-turn_hard |
801.58 |
953.54 |
| fish-upright |
824.81 |
936.42 |
| fish-swim |
103.96 |
647.46 |
| hopper-stand |
825.56 |
827.44 |
| hopper-hop |
240.66 |
234.14 |
| humanoid-stand |
228.32 |
786.03 |
| humanoid-walk |
101.72 |
517.65 |
| humanoid-run |
1.12 |
164.55 |
| pendulum-swingup |
789.10 |
783.78 |
| point_mass-easy |
411.64 |
875.83 |
| reacher-easy |
890.17 |
981.49 |
| reacher-hard |
921.87 |
949.69 |
| swimmer-swimmer6 |
358.72 |
587.06 |
| walker-stand |
974.92 |
985.22 |
| walker-walk |
957.28 |
971.97 |
| walker-run |
688.34 |
794.57 |