biwako's picture
Update README.md
cf3f65d
metadata
tags:
  - FrozenLake-v1-4x4-no_slippery
  - q-learning
  - reinforcement-learning
  - custom-implementation
model-index:
  - name: q-FrozenLake-v1-4x4-noSlippery
    results:
      - task:
          type: reinforcement-learning
          name: reinforcement-learning
        dataset:
          name: FrozenLake-v1-4x4-no_slippery
          type: FrozenLake-v1-4x4-no_slippery
        metrics:
          - type: mean_reward
            value: 1.00 +/- 0.00
            name: mean_reward
            verified: false

Q-Learning Agent playing1 FrozenLake-v1

This is a trained model of a Q-Learning agent playing FrozenLake-v1 .

Usage

{'env_id': 'FrozenLake-v1',

'max_steps': 99,

'n_training_episodes': 10000,

'n_eval_episodes': 100,

'eval_seed': [],

'learning_rate': 0.7,

'gamma': 0.95,

'max_epsilon': 1.0,

'min_epsilon': 0.05,

'decay_rate': 0.0005,

'qtable': array([[

     0.73509189, 0.77378094, 0.77378094, 0.73509189],

    [0.73509189, 0.        , 0.81450625, 0.77378094],
    
    [0.77378094, 0.857375  , 0.77378094, 0.81450625],
    
    [0.81450625, 0.        , 0.77378094, 0.77378094],
    
    [0.77378094, 0.81450625, 0.        , 0.73509189],
    
    [0.        , 0.        , 0.        , 0.        ],
    
    [0.        , 0.9025    , 0.        , 0.81450625],
    
    [0.        , 0.        , 0.        , 0.        ],
    
    [0.81450625, 0.        , 0.857375  , 0.77378094],
    
    [0.81450625, 0.9025    , 0.9025    , 0.        ],
    
    [0.857375  , 0.95      , 0.        , 0.857375  ],
    
    [0.        , 0.        , 0.        , 0.        ],
    
    [0.        , 0.        , 0.        , 0.        ],
    
    [0.        , 0.9025    , 0.95      , 0.857375  ],
    
    [0.9025    , 0.95      , 1.        , 0.9025    ],
    
    [0.        , 0.        , 0.        , 0.        ]])}