Reinforcement Q-Learning Flagged Maze
Exploring reinforcement learning concepts and solve a maze problem. Discusses model state determination, techniques for state reduction, impact of learning rate (α), and impact of discount factor (γ) on decision-making and convergence. (GitHub)
Model State Determination and Reduction:
- Number of model states depends on the environment size.
- Reducing states through equivalence of certain positions.
Concepts and Components:
- States: Correspond to agent positions in the environment.
- Actions: Define agent movements – “up”, “down”, “left”, “right”.
- Rewards: Define penalties and incentives.
- Goal State: End point to reach, marked as “T”.
Learning Rate (α) Impact:
- Speed of convergence and oscillation.
- Exploration vs. exploitation balance.
- Stabilization and solution accuracy.
Discount Factor (γ) Impact:
- Long-term vs. short-term rewards.
- Optimal policy and goal emphasis.
- Convergence and temporal consistency.