Reinforcement Learning Bomberman

For the final project of my Artificial Intelligence for Robotics class, I worked in a team of 3 to code an AI capable of playing Bomberman in 5 different variants. The 5 variants are as follows:

Variant 1: No monsters

Variant 2: 1 stupid monster, move randomly and not aware of the character

Variant 3: 1 self-preserving monster that avoids explosions and attacks the character when they're within a 1-block distance

Variant 4: 1 aggressive monster, same as variant 3 but the detection range is increased to 2-block

Variant 5: 1 stupid monster and 1 aggressive monster

Approximate Q-Learning

We experimented with various techniques such as minimax, expectimax, A*, etc. I was responsible for the final implementation of approximate Q-Learning. Approximate Q-Learning proved to be the most effective, meeting a satisfactory survival rate that was required.

The features that I used for the algorithm include:

Euclidean distance between the character and the monsters.
Shortest A* distance between the character and the exit.
- If no A* path exists due to the walls, the euclidean distance is used instead.

We ran this algorithm on each variant 100 times with different seeds while updating the weight values after each run to get the final optimized weights.

Variant 1

The easiest variant, 100% survival rate

Variant 3

Bit harder with the self-preserving monster, 90% survival rate

Variant 5

The hardest variant with 2 monsters, 80% survival rate

Future Improvements

Due to time constraints, there are places we haven't experimented with that'd improve the performance. Some improvements I'd like to implement in the future include:

By default, the character only places a bomb when it's next to a wall. The character can be improved to also place a bomb when it gets close to a monster.
After placing a bomb, the character runs away in a random diagonal direction. This can be changed so that they'll move in the direction furthest away from the monsters instead.

Page updated

Google Sites

Report abuse