Reinforcement Learning Bomberman

For the final project of my Artificial Intelligence for Robotics class, I worked in a team of 3 to code an AI capable of playing Bomberman in 5 different variants.  The 5 variants are as follows:

Variant 1: No monsters

Variant 2: 1 stupid monster, move randomly and not aware of the character

Variant 3: 1 self-preserving monster that avoids explosions and attacks the character when they're within a 1-block distance

Variant 4: 1 aggressive monster, same as variant 3 but the detection range is increased to 2-block

Variant 5: 1 stupid monster and 1 aggressive monster

Approximate Q-Learning

We experimented with various techniques such as minimax, expectimax, A*, etc. I was responsible for the final implementation of approximate Q-Learning. Approximate Q-Learning proved to be the most effective, meeting a satisfactory survival rate that was required. 

The features that I used for the algorithm include:

We ran this algorithm on each variant 100 times with different seeds while updating the weight values after each run to get the final optimized weights. 

Variant 1

The easiest variant, 100% survival rate

Variant 3

Bit harder with the self-preserving monster, 90% survival rate

Variant 5

The hardest variant with 2 monsters, 80% survival rate

Future Improvements

Due to time constraints, there are places we haven't experimented with that'd improve the performance. Some improvements I'd like to implement in the future include: