Weighted Majority / Multiplicative Weights Update Method Demo

Adapted from Jeremy Kun's MWUA demo (source: github)

There are four experts, and each round they compete to get a reward. You get to pick the rewards each round.

The Weighted Majority (WM) / Multiplicate Weights Update Method (MWUM) will follow the advice of an expert in each round, and over time try to perform as well as the best expert.

The WM algorithm picks the expert with the maximum weight while the MWUM is a randomized algorithm that picks an expert with probability proportional to their weights.

Algorithm state

Experts: Alaknanda Bhagirathi Chambal Dhauliganga
Cumulative rewards: 0 0 0 0
Weights (?):
Weights (normalized)(?):
Learning rate (?):
WA cumulative performance
MWUA Expected performance
MWUA cumulative performance

RewardsRewards (in the range [0,1]) (?)

Experts: Alaknanda Bhagirathi Chambal Dhauliganga
Round 1:

For round 1, MWUM picked while WM picked