Adapted from Jeremy Kun's MWUA demo (source: github)

There are four experts, and each round they compete to get a reward. You get to pick the rewards each round.

The Weighted Majority (WM) / Multiplicate Weights Update Method (MWUM) will follow the advice of an expert in each round, and over time try to perform as well as the best expert.

The WM algorithm picks the expert with the maximum weight while the MWUM is a randomized algorithm that picks an expert with probability proportional to their weights.

Algorithm state

Experts:	Alaknanda	Bhagirathi	Chambal	Dhauliganga
Cumulative rewards:	0	0	0	0
Weights (?):
Weights (normalized)(?):
Learning rate (?):
WA cumulative performance
MWUA Expected performance
MWUA cumulative performance

RewardsRewards (in the range [0,1]) (?)

Experts:	Alaknanda	Bhagirathi	Chambal	Dhauliganga
Round 1:

Weighted Majority / Multiplicative Weights Update Method Demo

Algorithm state

RewardsRewards (in the range [0,1]) (?)

For round 1, MWUM picked while WM picked