Adapted from Jeremy Kun's MWUA demo (source: github)
There are four experts, and each round they compete to get a reward. You get to pick the rewards each round.
The Weighted Majority (WM) / Multiplicate Weights Update Method (MWUM) will follow the advice of an expert in each round, and over time try to perform as well as the best expert.
The WM algorithm picks the expert with the maximum weight while the MWUM is a randomized algorithm that picks an expert with probability proportional to their weights.
Algorithm state
Experts: | Alaknanda | Bhagirathi | Chambal | Dhauliganga |
---|---|---|---|---|
Cumulative rewards: | 0 | 0 | 0 | 0 |
Weights (?): | ||||
Weights (normalized)(?): | ||||
Learning rate (?): | ||||
WA cumulative performance | ||||
MWUA Expected performance | ||||
MWUA cumulative performance |
RewardsRewards (in the range [0,1]) (?)
Experts: | Alaknanda | Bhagirathi | Chambal | Dhauliganga |
---|---|---|---|---|
Round 1: |