Abstract:
Abstract: Deep Learning has witnessed a 300,000X growth in compute demand in the last 6 years, matched equally by a growth in hardware capability – primarily riding on half precision compute and systolic arrays. Compute architects are now looking for the next 10-100X architectural opportunity, and there are no easy answers. This is where a theoretical understanding of how deep-learning compute manipulates probability distributions is necessary. We hope that this understanding will lead to ways for reducing compute and bandwidth needs and deliver the next 10X performance. We will talk about some empirical results and directions which look promising (LSH, projections, matrix factorization, sub-8b precision compute, and others), and try to motivate the TIFR community to delve into these problems.