It’s easy to find the bottom of a bowl no matter where you start — if you toss a marble anywhere into the bowl, it’ll roll downhill and find its way to bottom. What does this have to do with Machine Learning? A natural way to try to construct an accurate classifier is to minimize the number of prediction errors the classifier makes on training data. The trouble is, even for moderate-sized data sets, minimizing the number of training errors is a computationally intractable problem. A popular way around this’s to assign different training errors different costs and to minimize the total cost. If the costs are assigned in a certain way (according to a “convex loss function“), the total cost can be efficiently minimized the way a marble rolls to bottom of a bowl.”
In a recent paper, Rocco Servedio and Phil Long show that no algorithm that works this way can achieve a simple and natural theoretical noise-tolerance guarantee that can be achieved by other kinds of algorithms.