Welcome to MNIST Battleground

Deep Learning is full of choices:

What architecture to pick?
What loss function?
What learning rate?
What activation functions?
...and so on.

And many words have been spilt across the internet about all of these, but rarely are those words backed up with data.

That is what we're doing here.

We're starting with a simple dataset that everyone should be familiar with: MNIST, and we'll be testing everything we can think of, and posting the results here. More data. Less fluff.

In particular, we'll be presenting results in terms of Accuracy (out of 100% on the validation set) vs FLOPs spent on training. This later is a twist compared to the usual practice of counting training batches, but when different architecture choices may involve substantially different computational costs, it's important to have an apples-to-apples comparison.