numpy.expand_dims (a, axis) Expand the shape of an array. Insert a new axis that will appear at the axis ... SGD can be faster than batch gradient descent, intuitevely, when the dataset contains redundancy--say the same point occurs many times--SGD could complete before batch gradient does one iteration!

