The loss function should be convex in shape to ensure convergence. A smooth, differentiable, convex loss function provides better convergence. As the learning proceeds, the value of the loss function should decrease and eventually become stable.