Polynomial decay

  1. Advanced techniques for early stopping: Learning rate schedules, adaptive optimization, and more
  2. Mini-Batch Gradient Descent: AI (Brace For These Hidden GPT Dangers)