Forwarded from Tensorflow(@CVision) (Vahid Reza Khazaie)
New Google Brain Optimizer Reduces BERT Pre-Training Time From Days to Minutes
کاهش مدت زمان pre-training مدل زبانی BERT از سه روز به 76 دقیقه با ارائه یک تابع بهینه ساز جدید!
Google Brain researchers have proposed LAMB (Layer-wise Adaptive Moments optimizer for Batch training), a new optimizer which reduces training time for its NLP training model BERT (Bidirectional Encoder Representations from Transformers) from three days to just 76 minutes.
لینک مقاله: https://arxiv.org/abs/1904.00962
لینک بلاگ پست: https://medium.com/syncedreview/new-google-brain-optimizer-reduces-bert-pre-training-time-from-days-to-minutes-b454e54eda1d
#BERT #language_model #optimizer
کاهش مدت زمان pre-training مدل زبانی BERT از سه روز به 76 دقیقه با ارائه یک تابع بهینه ساز جدید!
Google Brain researchers have proposed LAMB (Layer-wise Adaptive Moments optimizer for Batch training), a new optimizer which reduces training time for its NLP training model BERT (Bidirectional Encoder Representations from Transformers) from three days to just 76 minutes.
لینک مقاله: https://arxiv.org/abs/1904.00962
لینک بلاگ پست: https://medium.com/syncedreview/new-google-brain-optimizer-reduces-bert-pre-training-time-from-days-to-minutes-b454e54eda1d
#BERT #language_model #optimizer
arXiv.org
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Training large deep neural networks on massive datasets is computationally very challenging. There has been recent surge in interest in using large batch stochastic optimization methods to tackle...