Data Science by ODS.ai 🦜
46.2K subscribers
643 photos
74 videos
7 files
1.73K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev
加入频道
Emerging Cross-lingual Structure in Pretrained Language Models

tl;dr – dissect mBERT & XLM and show monolingual BERTs are similar

They offer an ablation study on bilingual #MLM considering all relevant factors. Sharing only the top 2 layers of the #transformer finally break cross-lingual transfer.
Factors importance: parameter sharing >> domain similarity, anchor points, language universal softmax, joint BPE

We can align monolingual BERT representation at word-level & sentence level with orthogonal mapping. CKA visualizes the similarity of monitoring. & billing. BERT

Paper: https://arxiv.org/abs/1911.01464

#nlp #multilingual