自然语言处理中的预训练过程
预训练过程
预训练模型
Bert
BERT (来⾃ Google):作者 Jacob Devlin、Ming-Wei Chang、Kenton Lee 和 Kristina
Toutanova:BERT: Pre-training of Deep Bidirectional Transformers for Language
Understanding(《BERT:⽤于语⾔理解的深度双向 Transformer 的预训练》)
GPT
GPT (来⾃ OpenAI):作者 Alec Radford、Karthik Narasimhan、Tim Salimans 和 Ilya
Sutskever:Improving Language Understanding by Generative Pre-Training (《通过⽣成式预训练
提⾼语⾔理解能⼒》)
GPT2
GPT (来⾃ OpenAI):作者 Alec Radford、Karthik Narasimhan、Tim Salimans 和 Ilya
Sutskever:Improving Language Understanding by Generative Pre-Training (《通过⽣成式预训练
提⾼语⾔理解能⼒》)
Transformer-XL
Transformer-XL (来⾃ Google/CMU):作者 Zihang Dai、Zhilin Yang、Yiming Yang, Jaime
Carbonell、Quoc V. Le、Ruslan Salakhutdinov:Transformer-XL: Attentive Language Models
Beyond a Fixed-Length Context (《Transformer-XL:超⻓上下⽂关系的注意⼒语⾔模型》)
XLNet
XLNet (来⾃ Google/CMU):作者 Zihang Dai、Zhilin Yang、Yiming Yang、Jaime Carbonell、
Quoc V. Le、Ruslan Salakhutdinov:XLNet: Generalized Autoregressive Pretraining for Language
Understanding (《XLNet:⽤于语⾔理解的⼴义⾃回归预训练》)
XLM
XLM (来⾃ Facebook):作者 Guillaume Lample 和 Alexis Conneau:Cross-lingual Language
Model Pretraining (《跨语⾔的语⾔模型预训练》)
参考文献
媲美人类有何不可?深度解读微软新 AI 翻译系统四大秘技
Creating A Language Translation Model Using Sequence To Sequence Learning Approach
Bert时代的创新(应用篇):Bert在NLP各领域的应用进展