Introduction To Transformers For Nlp: With The ... -
: A high-level overview detailing how transformers became the go-to architecture not just for NLP, but also for computer vision and audio processing.
For a broader introduction to the field, these resources are also highly recommended: Introduction to Transformers for NLP: With the ...
(2017): The seminal paper by Vaswani et al. that first introduced the transformer architecture, replacing traditional recurrent networks with the self-attention mechanism. : A high-level overview detailing how transformers became
: A systematic review from 2024 that highlights how these models solve various NLP problems across different languages and domains. : A systematic review from 2024 that highlights
: A 2023 review that demystifies the architecture by breaking it down into its core components for beginners.
[2311.17633] Introduction to Transformers: an NLP Perspective
An essential paper for anyone starting out is by Tong Xiao and Jingbo Zhu. It serves as a comprehensive 119-page guide that bridges the gap between basic concepts and recent advanced techniques.