Domanda di colloquio di Elsevier

How do transformers work? What is Multi-Head attention? What is positional encoding?