Exploring Attention Mechanisms In Transformer Models For Machine Translation

M. Sumanjali, P. Swathi

This work delves into the utilization of attention mechanisms within Transformer models for machine translation, aiming to enhance the efficacy and interpretability of these models. By systematically analyzing various attention mechanisms, such as self-attention and cross-attention, the study investigates their impact on translation accuracy and computational efficiency. The research also explores the interplay between different attention layers and the overall performance of Transformer-based systems in diverse linguistic contexts. Our findings highlight significant improvements in translation quality and model interpretability when leveraging advanced attention strategies. This work provides a comprehensive evaluation of current techniques and proposes optimizations that could advance the state-of-the-art in machine translation.
PDF