Aggression detection refers to the identification of offensive or harmful expressions in online communications and is crucial due to its significant societal implications and impact on online safety. However, detecting aggression in Bengali texts is particularly challenging due to the language’s complex morphology and the scarcity of annotated datasets and computational tools. Although the task has been extensively studied in high-resource languages, research in the resource-constrained Bengali language remains limited. To address the gap, in this paper, we developed a new dataset BOLT (Bangla Offensive and Lethal Texts) consisting of 4027 annotated instances classified into four classes based on severity of aggression- no aggression, hate speech, vandalism, and atrocity. To optimize model performance, we proposed a modified attention mechanism within transformers. Unlike conventional self-attention, which computes pairwise interactions between all tokens and treats their importance uniformly, our approach introduces a learnable token-wise weight matrix to assess the relevance of each token for aggressive text classification. The attention scores are obtained by multiplying the transformer’s token embeddings with this matrix to generate task-specific weights that better capture contextual significance. Moreover, we utilized a weighted ensemble of the optimized transformer models to combine their individual strengths. Experimental results show that our proposed approach outperforms existing methods and baseline models, achieving both accuracy and weighted F1-score of 87%. Our developed dataset and proposed method pave the way for impactful advancements in Bengali NLP and support content moderation tools and create safer digital platforms through automated aggression detection.