Accurate and early diagnosis of eye disease is important to prevent irreversible vision loss and enable prompt clinical intervention. In this paper, we propose a powerful hybrid ensemble deep learning framework for multi-class eye disease detection using fundus images. The framework ensembles two state-of-the-art convolutional neural networks, ResNet50 and InceptionV3 with transformer-based models, Vision Transformer (ViT) and Swin Transformer to take advantage of both local feature extraction and global context information. Data preprocessing, including normalization, data augmentation, and SMOTE, was utilized to enhance data variability and correct class imbalance. Three ensembles were built and contrasted: a transfer learning ensemble, a transformer ensemble, and a hybrid ensemble of all four models combined. Experimental findings indicated that the hybrid ensemble yielded better results with an overall accuracy of 90.52 %, high precision, recall, and F 1 -scores across different classes of eye diseases. The findings demonstrate the effectiveness of ensemble deep learning for creating scalable and accurate diagnostic systems in ophthalmology.