The retina, a highly sensitive part of the eyes, is crucial for vision, particularly when affected by disease. Early identification of ocular diseases is made possible by optical coherence tomography (OCT), which offers excellent quality longitudinal images of the retinal layers. We present an iterative average ensemble method for enhanced OCT image classification accuracy. We combine three pre-trained models— ResNet101v2, DenseNet201, and NASNetMobile—into an ensemble, leveraging their unique strengths to improve performance. Tested on a real-world retinal OCT dataset of 84,495 JPEG images categorized into Normal, CNV, DME, and Drusen, our ensemble model achieves 91.51% accuracy, surpassing traditional CNNs and alternative cutting-edge techniques. For comparison, DenseNet201 achieved 89.75%, NASNetMobile 86.17%, and ResNet101v2 89.21%, highlighting the effectiveness of our model in diagnosing retinal diseases.