Medicinal plants including Ocimum tenuiflorum L. (Tulsi), Azadirachta indica A. Juss. (Neem), and Kalanchoe pinnata (Lam.) Pers. (Patharkuchi) are essential sources of bioactive compounds, yet leaf diseases threaten their yield and phytochemical integrity. This study proposes LSeTNet, a lightweight hybrid CNN (Convolutional Neural Network) Transformer architecture with Squeeze-and-Excitation (SE) blocks, achieving 99.72% accuracy, 1.00 macro F1-score, and AUC = 1.00 across 12 disease classes (1,000 images/class post-augmentation) using only 9.38 M parameters and 2.50 GFLOPs. Five-fold cross-validation yielded 99.74% ± 0.14% accuracy, with rapid convergence and no overfitting. Explainable Artificial Intelligence (XAI) via Gradient-weighted Class Activation Mapping (Grad-CAM) (mean intensity: 0.1664–0.2702), Local Interpretable Model-agnostic Explanations (LIME), and t-distributed Stochastic Neighbor Embedding (t-SNE) (silhouette score: 0.87) confirmed biologically meaningful attention on pathological regions. External validation on the independent BD-MediLeaves dataset (8 classes, 8,000 samples) achieved 99.42% accuracy and 0.99 macro F1. With 6.98 ms/image inference latency and 35.81 MB memory, LSeTNet enables real-time, edge-based deployment. It significantly outperforms DenseNet169 (95.56%), ViT-B16 (95.61%), and LW-CNN+SE (95.39%) (, paired t-tests), establishing a transparent, efficient, and generalizable benchmark for precision phytopathology and sustainable medicinal plant cultivation.