Accurate demand forecasting underpins inventory planning, promotions, and workforce scheduling in retail. Although deep learning has advanced time-series forecasting, structured retail data are still dominated by tree ensembles due to their strong accuracy–efficiency trade-offs. We introduce a resourceefficient deep ensemble tailored for structured sales forecasting. A compact MLP backbone feeds multiple heads trained with a diversity regularizer, and a lightweight residual corrector further refines predictions. On the Walmart weekly sales dataset, the proposed multihead ensemble matches the accuracy of a five-member deep ensemble while using about 80% fewer parameters and training 4−5× faster. The boosted variant further reduces error, reaching an RMSE within 2-3% of XGBoost while remaining compact. Ablation studies on ensemble size and diversity weight confirm robustness, residuals are centered with light tails, and a case study shows improved capture of holiday peaks. These findings demonstrate that efficient ensembling can make neural methods competitive for structured retail forecasting when both scale and cost matter.