In this paper, a unique method for predicting water quality resilience has been demonstrated using deep reinforcement learning. We showed a step by step procedure to formalise a multi-objective Markov decision process (MOMDP) to make a simulated environment for predicting water quality resilience in the state of São Paulo, Brazil. A common approach to solve a historical dataset is supervised learning. However, if we turn the problem into a multi-objective optimisation problem in a dynamic environment, the traditional approach may not be suitable or can have difficulties to be implemented. Notwithstanding, if we want to address this problem using a simulated environment (e.g. gaming environment) and give an opportunity to the agent to learn from the environment by itself through trial and error method, it could have brought human level expertise in such a complex constrained environment. An intelligent agent traverses this environment to determine the compromising solutions among objectives to predict resilient areas. The outcome revealed that a water quality dataset can be modelled into a reinforcement learning settings that is predominantly used to solve sequential decision making. This innovative method may solve many other intractable engineering problems with the help of deep reinforcement learning.