Human interaction has always been a critical aspect of social communication. Human action tracking and human behavior recognition are all indicators that assist in investigating human interaction and classification. Several features are considered to analyze human interaction classification in images and videos, including shape, the position of the human body parts, and their environmental effects. This paper approximated different human body key points to track their occurrence under challenging situations. Such tracking of critical body parts requires numerous features. Therefore, we first estimated human pose using key points and 2D human skeleton features to get full human body features. The extracted features are then served to t-DSNE in order to eliminate the redundant features. Finally, the optimized features are infused into the recognizer engine as a k-ary tree hashing algorithm. The experimental results have shown significant results on two benchmark datasets, including the UCF Sports Action dataset with an accuracy of 88.50% and an 89.45% mean recognition rate on the YouTube Action database. The results revealed that the proposed system had achieved better human body part tracking and classification when compared with other state-of-the-art techniques.