Identifying Neuroticism from User Generated Content of Social Media based on Psycholinguistic Cues
Social media has become a huge repository of textual data and images as each of the users' are creating posts, sharing views or news, capturing the moments via photos etc. Sharing or posting statuses/tweets could be considered as a common feature among the popular social networking sites like Facebook, Twitter, Google+ etc. User generated textual data such as statuses or tweets could be considered as the essential language to communicate in social media with others. This paper investigates the possibilities of identifying negative personality trait based on the psycholinguistic cues extracted from the language used in social media. Predicting personality traits based on widely accepted framework of Big Five Factor Model (BFFM) is a challenging task. According to the model, there are four positive traits namely openness to experience, conscientiousness, agreeableness and extraversion, while there is only one negative trait neuroticism. The tendency of experiencing negative emotions such as anger, sad, anxiety, depression, instability are referred as neuroticism. We have used psycholinguistic cues extracted using linguistic enquiry and word count (LIWC) for predicting neuroticism. We have applied five different classifiers to evaluate the prediction model.
Feature extraction, Facebook, Predictive models, Linguistics, Decision trees, Radio frequency