Scopus Indexed Publications

Paper Details


Title
Open source initiatives and frameworks addressing distributed real-time data analytics
Author
Sarwar Jahan Morshed,
Email
sarawar.cse@diu.edu.bd
Abstract
The continuous evolution of digital services, is resulting in the generation of extremely large data sets that are created in almost real time. Exploring new opportunities for improving the quality of these digital services, as well as providing better-personalized experiences to digital users are two major challenges to be addressed. Different methods, tools, and techniques existed today to generate actionable insights from digital services data. Traditionally, big data problems are handled on historical data-sets. However, there is a growing demand on real-time data analytics to offer new services to users and to provide pro-active customers' care, personalized ads, emergency aids, just to give a few examples. Spite of the fact that there are few existing frameworks for real-time analytics, however, utilizing those for solving distributed real-time big data analytical problems stills remains a challenge. Existing real-time data analytics (RTDA) frameworks are not covering all the features that requires for distributed computation in real-time. Therefore, in this paper, we present a qualitative overview and analysis on some of the mostly used existing RTDA frameworks. Specifically, Apache Spark, Apache Flink, Apache Storm, and Apache Samza are covered and discussed in this paper.

Keywords
Real-time , data analytics , big data , streaming data , data analytics framework , distributed real-time data analysis
Journal or Conference Name
2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
Publication Year
2016
Indexing
scopus