Key Challenges in Online Social networks Analysis: A Survey – IJERT

  • Linkage-based and Structural Analysis: In linkage-based and Structural analysis, an analysis of the linkage behavior of the network can be constructed in order to determine important nodes, communities, links, and evolving regions of the network. Such an analysis provides a good overview of the global evolution behavior of the underlying network.

    Adding Content-based Analysis: Many social networks contain a tremendous amount of content

    which can be leveraged in order to improve the quality of the analysis. It has been observed that combining content-based analysis with linkage-based analysis provides more effective results in a wide variety of applications.

    The rapid growth of the social networks observed several key challenges such as data gathering techniques, heterogeneity, scalability, missing data etc.

    The amount and kinds of data generated by social network usage are too rich to be captured by only one of these methods. The data may be collected from OSNs, (i) from the social network websites; (ii) from surveys, by asking participants about their behavior;

    (iii) through deployed applications, by directly monitoring users as they share content online. Hence, we believe that a single data collection method is insufficient to capture all aspects of users experience.

    The heterogeneity of data in OSNs is characterized by huge data sets and varied data types, both semistructured and unstructured (videos, images, audio, click-streams, weblogs, text, and e- mail). In essence heterogeneous data is from any number of sources, largely unknown and unlimited, and in many varying formats.

    Managing and processing on a network consisting of hundreds of millions of edges on a single machine [5], distributing status updates to millions of users [4, 6] and managing and distributing user generated content (UGC) to millions of users spread geographically [4, 2]. The growth and popularity of this is unprecedented and pose unique challenges in terms of scaling, management and maintenance [8].

    The increasing volume of generated data in OSNs, and the growing concerns of users will exacerbate the problem of missing data over time. Prediction of missing information is an important part of data analysis in social networks [1, 2, 3].

    This paper aims to present the current state-of-art on the key challenges such as heterogeneity, scalability, missing data in OSNs. The rest of the paper is organized as follows: Section 2 presents the state-of-the art in the selected challenges. In Section 3 the provisional research challenges are analyzed. Finally, in Section 4 we present the conclusions.