ID of Leaders Lurkers Associates and Spammers in a Social Network Context-Dependent and Context-Independent Approaches

Title	Identification of leaders, lurkers, associates and spammers in a social network: context-dependent and context-independent approaches
Publication Type	Journal Article
Year of Publication	2011
Authors	Fazeen, M, Dantu, R, Guturu, P
Journal	Social Network Analysis and Mining
Volume	1
Pagination	241-254
ISSN	1869-5469
Keywords	Context dependent and context independent data analysis, Fuzzy logic, MLP, Naive Bayesian classifier, Random Forest, Social networks, Twitter
Abstract	In this paper, we present two methods for classification of different social network actors (individuals or organizations) such as leaders (e.g., news groups), lurkers, spammers and close associates. The first method is a two-stage process with a fuzzy-set theoretic (FST) approach to evaluation of the strengths of network links (or equivalently, actor-actor relationships) followed by a simple linear classifier to separate the actor classes. Since this method uses a lot of contextual information including actor profiles, actor-actor tweet and reply frequencies, it may be termed as a context-dependent approach. To handle the situation of limited availability of actor data for learning network link strengths, we also present a second method that performs actor classification by matching their short-term (say, roughly 25 days) tweet patterns with the generic tweet patterns of the prototype actors of different classes. Since little contextual information is used here, this can be called a context-independent approach. Our experimentation with over 500 randomly sampled records from a twitter database consists of 441,234 actors, 2,045,804 links, 6,481,900 tweets, and 2,312,927 total reply messages indicates that, in the context-independent analysis, a multilayer perceptron outperforms on both on classification accuracy and a new F-measure for classification performance, the Bayes classifier and Random Forest classifiers. However, as expected, the context-dependent analysis using link strengths evaluated using the FST approach in conjunction with some actor information reveals strong clustering of actor data based on their types, and hence can be considered as a superior approach when data available for training the system is abundant.
URL	http://dx.doi.org/10.1007/s13278-011-0017-9
DOI	10.1007/s13278-011-0017-9

Publication Status:

Published

UNT Department:

Computer Science and Engineering (CSE)

UNT Center:

Center for Information and Cyber Security (CICS)

UNT Lab:

Network Security Laboratory (NSL)

Document:

leaders_lurkers_associates_spammers.pdf