0% found this document useful (0 votes)
16 views10 pages

Lecture Notes in Computer Science 8302: Editorial Board

Uploaded by

kaparna869
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views10 pages

Lecture Notes in Computer Science 8302: Editorial Board

Uploaded by

kaparna869
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Lecture Notes in Computer Science 8302

Commenced Publication in 1973


Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Germany
Madhu Sudan
Microsoft Research, Cambridge, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max Planck Institute for Informatics, Saarbruecken, Germany
Vasudha Bhatnagar Srinath Srinivasa (Eds.)

Big Data Analytics


Second International Conference, BDA 2013
Mysore, India, December 16-18, 2013
Proceedings

13
Volume Editor
Vasudha Bhatnagar
South Asian University
Department of Computer Science
Akbar Bhavan, Chanakyapuri
New Delhi, India
E-mail: [email protected]
Srinath Srinivasa
International Institute
of Information Technology
Banglore, India
E-mail: [email protected]

ISSN 0302-9743 e-ISSN 1611-3349


ISBN 978-3-319-03688-5 e-ISBN 978-3-319-03689-2
DOI 10.1007/978-3-319-03689-2
Springer Cham Heidelberg New York Dordrecht London

Library of Congress Control Number: 2013953235

CR Subject Classification (1998): H.3, H.2.8, H.2, I.2, H.4, I.5, F.2, G.2, H.5

LNCS Sublibrary: SL 3 – Information Systems and Application


incl. Internet/Web and HCI
© Springer International Publishing Switzerland 2013

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and
executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication
or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location,
in its current version, and permission for use must always be obtained from Springer. Permissions for use
may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution
under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of publication,
neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or
omissions that may be made. The publisher makes no warranty, express or implied, with respect to the
material contained herein.
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
Preface

Contemporary digital world comprises text, images, videos, and multiple forms
of semi-structured data that are inter-linked and inter-related in complex net-
works. Pervasive in both commercial and scientific domains, these data present
innumerable opportunities for discovering patterns, accompanied by challenges
of matching magnitude. The data deluge has fueled the creativity of data-curious
researchers and has led to the rapid emergence of new technologies in data an-
alytics. The major impetus has come from the variety, variability, and veracity
of data in addition to their infinitely growing volume (the 4 V’-s of Big Data).
The second edition of the International Conference on Big Data Analytics
(BDA 2013) was held during December 16–18, 2013, in Mysore, India, to con-
gregate researchers, practitioners and policy makers, for sharing their works and
experiences in development of methods and algorithms for big data analytics.
The conference attracted 49 submissions in all, of which 45 were submitted for
the research track and four for the industry track. All submitted papers were sub-
jected to plagiarism check before review. Each paper was reviewed by at-least
three reviewers and the review comments were communicated to the authors.
The review process resulted in the acceptance of nine regular papers and one
short paper for the research track. One industry paper was accepted, leading to
an overall acceptance rate of 22%. This volume includes the accepted papers,
tutorials, and invited papers presented during the conference.
The section “Mining Social Media Data” comprises four papers. The tutorial
article by Mehta and Subramaniam focuses on methods for doing entity analytics
and integration using large twitter data sets. They review the state of the art, and
present new ideas on handling common research problems such as event detection
from social media, summarization, location inference and fusing external data
sources with social data. Sureka and Agrawal address the problem of detecting
copyright infringement of music videos on YouTube. They propose an algorithm
that mines both video and uploader meta-data, and uses a rule-based classifier
to predict the category (original or copyright-violated) of the video. Jain et
al. investigate user behavior based on the temporal dimension of tweets and
relate it to the evolution of topics on Twitter OSN. Based on a novel metric
called “tweet strength,” topics are identified along with the users driving the
evolution. Bhargava et al. study the problem of authorship attribution of tweets,
for forensic purposes. The proposed method extracts stylometric information
from the collected data set to predict authors, using classification algorithms.
The section “Perspectives on Big Data Analytics,” which comprises three
papers, opens with a tutorial paper by Lakshminarayan. The paper presents
some fundamental methods of dimensionality reduction and elaborates on the
main algorithms. The author also points to next-generation methods that seek to
identify structures within high-dimensional data, not captured by second-order
VI Preface

statistics. The invited paper by Mondal discusses the role of crowd-driven data
collection in big data analytics and opportunities presented by such collections.
Kiran addresses the intermittance problem in large transaction databases. The
paper introduces quasi-periodic-frequent patterns, which provide useful informa-
tion and are immune to intermittance problem.
The section “Graph Analytics” consists of papers related to mining of large
graphs. Das and Chakravarthy present a survey of graph algorithms and identify
the challenges of adapting/extending algorithms for the analysis of large graphs
using the Map-Reduce programming model. Tripathy et al. study the character-
istics of complex networks in the game of cricket, where dyadic relationships exist
among a group of players. Properties such as average degree, average strength,
and average clustering coefficients are found to be directly related to the per-
formances of the teams. Parveen and Nair propose techniques for effective and
efficient visualization of small-world networks in a similarity space. An algorithm
for the visual assessment of cluster tendency is presented for efficient hierarchical
graphical representation of large networks.
The section “Practice of Big Data Analytics” consists of three papers describ-
ing practical applications. Elisabeth et al. present a tourist recommender system
using GPS data collected from rental tourist cars. Misra et al. present a case
study to demonstrate the performance advantage of Hadoop-based ETL tools
over the traditional tools. Lakshminarayan and Baron investigate and report on
the of application of big data analytics in manufacturing of integrated circuits.
We gratefully acknowledge the support extended by the University of Delhi
and the University of Aizu. We owe gratitude to MYRA School of Business in
Mysore for organizing the conference and extending their hospitality. Thanks are
also due to our sponsors: E-Bay and IBM India Research Lab. We also thank
all the Program Committee members and external reviewers for their time and
diligent reviews. Ramesh Agrawal performed a plagiarism check on submissions;
thanks to him and his team. The Organization Committee and student volun-
teers of BDA 2013 deserve special mention for their support. Special thanks to
the Steering Committee members. Finally, thanks to EasyChair for making our
task of generating this volume smooth and simple.

December 2013 Vasudha Bhatnagar


Srinath Srinivasa
Organization

Steering Committee
R.K. Arora IIT Delhi, Delhi, India
Subhash Bhalla University of Aizu, Japan
Sharma Chakravarthy University of Texas at Arlington, USA
Rattan Datta Indian Meteorological
Department, Delhi, India
S.K. Gupta IIT, Delhi, India (Chair)
H.V. Jagadish University of Michigan, USA
D. Janakiram IIT Madras, India
N. Vijayaditya Government of
India

Executive Committee
General Chair
D. Janakiram IIT Madras, India

Program Co-chairs
Srinath Srinivasa IIIT, Banglore, India
Vasudha Bhatnagar University of Delhi, India

Organizing Chair
Shalini Urs ISiM, Mysore, India

Publicity Chair
Vikram Goyal IIIT, Delhi, India

Proceedings Chairs
Subhash Bhalla University of Aizu, Japan
Naveen Kumar University of Delhi, India

Industry Chair
Vijay Srinivas Agneeswaran Impetus Labs, India

Tutorials Chair
Jaideep Srivastava University of Minnesota, USA
VIII Organization

PhD Symposium Chair


Maya Ramanath IIT Delhi, India

Local Organizing Committee


Abhinanda Sarkar MYRA School of Business, India
Naveen Kumar University of Delhi, India
Subhash Bhalla University of Aizu, Japan

Program Committee
Vijay Srinivas Agneeswaran Impetus Labs, Bangalore, India
Ramesh Agrawal Jawaharlal Nehru University, New Delhi, India
Avishek Anand Max Planck Institute, Germany
Amitabha Bagchi Indian Institute of Technology, Delhi, India
Srikanta Bedathur Indraprastha Institute of Information
Technology (IIIT), Delhi, India
Subhash Bhalla University of Aizu, Japan
Raj Bhatnagar University of Cincinnati, USA
Arnab Bhattacharya Indian Institute of Technology, Kanpur, India
Indrajit Bhattacharya IBM Research, India
Gao Cong Nanyang Technological University, Singapore
Prasad Deshpande IBM Research, India
Lipika Dey TCS Innovation Labs, Delhi, India
Dejing Dou University of Oregon, USA
Haimonti Dutta Columbia University, USA
Shady Elbassuoni American University, Beirut, Lebanon
Rajeev Gupta IBM Research, India
Sharanjit Kaur University of Delhi, India
Akhil Kumar Penn State University, USA
Naveen Kumar University of Delhi, USA
Choudur Lakshminarayan Hewlett-Packard Laboratories, USA
Ulf Leser Institut für Informatik, Humboldt-Universität
zu Berlin, Germany
Ravi Madipadaga Carl Zeiss, India
Sameep Mehta IBM Research, India
Mukesh Mohania IBM Research, India
Yasuhiko Morimoto Hiroshima University, Japan
Joydeb Mukherjee Impetus Labs, India
Saikat Mukherjee Siemens, India
Mandar Mutalikdesai Siemens, India
Felix Naumann Hasso-Plattner-Institut, Potsdam, Germany
Hariprasad Nellitheertha Intel, India
Anjaneyulu Pasala Infosys Labs, India
Organization IX

Adrian Paschke Freie Universität Berlin, Germany


Jyoti Pawar Goa University, India
Lukas Pichl International Christian University, Japan
Krishna Reddy Polepalli International Institute of Information
Technology, Hyderabad, India
Kompalli Pramod International Institute of Information
Technology, Hyderabad, India
Mangsuli Purnaprajna Honeywell, India
Sriram Raghavan IBM Research, India
S. Rajagopalan International Institute of Information
Technology, Bangalore, India
Muttukrishnan Rajarajan City University
Raman Ramakrishnan Honeywell, India
Chandrashekar Ramanathan International Institute of Information
Technology, Bangalore, India
Markus Schaal University College Dublin, Ireland
Srinivasan Sengamedu Komli Labs, Bangalore, India
Shubhashis Sengupta Accenture, India
Mark Sifer University of Wollongong, New Zealand
Jaideep Srivastava University of Minnesota, USA
Shamik Sural Indian Institute of Technology,
Kharagpur, India
Ashish Sureka Indraprastha Institute of Information
Technology (IIIT), Delhi, India
Asoke Talukder Interpretomics Labs, Bangalore, India
Srikanta Tirthapura Iowa State University, USA
Sunil Tulasidasan Los Alamos National Laboratories, USA
Sujatha Upadhyaya Independent Consultant, Bangalore, India
Shalini Urs International School of Information
Management, Mysore, India

Additional Reviewers
Adhikari, Animesh P, Deepak
Agarwal, Manoj Prateek, Satya
Burgoon, Erin Puri, Charu
Correa, Denzil Rachakonda, Aditya
Gupta, Shikha Ranu, Sayan
Jog, Chinmay Ravindra, Padmashree
Kulkarni, Sumant Sreevalsan-Nair, Jaya
Lal, Sangeeta Telang, Aditya
Table of Contents

Mining Social Media Data


Tutorial: Social Media Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Sameep Mehta and L.V. Subramaniam
Temporal Analysis of User Behavior and Topic Evolution on Twitter . . . 22
Mona Jain, S. Rajyalakshmi, Rudra M. Tripathy, and
Amitabha Bagchi
Stylometric Analysis for Authorship Attribution on Twitter . . . . . . . . . . . 37
Mudit Bhargava, Pulkit Mehndiratta, and Krishna Asawa
Copyright Infringement Detection of Music Videos on YouTube by
Mining Video and Uploader Meta-data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Swati Agrawal and Ashish Sureka

Perspectives on Big Data Analytics


High Dimensional Big Data and Pattern Analysis: A Tutorial . . . . . . . . . . 68
Choudur K. Lakshminarayan
The Role of Incentive-Based Crowd-Driven Data Collection in Big Data
Analytics: A Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Anirban Mondal
Discovering Quasi-Periodic-Frequent Patterns in Transactional
Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
R. Uday Kiran and Masaru Kitsuregawa

Graph Analytics
Challenges and Approaches for Large Graph Analysis Using
Map/Reduce Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Soumyava Das and Sharma Chakravarthy
Complex Network Characteristics and Team Performance in the Game
of Cricket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Rudra M. Tripathy, Amitabha Bagchi, and Mona Jain
Visualization of Small World Networks Using Similarity Matrices . . . . . . 151
Saima Parveen and Jaya Sreevalsan-Nair
XII Table of Contents

Big Data in Practice


Demonstrator of a Tourist Recommendation System . . . . . . . . . . . . . . . . . . 171
Erol Elisabeth, Richard Nock, and Fred Célimène

Performance Comparison of Hadoop Based Tools with Commercial


ETL Tools – A Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Sumit Misra, Sanjoy Kumar Saha, and Chandan Mazumdar

Pattern Recognition in Large-Scale Data Sets: Application in Integrated


Circuit Manufacturing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Choudur K. Lakshminarayan and Michael I. Baron

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

You might also like