BITS Pilani

  • Page last updated on Thursday, August 06, 2020

    • linkedin

ADAPT Laboratory

banner
ADAPT Laboratory
  • About

     
     
    The Advanced Data Analytics and Parallel Technologies (ADAPT) Lab has been started in 2011 with the aim to disseminate high quality research in the area of Big Data Analytics. We primarily work in the research areas related to High Performance Computing and Data Mining, as stated above. Our articles have been published in various top tier conferences (IEEE Cluster, IEEE Big Data, IEEE/ACM/ASA DSAA, ICDCN, etc.) and journals. (See publications for complete list).
     
    Click here to see our GitHub page. 

  • Ongoing Research Projects

    Parallelization of Data Mining Algorithms
    With growing data, and importance of data-centric approach to decision making, it is important to design data mining algorithms that work for HPC architectures such as distributed memory, shared memory or hybrid architectures. We have designed (still designing) parallel versions of various data clustering algorithms such as density based clustering (DBSCAN, OPTICS), hierarchical clustering (SLINK), subspace clustering (ENCLUS, MAFIA, PROCLUS), shared nearest neighbors (SNN), etc. that work for the above parallel architectures. The proposed algorithms are found to give better performance than the state of the art techniques.
     
    Data Distribution Strategies for Parallel Data Mining Algorithms
    For getting best performance for parallel data mining algorithms that work on distributed memory architectures, an efficient data distribution scheme is essential to achieve load balancing in terms of the computational load. This data distribution scheme has to be specific to each kind of algorithm being executed. We have designed (still designing) such distribution schemes for distributing large datasets over a cluster of computing nodes to achieve optimal load balancing, so as to maximize the performance of various spatial data mining algorithms.
     
    Anytime Mining of Data Streams
    With increase in popularity of deployment of data generating devices, there is a need to develop stream mining algorithms that process streams arriving at varying inter-arrival rate. These algorithms, at the same time, should be capable of processing multiple streams by leveraging a high performance computing architecture. We have developed (still developing) anytime stream mining algorithms for various data mining tasks such as clustering, classification, frequent itemset mining, etc. These algorithms not only handle variable stream speeds, but are also capable of producing an immediate approximate mining result when user requests, and can improve the quality of the result with increase in time allowance.
     
    Data Structure for Data Mining
    We have developed (still developing) tailor-made indexing structures that enhance the performance of various spatial queries like neighborhood and nearest neighbor queries, which are commonly used in spatial data mining algorithms like DBSCAN, OPTICS, SNN, SLINK, K-NN classifier, etc. The proposed data structures give better query performance than the conventional data structures like R-tree & kd-tree.
     
    Domain Specific Language and Compiler for Parallel Data Mining Algorithms
    We have developed a domain specific language known as DWARF, specifically for data clustering algorithms. It supports language constructs for efficient design and rapid prototyping of various clustering algorithms such as density based clustering (DBSCAN, OPTICS RECOME), subspace clustering (ENCLUS, MAFIA, PROCLUS, etc.), partitioning based clustering (K-means, EM Clustering, K-medoids, etc.), hierarchical clustering (SLINK, CLINK, ALINK), etc. Along with the language, we have designed a compiler that automatically parallelizes a sequential code written in DWARF to work for HPC architectures such as distributed memory and shared memory. The parallel code generated by the compiler gives at par performance with the state of the art parallel algorithms. Currently we are developing a DSL and a compiler for classification algorithms. We are also working upon development of a virtual machine to make DWARF independent of the platform.
     
    Social Media Analytics
    With increase in popularity of usage of social media platforms such as twitter, facebook, analyzing data from social media is becoming increasingly popular. We are working on twitter streams, more specifically on problems related to data visualization, event detection, etc.
     
    Genome Sequence Assembly
    We are also working on genome assembly problem, which refers to aligning and merging of fragments of longer DNA sequences in order to reconstruct the original sequence. We are working on building efficient ways of doing it while leveraging the performance gain achieved by HPC architectures.
     
     

  • Sponsored Research Projects

    Projects Completed
     
    Project #1:
    Project Title:A New Distributed Computing Framework for Data Mining
    Funding Agency:Department of Electronics and Information Technology (DEITY), Ministry of Communication and Information Technology, Govt. of India.
    Starting date:November, 2012
    Funding Amount:INR 120.20 Lakhs
    Project Duration:3 Years
    Collaborators:Indian Agricultural Statistics Research Institute (IASRI), New Delhi
    Project in Pipeline:Designing and Developing a Scalable and Green High Performance Computing System for Big Data Analytics (To be submitted to Big Data Initiative Division, DST)
     
    Projects Submitted
     
    Project #1: 
    Project Title:
    Design and Implementation of an efficient Parallel De Novo Hybrid Whole Genome Assembler for handling Biologically Complex Genome
    Funding Agency:Dept. of Biotechnology, Ministry of Science & Technology, Govt. of India
    Funding Amount:INR 99.5 Lakhs
    Project Duration:3 Years
    Collaborators:Indian Agricultural Statistics Research Institute (IASRI), New Delhi
    Project #2: 
    Project Title:Designing and Developing a Scalable and Green High Performance Computing System for Big Data Analytics 
    Funding Agency:Submitted to Big Data Initiative Division, Department of Science & Technology, Govt. of India
    Funding Amount:--
    Project Duration:3 Years
    Collaborators:IISc Bangalore

  • People

    Faculty & Research Scholars

    Name Designation
    Navneet Goyal (Incharge) 
    Research Interests: 
    Data Warehousing, Data Mining,
    Query Performance
    Professor, Department of Computer Science and Information Systems
    Room No: 6121-K
    Email - goel@pilani.bits-pilani.ac.in
    Personal Website : Click here 
    Kamlesh Tiwari (Co-Incharge) 
    Research Interests: 
    Biometrics Security, Machine Learning
    Assistant Professor, Department of Computer Science and Information Systems
    Room No: 6120-N
    Email - kamlesh.tiwari@pilani.bits-pilani.ac.in
    Personal Website : Click here 
    Poonam Goyal
    Research Interests: 
    Data Mining, Algorithms
    Associate Professor & Head, Department of Computer Science & Information Systems 
    Room No: 6121-Q
    Email - poonam@pilani.bits-pilani.ac.in
    Personal Website : Click here
    Jagat Sesh Challa
    Research Interests: 
    Data Mining, High Performance Computing, Distributed and Concurrent Data Structures
    Assistant Professor, Dept. of Computer Science & Information Systems
    Room No – 6121-C
    Email – jagatsesh@pilani.bits-pilani.ac.in
    Personal Website : Click here
     

    Prerna Kaushik
    Research Interests: 
    Data Mining, Social Media Analytics
    Research Scholar (Funded from DEITY Project)
    Dept. of Computer Science & Information Systems
    Room No – 6115
    Email –p2013192@pilani.bits-pilani.ac.in
    Personal Website : Click here 

    Ayushi Gaur
    Research Interests: 
    Data Mining & Analytics, High Performance Computing
    Research Scholar
    Dept. of Computer Science & Information Systems
    Room No – 6115
    Email –p20190023@pilani.bits-pilani.ac.in
    Personal Website : Click here
    Former Faculty, Research Scholars & Project Fellows
    Name & Designation Current Employment Email and Website
    Sundar Balasubramaniam
    Professor
    Incubating a Startup
    sundar.balasub@gmail.com
    Saiyedul Islam
    TCS Research Scholar 
    Senior Software
    Development Engineer
    AMD, Bangalore 
    saiyedul.islam@gmail.com
    Sonal Kumari
    TCS Research Scholar
    Senior Lead Engineer
    Samsung R&D 
    Bangalore 
    sonalkumari1910@gmail.com
    Mohit Sati
    Project Fellow 
    -- mohitsati1@gmail.com

  • Publications

    • Jagat Sesh Challa, Poonam Goyal, Ajinkya Kokandakar, Pranet Verma, Dhananjay Mantri, Sundar Balasubramaniam and Navneet Goyal, "Anytime Clustering of Data Streams while handling Noise and Concept Drift", Journal of Experimental and Theoretical Artificial Intelligence. T&F (TETA). (Accepted)
    • Poonam Goyal, Prerna Kaushik, Pranjal Gupta, Dev Vashisth, Shavak Agarwal and Navneet Goyal, "Multilevel Event Detection, Storyline Generation and Summarization for Tweet Streams", IEEE Transactions Computational Social Systems (TCSS). (Accepted)
    • Saiyedul Islam, Navneet Goyal, Sundar Balasubramaniam, Poonam Goyal, Achal Agarwal, Kirti Singh Rathore, and Nischay Singh, "Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems",  2019 IEEE International Conference on Big Data (Big Data 2019), December 9-12, Los Angeles, USA. (Accepted)
    • Saiyedul Islam, Sundar Balasubramaniam, Poonam Goyal, Ankit Sultana, Lakshit Bhutani, Saurabh Raje, Navneet Goyal, “A Rapid Prototyping Approach for High Performance Density-Based Clustering”, 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA 2019), October 5-8, Washington DC, USA.
    • Aditya Sarma A S, Poonam Goyal, Sonal Kumari, Anand Wani, Jagat Sesh Challa, Saiyedul Islam, Navneet Goyal, “MuDBSCAN: An Exact Scalable DBSCAN Algorithm for Big Data Exploiting Spatial Locality”, In 2019 IEEE International Conference on Cluster Computing (IEEE Cluster 2019), Sep 22-26, Albuquerque, USA.
    • Poonam Goyal, Sonal Kumari, Sumit Sharma, Sundar Balasubramaniam, Navneet Goyal, "Parallel SLINK for Big Data", In International Journal of Data Science and Analytics (JDSA), Springer. (Accepted). DOI: 10.1007/s41060-019-001884.
    • Jagat Sesh Challa, Poonam Goyal, Vijay M. Giri, Dhananjay Mantri and Navneet Goyal, “AnySC: Anytime Set-wise Classification of Variable Speed Data Streams”, 2018 IEEE International Conference on Big Data (Big Data 2018), December 10-13, Seattle, USA.
    • Saiyedul Islam, Sundar Balasubramaniam, Shruti Gupta, Shikhar Brajesh, Rohan Badlani, Nitin Labhishetty, Abhinav Baid, Poonam Goyal, and Navneet Goyal, “Pattern based Automatic Parallelization of Representative-based Clustering Algorithms”, IEEE International Conference on Data Science and Advanced Analytics (DSAA 2018), October 1-4, Turin, Italy.
    • Chandramani Chaudhary, Poonam Goyal, Joel R A Moniz, Navneet Goyal and Yi-Ping Phoebe Chen, “Linguistic pattern and cross modality based image retrieval for complex queries”, In 2018 ACM International Conference on Multimedia Retrieval (ICMR 2018), Yokohama, Japan.
    • Chandramani Chaudhary, Poonam Goyal, and Yi-Ping Phoebe Chen, "Exploiting Visual and Textual Neighborhood Information to Improve Image-Tag Relevance", In 2017 IEEE International Conference on Big Data 2017 (IEEE Big Data 2017), Boston, USA. 
    • Poonam Goyal, Jagat Sesh Challa, Shivin Srivastava, and Navneet Goyal, "AnyFI: An Anytime Frequent Itemset Mining Algorithm for Data Streams", In 2017 IEEE International Conference on Big Data (IEEE Big Data 2017), December 11-14, Boston, USA.
    • Saiyedul Islam, Sundar Balasubramaniam, Poonam Goyal, Mohit Sati, and Navneet Goyal, "A Domain Specific Language for Clustering", Accepted for publication in 13th International Conference on Distributed Computing and Internet Technology (ICDCIT 2017), January 13-16, Bhubaneswar, India. (Poster Paper).
    • Sonal Kumari, Poonam Goyal, Ankit Sood, Dhruv Kumar, Navneet Goyal, and Sundar Balasubramaniam, “Exact, Fast and Scalable Parallel DBSCAN for Commodity Platforms”, Accepted for publication in 18th International Conference on Distributing Computing and Networking (ICDCN 2017), to be held in January 4-7, Hyderabad, India.
    • Jagat Sesh Challa, Poonam Goyal, Nikhil S., Aditya Mangla, Sundar Balasubramaniam, Navneet Goyal, "DDR-Tree: A dynamic distributed data structure for efficient data distribution among cluster nodes for spatial data mining algorithms", Accepted for publication in 2016 IEEE International Conference on Big Data (IEEE Big Data 2016), to be held in December 5-8, Washington DC, USA.
    • Poonam Goyal, Sonal Kumari, Sumit Sharma, Dhruv Kumar, Vivek Kishore, Sundar Balasubramaniam, and Navneet Goyal, “A fast, Scalable SLINK Algorithm for Commodity Cluster Computing Exploiting Spatial Locality” Accepted for publication in 18th International Conference on High Performance Computing and Communications (HPCC 2016), to be held in December 12-14, Sydney, Australia.
    • Sonal Kumari, Saurabh Maurya, Poonam Goyal, Sundar Balasubramaniam, and Navneet Goyal, “A Parallel and Scalable Shared Nearest Neighbor Algorithm for Big Data,” Accepted for publication In 23rd IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2016), December 19-22, Hyderabad, India.
    • Navneet Goyal, Sundar Balasubramaniam, Poonam Goyal, Saiyedul Islam, and Mohit Sati, "A High Performance Computing Framework for Data Mining". Accepted for publication in IEEE International Workshop on Foundations in Big Data Computing (HiPC 2016), December 19-22, Hyderabad, India.
    • Poonam Goyal, Sonal Kumari, Shubham Singh, Vivek Kishore, Sundar S Balasubramaniam and Navneet Goyal, "A Parallel Framework for Grid-based Bottom-up Subspace Clustering", Accepted for publication In IEEE 3rd International Conference on Data Science and Advanced Analytics (DSAA 2016), October 17-19, Montreal, Canada.
    • Poonam Goyal, Sonal Kumari, Sumit Sharma, Vivek Kishore, Navneet Goyal and Sundar S Balasubramaniam, "Spatial Locality Aware, Fast, and Scalable SLINK Algorithm for Commodity Clusters", Accepted for publication in IEEE Cluster 2016, September 13-15, Taipei, Taiwan. (Poster Paper)
    • Dhruv Kumar, Poonam Goyal and Navneet Goyal, "An Efficient Method for Batch Updates in OPTICS Cluster Ordering", in International Journal of Data Analysis Techniques and Strategies (IJDATS). (Accepted in June 2016)
    • Jagat Sesh Challa, Poonam Goyal, Nikhil S., Sundar Balasubramaniam, Navneet Goyal. "A concurrent k-NN search algorithm for R-tree". In 8th ACM Compute '15, October 2015, Ghaziabad, India. (Best Poster Paper Award)
    • Sonal Kumari, Anil Maheshwari, Poonam Goyal, and Navneet Goyal. "Parallel framework for efficient k-means clustering". In 8th ACM Compute '15, October 2015, Ghaziabad, India. (Best Paper Award)
    • Poonam Goyal, Sonal Kumari, Dhruv Kumar, Sundar Balasubramaniam, Navneet Goyal, Saiyedul Islam, and Jagat Sesh Challa. 2015. "Parallelizing OPTICS for Commodity Clusters". In Proceedings of the 2015 International Conference on Distributed Computing and Networking (ICDCN '15), January 2015, Goa, India.
    • Poonam Goyal, Sonal Kumari, Dhruv Kumar, Sundar Balasubramaniam, and Navneet Goyal. 2014. "Parallelizing OPTICS for multicore systems". In Proceedings of the 7th ACM India Computing Conference (COMPUTE '14), October 2014, Nagpur, India.

  • Opportunities Available

    Applications are invited for LOP/SOP for the upcoming semester Aug-Dec 2020 in ADAPT LAB for the following research areas under Prof.Navneet Goel and/or Prof.Poonam Goyal.
     

  • Events

  • High Performance Computing Facilities

    32 node cluster IBM x3250 M4 server, Intel Xeon Quad core 3.2 GHz Hyper Threaded Processor, 32 GB RAM, 2.2 TB HDD
    16 node cluster HP Proliant DL 140, Two Intel Xeon dual core 2.92 GHz Processors, 16 GB RAM, 160 GB HDD
    48-core SMP HP Proliant DL 580 gen8, Four Intel Xeon 12 core 2.29 GHz Hyper Threaded Processors, 192 GB RAM, 600 GB HDD
    12-core SMP Dell PowerEdge T710, Two Intel Xeon Hex core 2.92 GHz Processors, 32 GB RAM, 2.2 TB HDD
    NAS EMC2 VNXe 3100-Network Attached Storage Server - Capacity (36 TiB, Usable 21 TiB)
    Software Intel Cluster Studio XE 2013
    Vampir 8.4 Trace Analyzer
    ROCKS Cluster Distribution

  • Contact Us

    6115, New Academic Building (NAB)
    Dept. of Computer Science & Information Systems
    BITS-Pilani, Pilani Campus
    Vidhya Vihar
    Pilani - 333031
    Rajasthan
     
    Ph - +91 1596-25-5437
     
    Email: goel@pilani.bits-pilani.ac.in 

Research Lab Photo Gallery


Research Lab Equipment

An Institution Deemed to be University estd. vide Sec.3 of the UGC Act,1956 under notification # F.12-23/63.U-2 of Jun 18,1964

© 2020 Centre for Software Development,SDET Unit, BITS-Pilani, India.

Designed and developed by fractal | ink design studios