SOME OF OUR PROJECTS



Knowledge-Rich Deep Models
There are many optimization problems of industrial interest for which it is not possible to employ the usual methods of analysis or numerical optimization. A third alternative is to develop sampling-based methods, by estimating the probability distribution of optimal or near-optimal solutions. We are interested in problems that have the following additional characteristics: (a) it is hard to compute costs of data instances, making it costly to get large numbers of data instances for distribution estimation; and (b) there is significant domain-knowledge about the problem, in the form of rules-of-thumb and guidelines gained from years of industrial experience. This project is about developing Machine Learning techniques to address such problems. Specifically, we propose to develop new kinds of neuro-symbolic learning that can combine deep neural networks with learning in first-order logic.
Grant SERB



Ashwin Srinivasan


Conceptual

Application


Developing Predictive Models for ‘druglikeness’ of small molecules
A chemical molecule is held together by sharing electrons to form bonds. These bonds are directional and define the structure of the molecule. This structure plays an important role in determining the properties of a molecule. The traditional process of finding out the intrinsic ‘physico-chemical’ properties of a molecule are time consuming and expensive. Nevertheless, elucidating these properties allows one to understand the molecular properties and establish their ‘druglikeness’. . The pharmacokinetic properties such as Absorption, distribution, metabolism, excretion, and toxicity (ADMET) play an important role in predicting the ‘druglikeness’ of a molecule. Though methods do exist for predicting the ADMET properties, lot of improvements in the accuracy of these methods is required. Hence, more accurate methods need to be developed for predicting these properties as well. Lastly, another aspect of a small molecule to be considered as a drug candidate, one needs to know the small molecule’s interaction profile with the potential target. We also need to consider the possible degrees of freedom in the small molecules that may (may not) get restricted due to interactions with the target. This project is aimed at constructing machine-learning models in these three areas ito develop better computational tools to predict the ‘druglikeness’ of small molecules.
Grant DBT: decision awaited



Raviprasad Aduri, Ashwin Srinivasan, Sukanta Mondal


Application


The TCS DataLab
The DataLab is a research collaboration with TCS Research. Its goal is to conduct short-term projects (no more than 6 months) in the areas of Data Analysis, Decision Sciences and Artificial Intelligence. The scope of the research will be in topics such as: Machine Perception; Knowledge Synthesis; Program Induction; Analytics and Insights for Decision Support; Intelligent Planning and Control; Machine learning; and Quantum Computation. Projects typically consist of at least 1 APPCAIR faculty, a researcher at TCS, and a student at BITS Goa.
Grant TCS Research



Ashwin Srinivasan, Tirtharaj Dash


Application

Implementation


The Reflexis CoLab
The CoLab is a research collaboration with Reflexis Systems Inc. Its goal is to investigate the use of state-of-the-art Machine Learning in specific areas of relevance to Reflexis. The areas of application are: time-series forecasting, optimisation, and outlier-detection. sist The project consists of at least 1 APPCAIR faculty, a researcher at Reflexis, and a student at BITS Goa.
Grant Reflexis Systems Inc.



Ashwin Srinivasan, Tirtharaj Dash


Application

Implementation


An interdisciplinary study in Machine Learning and Classification for Astronomy
An Interdisciplinary Study in Machine Learning and Classification for Astronomy: The project can be broadly classified into the following two goals:
• Searching for planets similar to the Earth: characterized by the Earth Similarity Index (ESI).
• Searching for life, either similar to Earth life or in an unknown form: characterized by the Planetary Habitability Index (PHI)
We plan to capture the baseline, set the tempo for future research in India and abroad and prepare a scholastic primer that would serve as a standard document for future research. The proposed project should serve as a primer for young astronomers.
The motivation of this research project has two specific objectives:
Develop efficient models for complex computer experiments and data analytic techniques which can be used in astronomical data analysis in short term and various related branches in physical, statistical, computational sciences.
Develop a set of fundamentally correct thumb rules and experiments, backed by solid mathematical theory and render the marriage of astronomy and Machine Learning stability and far reaching impact. We will do this in the context of specific science problems of interest to the proposers: the classification of exoplanets, quantitative estimation of habitability and the classification of multi-wavelength sources.
Grant SERB-DST



Snehanshu Saha


Application

Implementation


Machine Learning in Multi-Wavelength Astronomy
Russian Investigators: Yuri Shchekinov, Evgeniya Kalinicheva. Russian Academy Of Sciences, 2018-21
Grant Indo-Russian Project



Snehanshu Saha Co-PI, Jayant Murthy, IIAP Bangalore PI


Application


From Small Data to Big Data: Complex Computer Modeling and Optimization

Grant DST EMR



Snehanshu Saha Co-PI, PI-Pritam Ranjan, IIM Indore,


Conceptual

Application


Performing Deep Learning on Smart Wearables and IoT for Smart Healthcare
The use of smart wearables enables regular monitoring of human activities, which can lower risks of health complications due to cardiovascular diseases, diabetes, etc. The signal data generated by the Accelerometer and Gyroscope of the inertial measurement unit (IMU) within wearables aid in recognizing motion. In the past, researchers have used Signal Processing techniques for human activity recognition (HAR). However, these techniques are not adaptive to variations of the same activities in different people. Machine Learning and Deep Learning are comparatively tunable but require more computational power, especially the latter. To circumvent high computational requirements, state-of-the-art solutions typically deploy the learning algorithms on a platform with more compute power, e.g., a smartphone, or even Cloud. We are working on proposing solution that is suitable for deployment in the resource-constrained environment of smart wearables and IoT.
Grant



Vinayak Naik


Application

Implementation


Use of Machine Learning to Improve Performance of Large WiFi Networks
We are addressing the problem of automating the process of network troubleshooting for a large-scale WiFi network. For example, unnecessary active scans in WiFi networks, that are known to degrade the WiFi performance. We collected 340 hours-worth of data with several thousands of episodes of active scans to train various machine learning models. We collected data with 27 devices across vendors in varied network setups under a controlled setting. We studied unsupervised and supervised machine learning techniques to conclude that a multilayer perceptron is the best model to detect the causes of active scanning. Further, we performed an in-vivo model validation in an uncontrolled real-world WiFi network. Our proposed mechanisms have the potential of being incorporated in the existing WiFi controllers, such as that of Cisco and Aruba.
Grant



Vinayak Naik


Conceptual

Application


Technology Innovation Hub in the "Bio-CPS" vertical under National Mission on Interdisciplinary Cyber-Physical Systems of DST
Identification of novel biomolecules for developing new diagnostic tools/systems/devices to closely monitor the health condition of the patients is the need of the hour. These biomolecules could help in timely diagnosis with high sensitivity and reduced false positive/negative results, thereby influencing the critical therapeutic decisions and thus significantly altering the outcome in different health conditions. Herein, we propose to employ our knowledge and expertise in identifying novel biological molecules to establish the key targets that are significantly altered in human diseases. These targets will be contemplated to develop new diagnostic devices/tools using IoT, Machine Learning, and Artificial Intelligence techniques for monitoring the health condition of the patients.
Grant BioCPS



Vinayak Naik


Conceptual

Application


Sarcasm Detection
Sarcasm detection plays an important role in sentiment analysis as it is often used to express a negative opinion using positive or intensified positive words in social media. This intentional ambiguity makes sarcasm detection, a significant task of sentiment analysis. Sarcasm detection is considered a binary classification problem. The traditional feature-rich models and deep learning models have been successfully built to predict sarcastic comments. We build attention based deep learning models and other latest techniques to detect sarcasm in the reviews comments.
Grant None



Dr. N.L.Bhanu Murthy


Application


Defect Prediction Models
In our quest to build an effective defect prediction model, the defect prediction problem is formulated as a multi-objective optimization problem. With the aim of maximizing effectiveness and minimizing cost involved in defect prediction, a cross version defect prediction approach is proposed. The proposed model’s ability to identify more defect prone classes at the same or lesser cost than the cost incurred with the traditional single objective algorithms is being investigated. An attempt is made to model the distributions of defects and the object-oriented metrics of software systems that are used for defect prediction. This problem is investigated by making use of information theoretic (criterion based) approaches to model selection.
Grant None



Dr. N.L.Bhanu Murthy


Application


From Small Data to Big Data: Complex Computer Modelling and Optimization
Rapid growth in computer power has made it possible to study complex physical phenomena that might otherwise had been too time-consuming or expensive to observe. Computer experiments are extensively used in situations of national defence – such as measuring the destruction due to a nuclear explosion, predicting natural calamities like hurricane, tornado or earthquake, assessing damage on a car due to a crash test, population growth of certain pest species in the agricultural industry, synthesis of effective compounds in the drug discovery, verifying complex cosmological phenomena like formation of stars, dark energy and matter, the universe expansion theory, and so on. For such situations, the computer models and their statistical surrogates are used for deeper understanding. Important current research topics include surrogate modelling, design and analysis of such experiments, integrating various sources of data (e.g., multi-fidelity simulators and field data), feature estimation, computer model calibration, uncertainty quantification, and most importantly, efficiently working with big data, for instance, in dynamic computer models. While developing methodologies for this research during the next three years, a number of applied research problems will be addressed. These include (1) the problem of sustainable computing in data centres, provided by Amazon, Google, Microsoft, IBM and particularly a host of small and medium scale enterprises, (2) the problem of flood forecasting using complex computer modelling, (3) prediction of pest infestation in agriculture, thereby measuring the reduction in crop quality or quantity, (4) identifying optimal number of control factors/parameters in the context of designing products in the industry, with applications in pharmaceutical industry, and bio-plastic productions.



Snehanshu Saha


Conceptual

Application

Implementation


Machine Learning in Multi-Wavelength Astronomy
The models of stellar atmospheres and the theory of stellar evolution can serve as a link between the physical characteristics of the star (mass, radius, luminosity, etc.) and observed spectrum and photometry. From these, along with parallax information from Hipparcos and Gaia, we determine the distance to the stars and the extinction along each line of sight. Our final products are catalogues of stellar photometric magnitudes with the physical characteristics of each star and a 3-D dust map of the Galaxy, both of which will be released for general use. The principal objective of our study is to construct a 3D map of interstellar extinction in the Galaxy. To do so, we will develop a tool for determination of stellar parameters and interstellar extinction value, using multicolor photometry from large modern sky surveys and recently accessible Gaia DR2 parallaxes. The resulting 3D extinction map could allow to make definite conclusions on origin, physics, evolution and distribution of the interstellar medium, to specify distance scale in the Galaxy, and to accurately estimate parameters of various galactic objects and, extrapolating our results beyond the Galaxy, distance to extragalactic objects. Consequently, machine classification methods to label extragalactic objects using SDSS and GALEX will be performed. This is a first of its kind. Previous attempts were executed on binary object classification (Star-Quasar or Star-Galaxy based on spectrometric data). Novel methods in Neural networks will be written for that purpose and new activation functions will be explored for better classification. Additionally, hitherto unlabeled objects will be labeled and goodness of labels will be tested upon to produce a new catalog based on photometric data, a task of humongous proportions. The joint project is expected to yield an entirely new set of literature and data.



Snehanshu Saha


Conceptual

Application

Implementation


Information Gathering from Encrypted Mobile Phone Applications
The recent decade has witnessed phenomenal growth in the area of communication technology. Development of some very user-friendly software platforms such as Facebook, WhatsApp, etc. have facilitated ease of communication and thereby people freely started sharing their messages and digital contents over the network. One of the biggest concerns while using these platforms is privacy. Many times these contents are personal/sensitive, and people do not want them to be accessed by an unauthorized person. Encryption came as a solution to these concerns. Here sender and the intended receiver have some secret knowledge by using which only they can encrypt/decrypt the message that is transferred between them. Although the same message could be accessible to other users who are not the intended receiver but, they cannot understand the content of the message because of their inability to decode. Although, encryption is suitable to the user as it provides him privacy but, for authorities who want to keep an eye on the network usage for security or monitoring it becomes a big hurdle. Furthermore, various service providers perform network traffic analysis of their user’s data flow to identify the type of application and use this information for various crucial controlling and monitoring systems including billing, quality of service, security, etc. Performing network traffic analysis over encrypted data is hard as nothing can be inferred from the payload of the intercepted packet flowing in the network. However, it turns out that these packet traces can still reveal substantial information about the type of application associated with particular traffic flow and consequently the information about the user of the application. Traffic analysis attacks on encrypted traffic are often referred to as side-channel information leaks. In this project, our aim is to gain valuable insight into the encrypted data by analyzing and classifying the network flows using various machine learning techniques, and the gathered information will eventually be used for the profiling of mobile users.



Kamelesh Tiwari


Application

Implementation


Development of a Real-Time Multi-Objective Route Recommender System using AI
Route selection from a given point of origin to a given destination point on a map is one of the important problems that need to be solved for any in-vehicle navigation. This task is not straightforward when we bring optimality and real-timeliness into consideration. For example, only considering the distance between the two stations is not sufficient to calculate the time required to reach the destination starting from the source accurately. The system should also consider the current average vehicle speed and the current traffic density at different segments of the roads. Additionally, minimizing the travel time may not be the only objective of the route selection algorithm. The user may prefer a safer and comfortable route than the route which takes lesser time. This requirement is pertinent to the Indian scenario, where a few routes may not be safe especially for female drivers and a few roads are not in good condition. Finding the optimal solution for such multiple objective shortest route problem considering the scale and dynamics of the data at which these algorithms have to work is not a practical solution. We need algorithms which can give sub-optimal yet faster routes. In this project, our objective is to build a turn-by-turn real-time route guidance system for Indian scenario which is faster, dynamic and provide suggestions to the users depending upon their objective. The objective of the user could be to find a fast, secure, cost-effective, comfortable routs with predefined intermediate stations depending upon the vehicle type (car, bike, truck, pedestrian etc). Usefulness of some AI and ML algorithms such as neural network, deep learning and RNN could also be explored.



Kamelesh Tiwari


Application

Implementation


Android Web Security Solution using Cross-device Federated Learning
Mobile and Web Security is an important research area in which researchers have been trying to apply Machine Learning, but data privacy concerns has limited its use. Federated Learning is emerging as a promising solution which not only addresses privacy concerns and also drastically reduces communication costs. We propose a Federated Learning solution for security of Android based devices. Mobile and Web Security solutions have evolved from signature-based detections to building Machine Learning models which are trained over large centralized malware repositories. We have used Federated Learning to learn security patterns from users’ browsing data which resides on individual devices and will never leave the devices. Mobile devices shares only the encrypted model parameters that it learns with the central server. The centralized server aggregates these locally trained models received from numerous mobile devices and compiles an aggregated global model, which in turn is sent to mobile devices for inference. Mobile security solutions using cross-device FL can sacle to million of devices and creates robust machine learning models.



Navneet Goyal


Application

Implementation


Cross-device & cross-silo Federated Learning Applications for Health Sector
The health sector in India is all set for a digital revamp with the Government announcing the National Health Mission (NHM). We are developing Federated Learning based applications without compromising the privacy of individuals or health organizations.



Navneet Goyal


Application

Implementation


Biodiversity Monitoring, Assessment & Awareness System (BioMAAS)
We base our proposal on recording of raw mixed sound from different landscapes to monitor and assess. Biodiversity of a region. The idea stems from the fact that when we walk through a landscape, we hear a lot of mixed sounds made by different animals, birds, insects etc. We do not necessarily see them, but their sounds can be heard clearly. Can we analyze these mixed sounds using advanced deep learning techniques to identify the fauna and using this information to assess the biodiversity of the area in terms of a simple metric like number of species?



Navneet Goyal


Application

Implementation


Improving procesution rate using AI
Poor prosecution rate is a major problem faced by the police department. With FIR data now being available in digital form, it is possible to analyze the complete process, from filing of FIR to court ruling, to improve the prosecution rate. Process analytics will provide new insights into the shortcomings of the current practices.



Navneet Goyal


Application

Implementation


Improving data driven decision making in small-mid scale dairy cooperatives
We are trying to develop an IoT-based cold chain real-time monitoring system to identify/rectify issues related to quality control and wastage of milk. Machine Learning techniques like regression, classification, clustering, and anomaly detection will be applied on the IoT data to solve some of the critical problems faced by the small/mid-scale dairy cooperatives. Blockchain technology will be used to improve provenance and traceability of supply chain.



Navneet Goyal


Application

Implementation


Multi-modal Knowledge Graphs
Knowledge bases (KBs) store factual or commonsense knowledge about the world in a structured form. The existing KBs either use only textual information or visual information to identify relations between various entities (concepts). For example, text-based KBs miss information which is visually present in images/videos and visual-based KBs miss parent-child relation if concepts are not visually similar. Existing KBs lack in integrating both textual and visual information which is typically required to improve semantic understanding in applications involving both textual and visual information, such as visual question answering (VQA), video VQA, image captioning, video description, image/video retrieval for complex queries, etc. We propose to automatically construct a multi-modal knowledge base (MMKB) which is capable of learning semantically rich relations and representations, including commonsense knowledge, for concepts and their attributes, by integrating information from multiple modalities. The construction of MMKB requires multi-modal knowledge, straddling different domains, and thus presents an ideal setting for spawning new research directions in AI.
GRANT: DST SERB Submitted



Poonam Goyal


Application

Implementation


Biodiversity Awareness among Children
It is important to create awareness about the importance of biodiversity among children. We believe that an increased awareness among children will go a long way in preserving biodiversity in the decades to come. We plan to achieve this using an interactive multimodal question answering system (MMQA) which will be developed to create interest and awareness among children of different age groups. It will be based on information retrieval using complex queries. MMQA is an extension of visual question answering system (VQA) to include text and audio modalities so that it gives out images, text, and audio as answers to questions which can be in the form of images, text, and audio.



Poonam Goyal


Application

Implementation


De novo genome assembly through Machine Learning
In general, plant genomes are large and complex. de novo assembly of such genomes is challenging for biological & computational reasons. Comparatively, they can have much higher ploidy, higher rates of heterozygosity & repeats. de novo assembly of a plant genome can generate a highly fragmented result despite availability of varied platforms for sequencing. Assembling genome requires proper combination of high coverage, long read length & good read quality. Existing sequencing platforms do not ensure all these parameters. Combination of data coming from these sequencing platforms can be useful. Currently available assemblers are mostly platform dependent & cannot handle data coming from different platforms. We are working on accurate, parallel de novo assembler which can assemble cross platform data and handle biological complexities through ML models.



Poonam Goyal


Application

Implementation


Vision based biomedical analysis of gait
"Gait analysis in the field of medicine is typically performed using motion capture or force plate-based systems, for their reliability and accuracy
[1]. These systems are expensive to install and are inaccessible to most individuals, as they can operate effectively only in dedicated laboratories, under the supervision of trained professionals. Consequently, the use of such systems for the assessment of the individuals’ gait on a regular basis may not be easy. Hence, there is a need for novel systems that can perform acquisition and evaluation of an individuals’ gait in an unconstrained manner. This research project will explore vision-based methodologies to perform gait analysis in a cost- effective and a non-invasive manner, without the need of trained professionals. It will focus on:

  • Acquisition of a database containing videos and metadata of individuals suffering from gait related pathologies.
  • The use of machine learning strategies to develop a system that can provide preliminary diagnosis of an individual’s gait."

GRANT: Research Initiation Grant



Tanmay T. Verlekar


Application

Implementation

Responsible


Unreliable synaptic transmission as a possible enhancer of learning performance in neural circuits
The unreliability of chemical synaptic transmission has classically been viewed as a source of detrimental noise that needs to be overcome for effective neural computation. The present project will investigate, via computational modeling, the hypothesis that unreliable synaptic transmission can improve learning performance of neural circuits in the brain.
GRANT: Grass Foundation, USA.



Venkat Ramaswamy


Conceptual



Contact Us

Ashwin Srinivasan
Senior Professor and Head, APPCAIR
BITS Pilani, K.K. Birla Goa Campus
ashwin@goa.bits-pilani.ac.in
+91 832 2580 111
BITS Pilani K K Birla Goa Campus