Students' Abstracts


M. Saravanan (2008)
Thesis title: Ontology-Based Retrieval and Automatic Summarization of Legal Judgments
Ericsson R &D, Chennai


In the current scenario of information overload, the most challenging task before the computing society is to devise methods for efficient retrieval and timely delivery of relevant information in short and precise segments to the end user. The increasing availability of legal judgments in digital form creates opportunities and challenges for both the legal community and for information technology researchers. While digitized documents facilitate easy access to a large number of documents, finding all documents that are relevant to the task at hand and comprehending a vast number of them are non-trivial tasks. In this thesis we address the issues of legal judgment retrieval and of aiding in rapid comprehension of the retrieved documents. The usual practice of the legal community is that of reading the summaries (headnotes) instead of reading the entire judgments. Moreover, the legal community is also in search of an end-to-end legal information retrieval system which can give a solution for their day- to-day activities. In our thesis, a system has been proposed and tested for creating headnotes automatically from the retrieved legal judgments.

Rhetorical roles are used to represent the collection of sentences under common titles, according to their function in a narrative. Imposing a structure on a document using such roles greatly aids in comprehension. In our study, seven rhetorical roles namely identifying the case, establishing the facts of the case, arguing the case, history of the case, arguments, ratio decidendi and final decision have been identified for this purpose. We pose the problem of performing genre analysis for identifying these roles present in a judgment as one of automatic segmentation of a document. Conditional Random Fields, a discriminative graphical model has been employed in our work for text segmentation and the resulting labels are used to structure the judgments and the corresponding headnotes to aid in better comprehension.

The other major problem we address is that of retrieval of judgments relevant to the cases a legal user is currently involved in. To facilitate this we have developed a novel framework for constructing a legal knowledge base in the form of a legal ontology. Ontologies ensure efficient retrieval of resources by enabling inferences based on domain knowledge. The knowledge base developed is used to enhance the query given by the user in order to retrieve more relevant judgments. The retrieved judgments are considered for generating a final summary.

We have used term distribution model to extract important sentences from the legal judgments for the automatic summarization task. Another major issue to be handled in our study is to generate an "user-friendly" summary at the end. The rhetorical roles identified in the earlier phase have been used to improve the final summary. That is, we can significantly improve extraction-based summarization to generate coherent and concise outputs that incorporate important features of the legal judgment, namely the ratio decidendi and final decision.

The documents considered for study in this thesis are from three different sub-domains viz. rent control, income tax and sales tax related to civil court judgments. The proposed model was evaluated on a specific data collection spanning three legal sub domains. The performances of our system and other auto-summarizers available in the public domain were compared with the summary generated by a set of human subjects. It is found that, at different stages, our system-generated output is close to the outputs generated by human subjects, and it is better than the other tools considered in the study. Thus, the present work comprises almost all the aspects of finding relevant information in the document space for helping the legal communities in their information needs.

Pragathi Priyadarshini B. (2015)(BT, Jointly with Prof. V. Srinivasa Chakravarthy)
Thesis title: A Risk Based Decision Making Model Combining the Functions of Dopamine and Serotonin in the Basal Ganglia
Predoctoral fellow, IIT Madras


The research work presented in this thesis proposes a computational model that reconciles the various functions of neuromodulators dopamine (DA) and serotonin (5HT) in the basal ganglia (BG), viz., risk sensitivity, time scale of reward- punishment prediction, and reward-punishment sensitivity. A utility-based approach is proposed to be more suitable to model the actions of DA and 5HT in the BG, compared to the purely value-based approaches adopted in existing literature. The value function represents the expectation of the sampled reward outcomes, while the utility function is a combination of value and risk function that captures the variance associated with the observed reward samples. The thesis begins with an abstract, utility-based model that reconciles three divergent functions of DA and 5HT in BG-mediated decision making processes. This is further developed into a network model representation of the BG in the later chapters of the thesis.

Basal Ganglia (BG) is a group of subcortical nuclei involved in wide ranging functions such as cognition and decision making, voluntary motor control, timing, procedural memory and emotions. The diverse functions of the BG are coordinated by key neuromodulators including DA and 5HT. Loss of dopaminergic cells in Substantia Nigra pars compacta, a mesencephalic nucleus, is the primary etiology for Parkinson's disease (PD), a neurodegenerative disorder. There is evidence that, in addition to DA deficiency, PD is characterized by serotonergic changes. Models of the BG often aim to explain functions of BG in both control and PD conditions. The series of models presented in this thesis also seek to explain the BG functions in control and pathological conditions.

A large body of modeling literature has grown around the idea that the BG system is a Reinforcement Learning engine. A quantity known as temporal difference (TD) error is thought to be analogous to dopamine signal, while another parameter called the discount factor or time scale of prediction, is related to 5HT. The first computational model of the BG presented in this thesis (Chapter 4), applies these ideas to explain impairments in Parkinsonian gait (Muralidharan et al., 2014). We then introduce the utility function, as a preparation to the full abstract model presented later in Chapter 5, and explain features of precision grip performance in control and PD conditions (Gupta et al., 2013).

Although empirical studies show that 5HT plays many functional roles in risk-reward-punishment learning, computational models mostly focus on its role in behavioral inhibition or time scale of prediction. Then presented is a abstract, RL-based model of DA and 5HT function in the BG, a model that reconciles some of the diverse roles of 5HT. The model uses the concept of the utility function - a weighted sum of the traditional value function expressing the expected sum of the rewards, and a risk function expressing the variance observed in reward outcomes. Serotonin is represented by a weight parameter, used in this combination of value and risk functions, while the neuromodulator dopamine (DA) is represented as reward prediction error as in the classical models. The proposed 5HT-DA abstract model is applied to data from different experimental paradigms used to study the role of 5HT: 1) Risk-sensitive decision making, where 5HT controls the risk sensitivity; 2) Temporal reward prediction, where 5HT controls time-scale of reward prediction, and 3) Reward/Punishment Sensitivity, where punishment prediction error depends on 5HT levels. Thus this abstract and extended RL model explains the three diverse roles of 5HT in a single framework. The model is also shown to be efficient in explaining the effects of medications on reward/punishment learning in PD patients (Balasubramani et al., 2014).

Little is known about the neural correlates of risk computation in the subcortical BG system. The later part of the thesis deals with a network model that is conservatively built from the earlier described abstract model. At the core of the proposed network model is the following insight regarding cellular correlates of value and risk computation. Just as the DA D1 receptor (D1R) expressing medium spiny neurons (MSNs) of the striatum are thought to be neural substrates for value computation, we propose that DA D1R and D2R co-expressing MSNs that occupy a substantial proportion of the striatum, are capable of computing risk. This is the first-of-its-kind model to account for the significant computational possibilities of these co- expressing D1R-D2R MSNs, and describes how the DA-5HT mediated activity in these classes of neurons (D1R-, D2R-, D1R-D2R- MSNs) contribute to the BG dynamics. Firstly the network model is shown for consistently explaining all the results emerging out of the earlier abstract model. This includes reconciling the multifarious functioning of the DA-5HT in the BG through the network model-risk sensitivity, timescale of reward prediction and punishment sensitivity. Furthermore, the network model is also shown to capture the PD patients' behavior in a probabilistic learning paradigm. The model predicts that optimizing 5HT levels along with DA medication might be quintessential for improving the patients' reward-punishment learning (Balasubramani et al., submitted).

All the above experiments tested the accuracy in the action selection. Finally a study to investigate the efficiency of the developed network model in a task analyzing the reaction times of subjects, is presented. This task also employs a probabilistic learning paradigm tested on healthy controls and PD patients with and without Impulse Control Disorder (ICD). Impulsivity involves irresistibility in execution of actions and is prominent in ON medication condition of PD patients. Therefore, four kinds of subject groups-healthy controls, ON medication PD patients with ICD (PD-ON ICD) and without ICD (PD-ON non-ICD), OFF medication PD patients (PD-OFF)-are tested. The proposed network model is able to infer the neural circuitry responsible for displaying ICD in PD condition. Significant experimental results are increased reward sensitivity in PD-ON ICD patients, and increased punishment sensitivity in PD-OFF patients. The PD-ON ICD subjects had lower reaction times (RT) compared to that of the PD-ON non-ICD patients. The models for PD-OFF and PD-ON are found to have lower risk sensitivity, while that of the PD-ON also has lower punishment sensitivity especially in ICD condition. The model for healthy controls shows comparatively higher risk sensitivity (Balasubramani et al., accepted).

Anirban Santara (2020) (IIT Kgp, Jointly with Prof. Pabitra Mitra) [Google Scholar]
Thesis title: Reinforcement Learning for Safe and Efficient Planning in Autonomous Driving
Research Engineer, Google Research, Bengaluru


The success of autonomous driving is contingent on the development of safe and efficient motion planning algorithms capable of working in multi-agent environments under stochasticity and partial observability. Reinforcement Learning (RL) provides a powerful framework for learning to behave in such environments. The focus of this thesis is to advance the state-of-the-art in reinforcement learning based motion planning for autonomous driving. We present MADRaS, an opensource Multi-Agent DRiving Simulator that is capable of simulating challenging driving tasks of high variance. We demonstrate how MADRaS can be used as a curriculum learning platform for training RL agents to drive a wide range of cars in different road tracks, navigate through traffic in narrow roads and prevent congestion and deadlocks through multi-agent cooperation.

In reinforcement learning, the agent's objective is specified in terms of a scalar reward function making its accurate description crucial for success in achieving the desired behavior. In order to bypass the need to hand-engineer reward functions, Imitation learning algorithms like Generative Adversarial Imitation Learning (GAIL) estimate an expert's reward function from its demonstrations and then maximize it using RL. We observe that although GAIL is effective in matching (and often, exceeding) the expert at mean performance, high-cost trajectories, corresponding to tail-end events of catastrophic failure, are more likely to be encountered by GAIL agents than the expert. To address this issue, we develop Risk-Averse Imitation Learning (RAIL) as an alternative to GAIL in risk-sensitive applications that achieves up to 89% reduction in tail-risk at benchmark continuous control tasks of OpenAI Gym.

The sample efficiency and convergence time of an RL algorithm heavily depend on the exploration method used. While human beings use knowledge from prior experiences at related tasks while exploring a new task, most exploration algorithms for RL use the information only from the current task-environment. We develop ExTra, a framework for Transfer-guided Exploration in which we leverage a known optimal policy of a related task for efficient exploration in a new task. We demonstrate that ExTra is capable of exceeding and complementing the performance of traditional exploration algorithms.


Sriram Raghavan (2007)
Thesis title: Distributed Algorithms for Hierarchical Area Coverage Using Teams of Homogeneous Robots


Covering a given area using a team of mobile robots poses several challenges. One such challenge lies in guaranteeing efficient coverage in the absence of complete information. The robots (agents) must cover the area in a manner that the overlap (repeated coverage) in coverage across all the robots is minimized. In this thesis, we introduce “overlap_ratio” to explicitly measure the overlap in coverage for various algorithms. This measure has been shown to adequately represent the performance of any coverage system as it effectively captures the resource usage as well as the time required for coverage. We show systematically how the area coverage algorithms, which perform complete coverage with minimum overlap, can be designed for a team of mobile robots. We begin by understanding the behavior of robots in a random-walk model and successively refine the model to provide the robots with additional capabilities and minimize the overlap. We gradually transition from random decision-making to deterministic decision-making and suggest appropriate algorithms to suit various application needs and capabilities. We also prove that arbitrarily large areas can be covered with simple and elegant coverage algorithms by hierarchically composing it using smaller areas called primitives. An associated theorem called the H2C theorem that provides a linear scaling of overlap ratio with exponential increase in area has been proved. Further, this theorem is applicable across arbitrary number of levels in hierarchy. We demonstrate the same experimentally through simulation. Performance of such multi-robot (agent) applications critically depends on the communication architecture that facilitates coordination. A generic architecture often turns out to be burdensome on the robot (agent) due to overhead. With significant increase in the number of applications, a lightweight architecture that supports reliable and robust delivery of coordination messages is the need of the hour. We explain the design of a Multi-Robot Coordination Architecture, called Pseudonet, based on Bluetooth, to develop area coverage algorithms through successive refinement. Pseudonet supports a wide variety of multi-agent applications in which communication is the primary mode of coordination. We illustrate the use of Pseudonet architecture in the multi-robot area coverage application.

B. H. Sreenivasa Sarma (2007)
Thesis title: Intelligent Tutoring System using Reinforcement Learning
LSI Technologies, Bangalore


Intelligent Tutoring System (ITS) is an interactive and effective way of teaching students. ITS gives instructions about a topic to a student, who is using it. The student has to learn the topic from an ITS by solving problems. The system gives a problem and compares the solution it has with that of the student’s and then it evaluates the student based on the differences. The system continuously updates a student model by interacting with the student. Usually, ITS uses artificial intelligence techniques to customize its instructions according to the student’s need. For this purpose the system should have the knowledge of the student (student model) and the set of pedagogical rules. Disadvantages of these methods are, it is hard to encode all pedagogical rules, it is difficult to incorporate the knowledge that human teachers use and it is difficult to maintain a student model and also these systems are not adaptive to new student’s behavior. To overcome these disadvantages we use Reinforcement Learning (RL) to take teaching actions. RL learns to teach, given only the state description of the student and reward. Here, the student model is just the state description of the student, which reduces the cost and time of designing ITS. The RL agent uses the state description as the input to a function approximator, and uses the output of the function approximator and reward, to take a teaching action. To validate and to improve the ITS we have experimented with teaching a simulated student using the ITS. We have considered issues in teaching a special case of students suffering from autism. Autism is a disorder of communication, socialization and imagination. An Artificial Neural Network (ANN) with backpropagation is selected as simulated student, that shares formal qualitative similarities with the selective attention and generalization deficits seen in people with autism. Our experiments show that an autistic student can be taught as effectively as a normal student using the proposed ITS.

Vimal Mathew (2007)
Thesis title: Automated Spatio-Temporal Abstraction in Reinforcement Learning
PhD, UMass Amherst


Reinforcement learning is a machine learning framework in which an agent manipulates its environment through a series of actions, receiving a scalar feedback signal or reward after each action. The agent learns to select actions so as to maximize the expected total reward. The nature of the rewards and their distribution across the environment decides the task that the agent learns to perform. Consider a generic agent solving multiple tasks in a common environment. The effectiveness of such an agent increases with its ability to reuse past experience in new tasks. Identifying similar representations between tasks paves the way for a possible transfer of knowledge between such tasks. A sufficiently broad definition of similarity and an ability to effectively store and reuse previous experience can greatly increase the long-term performance of a multipurpose agent. This thesis introduces an automated task-independent abstraction method that generalizes across tasks sharing the same state-space topology and similar dynamics. New tasks performed on this common environment are able to reuse skills acquired over previous tasks. Pruning techniques are identified for specific situations to identify relevant skills. The interaction of the agent with the environment is modeled as a dynamical system. Identification of metastable regions in the agent dynamics leads to a partitioning of the state space. Transitions between metastable regions, which normally are events of low probability during a random-walk on the state space, are identified as useful skills.

Shravan Matthur Narayanamurthy (2007)
Thesis title: Abstraction using Symmetries in Markov Decision Processes
Yahoo Labs, Bangalore


Current approaches to solving Markov Decision Processes (MDPs), the de-facto standard for modeling stochastic sequential decision problems, scale poorly with the size of the MDP. When used to model real-world problems though, MDPs exhibit considerable implicit redundancy, especially in the form of symmetries. However, existing model minimization approaches do not efficiently exploit this redundancy due to symmetries. These approaches involve constructing a reduced model first and then solving them. Hence we term these as “explicit minimization” approaches. In the first part of this work, we address the question of finding symmetries of a given MDP. We show that the problem is Isomorphism Complete, that is, the problem is polynomially equivalent to verifying whether two graphs are isomorphic. Apart from the theoretical importance of this result it has an important practical application. The reduction presented can be used together with any off-the-shelf Graph Isomorphism solver, which performs well in the average case, to find symmetries of an MDP. In fact, we present results of using NAutY (the best Graph Isomorphism solver currently available), to find symmetries of MDPs. We next address the issue of exploiting the symmetries of a given MDP. We propose the use of an explicit model minimization algorithm called the G-reduced image algorithm that exploits symmetries in a time efficient manner. We present an analysis of the algorithm and corroborate it with empirical results on a probabilistic GridWorld domain and a single player GridWorld Soccer domain. We also present results of integrating the symmetry finding with the explicit model minimization approach to illustrate an end-to-end approach for “Abstraction using Symmetries in MDPs”. We then note some of the problems associated with this explicit scheme and as a solution present an approach wherein we integrate the symmetries into the solution technique implicitly. However, we should select a suitable solution technique so that the overheads due to integration do not outweigh the improvements. We validate this approach by modifying the Real Time Dynamic Programming (RTDP) algorithm and empirically demonstrate significantly faster learning and reduced overall execution time on several domains.

Aakanksha Gagrani (2007) (Jointly with Prof. Koshy Varghese, CE, IITM)
Thesis title: Image Modeling using Hierarchical Conditional Random Fields
Oracle, Hyderabad


Image modeling methods broadly fall under two categories, (a) Methods which consider bottom up cues , these methods take pixel statistics into consideration for classification. (b) Methods which consider top down cues as well, these methods utilize the context information as well for the modeling task. Natural images exhibit strong contextual dependencies in the form of spatial interactions among components. For example, neighboring pixels tend to have similar class labels, and different parts of an object are related through geometric constraints. Going beyond these, different regions e.g., objects such as, monitor and keyboard appear in restricted spatial configurations. Modeling these interactions is crucial for achieving good classification accuracy. In this thesis, we present a method based on Hierarchical Conditional Random Fields that can handle both bottom up and top down cues simultaneously. This offers tremendous advantages over existing methods which use either bottom up cues, or model only limited contextual information. The Tree Structured Conditional Random Field (TCRF) as proposed, is capable of capturing long range correlations at various levels of scales. TCRF has non loopy graph structure whichallows us to perform inference in time linear in number of nodes. The model is generic enough to be applied to several challenging computer vision tasks, such as object detection and image labeling; seamlessly within a single, unified framework.

Dinakar Jayarajan (2009)(Jointly with Dr. Dipti Deodhare, CAIR-DRDO)
Thesis title: Using Semantics in Document Representation: A Lexical Chains Approach


Automatic classification and clustering are two of the most common operations performed on text documents. Numerous algorithms have been proposed for this and invariably, all of these algorithms use some variation of the vector space model to represent the documents. Traditionally, the Bag of Words (BoW) representation is used to model the documents in a vector space. The BoW scheme is a simple and popular scheme, but it suffers from numerous drawbacks. The chief among them is that the feature vectors generated using BoW results in very large dimensional vectors. This creates problems with most machine learning algorithms where the high dimensionality severely affects the performance. This fact is manifested in the current thinking in the machine learning community, that some sort of a dimensionality reduction is a beneficial preprocessing step for applying machine learning algorithms on textual data. BoW also fails to capture the semantics contained in the documents properly. In this thesis, we present a new representation for documents based on lexical chains. This representation addresses both the problems with BoW - it achieves a significant reduction in the dimensionality and captures some of the semantics present in the data. We present an improved algorithm to compute lexical chains and generate feature vectors using these chains. We evaluate our approach using datasets derived from the 20 Newsgroups corpus. This corpus is a collection of 18,941 documents across 20 Usenet groups related to topics such as computers, sports, vehicles, religion, etc. We compare the performance of the lexical chain based document features against the BoW features on two machine learning tasks - clustering and classification. We also present a preliminary algorithm for soft clustering using the lexical chains based representation, to facilitate topic detection from a document corpus.

A. Yousuf (2011)
Thesis title: Visual Object detection using Frequent Pattern Mining
Nvidia, Bengaluru


Visual object detection is the problem of identifying an object from a visual scene or more generally, addressing object queries like, What are the objects present in the scene ? How many objects are there in the scene ?, etc. Performing reliable object detection is an important task as it is one of the key requirements for realizing a fully autonomous, general purpose robotic agent. Activities like playing a game, navigating a road, looking for a book in library, etc., are examples involving object detection. Humans perform these tasks incredibly well. But this is a very hard problem for a learning agent. Part of the difficulties arise from the visual processing involved as the objects can appear in different scale, lighting conditions, rotation, occlusion etc. Vision researchers have come up with representations which are robust to these visual variations and can reliably detect objects even in the case of noise. Apart from the visual challenges, there are machine learning challenges of generalization, discrimination and incremental addition. From the training samples the agent has to learn representations which can generalize well over objects within the class and discriminate between objects of different classes. For example, the representation learned for class cup should generalize various cups of different size, and shape model and should distinguish it from other classes say ball. As the agent interacts with the environment, it may encounter new objects. It needs to learn and update its knowledge base so that it can detect the new object class. This incremental update of knowledge base is very desirable for a vision system and the re-learning involved should be minimal.

In this work we propose a system which addresses the learning challenges explained above. The key idea here is to learn those visual properties which are common among the objects of the same class so that these properties can be later used for detecting the class. Shape, color, size, texture, etc., are examples of visual properties. A combination of these properties can represent each class and the learning is done in three phases - pattern mining phase, scoring phase and re-scoring phase. In the pattern mining phase, training image of each class is processed and the various visual properties are extracted. From these visual properties, those patterns that frequently occur are learned using frequent pattern mining. Since we are considering the frequency alone, it might happen that the operators learned will include those which covers more background pixels than the foreground ones or whose coverage is superseded by other operators. The scoring phase is used to eliminate such erroneous or extra operators. To handle discrimination and incremental addition, we have the re-scoring phase, which re- scores the operator sets learned based on how popular the operator sets are among other classes. Generally those operators which are common among many classes are considered as bad choices for discriminating the classes. Re-scoring phase tries to remove such operators by switching them with other alternatives. The system is empirically evaluated with the Caltech-101 data set and the results are presented.

Abhishek Ghose (2011)
Thesis title: Supervised Lexical Chaining
Adobe, Bengaluru


Lexical chaining is a method of grouping semantically related words in a document. Groups thus obtained are known as lexical chains. Lexical chains provide a rich representation of text, and have been used in various tasks like discourse analysis, summarization, corrections of malapropisms, amongst many others, with reasonable success. However, despite the general applicability of the method, its use is limited by the fact that chaining algorithms often group weakly related or unrelated words together. The large amount of time required for mining chains from a document also make it unsuitable for certain tasks. In our research, we look at applying supervised learning methods to address these drawbacks.

We propose two supervised algorithms as viable alternatives to classical algorithms. These algorithms rely on certain probabilistic properties of usage of words in text. We empirically establish the relevance of these properties to rapid construction of high quality lexical chains through experiments we have performed. Using lexical chains formed over sense tagged documents as training data, along with a knowledge of these properties, our algorithms are shown to be capable of reliably constructing chains on a test set of documents.

Although, we believe that exploring supervised learning for chaining is a worthy investigation in its own right, we provide certain encouraging results in defence of the approach. We compare our algorithms to a classical chaining algorithm and report a 44% improvement in quality and 55 times improvement in speed. Our experiments were performed on the SemCor 3.0 dataset.

R. Vijayasarathy (2012) (External, SETS)
Thesis title: A Systems Approach to Network Modelling for DDoS Attack Detection using Naive Bayes Classifier


Denial of Service attacks are a major threat to the modern electronic society. Carefully crafted attacks of large magnitude, better referred to as Distributed Denial of Service attacks (DDoS) have the ability to cause havoc at the highest level, national information infrastructure. The ease at which such attacks can be performed today is startling given the number of free online tools available in the open. An ideal situation would be to differentiate between good and bad packets as generally attempted and accomplished in the case of intrusion attempts, which in the case of DoS attacks is extremely hard as only the intent differs between a genuine user and an attacker.

There are multiple perspectives of dealing with the DoS attack problem, in order to mitigate, detect and cope with detected attack. Techniques dealing with each of these aspects can be integrated to make a fool-proof system. An important such perspective in terms of detecting DoS attacks is to view the problem as that of a classification problem on network state (and not on individual packets or other units) by modelling normal and attack traffic and classifying the current state of the network as good or bad, thereby detecting attacks when they happen. Classical machine learning algorithms are used to solve the problem.

In this work, the approach to a carefully engineered, practically realisable system to detect DoS attacks based on a Nai�ve Bayesian classifier is described. The system is designed to be near the target. The focus of the system is on two transport layer protocols - TCP and UDP, both of which work on a common framework of mechanisms, although with different input parameters.

For the TCP protocol, TCP flags are taken as the feature, and the probability space is engineered taking into account hardware implementability and operability at line speeds. Simple independence assumptions are made inside a window of packets, and the probability of a window is determined as a function of the probabilities of flag combinations observed inside the window. Threshold probability is determined by cross validation - a technique commonly used in data mining. For the UDP protocol, the feature is the Windows Arrival Time (WAT), the time taken for a window of packets to arrive at the mitigation device. Techniques similar to TCP are used.

Experiments are carried out with both tagged and untagged datasets, and the results are found to be promising. In order to assert the efficiency and usefulness of our system, the system was compared with a one-class SVM classifier, and found to be more suitable for hardware implementation.

S. Shivashankar(2012)
Thesis title: Multi-view based Approaches for Collective Learning and Retrieval Tasks
Ericsson R & D, Chennai


Machine learning techniques can be used to build systems to perform various tasks such as classification, clustering, retrieval, etc. In traditional machine learning algorithms, the data points are represented in a d-dimensional space which defines a view of the data. But many real-world datasets possess rich additional information that can be leveraged to improve the system's performance, reduce labeling cost, etc. For example, in webpage classification the content of the webpage and text on hyperlink pointing to this page can be used as two different views. Multi-view learning is a technique which allows the use of multiple views where a model built on one of the views can aid learning the models built on other views. It has been shown that multi-view approaches are more effective than single-view approaches for various domains. In this work, we propose multi-view based approaches for the following two tasks:
  • Linked Data Classification: In general, all the machine learning paradigms assume the data points to be independent and identically distributed (i.i.d). But many real-world datasets such as webpages, images and protein interaction networks possess additional link information which provides local label smoothness. For example, in webpage classification, links between webpages convey that there is a strong correlation between labels of the linked pages. This can be exploited to provide regularization and thereby improve the performance. Collective classification is one of the popular approaches that can handle this type of network data. It combines traditional machine learning and link based classification in an iterative procedure. This involves two important steps: 1) collective learning - training the base classifier on the appended content and link information with the given (partially) labeled network data; and 2) collective inference - applying the trained base classifier on the partially labeled network data and labeling it completely. Most of the collective classification frameworks assume that sufficient labeled data is available for training the base classifier. The trained base classifier is applied on the test data iteratively to perform collective inference. But, there has been little attention paid towards collective learning on partially labeled network data. We propose a multi-view learning based algorithm which utilizes both content and link views effectively to learn from the unlabeled data also. We empirically evaluate the effectiveness of the proposed framework by showing that it performs better than traditional single view approaches on sentiment analysis and standard linked datasets such as Cora, Citeseer and WebKB. We also performed sentiment analysis on automobile reviews.
  • Similar Protein Structures Retrieval: With the rapid expansion of protein structure databases, the task of efficiently retrieving structures similar to a given protein has gained importance. It involves two major issues: (i) an effective protein structure representation that captures inherent relationship between fragments and facilitates efficient comparison between structures, (ii) an effective framework to accommodate different retrieval requirements. Recently, researchers proposed a vector space model of proteins using bag of fragments representation (FragBag), which corresponds to a basic information retrieval model. In this work, first, we propose an improved representation of protein structures using Latent Dirichlet Allocation (LDA) topic model. Second, we handle the most important requirement which is to retrieve proteins, whether they are close or intermediate or remote homologs. Close, remote and intermediate homologs are defined based on Structural Alignment Score (SAS). The SAS thresholds typically are 2, 3.5 and 5 for close, intermediate and remote homologs respectively. In order to meet diverse objectives, we propose a multi-view based framework that combines multiple representations and retrieval techniques. We experimentally show that the proposed multi-view based retrieval framework performs better than state-of-the-art filter and match techniques across different retrieval requirements. The experiments are performed on FragBag dataset, which contains 2,930 sequence non-redundant structures selected from CATH version 2.4.

Priya Anna Mani (2012)
Thesis title: A Hierarchical Markov Logic Based Framework for Reasoning with Incomplete Visual Evidence
Arista Networks, Bengaluru


Visual perception is a key function for an embodied agent to interact with its environment for complex object manipulation tasks. The theory of visual routines suggests a framework for employing perception to solve high-level vision tasks in a cognitively oriented way. But a major challenge in building vision systems for embodied agents is that the evidence obtained from sensors is uncertain and incomplete, i.e., the results of operation of visual routines are not completely reliable. This is due to the inherent limitations of the equipment in terms of field-of-view and resolution of the camera, which causes the input image to be of low fidelity. Moreover, the application of visual operators may yield spurious or imprecise evidence and choosing the right parameters is hard. We propose a novel approach for inference over uncertain and incomplete evidence, using Markov Logic Networks (MLN) and active vision in a hierarchical framework and evaluate it using an object categorization task.

Markov Logic Networks belong to the class of Statistical Relational Learning (SRL) methods that combine the expressiveness of first-order logic and the ability of probability theory to handle uncertainty. MLNs extend Markov networks to a relational setting by expressing the knowledge as a set of weighted formulas. We propose a layered MLN design which performs stage-wise inference to allow for reasoning at multiple levels and at varying levels of uncertainty. Given that the information is incomplete, active vision is a mechanism for focused gathering of additional information. Our framework integrates active vision with the layered MLN model to gather missing evidence, facilitating reliable and tractable inference. Inspired by the ideas of active vision, in the event of missing evidence, our framework restricts the selective visual processing to specific regions of the input image and further inference is carried out incorporating the new evidence.

We present a cognitively motivated, complete end-to-end system for object categorization in a SRL framework. We use three different datasets for experimental evaluation: synthetic images generated using OpenCV library, images obtained from the iCub humanoid simulator and real images taken from Microsoft's Kinect Xbox (R). The system is evaluated with different levels of incompleteness and noise on these datasets and empirically prove its applicability to detect objects of complex structures.

K. V. N. Pradyot (2013)
Thesis title: Beyond Rewards: Learning from Richer Supervision
Hitech Robotics Systemz Ltd., Gurgaon


Robots have captivated human imagination for a long time. Numerous science fictions have promised that the era of robots is beckoning and the day androids walk shoulder to shoulder with us is not too far away. If this were to happen, the intelligent systems that would populate our environment will have to adapt and learn fast, possibly from humans too. For such systems, human teachers present a narrow but critical learning environment. Even between humans, the interactive teacher-student relationship is known to be effective. Endowing robots with human-like learning capabilities would facilitate a similar human- machine interaction, resulting in effective utilization of human knowledge by the robot. We model a subset of these interactions as instructions and propose a framework that enables humans to instruct robots and robots to exploit these supervisory inputs. In all of our experiments, the systems are based on Reinforcement Learning (RL).

In RL, rewards have been considered the most important feedback in understanding environments. However, recently there have been interesting forays into other modes such as sporadic human instructions. We utilize these instructions to identify structural regularities in the environment that can aid in taking behavioural decisions in unfamiliar situations. An important aspect of working with instructions is their proper interpretation. Our approach accommodates multiple interpretations of instructions and provides a handle to choose the best. In this regard, we have tested our approach on several domains and implemented it on a real robotic system.

Karthick Rangadurai (2014)
Thesis title: Sequence Modelling to Detect Network Intrusions
Cisco Systems, Bengaluru


Any activity aimed at disrupting a service or making a resource unavailable or gaining unauthorized access can be termed as an intrusion. Examples include buffer over-flow attacks, flooding attacks, system break-ins, etc. Intrusion Detection Systems (IDS) play a key role in detecting such malicious activities and enable administrators in securing network systems. Two key criteria should be met by an IDS for it to be effective: (a) ability to detect unknown attack types (b) having very less misclassification rate.

Over the last decade, using Machine Learning (ML) techniques to address the problem of detecting network intrusions has become prevalent. The ability to detect novel intrusions sets apart ML based systems from standard signature based intrusion detection methods. In this work, the problem of detecting network intrusions is posed as a classification problem. The classifier built should be able to separate good/legitimate traffic from bad/anomalous traffic. Machine Learning (ML) based classfiers are used to build our IDS. It profiles clean traffic traces and uses the model built to classify unseen traffic.

In this work, we describe a two stage network intrusion detection system (NIDS) that can operate at high packet rates. In the first stage, which is in-line with the network, we use Naïve Bayesian based probabilistic classifier to detect potential anomalies in the trafic. In the second stage, a cascaded Hidden Markov Model (HMM) is used to pin down the potential IP address of misbehaving connections. Various design choices that were made to make this system practical and the difficulties faced while integrating the system with existing models are also described. Experiments were conducted to evaluate the performance and to validate the proposed model. We show that this system achieves good performance empirically.

Vishnu Sankar (2015)
Thesis title: Scalable Community Detection and Centrality Algorithms for Network Analysis
VMWare, Bengaluru


Communication has become a lot easier with the advent of easy and cheap means of reaching people across the globe. This has allowed the development of large networked communities and, with the technology available to track them, has opened up the study of social networks at unprecedented scales. This has necessitated the scaling up of various network analysis algorithms that have been proposed earlier in the literature. While some algorithms can be readily adapted to large networks, in many cases the adaptation is not trivial.

Real-world networks typically exhibit non-uniform edge densities with there being a higher concentration of edges within modules or communities. Various scoring functions have been proposed to quantify the quality of such communities. The popu- lar scoring functions suffer from certain limitations. This thesis identifies the necessary features that a scoring function should incorporate in order to characterize good community structure and propose a new scoring function, CEIL (Community detection using External and Internal scores in Large networks), which conforms closely with the characterization. It also demonstrates experimentally the superiority of CEIL score over the existing scoring functions. Modularity, a very popular scoring function, exhibits a resolution limit, i.e., one cannot find communities that are much smaller in size compared to the size of the network. In many real world networks, community size does not grow in proportion to the network size. This implies that resolution limit is a serious problem in large networks. Still modularity is very popular since it offers many advantages such as fast algorithms for maximizing the score, and non-trivial community structures corresponding to the maxima. This thesis shows analytically that the CEIL score does not suffer from resolution limit. It also modifies the Louvain method, one of the fastest greedy algorithms for maximizing modularity, to maximize the CEIL score. The modified algorithm, known as CEIL algorithm, gives the expected communities in synthetic networks as opposed to maximizing modularity. The community labels given by CEIL algorithm matches closely with the ground-truth community labels in real world networks. It is on par with Louvain method in computation time and hence scales well to large networks.

This thesis also explores the scaling up of a class of node centrality algorithms based on cooperative game theory which were proposed earlier as efficient alternatives to traditional measure of information diffusion centrality. It presents the distributed versions of these algorithms in a Map-Reduce framework, currently the most popular distributed computing paradigm. The scaling behavior of the distributed version of algorithms on large synthetic networks is demonstrated empirically thereby establishing the utility of these methods in settings such as online social networks.

Pratik Gupte (2015)
Thesis title: Epsilon Equitable Partitions based Approach for Role & Positional Analysis of Social Networks, Chennai


In social network analysis, the fundamental idea behind the notion of position is to discover actors who have similar structural signatures. Positional analysis of social networks involves partitioning the actors into disjoint sets using a notion of equivalence which captures the structure of relationships among actors. Classical approaches to Positional Analysis, such as Regular equivalence and Equitable Partitions, are too strict in grouping actors and often lead to trivial partitioning of actors in real world networks. An epsilon-Equitable Partition (eEP) of a graph is an useful relaxation to the notion of structural equivalence which results in meaningful partitioning of actors. All these methods assume a single role per actor, actors in real world tend to perform multiple roles. For example, a Professor can possibly be in a role of "Advisor" to his PhD students, but in a role of "Colleague" to other Professors in his department.

In this thesis we propose epsilon-equitable partitions based approaches to perform scalable positional analysis and to discover positions performing multiple roles. First, we propose and implement a new scalable distributed algorithm based on MapReduce methodology to find eEP of a graph. Empirical studies on random power-law graphs show that our algorithm is highly scalable for sparse graphs, thereby giving us the ability to study positional analysis on very large scale networks.

Second, we propose a new notion of equivalence for performing positional analysis of networks using multiple epsilon-equitable partitions. These multiple partitions give us a better bound on identifying equivalent actor "positions" performing multiple "roles".

Evaluation of our methods on multi-role ground-truth networks and time evolving snapshots of real world social graphs show the importance of epsilon equitable partitions for discovering positions performing multiple roles and in studying the evolution of actors and their ties.

A. P. Sarath Chandar (2015)
Thesis title: Correlational Neural Networks for Common Representation Learning
PhD, University of Montreal, Canada


Classical Machine Learning (ML) algorithms learn to do a specific task, say classification by using the available training data and classify the test data which is unknown during the training. However, all these algorithms assume that the data is represented using a set of features, which are mostly hand-crafted by human experts. Learning useful features or representations from the raw data, also known as Representation Learning is one of the hardest problems in Machine Learning. Deep Learning is an emerging field in Machine Learning which solves the problem of representation learning using Deep Neural Networks (DNNs). These representations learned using Deep Neural Networks, when coupled with simple classification algorithms, perform significantly better than the complex state-of-the-art procedures in text, image and speech data.

Common Representation Learning (CRL), wherein different descriptions (or views) of the data are embedded in a common subspace, is receiving a lot of attention recently. Two popular paradigms in CRL world are Canonical Correlation Analysis (CCA) based approaches and Autoencoder (AE) based approaches. CCA based approaches learn a joint representation by maximizing correlation of the views when projected on the common subspace. AE based methods learn a common representation by minimizing the error of reconstructing the two views. Each of these approaches has its own advantages and disadvantages. For example, though CCA based approaches outperform AE based approaches for the task of transfer learning, they are not as scalable as the latter.

In this thesis, we propose Correlational Neural Networks (CorrNet), a class of neural networks that can learn common representation for two different views. CorrNet is an AE based approach, that explicitly maximizes correlation among the views when projected on the common subspace. The model can be efficiently trained with batch gradient descent and is thus scalable to big data. Apart from learning common representation for different views, the model also learns to predict the features in one view given the features in the other view, which has several interesting applications. CorrNet, unlike CCA, can be easily extended to more than two views. Experiments show that CorrNet outperforms CCA in learning common representation. Thus scalability is gained without compromising performance. The CorrNets come with another advantage, that is, it can make use of single view data when there are less parallel data and this is not possible in CCA.

CorrNet has been successfully applied in Natural Language Processing (NLP) to several cross lingual tasks where a learner has to learn from one language and perform in a different language. In the task of Cross Language Document Classification (CLDC), CorrNet based bilingual word representation learning algorithm performs significantly better than the current state of the art procedures and a strong Machine Translation baseline. The model is also applied in Transliteration Mining. It has been tested with several language pairs and works well even if the languages differ in scripts, since the model itself is not language dependent.

CorrNet can be considered as a variant of auto-encoder to handle multi-view data. In this thesis, we also propose ways to make the CorrNet deep. One of the deep versions of the CorrNet performs much better than several strong baselines in the task of Cross Language Sentiment Analysis (CLSA). All the mentioned studies leads us to believe that CorrNet is a promising tool from common representation learning.

Saket Gurukar (2015) (Jointly with Sayan Ranu, CSE)
Thesis title: A Scalable Approach to Mining Communication Motifs from Dynamic Networks
CodeNation, Bengaluru


Social networks have become an effective means of communication among people. Recently, there is a trend to analyze these social networks to infer the dynamics of human interaction. A fundamental problem in behavioral analysis of human interactions is to understand how communications unfold. In this thesis, we propose and solve the problem of mining Communication motifs from dynamic interaction networks. Simply stated, a communication motif is a recurring subgraph that has a similar sequence of information flow. Communication motifs provide a powerful mechanism to capture the dynamics of human interactions.

Existing work show that communication motifs reveal how the functional behavioral patterns evolve with time, how the structures of these patterns change with the social network, and finally, how the social network influences the speed and amount of information exchanged in communications between individuals. However, no technique is proposed for mining these motifs in a scalable manner. Mining communication motifs requires us to explore the exponential subgraph search space where existing techniques fail to scale. To tackle this scalability bottleneck, we develop a technique called COMMIT. COMMIT converts a dynamic graph into a database of sequences. Through careful analysis in the sequence space, only a small portion of the exponential search space is accessed to identify regions embedding communication motifs. We also store the pointers to these regions as a result costly subgraph enumeration step is avoided.

We perform extensive experiments on three real world datasets and evaluate the proposed COMMIT based on accuracy and scalabilty. We find that COMMIT is up to two orders of magnitude faster than baseline techniques. COMMIT can also mine large size communication motifs where existing algorithms fails. Furthermore, qualitative analysis demonstrate communication motifs to be effective in characterizing the recurring patterns of interactions while also revealing the role that the underlying social network plays in shaping human behavior.

S. S. Manimaran (2016)
Thesis title: Towards an Autonomous Human Inclusive Robot Learning Framework
Epic, Wisconsin, USA


Building intelligent robots has been one of the motivating problems in Artificial Intelligence research. Choosing what actions to perform is a fundamental problem every robot must solve. The aptness of the actions depends on the robot's understanding of its environment and the dynamics involved. This understanding is an ever-changing model based on the robot's observations and experiences. In this thesis, we propose a robot-learning architecture and discuss two key modules - a continuous space path planning algorithm and an activity recognition framework.

Path planning in continuous spaces is a challenging problem for which a variety of solutions have been proposed. The performance of sampling based path planning techniques relies on identifying a good approximation to the cost-to-go distance metric, in the case of systems with complex dynamics. We propose a technique that uses reinforcement learning to learn this distance metric on the fly from samples and combine it with existing sampling based planners to produce near optimal solutions. The resulting algorithm - Policy Iteration on Rapdily exploring Random Trees (RRTPI) can solve problems with complex dynamics in a sample-efficient manner while preserving asymptotic guarantees. We provide experimental evaluation of this technique on domains with under-actuated and underpowered dynamics such as the mountain car domain, the acrobot domain and a variant of the octopus arm domain.

Human activity recognition is a key component in enabling social robots to naturally interact with humans. While humans can easily distinguish whether an activity corresponds to a form of communication or an act of daily living, robots are yet not that insightful. This thesis addresses the problem of enabling a social robot to understand an activity as well as classify it as an action corresponding to a human instruction or an act of daily living. We address this problem by presenting a simple yet effective approach of modeling pose trajectories in terms of directions taken by skeletal joints over the duration of an activity. Specifically, we propose a novel activity descriptor: Histogram of direction vectors (HODV). The descriptor is computationally efficient, scale and speed invariant and provides state of the art classification accuracies using off the shelf classification algorithms on multiple datasets such as the Cornell Activity Dataset (CAD-60), UT-Kinect Dataset as well as on our novel RGBD dataset.

Avijit Saha (2016)
Thesis title: Scalable Bayesian Factorization Models for Recommender Systems
Flipkart, Bengaluru


In recent years, Recommender Systems (RSs) have become ubiquitous. Factorization models are a common approach to solve RSs problems, due to their simplicity, prediction quality and scalability. The idea behind such models is that preferences of a user are determined by a small number of unobserved latent factors. One of the most well studied factorization model is matrix factorization using the Frobenius norm as the loss function. In more challenging prediction scenarios where additional "side-information" is available and/or features capturing higher order may be needed, new challenges due to feature engineering needs arise. The side-information may include user specific features such as age, gender, demographics, and network information and item specific information such as product descriptions. While interactions are typically desired in such scenarios, the number of such features grows very quickly. This dilemma is cleverly addressed by the Factorization Machine (FM), which combines high prediction quality of factorization models with the flexibility of feature engineering. Interestingly, the framework of FM subsumes many successful factorization models like matrix factorization, SVD++, TimeSVD++, Pairwise Interaction Tensor Factorization (PITF), and factorized personalized Markov chains (FPMC). Also, due to the availability of large data, several sophisticated probabilistic factorization models have been developed. However, in spite of having a vast literature on factorization models, several problems exist with different factorization model algorithms.

In this thesis, we take a probabilistic approach to develop several factorization models. We adopt a fully Bayesian treatment of these models and develop scalable approximate inference algorithms for them.

Bayesian Probabilistic Matrix Factorization (BPMF), which is a Markov chain Monte Carlo (MCMC) based Gibbs sampling inference algorithm for matrix factorization, pro- vides state-of-the-art performance. BPMF uses multivariate Gaussian prior on latent factor vector which leads to cubic time complexity with respect to the dimension of latent space. To avoid this cubic time complexity, we develop the Scalable Bayesian Matrix Factorization (SBMF) which considers independent univariate Gaussian prior over latent factors. SBMF, which is a MCMC based Gibbs sampling inference algorithm for matrix factorization, has linear time complexity with respect to the dimension of latent space and linear space complexity with respect to the number of non-zero observations.

We then develop the Variational Bayesian Factorization Machine (VBFM) which is a batch scalable variational Bayesian inference algorithm for FM. VBFM converges faster than the existing state-of-the-art MCMC based inference algorithm for FM while providing similar performance. Additionally for large scale learning, we develop the Online Variational Bayesian Factorization Machine (OVBFM) which utilizes stochastic gradient descent to optimize the lower bound in variational approximation. OVBFM outperforms existing online algorithms for FM as validated by extensive experiments performed on numerous large-scale real world data.

Finally, the existing inference algorithm for FM assumes that the data is generated from a Gaussian distribution which may not be the best assumption for count data such as integer-valued ratings. Also, to get the best performance, one needs to cross-validate over the number of latent factors used for modeling the pairwise interaction in FM, a process that is computationally intensive. To overcome these problems, we develop the Nonparametric Poisson Factorization Machine (NPFM), which models count data using the Poisson distribution, provides both modeling and computational advantages for sparse data. The ideal number of latent factors is estimated from the data itself. We also consider a special case of NPFM, the Parametric Poisson Factorization Machine (PPFM), that considers a fixed number of latent factors. Both PPFM and NPFM have linear time and space com- plexity with respect to the number of non-zero observations. Using extensive experiments, we show that our methods outperform two strong baseline methods by large margins.

Prasanna Parthasarathy (2017)
Thesis title: Understanding Exploration Strategies in Model Based Reinforcement Learning
PhD, McGill University, Montreal, Canada


Reinforcement Learning (RL) is a collection of learning approaches that solve for a behavior of an agent in a world maximizing some notion of payoff. Under certain circumstances the agent's world, described by a set of parameters, is essential to be learnt for it to discover an optimal behavior. This paradigm of problems is studied under the broad topic of model-based RL, and the class of algorithms that help an agent in learning the model parameters are known as model learning algorithms. In this thesis, we discuss exploration strategies in model-based RL from two different perspectives - asymptotic agent, where early convergence to an optimal solution is desirable but not a necessity, and finite-episode agent, which has a constraint on the number of episodes to converge to an optimal behavior.

The first part of this thesis proposes a novel uncertainty based exploration strategy, Thompson Sampling with Exploration Bonus (TSEB), for the asymptotic agent case. This work draws insights from Thompson Sampling, a Bayesian approach to modelbased RL, and the homomorphism literature in RL. The proposed algorithm helps the agent learn close to true model parameters of the world with lesser sample complexity than the existing model learning algorithms. TSEB algorithm trades off the proposed exploration bonus with the sampled reward estimate in Thompson Sampling algorithm. This strategy, with the elegance of Bayesian approach, provides a better way to learn model parameters and is validated theoretically and empirically in this thesis.

Despite having sound strategies for the asymptotic case, the finite-episode case in RL has been seldom looked into. The other part of this thesis studies the exploration strategies of finite-episode agents and proposes a novel framework, Structuring FiniteEpisode Exploration (SFEE) in Markov Decision Processes, that adapts an asymptotic exploration strategy for finite-episode setting. This work justifies the need for such a framework by showing that the asymptotic exploration strategies don't suit the finite-episode setting which can be because of them not being conscious of the agent's lifetime. Aiming to make good use of the sound strategies of asymptotic learning algorithms, the proposed framework uses the policy of an asymptotic algorithm plugged into the framework for a certain number of episodes and then appropriately switches to a greedy policy. Further, this thesis shows, under some mild assumptions, that this is indeed the optimal way to explore in a finite-episode setting using that exploration algorithm. The framework also enables exporting theoretical guarantees from the plugged asymptotic algorithm to the proposed framework.

Subhojyoti Mukherjee (2018) (ID) (Jointly with Nandan Sudarsanam, DoMS)
Thesis title: Finite-time Analysis of Frequentist Strategies for Multi-armed Bandits
PhD, UMass Amherst, USA


This thesis studies the following topics in the area of Reinforcement Learning: classic Multi-armed bandits in stationary distribution with the goal of cumulative regret minimization using variance estimates and Thresholding bandits in pure exploration fixed-budget setting with the goal of instantaneous regret minimization also using variance estimates. The common underlying theme is the study of bandit theory and its application in various types of environments.

In the first part of the thesis, we study the classic multi-armed bandit problem with a stationary distribution, one of the first settings studied by the bandit community and which successively gave rise to several new directions in bandit theory. We propose a novel algorithm in this setting and compare both theoretically and empirically its performance against the available algorithms. Our proposed algorithm termed as Efficient-UCB-Variance (EUCBV) is the first arm-elimination algorithm which uses variance estimation to eliminate arms as well as achieve an order optimal regret bound. Empirically, we show that EUCBV outperforms most of the state-of-the-art algorithms in the considered environments.

In the next part, we study a specific type of stochastic multi-armed bandit setup called the thresholding bandit problem and discuss its usage, available state-of-the-art algorithms on this setting and our solution to this problem. We propose the Augmented-UCB (AugUCB) algorithm which again uses variance and mean estimation along with arm elimination technique to conduct exploration. We give theoretical guarantees on the expected loss of our algorithm and also analyze its performance against state-of-the-art algorithms in numerical simulations in multiple synthetic environments.

Bhanu Chandar V. (ED, Jointly with T. Asokan)
Thesis title: Geometrical Methods for Constructing 2D and 3D Maps using Single Perspective Image for Mobile Robot Path Planning
Schlumberger, Pune


In the field of mobile robotics, robot tasks usually involve navigation in a known or unknown environment, for which the robot needs a map of that environment and also needs to know its position within the operating area. Although there are techniques like SLAM (Simultaneous Localization and Mapping) where a map is constructed and/or updated while simultaneously localizing the robot, they do need a map for localizing. So, in any case, having a map at hand is advantageous.

Traditional map construction methods depend extensively on costly sensors. For indoor applications, the robot has to wander around the world numerous timesto catch as much data as possible to construct a map, mostly supervised by human. Vision based navigation systems are hence gaining popularity because of being cheap and easy to handle. While map building is easier using multiple vision systems or using multiple images, map construction from single image is definitely a harder problem, although its applications are immense.

In the domain of robotics, the idea of constructing maps from a single image taken from monocular cameras might solve several complex issues related to path planning and navigation. Owing to its diverse applications, the current research focus is on developing methods which would utilize the properties of the images. One such property being considered is Vanishing points (VPs). Existing methods on single camera vision systems depend either on multiple images or on prior information to be fed to the system. The focus of the research presented in this thesis is to develop geometrical methods that will take advantage of VPs for constructing scaled 2D free- space maps and 3D models from a single image, the image being either in two-point perspective (2PP) or three-point perspective (3PP).

Given an image with one or more objects in it, the proposed approach is to bring all the objects to a common ground depth w.r.t. the camera view point. This will ensure that all the objects' projections are to a common scale and thus a scaled map can be constructed. The principal contribution of this thesis is the concept of side view geometry of an image, which is first of its kind. All the mathematical derivations have been done using the proposed side view geometry. A systematic method has been developed, which has been proved as a generalized approach for both 2PP and 3PP cases. A generalized algorithm has been developed, coded, tested with some case studies, and the results are compared with the ground truths. The effectiveness of the proposed method is established through the results and the error plots.

A new methodology for path planning of mobile robots in known environments with static obstacles is also proposed in this thesis. A new concept of multi-bug system will be introduced. Multi Bug Path Planning (MBPP) works by assuming a virtual bug that move towards the goal from the start state. If in case it meets an obstacle, then that bug generates a new bug. The two bugs now move along walls of the obstacle in either direction until specific conditions are met. Bug generation continues whenever any of the current bugs hit a new obstacle, until target is reached by all live bugs. MBPP thus evaluates best possible paths for a given environment and chooses the best route that is supposed to be optimal. Experimentally, it will be shown that MBPP finds paths that are shorter and comparable with post-smoothed A* in lesser runtimes. Proposed work on MBPP is an attempt to combine the features of offline and online methods so that the same algorithm could be used for both cases. Current work on MBPP demonstrates the offline case with static obstacles in a priori known environment.


Ankit Malpani (Dual Degree) (Jointly with Prof. Hema A. Murthy)
Thesis title:Personalized Intelligent Tutoring System
MSIDC, Hyderabad


Throughout the educational system, teaching has traditionally followed a one-size-fits-all approach to learning, with a single set of instructions provided identically to everybody in a given class,regardless of differences in aptitude or interest. In recent years,a growing appreciation of individual preferences and aptitudes has led towards Personalized Intelligent Tutoring System where instructions are presented to students based on theisr needs, adapting to their learning behaviour to make learning both interesting and challenging. The US National Academy of Engineers identified it as one of the fourteen grand challenges for engineering in the 21st century. The system becomes even more relevant in the current Indian Educational scenario. In rural arears skilled and inspiring teaches are few and far to come by. An Intelligent Tutoring system can go a long way in bringing in good students from the marginalized sections of society into the main stream. Most of the present tutoring systems use extensive expert knowledge to either explicitly store teaching rules or have a finely labelled dataset specifying skills extending such a Tutoring System to newer domains difficult specially in India with numerous education boards and poorly labelled data-sets. In this project, we propose a system that works on coarsely labelled data-sets and uses Reinforcement Learning techniques to implicitly learn teaching rules. We use Student Models to simulate student's learning behaviour and show that our system helps in speeding up students learning process. As interactions with the student proceed the system also learns to identify between different types of students and presents instructions to suit individual student's need.

Kiran Kate
Thesis title: Positional Analysis of Social Networks
IBM-IRL, Bangalore


Social network analysis is an emerging area of research for computer scientists. It employs different concepts from graph theory, probability, and statistics to solve problems in a wide range of disciplines. Social network analysis can be performed on many real world networks from different domains.There are a number of social network analysis methods designed for various tasks and positional analysis is one of them. It involves partitioning of the set of actors into subsets such that actors in a subset are similar in their structural relationships with other actors. Traditional methods of positional analysis are either too strict or too loose. Their applicability to real world large graphs is severely limited as they result into almost trival partitions. We propose a useful relaxation to the concept of equitable partition called as e-equitable partition. A variant of epsilon equitable partition called maximal e-equitable partition is also proposed and formulated as an optimization problem. Various heuristics to compute a maximal e-equitable partition of a graph were tried and finally,a fast algorith is proposed. We also present the results of performing positional analysis on a number of networks and demonstrate empirically that positional analysis with our notion gives raise to non-trivial and meaningful partitions. Along with the static network anaysis with respect to positions, we have also studied the impact of positional analysis on the evolution of networks. Our results show that positional analysis indeed plays an important role in the evolution.

Rachit Arora
Thesis title: Latent Dirichlet Allocation and Singular Value Decomposition based Multi-Document Summarization
Microsoft, Redmond


Mixture models like Latent Dirichlet Allocation are used to represent documents as a mathematical model with a given probability distribution. We look at a novel way of applying this to one of the most challenging task in Statistical Natural Language Processing multi-document summarization. Representing the documents as a probability distribution using LDA can effectively capture and represent the topics of the corpus. We also look at the methods of improving the pre-existing algorithm for estimation and interference in LDA and give certain work-arounds for the limitations in the LDA Model which can be used in other topic models which are extension of the basic LDA model. Singular Value Decomposition is an effective tool of Principal Component Analysis used in the area of Text Retrieval. It is used to find orthogonal dimensions for the terms and we exploit this ability to find orthogonal dimensions to extract sentences orthogonal to each other and form a summary. In PCA one of the problems being looked into is a weighting scheme for the terms of the term-document matrix. The current weighting mechanisms dont capture the underlying structure that the documents represent as a set. they are based on the assumption that the documents are independant of each other. We show that improvements can be made by making the assumption that is central to the LDA theme that documents share a certain set of underlying latent topics. Using the probality distribution of the LDA model we see a way of weighting the terms of the documents and show its application by computing multi-document summary using this combined LDA-SVD model. LDA based topic models are very effective in capturing the correlations among the terms at the document level. Howvere many tasks in SNLP deal with the documents at the level of the sentences. The terms have stronger correlation among them at the sentence level and lesser at the document level. Thus we have come up with a new topic model, which we call Latent Dirichlet Segment and word allocation that captures the correlation among the terms of the documents both at the level of sentences or paragraphs and at the level of the documents. Thus we represent the documents of a corpus by capturing there underlying structure of latent topics, which is the theme of LDA and capturing the structure of the documents which is represented by the sentences of the documents. Finally we look at a News Aggregator in which we attempt to gather the common news articles from various websites and present them to the user under one heading. Thus a user can read the same news event of his favorite new websites and also compare them at one place. Together with news articles we can provide the user even more facilities like timeline and summary for the entire news event or the individual articles.

Simarjit Singh
Thesis title:Policy Learning in Partially Observable Environment
PGP, IIM Lucknow


Agents that act in real environments,whether physical or virtual, rarely have complete information about the state of environment in which they are working. Robots and complex systems have limitations in sensor capabilities as well as bottle necks in computational power,these limitations further restrict the real time information available to agent for decision-making. Predictive State Representation (PSRs) is a recently introduced class of models for discrete time dynamical systems with partial observability of the actual state of system. PSRs replace the traditional notion of state with a set of statistics about future actions and observations. PSRs have been extended to encompass continuous domains,since a lot of real world signals are continous. But the problem is the quality of the predicitions are dependant on the core test cases chosen, and in continous domains,there is not a good criterion yat, like linear independant in discrete domains. We have extended the existing literature in PSR to address the problems of number of test cases and test case length. We have also provided a firm theoretical foundation using Schwarz information theoretic criteria for test case selection. By modeling problems from field of economics in RL domain, we have shown our criterion for selection of test case length and count indeed provide a sufficient statistic, at the same time we have established the applicability of PSR to such complex stateless domains that are mainly addressed using time series analysis or chaos theory. Availability of long range real time data for macro-economics indicators, as well as complex and continous nature of state make economics domain as obvious for experimentation. Moreover Economic Systems are huge systems, it is not possible to find complete state and other machine learning techniques have been employed with only limited success. We found that relatively established fields in RL, Markov decision process (MDP) and Partially Observable Markov Decision Process (POMDP) are not suitable to address problems in economics domain. However we have shown that PSR(Predictive State Representation) along with Artificial Neural Networks (ANNs) can be efficiently employed with very high accuracy in prediction of future interest rate and industrial production average. We have further shown that Support Vector Machine (SVMs) regression can address the problem of prediction of future interest rate with higher accuracy than ANN.

P. Swaminathan
Thesis title:Co-SOFT-Clustering-An Informatic Theoretic Approach To Obtain Overlapping Clusters From Co-occurrence Data
Yahoo Research, Bangalore


Co-clustering exploits co-occurrence information, from contingency tables to cluster both rows and columns simultaneously. It has been established that co-clustering produces a better clustering structure as compared to conventional methods of clustering. So far, co-clustering has only been used as a technique for producing hard clusters. In this papaer, we present an algorithm using the information theoretic approach to generate overlapping (soft) clusters. The algorithm maintains probability membership for every instance to each of the possible clusters and iteratively tunes these membership values. The theoretical formulation of the criterion function is presented first, followed by the actual algorithm. We evalute the algorithm over document/word co-occurance information and present experimental results. We discuss the algorithm's applicability for polysemy detection experimental verification. We also discuss directions for future research.


S. Shonima
Thesis title:Models for Prediction of Angle Closure Glaucoma
Strand Life Sciences, Bangalore


Angle-closure glaucoma is a condition in which the iris in the eye shifts and blocks the exit passegeway of the aqueous humor,the fluid that fills the eye. This fluid breakup causes a rapid build-up of pressure in the eye. Angle closure glaucoma is an emergency condition that requires immediate medical treatment to preserve vision. Predicting the suspects of the various stages of the angle closure glaucoma at an early stage is important to avoid the progressive loss of vision in both the suspected eye and the other eye. The first part of the work deals with building classification models for prediction of angle closure glaucoma in the right eye. A variety of classification algorithms including Support Vector Machines, Random forests, Maximum Margin Classifiers with specified maximum false positive and false negative error rates and Layered Hierarchial Committee of Trees were used to bulid the classification models. In oder to address imbalance, methods like sampling, down sampling and modification of misclassification costs were employed to achieve a high classification accuracy. The second part of the work is a risk factor analysis of angle closure glaucoma where a subset of the features that are good predictors of angle closure glaucoma in the right eye are identified.

Deepthi Cheboli
Thesis title:Models For Detection Of Feratoconus
MS, Univ. of Minnesota, Minneapolis


Keratoconus, is a non-inflammatory disorder of the eye in which structural changes within the cornea cause it to thin and change to a more conical shape which leads to substantial distortion of vision. Detection of Keratoconus is important to avoid unpredictable results in refractive surgery. This surgery on a Keratoconic eye might lead to even more thinning of the cornea. Current methods for the detection of the disease mainly consistes of Neural Networks. We propose a new model, which integrates Manifold learning techniques with Semi-Supervised learning,to maximise the utility of unlabeled data. Through the empirical results,we show that highly accurate models can be bulit using these principles.

Aniket Ray
Thesis title:Models For Prediction Of Retinopathy Of Prematurity
Adobe, Mumbai


Retinopathy of Prematurity(ROP) is a major condition affecting prematurely born children in the developing world. In this work, we have attempted to use machine learning appraches to learn models for the prediction of Retinopathy of Prematurity. We conduct a study on several kinds of classification methods and the empirical results for them have been outlined. We propose a hierarchical random committee of trees model to closely model the disease. It has been shown through empirical results that a high level of accuracy is achieved while using this model.

Pranjal Awasthi
Thesis title:Image Segmentation using Multiscale Conditional Random Fields


This work is an attempt to study probablistic graphical models, specifically for image segmentation. The problem of image segmentation is of fundamental importance in computer vision. It is one of the priliminary tasks that are to be performed before any high level processing of the image can be done. Needless to say, good segmentation is crucial for the success of other vision tasks such as object detection and tracking, video sequence analysis etc. In recent years, a lot of research has focused on the use of graphical models for segmenting images. Graphical models combine ideas from graph theory and probability theory in an elegant manner to represent complex systems. Efficient training and interference procedures make them a popular choice for many problem domains. Markov random fields (MRFs) are the most popular form of graphical models used for image segmentation. They represent a generative framework by modeling the joint probability of the image pixels and the corresponding labels. There have been a lot of improvements suggested over the basic MRF framework. Some of them include hierarchical MRFs which try to capture information over a range of scales and multiscale random fields (MSRFs) which try to simplify the parameter estimation procedures. In parallel, work has also been done on the use of directed models such as tree structures belief networks(TSBNs) and dynamic trees (DTs). Recently with the introduction of conditional random fields (CRFs), there has been a shift towards the use of conditional models for image segementation. CRFs provide a lot of advantage over generative models. Multiscale conditional random field (mCRF) is a recently proposed conditional model for image segmentation. It belongs to the family of the latent state models which are able to capture long range association among pixel labels. After a comprehensive survey of the existing techniquies, we study the mCRF framework in detail and provide an efficient implementation of it. We also provide two possible extensions to the basic model and successfully demonstrate the increase in the performance which they achieve.

E. Siva Sowmya
Thesis title:Placement And Routing Of 3D-FPGA using Reinforcement Learning and Support Vector Machines
PhD, UPenn


The primary advantage of using 3D-FPGA over 2D-FPGA is that the vertical stacking of active layers reduce the Manhattan distance between the components in 3D-FPGA than when placed on 2D-FPGA. This results in a considerable reduction in total interconnect length. Reduced wire length eventually leads to reduction in delay and hence improved performance and speed. Design of an efficient placement and routing algorithm for 3D-FPGA that fully exploits the above-mentioned advantage is a problem of deep research and commercial interest. In this work, an efficient placement and routing algorithm is proposed for 3D-FPGAs which yields better results in terms of total-interconnect length and channel-width. The proposed algorithm employs two important techniques, namely, Reinforcement Learning(RL) and Support Vector Machines (SVMs), to perform the placement. The proposed algorithm is implemented and tested on standard benchmark circuits and the results obtained are encouraging. This is one of the very few instances where, reinforcement learning is used for solving a problem in the area of VLSI.