CoLISD: Collective Learning and Inference on Structured Data

Submission Deadline Extended to July 27, 2012.

Classical ML techniques assume the data to be iid, but the real world data is inherently relational and can generally be represented using graphs or some variants of them. The importance of modelling structured data is evident from its increasing presence: WWW, social networks, organizational network, image, protein sequence, relational data etc. This field has been recently receiving a lot of attention in the community under different themes depending on the problem addressed and the nature of solution. Researchers in different areas have proposed very useful and successful frameworks:

  • Iterative collective classification involves use of a local classifier that embeds the node's own attributes and neighbours information in a feature vector, and classifies the nodes in an iterative procedure.
  • Statistical relational learning combines statistics (uncertainty) and relational information (first order logic) to model the target domain.
  • Structured prediction involves machine learning techniques for structured objects which embeds the relationship between output classes.
  • Regularization based framework poses the local smoothness constraint on the structured data i.e., uses it as a constraint or side-information.
  • Kernel methods for structured data deals with developing similarity functions for objects of a domain that can handle relationship between objects, for example a tree, as well as heterogeneous representations.
  • Message passing techniques such as belief propagation, loopy belief propagation, mean field relaxation labelling on networks.
  • Learning with network data involves many building blocks:

  • How to build the relationship network.
  • How to learn the prediction model (classification/regression) with fully supervised or partially supervised data.
  • How to use the learnt model for inference on fully unlabeled or partially labelled data.
  • How to manipulate the relational information to reduce the computational cost, such as lifted probabilistic inference in SRL.
  • There is huge research progress on these subtasks in each area individually. Also, workshops are held for each field such as SRL, ILP, StarAI, GBR, GDM. Collective inference is a common factor to all these subtasks, and notably it has attained huge progress in individual areas. Some of the future/past workshops on collective inference include "CVPR WS on Inference in Graphical Models and Structured Potentials", "Propagation Algorithms on Graphs with Cycles: Theory and Applications", "Approximate inference-How far have we come?", "Approximate Learning of Large Scale Graphical Models: Theory and Applications". We believe the current situation provides us with an opportunity for attempts at synthesis, forming a common core of problems and ideas, and crosspollinating across subareas. There have been few attempts, and a notable success in the MLG series. MLG addresses all general aspects of mining and learning with graphs, whereas CoLISD focuses on the within-network learning and inference tasks with special emphasis on collective inference. Inspired by the success of MLG, this workshop will attempt to reach out to different groups which work on the same theme and to explore together how to reach the goals w.r.t within-network learning and inference for each subfield mentioned above.

    Following up on the successful conduct of the CoLISD workshop last year at ECML PKDD, we propose to organize the second edition of the workshop this year. The first edition of the workshop succeeded in bringing together researchers from various communities who look at different aspects of learning with structured data. For many of the participants it was the first time they were seriosuly looking at approaches from other disciplines. The wide spread feeling was that the workshop should be continued since there was scope for much crossfertilization. We are currently working on a special issue of MLJ based on the outcome of the first workshop.

    Topics of Interest

    Technical original research papers and position papers on each of the topics are invited for submission on or before 29th June 27th July 2012.

    Potential topics include (but are not limited to):

  • Collective Learning
  • Collective Inference
  • Representation of structured data : for example, embedding a node's attributes and neighboring nodes' class distribution information as a flat vector
  • Cross-domain applications
  • Comparison study aimed at exploring commonality and differences between topics mentioned above
  • Submission Instructions

    The papers must be written in English and formatted according to the Springer-Verlag Lecture Notes in Artificial Intelligence guidelines. Authors instructions and style files can be downloaded at LNCS site. The maximum length of papers is 12 pages in this format. Submissions must be made through EasyChair system .

    Organising Committee

  • Balaraman Ravindran , Dept of CSE, IIT Madras
  • Kristian Kersting , Dept. of Knowledge Discovery , Fraunhofer IAIS
  • Sriraam Natarajan , Wake Forest University Baptist Medical Center
  • Local Organiser

  • S. Shivashankar , Ericsson R&D, Chennai
  • Program Committee

  • Annalisa Appice, Università degli Studi di Bari
  • Mustafa Bilgic, Illinois Institute of Technology
  • Joschka Boedecker, Osaka University
  • Ulf Brefeld, Yahoo Research, Barcelona, Spain
  • Michelangelo Ceci, Università degli Studi di Bari
  • Janardhan Rao Doppa, Oregon State University
  • Saket Joshi, Oregon State University
  • Sofus A. Macskassy, University of Southern California
  • Oliver Obst, CSIRO ICT Centre
  • Scott Sanner, Australian National University
  • Invited Speakers

  • Charles Sutton, University of Edinburgh
  • Sebastian Riedel, University of Massachusetts, Amherst
  • Christopher Ré, University of Wisconsin-Madison
  • Time Description
    10:30-10:40 Welcome address
    10:40-11:40 Invited Talk by Charles Sutton, University of Edinburgh
    Title : "Piecewise Training for Structured Prediction"
    11:40-12:05 Talk : Characterizing Retweeting Behaviors in Twitter: On the use of Text vs. Concepts
    Authors : Sofus Macskassy
    12:05-12.30 Talk : Ranking Mechanisms for Maximizing Spread, Trust in Signed Social Networks
    Authors : Ramasuri Narayanam, Ranga Suri N.N.R., Vikas K. Garg and Narasimha Murty M
    12:30-12.55 Talk : Link Prediction via Generalized Coupled Tensor Factorisation
    Authors : Beyza Ermis, Evrim Acar and A. Taylan Cemgil
    12:55-13:00 Poster Highlights - Node Classification in Partially Labeled Social Networks
    Authors : S Shamshu Dharwez, Subramanian Shivashankar and Balaraman Ravindran
    13:00-14:10 Lunch Break
    14:10-15:10 Invited Talk by Sebastian Riedel, University of Massachusetts, Amherst
    15:10-16:10 Invited Talk by Christopher Ré, University of Wisconsin-Madison
    16:10-16:35 Coffee Break
    16:35-17:00 Talk : Supervised Blockmodelling
    Authors : Leto Peel
    17:00-17:30 Discussions