Discriminative Embeddings of Latent Variable Models for Structured Data (1603.05629v5)

Published 17 Mar 2016 in cs.LG

Abstract: Kernel classifiers and regressors designed for structured data, such as sequences, trees and graphs, have significantly advanced a number of interdisciplinary areas such as computational biology and drug design. Typically, kernels are designed beforehand for a data type which either exploit statistics of the structures or make use of probabilistic generative models, and then a discriminative classifier is learned based on the kernels via convex optimization. However, such an elegant two-stage approach also limited kernel methods from scaling up to millions of data points, and exploiting discriminative information to learn feature representations. We propose, structure2vec, an effective and scalable approach for structured data representation based on the idea of embedding latent variable models into feature spaces, and learning such feature spaces using discriminative information. Interestingly, structure2vec extracts features by performing a sequence of function mappings in a way similar to graphical model inference procedures, such as mean field and belief propagation. In applications involving millions of data points, we showed that structure2vec runs 2 times faster, produces models which are $10,000$ times smaller, while at the same time achieving the state-of-the-art predictive performance.

Citations (678)

View on Semantic Scholar

Summary

The paper introduces structure2vec, a method that embeds posterior distributions of latent variables into discriminative feature spaces, surpassing traditional kernel approaches.
It leverages function mappings akin to belief propagation and stochastic gradient descent to achieve efficient and scalable computations for structured data.
Experiments demonstrate that structure2vec achieves high predictive accuracy while dramatically reducing model size and processing time on large benchmark datasets.

Discriminative Embeddings of Latent Variable Models for Structured Data

The paper introduces "structure2vec," a novel method for generating scalable and discriminative representations for structured data. This approach addresses limitations in traditional kernel methods, which are constrained by pre-defined feature spaces, by embedding latent variable models into discriminative feature spaces. The paper demonstrates the utility of this method across various structured data forms, such as sequences, trees, and graphs, particularly in domains like computational biology and drug design.

Methodology and Innovation

The core innovation in "structure2vec" is its ability to learn feature spaces that are informed by discriminative tasks, rather than relying on pre-configured kernels. The method applies a sequence of function mappings akin to graphical model inference techniques like mean field and belief propagation. This allows it to leverage the structures inherent in the data effectively.

Embedding of Latent Variable Models: The paper proposes using graphical models for each data point, modeling these as latent variable models. By embedding the posterior distributions of these latent variables into feature spaces, "structure2vec" facilitates a discriminative learning process.
Scalability: Traditional kernel methods struggle with large datasets due to high memory and computational costs. "Structure2vec" mitigates these issues by utilizing a compact, explicit feature map, combined with stochastic gradient descent, enabling efficient processing of large datasets.

Computational Performance

The paper highlights outstanding numerical results, illustrating that "structure2vec" achieves state-of-the-art predictive performance in several benchmark datasets. Notably, it processes large datasets significantly faster—twice as fast—and produces models that are 10,000 times smaller than those produced using traditional kernel methods.

Experimental Validation

Experiments on benchmark datasets such as SCOP and NCI demonstrate the robustness of the proposed technique. The empirical results substantiate the claim that "structure2vec" not only matches but often surpasses traditional methods in predictive accuracy. Moreover, its application to a 2.3 million molecule dataset from the Harvard Clean Energy Project showcases its scalability and efficiency.

Theoretical and Practical Implications

Theoretically, the paper advances the field by demonstrating how discriminative information can be used to jointly learn feature spaces and classifiers, breaking away from the constraints of pre-defined kernels. Practically, this approach is of great importance in fields requiring the processing of large volumes of complex, structured data, such as genomics and cheminformatics.

Future Directions

Looking forward, the approach opens avenues for further exploration into combining graphical models with discriminative embeddings. This could lead to new methodologies in AI that utilize deep learning techniques in conjunction with probabilistic graphical model inference, potentially enhancing the efficacy of models dealing with a wide array of structured data types.

In summary, "structure2vec" represents a significant step toward scalable and effective structured data representation by embedding latent variable models into discriminative feature spaces, offering compelling advantages in both computational efficiency and prediction accuracy.

PDF Markdown