Emergent Mind

Kernel Distillation for Fast Gaussian Processes Prediction

(1801.10273)
Published Jan 31, 2018 in stat.ML and cs.LG

Abstract

Gaussian processes (GPs) are flexible models that can capture complex structure in large-scale dataset due to their non-parametric nature. However, the usage of GPs in real-world application is limited due to their high computational cost at inference time. In this paper, we introduce a new framework, \textit{kernel distillation}, to approximate a fully trained teacher GP model with kernel matrix of size $n\times n$ for $n$ training points. We combine inducing points method with sparse low-rank approximation in the distillation procedure. The distilled student GP model only costs $O(m2)$ storage for $m$ inducing points where $m \ll n$ and improves the inference time complexity. We demonstrate empirically that kernel distillation provides better trade-off between the prediction time and the test performance compared to the alternatives.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.