Papers
Topics
Authors
Recent
2000 character limit reached

LiteMORT: A memory efficient gradient boosting tree system on adaptive compact distributions (2001.09419v1)

Published 26 Jan 2020 in cs.LG and stat.ML

Abstract: Gradient boosted decision trees (GBDT) is the leading algorithm for many commercial and academic data applications. We give a deep analysis of this algorithm, especially the histogram technique, which is a basis for the regulized distribution with compact support. We present three new modifications. 1) Share memory technique to reduce memory usage. In many cases, it only need the data source itself and no extra memory. 2) Implicit merging for "merge overflow problem"."merge overflow" means that merge some small datasets to huge datasets, which are too huge to be solved. By implicit merging, we just need the original small datasets to train the GBDT model. 3) Adaptive resize algorithm of histogram bins to improve accuracy. Experiments on two large Kaggle competitions verified our methods. They use much less memory than LightGBM and have higher accuracy. We have implemented these algorithms in an open-source package LiteMORT. The source codes are available at https://github.com/closest-git/LiteMORT

Citations (2)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Video Overview

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.