A Generic Algorithm for Top-K On-Shelf Utility Mining (2208.14230v1)

Published 27 Aug 2022 in cs.DB and cs.AI

Abstract: On-shelf utility mining (OSUM) is an emerging research direction in data mining. It aims to discover itemsets that have high relative utility in their selling time period. Compared with traditional utility mining, OSUM can find more practical and meaningful patterns in real-life applications. However, there is a major drawback to traditional OSUM. For normal users, it is hard to define a minimum threshold minutil for mining the right amount of on-shelf high utility itemsets. On one hand, if the threshold is set too high, the number of patterns would not be enough. On the other hand, if the threshold is set too low, too many patterns will be discovered and cause an unnecessary waste of time and memory consumption. To address this issue, the user usually directly specifies a parameter k, where only the top-k high relative utility itemsets would be considered. Therefore, in this paper, we propose a generic algorithm named TOIT for mining Top-k On-shelf hIgh-utility paTterns to solve this problem. TOIT applies a novel strategy to raise the minutil based on the on-shelf datasets. Besides, two novel upper-bound strategies named subtree utility and local utility are applied to prune the search space. By adopting the strategies mentioned above, the TOIT algorithm can narrow the search space as early as possible, improve the mining efficiency, and reduce the memory consumption, so it can obtain better performance than other algorithms. A series of experiments have been conducted on real datasets with different styles to compare the effects with the state-of-the-art KOSHU algorithm. The experimental results showed that TOIT outperforms KOSHU in both running time and memory consumption.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Related Papers

Towards Target High-Utility Itemsets (2022)
Targeted Mining of Top-k High Utility Itemsets (2023)
TOPIC: Top-k High-Utility Itemset Discovering (2021)
On-shelf Utility Mining of Sequence Data (2020)
Beyond Frequency: Utility Mining with Varied Item-Specific Minimum Utility (2019)