Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1
Gemini 2.5 Flash 128 tok/s Pro
Gemini 2.5 Pro 41 tok/s Pro
Kimi K2 203 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Self-Supervised Log Parsing (2003.07905v1)

Published 17 Mar 2020 in cs.LG and cs.SE

Abstract: Logs are extensively used during the development and maintenance of software systems. They collect runtime events and allow tracking of code execution, which enables a variety of critical tasks such as troubleshooting and fault detection. However, large-scale software systems generate massive volumes of semi-structured log records, posing a major challenge for automated analysis. Parsing semi-structured records with free-form text log messages into structured templates is the first and crucial step that enables further analysis. Existing approaches rely on log-specific heuristics or manual rule extraction. These are often specialized in parsing certain log types, and thus, limit performance scores and generalization. We propose a novel parsing technique called NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling (MLM). In the process of parsing, the model extracts summarizations from the logs in the form of a vector embedding. This allows the coupling of the MLM as pre-training with a downstream anomaly detection task. We evaluate the parsing performance of NuLog on 10 real-world log datasets and compare the results with 12 parsing techniques. The results show that NuLog outperforms existing methods in parsing accuracy with an average of 99% and achieves the lowest edit distance to the ground truth templates. Additionally, two case studies are conducted to demonstrate the ability of the approach for log-based anomaly detection in both supervised and unsupervised scenario. The results show that NuLog can be successfully used to support troubleshooting tasks. The implementation is available at https://github.com/nulog/nulog.

Citations (65)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.