Parameter-Efficient Log Anomaly Detection based on Pre-training model and LORA

Shiming He; Ying Lei; Ying Zhang; Kun Xie; Pradip Kumar Sharma

doi:10.1109/ISSRE59848.2023.00038

Parameter-Efficient Log Anomaly Detection based on Pre-training model and LORA

Shiming He^* (Corresponding Author), Ying Lei, Ying Zhang, Kun Xie, Pradip Kumar Sharma

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

Abstract

Logs record both the normal and abnormal system operating status at any time, which are crucial data during system operation. Log anomaly detection can help with system debugging and analyzing root causes, such as system fault, shutdown fault, null-pointer exception, illegal-argument exception, and class cast exception. Deep learning is widely applied to log anomaly detection to enhance detection accuracy. However, the deep learning model requires a lot of label logs, which consume large amounts of labor and time. To tackle this label requirement problem, the pre-training model is introduced, for instance, the Bidirectional Encoder Representations from Transformers (BERT). However, the pre-training model brings new problems. The parameters of BERT needed to be fine-tuned are huge, resulting in a high training overhead. Besides, the direct word sequence input representation of BERT ignores the semantic information among logs. Therefore, we propose a parameter-efficient log anomaly detection scheme (LogBP-LORA) based on BERT and Low-Rank Adaptation (LORA). LORA is an effective parameter-tuning strategy. LogBP-LORA increases bypass weight matrices and only updates the bypass parameters instead of all the original parameters to reduce the training overhead. Additionally, LogBP-LORA exploits log event sequence representation to obtain more semantic information with a shorter sequence length. Extensive experiments carry on three public log datasets, BGL, Thunderbird, and HDFS, demonstrate LogBP-LORA can obtain favorable performance with lower resource consumption. When fewer label data is available, LogBP-LORA achieves about 10%-99% higher F1-score compared with Neurallog, Deeplog, MADDC, and Loganomaly. The training parameters of LogBP-LoRA are only 0.06% of the original parameters of BERT.

Original language	English
Title of host publication	Proceedings - 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023
Publisher	IEEE Computer Society
Pages	207-217
Number of pages	11
ISBN (Electronic)	9798350315943
DOIs	https://doi.org/10.1109/ISSRE59848.2023.00038
Publication status	Published - 2 Nov 2023
Event	34th IEEE International Symposium on Software Reliability Engineering, ISSRE 2023 - Florence, Italy Duration: 9 Oct 2023 → 12 Oct 2023

Publication series

Name	Proceedings - International Symposium on Software Reliability Engineering, ISSRE
ISSN (Print)	1071-9458

Conference

Conference	34th IEEE International Symposium on Software Reliability Engineering, ISSRE 2023
Country/Territory	Italy
City	Florence
Period	9/10/23 → 12/10/23

Keywords

BERT
Log anomaly detection
log event
log feature extraction
parameter-tuning strategy

Access to Document

10.1109/ISSRE59848.2023.00038Licence: Unspecified

Cite this

He, S., Lei, Y., Zhang, Y., Xie, K., & Sharma, P. K. (2023). Parameter-Efficient Log Anomaly Detection based on Pre-training model and LORA. In Proceedings - 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023 (pp. 207-217). (Proceedings - International Symposium on Software Reliability Engineering, ISSRE). IEEE Computer Society. https://doi.org/10.1109/ISSRE59848.2023.00038

Parameter-Efficient Log Anomaly Detection based on Pre-training model and LORA. / He, Shiming (Corresponding Author); Lei, Ying; Zhang, Ying et al.
Proceedings - 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023. IEEE Computer Society, 2023. p. 207-217 (Proceedings - International Symposium on Software Reliability Engineering, ISSRE).

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

He, S, Lei, Y, Zhang, Y, Xie, K & Sharma, PK 2023, Parameter-Efficient Log Anomaly Detection based on Pre-training model and LORA. in Proceedings - 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023. Proceedings - International Symposium on Software Reliability Engineering, ISSRE, IEEE Computer Society, pp. 207-217, 34th IEEE International Symposium on Software Reliability Engineering, ISSRE 2023, Florence, Italy, 9/10/23. https://doi.org/10.1109/ISSRE59848.2023.00038

He S, Lei Y, Zhang Y, Xie K, Sharma PK. Parameter-Efficient Log Anomaly Detection based on Pre-training model and LORA. In Proceedings - 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023. IEEE Computer Society. 2023. p. 207-217. (Proceedings - International Symposium on Software Reliability Engineering, ISSRE). Epub 2023 Nov 2. doi: 10.1109/ISSRE59848.2023.00038

@inproceedings{8b5dc8d7b19e4da5bba355960b538425,

title = "Parameter-Efficient Log Anomaly Detection based on Pre-training model and LORA",

abstract = "Logs record both the normal and abnormal system operating status at any time, which are crucial data during system operation. Log anomaly detection can help with system debugging and analyzing root causes, such as system fault, shutdown fault, null-pointer exception, illegal-argument exception, and class cast exception. Deep learning is widely applied to log anomaly detection to enhance detection accuracy. However, the deep learning model requires a lot of label logs, which consume large amounts of labor and time. To tackle this label requirement problem, the pre-training model is introduced, for instance, the Bidirectional Encoder Representations from Transformers (BERT). However, the pre-training model brings new problems. The parameters of BERT needed to be fine-tuned are huge, resulting in a high training overhead. Besides, the direct word sequence input representation of BERT ignores the semantic information among logs. Therefore, we propose a parameter-efficient log anomaly detection scheme (LogBP-LORA) based on BERT and Low-Rank Adaptation (LORA). LORA is an effective parameter-tuning strategy. LogBP-LORA increases bypass weight matrices and only updates the bypass parameters instead of all the original parameters to reduce the training overhead. Additionally, LogBP-LORA exploits log event sequence representation to obtain more semantic information with a shorter sequence length. Extensive experiments carry on three public log datasets, BGL, Thunderbird, and HDFS, demonstrate LogBP-LORA can obtain favorable performance with lower resource consumption. When fewer label data is available, LogBP-LORA achieves about 10%-99% higher F1-score compared with Neurallog, Deeplog, MADDC, and Loganomaly. The training parameters of LogBP-LoRA are only 0.06% of the original parameters of BERT.",

keywords = "BERT, Log anomaly detection, log event, log feature extraction, parameter-tuning strategy",

author = "Shiming He and Ying Lei and Ying Zhang and Kun Xie and Sharma, {Pradip Kumar}",

year = "2023",

month = nov,

day = "2",

doi = "10.1109/ISSRE59848.2023.00038",

language = "English",

series = "Proceedings - International Symposium on Software Reliability Engineering, ISSRE",

publisher = "IEEE Computer Society",

pages = "207--217",

booktitle = "Proceedings - 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023",

note = "34th IEEE International Symposium on Software Reliability Engineering, ISSRE 2023 ; Conference date: 09-10-2023 Through 12-10-2023",

}

TY - GEN

T1 - Parameter-Efficient Log Anomaly Detection based on Pre-training model and LORA

AU - He, Shiming

AU - Lei, Ying

AU - Zhang, Ying

AU - Xie, Kun

AU - Sharma, Pradip Kumar

PY - 2023/11/2

Y1 - 2023/11/2

N2 - Logs record both the normal and abnormal system operating status at any time, which are crucial data during system operation. Log anomaly detection can help with system debugging and analyzing root causes, such as system fault, shutdown fault, null-pointer exception, illegal-argument exception, and class cast exception. Deep learning is widely applied to log anomaly detection to enhance detection accuracy. However, the deep learning model requires a lot of label logs, which consume large amounts of labor and time. To tackle this label requirement problem, the pre-training model is introduced, for instance, the Bidirectional Encoder Representations from Transformers (BERT). However, the pre-training model brings new problems. The parameters of BERT needed to be fine-tuned are huge, resulting in a high training overhead. Besides, the direct word sequence input representation of BERT ignores the semantic information among logs. Therefore, we propose a parameter-efficient log anomaly detection scheme (LogBP-LORA) based on BERT and Low-Rank Adaptation (LORA). LORA is an effective parameter-tuning strategy. LogBP-LORA increases bypass weight matrices and only updates the bypass parameters instead of all the original parameters to reduce the training overhead. Additionally, LogBP-LORA exploits log event sequence representation to obtain more semantic information with a shorter sequence length. Extensive experiments carry on three public log datasets, BGL, Thunderbird, and HDFS, demonstrate LogBP-LORA can obtain favorable performance with lower resource consumption. When fewer label data is available, LogBP-LORA achieves about 10%-99% higher F1-score compared with Neurallog, Deeplog, MADDC, and Loganomaly. The training parameters of LogBP-LoRA are only 0.06% of the original parameters of BERT.

AB - Logs record both the normal and abnormal system operating status at any time, which are crucial data during system operation. Log anomaly detection can help with system debugging and analyzing root causes, such as system fault, shutdown fault, null-pointer exception, illegal-argument exception, and class cast exception. Deep learning is widely applied to log anomaly detection to enhance detection accuracy. However, the deep learning model requires a lot of label logs, which consume large amounts of labor and time. To tackle this label requirement problem, the pre-training model is introduced, for instance, the Bidirectional Encoder Representations from Transformers (BERT). However, the pre-training model brings new problems. The parameters of BERT needed to be fine-tuned are huge, resulting in a high training overhead. Besides, the direct word sequence input representation of BERT ignores the semantic information among logs. Therefore, we propose a parameter-efficient log anomaly detection scheme (LogBP-LORA) based on BERT and Low-Rank Adaptation (LORA). LORA is an effective parameter-tuning strategy. LogBP-LORA increases bypass weight matrices and only updates the bypass parameters instead of all the original parameters to reduce the training overhead. Additionally, LogBP-LORA exploits log event sequence representation to obtain more semantic information with a shorter sequence length. Extensive experiments carry on three public log datasets, BGL, Thunderbird, and HDFS, demonstrate LogBP-LORA can obtain favorable performance with lower resource consumption. When fewer label data is available, LogBP-LORA achieves about 10%-99% higher F1-score compared with Neurallog, Deeplog, MADDC, and Loganomaly. The training parameters of LogBP-LoRA are only 0.06% of the original parameters of BERT.

KW - BERT

KW - Log anomaly detection

KW - log event

KW - log feature extraction

KW - parameter-tuning strategy

UR - http://www.scopus.com/inward/record.url?scp=85178007701&partnerID=8YFLogxK

U2 - 10.1109/ISSRE59848.2023.00038

DO - 10.1109/ISSRE59848.2023.00038

M3 - Published conference contribution

AN - SCOPUS:85178007701

T3 - Proceedings - International Symposium on Software Reliability Engineering, ISSRE

SP - 207

EP - 217

BT - Proceedings - 2023 IEEE 34th International Symposium on Software Reliability Engineering, ISSRE 2023

PB - IEEE Computer Society

T2 - 34th IEEE International Symposium on Software Reliability Engineering, ISSRE 2023

Y2 - 9 October 2023 through 12 October 2023

ER -

Parameter-Efficient Log Anomaly Detection based on Pre-training model and LORA

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this