Skip to main content




MSc, Adrian Murphy and Daniel Larsson: Towards Automated Log Message Embeddings for Anomaly Detection

From: 2024-01-24 10:30 to 11:30 Seminarium

Date & Time: January 24th, 10:30-11:30
Location: Seminar Room M 3170-73 at Dept. of Automatic Control, LTH
Author: Adrian Murphy and Daniel Larsson
Title: Towards Automated Log Message Embeddings for Anomaly Detection
Supervisor: Johan Eker, LTH, Ola Angelsmark and Fanny Söderlund - Advenica
Examiner: Karl-Erik Årzén, LTH

Abstract: Log messages are implemented by developers to record important runtime information about a system. For that reason, system logs can provide insight into the state and health of a system and potentially be used to anticipate and discover errors. Manually inspecting these logs becomes impractical due to the high volume of messages generated by modern systems. Consequently, the research field of machine learning-based log anomaly detection has emerged to automatically identify irregularities. Parsing log messages into a structured, tractable format is a vital step in log anomaly detection. This degree project investigates the application of log message embeddings, a recently proposed log parsing method, for anomaly detection in complex IT systems and measures their resilience to concept drift, where the format of log messages changes over time, in comparison with a traditional parsing approach that utilizes Drain. Empirical analyses are conducted on two benchmark datasets, revealing that log message embeddings not only achieve anomaly detection results on par with traditional methods but also, as opposed to Drain, demonstrate considerable robustness against concept drift. A key focus of this project is on the application of large language models to automate the log embedding pipeline by handling out-of-vocabulary words and extracting synonymous and antonymous word relationships. These capabilities are important for distinguishing log messages that are identical except for one or more synonymous or antonymous word pairs. While large language models show promise in these tasks, experiments highlight the need for further refinement to match the performance achieved through manual operator feedback.

Om händelsen
From: 2024-01-24 10:30 to 11:30

Seminar Room M 3170-73 at Dept. of Automatic Control, LTH

johan [dot] eker [at] control [dot] lth [dot] se