Unified Parallel Semantic Log Parsing based on Causal Graph Construction for Attack Attribution
Abstract / Description
Abstract
Multi-source logs offer a holistic view of system activities, enabling detailed analysis for detecting potential threats. A practical method for threat detection involves explicit extraction of entity triples (subject, action, object) to construct provenance graphs, facilitating system behavior analysis. However, existing log parsing methods primarily focus on extracting parameters and events from raw logs while entity extraction methods are often limited to processing single log types. To address these limitations, we propose UTLParser, a novel scalable unified framework for log parsing and analysis. UTLParser adopts semantic analysis to construct causal graphs by merging multiple sub-graphs from diverse log sources within a labeled log dataset. It leverages domain-specific knowledge, such as Points of Interest for threat hunting, and implements parallel processing at both subgraph fusion and fine-grained individual log parsing levels. Additionally, UTLParser addresses log generation delays and provides optimized interfaces for temporal graph querying. Our experimental results demonstrate that UTLParser overcomes the limitations of existing log parsing approaches, achieving superior performance on certain log types. Moreover, UTLParser precisely extracts explicit causal threat information while maintaining compatibility with a wide range of downstream applications.
Illustration
