Towards Learning without Labels for Unsupervised Detection of Malicious Executables

Abstract

The dissertation presents novel unsupervised deep learning methodologies for detecting malicious executable files, addressing the gap in research on unsupervised detection of Windows-based portable executables (PEs). It begins by compiling a dataset of real-world malicious and benign files, acknowledging the difficulty of manual labeling. The research progresses from a feature learning approach to a sophisticated distribution modeling strategy, incorporating a deep ensemble architecture and soft clustering to deepen the understanding of the contextual relationships between samples. This results in superior performance over traditional methods. The final methodology leverages Convolutional Neural Networks (CNNs) and PE semantics for classification without manual data labeling, proving effective against the growing threat of malicious executables. The methodologies are validated on various datasets, showing enhanced performance compared to current state-of-the-art methods, ultimately offering organizations robust tools to mitigate malware threats.
skjr web-banner17
Ellipse-113.svg