This is the fourth in an ongoing series of blogs focused on AI/ML.
Malware detection is an important part of the Netskope Security Cloud platform, complete with a secure access service edge (SASE) architecture, that we provide to our customers. Malware is malicious software that is designed to harm or exploit devices and computer systems. Various types of malware, such as viruses, worms, Trojan horses, ransomware, and spyware, remain a serious problem for corporations and government agencies. Traditional malware detection systems rely on anti-virus signatures, heuristics, and behavior patterns in sandboxes, which require a significant amount of manual analysis from security analysts and researchers. With new attacks and variants emerging every day, it is hard for organizations to keep pace with malware threats. In comparison, artificial intelligence (AI) and machine learning (ML) has the potential to detect unknown and zero-day malware by automatically learning the malware patterns based on large volumes of historical data. This unique capability has made AI/ML an indispensable part of a modern malware detection solution, complementing heuristic and signature-based approaches.
At Netskope, we have developed a comprehensive, multi-layered threat protection system to scan our customers’ network traffic. AI/ML is used to power multiple engines in the inline fast scan, as well as static and dynamic analysis-based deep scan. In this blog post, we will highlight three of them:
- Inline PE Classifier
- MS Office Classifier
- Cloud Sandbox
Inline PE Classifier
The Portable Executable (PE) file format is used by Windows executables, object code, and dynamic link libraries (DLLs). It’s one of the most common malware file formats. To stop malicious PE files in real-time, we have developed the inline PE classifier. Trained with millions of malicious and benign PE samples, the ML-based classifier is able to identify malware patterns in raw bytes. The classifier doesn’t need to parse a PE file and extract features based on domain knowledge. Therefore, it’s lightweight, fast, and suitable for inline predictions.
The inline PE classifier complements the signature-based malware engines in fast scan. Since its launch, the classifier has detected unique malware samples that were undetectable to signature-based inline engines, without introducing any new false positives. Its runtime in production is just a few milliseconds.
This high efficacy ML classifier enables faster time to detection for unique detections that can be blocked inline and complements the dynamic analysis with advanced forensics in the Advanced Threat Protection engines.
MS Office Classifier
Microsoft Office documents are another common source of malware. As part of Netskope’s Advanced Threat Protection, the Office Classifier is designed to leverage a combination of heuristics and supervised machine learning to identify malicious code embedded in Office documents. The Office Classifier performs static analysis and extracts detailed information about the components in an Office file, including embedded macros (VBA), dynamic data exchange (DDE), and other jpg/mpeg or EXE/PE files. The extracted information is then mapped to hundreds of features to train ML classification models and predict whether a new Office document is malicious or not.
The Office Classifier provides proactive coverage against zero-day malware attacks that can evade signature-based detections. For example, the Office Classifier has detected downloads of multiple zero-day