Enhancing Security with AI/ML

Digital transformation has driven the rapid adoption of cloud-delivered services like SaaS/IaaS/PaaS in enterprises. This, in turn, has resulted in the migration of digital assets (aka data) from the confines of enterprise data centers to the cloud data centers that are not under the control of the enterprises. Additionally, the onset of the COVID-19 pandemic has resulted in remote work becoming the norm. These trends have, in turn, forced a security transformation from the traditional stack of security appliances deployed in an enterprise data center to cloud-delivered security. Gartner has coined the term security service edge (SSE) to represent this new platform where security services like secure web gateway, cloud access security broker, zero trust network access, egress firewall, etc. are delivered in the cloud to safely enable users to perform their work and to reduce the risk of getting compromised and losing data.

There are a few key capabilities that are critical to SSE solutions:

Zero Trust Data Access – SSE solutions enforce security policies for accessing data based on contextual information like user, device, application, application risk, user activity, user risk, etc. This contextual information becomes the virtual badge that allows/denies/coaches a user’s access to an enterprise’s digital assets
Insider Threat Detection – Enterprise users (employees, contractors) are entrusted with access to business-sensitive data to carry out their work. Security controls are needed to ensure these insiders do not inadvertently or maliciously exfiltrate the sensitive data thereby putting the business at risk.
External Threat Detection – Every enterprise is under attack from external bad actors looking to compromise the coveted data for monetary or strategic control purposes. These actors can be individual hackers and organized cybercrime groups, as well as nation-states. The attacks can be phishing, malware, ransomware, or even sophisticated APT attacks. SSE solutions provide effective threat detection, prevention, and remediation services as an added layer of defense to enterprises to protect their data.

The role of AI/ML in SSE solutions

The key underpinning of a powerful SSE solution is the ability to extract very rich contextual information when processing network traffic and enforce the zero trust data access policies. Some of the inputs needed for making the data access decision are the sensitivity of the data leaving an enterprise as well as indicators of threat in data coming from external sources. These are areas where artificial intelligence (AI) and machine learning (ML) have proven to be invaluable in enhancing the fidelity of detections. Let’s look at this in more detail:

Sensitive data classification

Legacy data security solutions use a combination of regular expressions, keywords, and dictionaries to identify sensitive data. This is very error-prone and leads to excessive false positives and in turn, adds a burden to security analysts to sift through mounds of alerts to identify the real violations.

Machine learning-driven data classification can significantly reduce this burden and provide high fidelity classification verdicts. Natural language processing (NLP) algorithms are very conducive to solving this problem. NLP models have been developed by Netskope to classify common business documents like tax forms, paychecks, business contracts, non-disclosure agreements, etc. By using these pre-built models, security admins do not have to create cumbersome and error-prone regular expressions and other patterns to identify which of these types of documents contain sensitive information that needs to be protected from compromise.

In the Netskope Security Cloud, 20% of documents being scanned are images, like JPG and PNG files. Additionally, many business documents have embedded images. The most common way of classifying images is to run them through an optical character recognition (OCR) engine. It is well known that the efficacy of OCR engines is marginal for the commonly seen image content. This is another area where AI/ML can be leveraged to yield outstanding results. There are a number of deep learning algorithms that are suitable for classifying image data. Examples of image detection AI/ML models deployed in the Netskope Security Cloud include passports, drivers’ licenses, other photo identification, computer screenshots, whiteboard images, etc. Given the rise of privacy regulations around the world like CCPA, GDPR, LGPD, etc., it becomes very important for enterprises in possession of images that contain PII data to protect it from being compromised by insiders and external actors.

Threat detection

Insider threat continues to be one of the biggest issues facing enterprises these days. Departing employees tend to take the sensitive information like design documents and code that they contributed to while working in the company. Malicious insiders also steal company data and share it externally. The Netskope Intelligent SSE solution keeps a log of all user activities and applies AI/ML algorithms to detect anomalous behavior. In addition to alerting the admins about the anomalous behavior, the solution also maintains a risk score for every user similar to the credit score that each of us has. The risk score is then fed into the zero trust data access policies as a matching criterion. For example, a user with a poor user risk score can be denied access to sensitive data.

A very common way in which threats like malware and ransomware are detected is using vulnerability and exploit signatures. Indicators of compromise like bad file hashes and malicious URLs are also other techniques used to detect threats. These techniques are good at detecting known vulnerabilities but what about the unknown or what is commonly referred to as zero-day threats. This is where AI/ML comes to the rescue. By training AI/ML models with the vast number of known vulnerabilities and exploits, the trained models are able to detect yet to-be-discovered attacks. Netskope has successfully developed AI/ML models to detect threats in executable files (referred to as PE files) as well as common document formats like PDF and Microsoft Office documents.

In the Netskope Next Gen Secure Web Gateway, AI/ML models are used to classify URLs as well as the web content belonging to phishing sites that tend to steal user credentials. AI/ML is also used to categorize websites and help block inappropriate content from being viewed by enterprise users.

Conclusion

In this blog, we have seen that AI/ML algorithms can help solve a variety of problems that are commonly seen in enterprises. When it comes to SSE solutions, it has to be noted that these AI/ML algorithms have to be optimized to run and return a verdict in real time to be effective. Over time, there are going to be many more challenging use cases that AI/ML can be used to solve effectively.

Krishna Narayanaswamy

A highly regarded and awarded researcher in security, behavioral anomaly detection, and deep packet inspection, Krishna Narayanaswamy brings two decades of technical and thought leadership as founder and chief technology officer at Netskope.

Subscribe to the Netskope Blog

Subscribe now