In the modern, cloud-first era, traditional data protection technology approaches struggle to keep up. Data is rapidly growing in volume, variety, and velocity. It is becoming more and more unstructured, and therefore, harder to detect, and consequently, to protect. Most DLP solutions today rely only on textual data analysis in order to detect what data is sensitive, utilizing regular character patterns and content matching techniques applied to “conventional” data types (such as Word documents and spreadsheets). These techniques were once revolutionary; today, they are behind.
Similarly, traditional zero trust network access (ZTNA) solutions often depend on broad wildcard domains and IP subnets, making secure access management cumbersome and inefficient.
Don’t get me wrong: it is fundamental for DLP to be equipped with as many text analysis tools as possible—after all, if identifiable, it’s the content itself that is sensitive. DLP must be able to recognize thousands of known sensitive data types and unambivalent regular expressions, plus understand different data specific to countries and languages. For reliability, DLP must also be equipped with highly scalable data fingerprinting engines that can memorize and match specific information found in sensitive databases and documents. Textual content must be clear and legible in order to be leveraged by such engines. To minimize false positives, today it is also fundamental to leverage rich context, deep learning, natural language processing (NLP), and other newer ML and AI based automated techniques.
ZTNA must also adapt to reduce operational overhead and strengthen security by simplifying policy management and configuration audits. Modern business relies on unstructured data like images and screenshots to quickly share information. However, traditional optical character recognition (OCR) struggles with low-quality images and consumes excessive resources, leading to delays and unreliable results. As visual data sharing grows, smarter solutions are needed to accurately identify and protect sensitive information.
To address these challenges, Netskope has pioneered SkopeAI, a suite of AI and ML innovations that revolutionizes both data protection and secure access.
Evolving modern DLP
For modern businesses, DLP has to evolve. Think of the necessity for modern DLP as akin to functioning like a human brain. Our brain doesn’t necessarily have to read the text in a document like a picture ID to tell that the document is indeed a picture ID containing personally identifiable information (PII). Now, modern DLP can do the same.
To solve modern DLP challenges, Netskope has pioneered ML-enabled image classification. This technique leverages deep learning and convolutional neural networks (CNN) to swiftly and accurately identify sensitive images without the need for text extraction. It mimics the human visual cortex, recognizing visual characteristics such as shapes and details to comprehend the image as a whole (much like how we can recognize that a passport is a passport without necessarily reading the details in it). ML enables feature recognition even in poor quality images, akin to the capabilities of the human eye. This is crucial, as images can be blurry, damaged, or discolored, yet still contain sensitive information.
The importance of personalized data classifiers
Netskope’s industry-leading ML classifiers empower automated identification of sensitive data, revolutionizing the categorization of images and documents with exceptional precision. This breakthrough technology detects and safeguards various sensitive data types, including source code, tax forms, patents, identification documents like passports and driver’s licenses, credit and debit cards, as well as full-screen screenshots and application screenshots. The ML classifiers work in conjunction with text-based DLP analysis (like data identifiers, exact matching, document fingerprinting, ML-based NLP and deep learning etc.), complimenting the DLP analysis of a file when text is indecipherable or harder to extract. They greatly enhance the detection accuracy and help