The Future of Zero Trust and SASE is Now! Register now

close
close
The platform of the future is Netskope

Intelligent Security Service Edge (SSE), Cloud Access Security Broker (CASB), Cloud Firewall, Next Generation Secure Web Gateway (SWG), and Private Access for ZTNA built natively into a single solution to help every business on its journey to Secure Access Service Edge (SASE) architecture.

Go to Products Overview
Netskope video
Next Gen SASE Branch is hybrid — connected, secured, and automated

Netskope Next Gen SASE Branch converges Context-Aware SASE Fabric, Zero-Trust Hybrid Security, and SkopeAI-powered Cloud Orchestrator into a unified cloud offering, ushering in a fully modernized branch experience for the borderless enterprise.

Learn about Next Gen SASE Branch
People at the open space office
Designing a SASE Architecture For Dummies

Get your complimentary copy of the only guide to SASE design you’ll ever need.

Get the eBook
Embrace a Secure Access Service Edge (SASE) architecture

Netskope NewEdge is the world’s largest, highest-performing security private cloud and provides customers with unparalleled service coverage, performance and resilience.

Learn about NewEdge
NewEdge
Your Network of Tomorrow

Plan your path toward a faster, more secure, and more resilient network designed for the applications and users that you support.

Get the white paper
Your Network of Tomorrow
Netskope Cloud Exchange

The Netskope Cloud Exchange (CE) provides customers with powerful integration tools to leverage investments across their security posture.

Learn about Cloud Exchange
Netskope video
Make the move to market-leading cloud security services with minimal latency and high reliability.

Learn about NewEdge
Lighted highway through mountainside switchbacks
Safely enable the use of generative AI applications with application access control, real-time user coaching, and best-in-class data protection.

Learn how we secure generative AI use
Safely Enable ChatGPT and Generative AI
Zero trust solutions for SSE and SASE deployments

Learn about Zero Trust
Boat driving through open sea
Netskope achieves FedRAMP High Authorization

Choose Netskope GovCloud to accelerate your agency’s transformation.

Learn about Netskope GovCloud
Netskope GovCloud
  • Resources chevron

    Learn more about how Netskope can help you secure your journey to the cloud.

  • Blog chevron

    Learn how Netskope enables security and networking transformation through security service edge (SSE).

  • Events & Workshops chevron

    Stay ahead of the latest security trends and connect with your peers.

  • Security Defined chevron

    Everything you need to know in our cybersecurity encyclopedia.

Security Visionaries Podcast

Cookies, Not Biscuits
Host Emily Wearmouthas sits down with experts David Fairman and Zohar Hod to discuss the past, present, and future of internet cookies.

Play the podcast
Podcast: Cookies, Not Biscuits
Latest Blogs

How Netskope can enable the Zero Trust and SASE journey through security service edge (SSE) capabilities.

Read the blog
Sunrise and cloudy sky
SASE Week 2023: Your SASE journey starts now!

Replay sessions from the fourth annual SASE Week.

Explore sessions
SASE Week 2023
What is Security Service Edge?

Explore the security side of SASE, the future of network and protection in the cloud.

Learn about Security Service Edge
Four-way roundabout
We help our customers to be Ready for Anything

See our Customers
Woman smiling with glasses looking out window
Netskope’s talented and experienced Professional Services team provides a prescriptive approach to your successful implementation.

Learn about Professional Services
Netskope Professional Services
The Netskope Community can help you and your team get more value out of products and practices.

Go to the Netskope Community
The Netskope Community
Secure your digital transformation journey and make the most of your cloud, web, and private applications with Netskope training.

Learn about Training and Certifications
Group of young professionals working
  • Company chevron

    We help you stay ahead of cloud, data, and network security challenges.

  • Why Netskope chevron

    Cloud transformation and work from anywhere have changed how security needs to work.

  • Leadership chevron

    Our leadership team is fiercely committed to doing everything it takes to make our customers successful.

  • Partners chevron

    We partner with security leaders to help you secure your journey to the cloud.

Supporting sustainability through data security

Netskope is proud to participate in Vision 2045: an initiative aimed to raise awareness on private industry’s role in sustainability.

Find out more
Supporting Sustainability Through Data Security
Highest in Execution. Furthest in Vision.

Netskope recognized as a Leader in the 2023 Gartner® Magic Quadrant™ for Security Service Edge.

Get the report
Netskope recognized as a Leader in the 2023 Gartner® Magic Quadrant™ for Security Service Edge.
Thinkers, builders, dreamers, innovators. Together, we deliver cutting-edge cloud security solutions to help our customers protect their data and people.

Meet our team
Group of hikers scaling a snowy mountain
Netskope’s partner-centric go-to-market strategy enables our partners to maximize their growth and profitability while transforming enterprise security.

Learn about Netskope Partners
Group of diverse young professionals smiling

In the Blink of AI — How Artificial Intelligence is Changing the Way Enterprises Protect Sensitive Data in Images

Jul 27 2020

Co-authored by Yihua Liao and Yi Zhang

You have probably heard of how AI technology is used to recognize cats, dogs and humans in images, a task known as image classification. The same technology that identifies a cat or dog – can also identify sensitive data (such as identification cards and medical records) in images traversing your corporate network. In this blog post, we will show you how we use convolutional neural networks (CNN), transfer learning, and generative adversarial networks (GAN) to provide image data protection for Netskope’s enterprise customers. 

Image Data Security

Images represent over 25% of the corporate user traffic that goes through Netskope’s Data Loss Prevention (DLP) platform. Many of these images contain sensitive information, including customer or employee personally identifiable information (PII) (e.g., pictures of passports, driver’s licenses, and credit cards), screenshots of intellectual property, and confidential financial documents. By detecting sensitive information in images, documents, and application traffic flows, we help organizations comply with compliance regulations and protect their assets.

The traditional approach to identifying sensitive data in an image has been to use optical character recognition (OCR) to extract text out of the image. The extracted text is then used for pattern matching. This technology, though effective, is resource-intensive and delays detection of security violations. OCR also has difficulties identifying violations in low-quality images. In many cases, we only need to determine the classification of the input image. For example, we would like to find out whether an image is a credit card or not, without knowing the 16-digit card number and other details in the image. Machine learning-based image classification is an ideal choice for that because of its accuracy, speed and ability to work inline with granular policy controls. We can also combine image classification with OCR to generate more detailed violation alerts. 

CNN and Transfer Learning

Deep learning and convolutional neural networks (CNN) were a huge breakthrough in image classification in the early 2010s. Since then, CNN-based image classification has been applied to many different domains, including medicine, autonomous vehicles, and security, with accuracy close to that of humans. Inspired by how the human visual cortex works, a CNN is able to effectively capture the shapes, objects and other qualities to better understand the contents of the image. A typical CNN has two parts (depicted in the chart below):

  • The convolutional base, which consists of a stack of convolutional and pooling layers. The main goal of the convolutional base is to generate features from the image. It builds progressively higher-level features out of an input image. The early layers refer to general features, such as edges, lines, and dots in the image. Meanwhile, the latter layers refer to task-specific features, which are more human interpretable,  such as the logo on a credit card, or application windows in a screenshot. 
  • The classifier, which is usually composed of fully connected layers. Think of the classifier as a machine that sorts the features identified in the convolutional base. The classifier will tell you if the features identified are a cat, dog, drivers license, or X-ray.
Diagram of CNN and transfer learning
Image Source: DOI: 10.3390/electronics8030292

You may need millions of labeled images to train a CNN from scratch in order to achieve state-of-the-art classification accuracy. It is not trivial to collect a large number of images with proper labels, especially when you are dealing with sensitive data such as passports and credit cards. Fortunately, we can use transfer learning, a popular deep learning technique, to train a neural network with just hundreds or thousands of training samples. With transfer learning, we can leverage an existing convolutional neural network (e.g., ResNet or MobileNet) that was trained on a large dataset to classify other objects, and tweak it to train with additional images. Transfer learning allows us to train a CNN image classifier with a limited dataset and still achieve good performance while significantly reducing the training time.

Synthetic Training Data Generation

It’s very challenging to acquire real images for the sensitive categories we are interested in. To increase the amount and diversity of the training dataset and further improve the accuracy of CNN classifiers, we use generative adversarial networks (GAN) to generate synthetic training data. The basic idea of a GAN is to create two neural networks (high-level architecture diagram below), which compete against each other. One neural network, called the generator, generates fake data, while the other, the discriminator, evaluates them for authenticity. The goal is to generate data that is similar to the training data and fool the discriminator.

Diagram of GAN
Image Source: Deep Convolutional Generative Adversarial Networks

With a GAN, we are able to synthesize photorealistic images with varying degrees of change in rotation, color, blurring, background, and so on. Here are a few examples of the synthetic images:

Examples of synthetic images

Netskope’s Inline DLP Image Classifiers

At Netskope, we have developed CNN-based image classifiers, as part of our Next Gen SWG and cloud inline solutions covering managed apps, unmanaged apps, custom apps, and public cloud service user traffic. The classifiers are able to accurately identify images with sensitive information, including passports, driver’s licenses, US social security cards, credit cards and debit cards, fullscreen and application screenshots, etc. The inline classifiers provide granular policy controls in real-time.

Examples of passports, drivers licenses, social security numbers, and credit/debit cards
Screenshots of examples

Future Work

At Netskope, we are actively expanding our portfolio of inline image classifiers with the latest computer vision technology. We also have the capability to train custom classifiers and identify new types of images that our customers are interested in classifying. If your organization has unique assets that may be shared in images and you’d like to protect those assets, please contact us at [email protected] to learn more.

author image
Yihua Liao
Dr. Yihua Liao is the Head of AI Labs at Netskope. His team Develops cutting-edge AI/ML technology to tackle many challenging problems in cloud security, including data loss prevention, malware and threat protection, and user/entity behavior analytics. Previously, he led data science teams at Uber and Facebook.

Stay informed!

Subscribe for the latest from the Netskope Blog