The Future of Zero Trust and SASE is Now! Watch on-demand

close
close
  • Why Netskope chevron

    Changing the way networking and security work together.

  • Our Customers chevron

    Netskope serves more than 3,000 customers worldwide including more than 25 of the Fortune 100

  • Our Partners chevron

    We partner with security leaders to help you secure your journey to the cloud.

Highest in Execution. Furthest in Vision.

Netskope recognized as a Leader in the 2023 Gartner® Magic Quadrant™ for Security Service Edge.

Get the report
Netskope recognized as a Leader in the 2023 Gartner® Magic Quadrant™ for Security Service Edge.
We help our customers to be Ready for Anything

See our customers
Woman smiling with glasses looking out window
Netskope’s partner-centric go-to-market strategy enables our partners to maximize their growth and profitability while transforming enterprise security.

Learn about Netskope Partners
Group of diverse young professionals smiling
Your Network of Tomorrow

Plan your path toward a faster, more secure, and more resilient network designed for the applications and users that you support.

Get the white paper
Your Network of Tomorrow
Introducing the Netskope One Platform

Netskope One is a cloud-native platform that offers converged security and networking services to enable your SASE and zero trust transformation.

Learn about Netskope One
Abstract with blue lighting
Embrace a Secure Access Service Edge (SASE) architecture

Netskope NewEdge is the world’s largest, highest-performing security private cloud and provides customers with unparalleled service coverage, performance and resilience.

Learn about NewEdge
NewEdge
Netskope Cloud Exchange

The Netskope Cloud Exchange (CE) provides customers with powerful integration tools to leverage investments across their security posture.

Learn about Cloud Exchange
Netskope video
The platform of the future is Netskope

Intelligent Security Service Edge (SSE), Cloud Access Security Broker (CASB), Cloud Firewall, Next Generation Secure Web Gateway (SWG), and Private Access for ZTNA built natively into a single solution to help every business on its journey to Secure Access Service Edge (SASE) architecture.

Go to Products Overview
Netskope video
Next Gen SASE Branch is hybrid — connected, secured, and automated

Netskope Next Gen SASE Branch converges Context-Aware SASE Fabric, Zero-Trust Hybrid Security, and SkopeAI-powered Cloud Orchestrator into a unified cloud offering, ushering in a fully modernized branch experience for the borderless enterprise.

Learn about Next Gen SASE Branch
People at the open space office
Designing a SASE Architecture For Dummies

Get your complimentary copy of the only guide to SASE design you’ll ever need.

Get the eBook
Make the move to market-leading cloud security services with minimal latency and high reliability.

Learn about NewEdge
Lighted highway through mountainside switchbacks
Safely enable the use of generative AI applications with application access control, real-time user coaching, and best-in-class data protection.

Learn how we secure generative AI use
Safely Enable ChatGPT and Generative AI
Zero trust solutions for SSE and SASE deployments

Learn about Zero Trust
Boat driving through open sea
Netskope achieves FedRAMP High Authorization

Choose Netskope GovCloud to accelerate your agency’s transformation.

Learn about Netskope GovCloud
Netskope GovCloud
  • Resources chevron

    Learn more about how Netskope can help you secure your journey to the cloud.

  • Blog chevron

    Learn how Netskope enables security and networking transformation through security service edge (SSE)

  • Events and Workshops chevron

    Stay ahead of the latest security trends and connect with your peers.

  • Security Defined chevron

    Everything you need to know in our cybersecurity encyclopedia.

Security Visionaries Podcast

Elections, Disinformation, and Security
This episode takes a look at aspects of election security around voter registration and physical controls at polling places.

Play the podcast
Blog: Elections, Disinformation, and Security
Latest Blogs

Read how Netskope can enable the Zero Trust and SASE journey through security service edge (SSE) capabilities.

Read the blog
Sunrise and cloudy sky
SASE Week 2023: Your SASE journey starts now!

Replay sessions from the fourth annual SASE Week.

Explore sessions
SASE Week 2023
What is Security Service Edge?

Explore the security side of SASE, the future of network and protection in the cloud.

Learn about Security Service Edge
Four-way roundabout
  • Company chevron

    We help you stay ahead of cloud, data, and network security challenges.

  • Leadership chevron

    Our leadership team is fiercely committed to doing everything it takes to make our customers successful.

  • Customer Solutions chevron

    We are here for you and with you every step of the way, ensuring your success with Netskope.

  • Training and Certification chevron

    Netskope training will help you become a cloud security expert.

Supporting sustainability through data security

Netskope is proud to participate in Vision 2045: an initiative aimed to raise awareness on private industry’s role in sustainability.

Find out more
Supporting Sustainability Through Data Security
Thinkers, builders, dreamers, innovators. Together, we deliver cutting-edge cloud security solutions to help our customers protect their data and people.

Meet our team
Group of hikers scaling a snowy mountain
Netskope’s talented and experienced Professional Services team provides a prescriptive approach to your successful implementation.

Learn about Professional Services
Netskope Professional Services
Secure your digital transformation journey and make the most of your cloud, web, and private applications with Netskope training.

Learn about Training and Certifications
Group of young professionals working

The Eight “Must-Haves” for Successful Anomaly Detection

Feb 10 2016
Tags
Anomaly Detection
Cloud Access Security Broker
Cloud Best Practices
Cloud Computing
Cloud Security
Tools and Tips

Traditional anomaly detection methods are either rule-based, which doesn’t generalize well since the rules are too specific to cover all possible scenarios or time-series based, (time vs. quantity) which is too low-dimensional to capture the complexity of real life. Real-life events have higher dimensions (time, both source and destination locations, activity-type, object-acted on, app used, etc.) A successful anomaly detection system will have eight “must-have” features.

Before we go through those features, at the highest level the system must be one that “allows” rather than “blocks” and is based on machine learning.

The reason why an allow list is critical is because it studies the good guys. Bad guys try to hide and outsmart block-based platforms like anti-malware. A successful machine-learning anomaly detection system won’t chase bad guys, looking for “bad-X” in order to react with “anti-X.” Instead, such a platform that is allow-based can study what is stable (good guys’ normal behavior) and then look out for outliers. This approach avoids engaging in a perpetual and futile arms race.

If you’re going to do anomaly detection the right way, you need to be able to scale to billions of events per day and beyond. It’s not practical at that scale to define allow lists a-priori, or keep a perfect history of all observed behavior combinations. Instead, anomaly detection models should be “soft” in the sense that they always deal with conditional probabilities of event features and are ever-evolving.

The second high-level requirement is that a successful anomaly detection system must be machine learning-based. Virtually every CASB today uses this term, but few mean it. Machine learning means just what it says, that pattern recognition should be done by the computer without being specifically told what to look for. There are two main types of machine learning: Supervised and unsupervised. The former is where the computer learns from a dataset of labeled training data whereas the latter is where the computer makes sense of unlabeled data and finds patterns that are hard to find otherwise. Both supervised and unsupervised machine learning are relevant for this blog, and from here on out I’ll simply refer to anomaly detection as “Machine Learned Anomaly Detection,” or “MLAD” for short.

Now that we have established some high-level requirements, let’s dive into the eight “must-haves” for effective MLAD.

Noise resistance: A common issue with all anomaly detection systems is false-positives. In reality, it’s hard to avoid false positives entirely because in the real world there’s always an overlap between two distributions with unbounded ranges and different means. The chart below, which includes two distributions from the same data set of test results, shows this. Move the criterion threshold value to the right and you get fewer false-positives (FPs). The problem is that by doing this you’ll be also getting a growing number of false negatives (FNs). There is always a tradeoff.

While it is difficult to avoid false-positives, a successful MLAD system will take steps to help the user filter noise. Applying this model to cloud security, observing new users or devices, by definition, will generate patterns that are seen for the first time (a new IP address, a new application, a new account, etc. will appear). Good MLAD will learn source habits over time and flag anomalies only when, statistically, the event stream from a source, such as a user or device, is considered seasoned, or established enough.

More critically, MLAD must support a likelihood metric per event. Operators can display only the top N most unlikely/unusual events, sorted in descending order, while automatically filtering out any other event with a less than “one in a thousand,” or “one in a million” estimated probability to occur. Often these per-event likelihood metrics are based on the machine-learned statistical history of parameter values and their likelihood to appear together in context, for any source. It is up to the user to set the sensitivity thresholds to display what they want to see. This type of approach flags “rareness” and not “badness.”

Multi-dimensionality and generality: Successful MLAD platforms don’t rely on specific, hard-wired rules. Machine-learned anomalies are no longer unidimensional, such as “location-based,” “time-based,” etc. Instead, they are designed to detect anomalies in multiple, multi-dimensional spaces. You must look at every feature you can collect and that makes sense in every event and consider many features as a whole when calculating the likelihoods of each combination. An anomaly may be triggered due to one unusual value in a dimension, or a combination of multiple dimensions falling out of bounds. Features can be categorical or numeric, ordered or not, cyclical or not, monotonic or non-monotonic.

Worlds in deep space

Robustness and ability to cope with missing data: Traditional batch machine learning clustering methods suffer from two critical issues:

  • They break in the face of incomplete data, such as missing dimensions in some events.
  • Due to the curse of dimensionality and the way distance metrics between multi-dimensional points are computed, they lose their effectiveness in high-dimensions (typically about five dimensions).

A good MLAD platform doesn’t rely on traditional batch clustering such as k-means. It is feature agnostic, dimension agnostic and can deal with missing or additional dimensions (features in an event) on the fly, as they appear.

Adaptability and self-tuning: Over time, even the most persistent habits tend to change. Users may switch to other applications, move to new geographical locations, etc. A platform that is based on machine learning adapts over time to new patterns and user habits. Old unusual patterns become the new norm if they persist for a long enough period. All conditional event probabilities keep updating over time.

Since organizations tend to be very different in the usage profiles, cloud app mix, event dimensions, and number of users, it’s important to keep a separate model for each organization and let it shift over time based on that organization’s change over time.

Future-proofing: An MLAD platform is agnostic to the semantics of input features. All it cares about is the statistical probability of each feature to occur in its specific context. We can add features (think about installing a new security camera, or any other new sensor) without code changes to the platform. The moment a new event source is introduced as a new input is the moment that you can detect anomalies in it. In the coming months, we plan to enrich our data with more features that will enable us to introduce additional data flows and models into the Netskope MLAD platform.

future proof words written by 3d hand

Personalization: Good MLAD studies each source separately, yet all in parallel. In Netskope’s MLAD, a source can be anything: a user, device, department, etc. In the real world, different sources tend to be very different in their normal behavior. In our experience, this fine-grained study of each source separately greatly improves signal-to-noise ratios.

Scalability: The algorithm we use can process tens of thousands of events per second per each tenant/model thread on standard hardware. We can run hundreds of such threads in parallel and horizontally scale as we grow. We can analyze the probability of any event in near constant time versus all prior historical events. The time to calculate the probability of any event is linear with the number of dimensions in the event. We detect anomalies as they come in, at a speed that is small constant multiplier over plain I/O of the same data.

User-friendliness: Each anomalous event is dissected and explained in-context using “smoking gun” evidence. For example, we may say, “This event is highly unusual (1 in 9.67 million likelihood) because, for this particular user, the source location is unusual, and the time of day is unusual, and this application has never been used before.” We do this while contrasting rare and unusual events with normal or common patterns. We don’t pass judgment on the maliciousness of an event; we only focus on likelihoods based on historical evidence. It is up to the user, given the information they have (and we don’t) to decide whether to take action on the information our anomaly-detection platform provides.

Dog food: As we were developing and testing MLAD at Netskope, we came across several interesting revelations. Users were downloading sensitive files from one app and then re-uploading those files to a separate app. After taking a closer look, we found that sensitive may have been been exfiltrated;  file names included “Strategic Plan.pdf,” “passwords.txt” and “XYZ_litigation.docx.” This was one of the early indications that MLAD was kicking, and we were on the right track. Since then we’ve been discovering other unusual patterns that we had not anticipated before seeing MLAD in action.

“Solving security” is a tall order. Complex systems with many applications and users, and millions of possible access patterns can never be 100% secure. Our mission is to keep improving our tools. MLAD is one of these tools. We hope it will keep getting better and help our customers in their quest to keep their systems more secure.

Stay informed!

Subscribe for the latest from the Netskope Blog