Over-Privileged Service Accounts Create Escalation of Privileges and Lateral Movement in Google Cloud

Introduction

In this blog, we’ve analyzed data from Netskope customers that include security settings of over 1 million entities in 156,737 Google Cloud (GCP) projects across hundreds of organizations (see Dataset and Methodology for more details on the dataset).

We will specifically look at the configuration of service accounts, see what’s commonly occurring in the real world, and analyze how multiple security misconfigurations can lead to escalation of privileges and lateral movement. In this Netskope dataset, we observed:

Service Accounts with user-managed keys: Over 38% (4,160) of service accounts use user-managed keys, which are often not needed, require work to keep secure (e.g. key rotation), and create a larger attack surface for compromised credentials. Service accounts with user-managed keys are common, existing in 82% of the customers in this dataset.

Service Accounts with old user-managed keys: Of the 4,160 service accounts with user-managed keys, 89% of the service accounts have keys older than 90 days, and over 71% have keys older than a year with the oldest key at 8 years, 1 month. All of this shows the challenges in keeping up with best practices when self-managing keys, which then poses an even greater risk of compromised credentials attacks.

Over-privileged services accounts with old, user-managed keys: Of the 4,160 service accounts with user-managed keys, 26% (1,102) not only have old keys but are also over-privileged with project-level administrator privileges, such as the project owner/editor roles. This greatly increases the impact after compromise, as it allows wider lateral movement and escalation of privileges.

Over-privileged services accounts with old, user-managed keys and access to multiple projects: Of these 1,102 high-risk service accounts that are over-privileged with old user-managed keys, 10% (110) have access privileges to multiple projects, providing more opportunities for lateral movement.

Service account access risk from users: Access to over-privileged service accounts (item #3) can be gained not just from service account key compromise (items #1 and #2), but also from compromise of user accounts. 92 (1.3%) users have the iam.serviceAccountUser or iam.serviceAccountTokenCreator roles at the project level, allowing them to access or impersonate ALL service accounts within a project.

As we can see, there is a cumulative, interrelated risk from the above misconfigurations around service accounts. We’ll be looking closer at:

Service account design in GCP
The interdependencies of these four controls
How misconfigurations in the four controls can be chained by attackers to gain broader access to your GCP environment

Service Accounts

Before we dive into the data and risk analysis, we need to be clear on what service accounts are and how they’re used within GCP. Service accounts are security principals, but created for use by user scripts, applications, or Google services such as virtual machines—the service accounts have roles/permissions that govern the access of the scripts/applications/services using them. Services such as virtual machines can have service accounts attached to them so they run as that particular service account. They differ from user accounts in that:

They do not have passwords
Instead, they use RSA public/private key pairs which are used for authentication for Google API access
Users can impersonate or “run as” a service account
Finally, they are resources so user access to service accounts can be protected by IAM policies

Types

There are two types of service accounts:

User-managed. These service accounts include default service accounts, for example, a default service account will be created when you enable the Compute Engine API in a project. However, most user-managed service accounts are created by users. The user is responsible for all management including the creation and rotation of keys.
GCP-managed (and created). An example of a GCP-managed service account is the Google APIs Service Agent, with an email address that has a format like: [email protected]. This service account runs internal Google processes on behalf of users. It is not listed in the Service Account section in the Console and is not modifiable or accessible by the user.

Keys

User-managed keys: User-managed service accounts can have optional, user-created (and user-managed) keys, which are commonly used for applications that run outside of GCP to authenticate as the service account and access your GCP environment. The keys are a public/private RSA key pair that serve as the credential for authenticating as the service account and the key file is typically a downloadable .json file, which is also the risk for compromise of a user-managed service account. User-managed keys create risk in two areas:

They involve a key file (credentials) that is often stored in an insecure manner on a less-secure endpoint running the code that needs it, and
The management of the keys relies on the user. Key rotation, as one example, is a management task that is usually not done by the user, as we see in our dataset.

Google-managed keys: Google creates and manages keys for GCP-managed service accounts as well as user-managed service accounts, and these are used under the hood for the service attachment and impersonation of service accounts. The difference is that when Google manages the keys, it is more secure because private keys are not shipped around and the management (e.g. key rotation) of the keys is more likely to be done.

External vs. Internal

Service accounts that run inside GCP have more secure methods that do not involve downloadable key files containing private keys. Services that run within GCP, such as Compute Engine VMs, can be configured and attached to service accounts and the services will effectively run as that service account. No explicit user keys are needed in this case. Service account keys are described by Google in depth.

However, for scenarios where applications must run outside of GCP and do not have human interaction/authentication, the commonly used option is to have a user-managed key that results in a key file that is collocated or accessible by the application.

Considerations

When using service accounts, consider:

Whether service accounts are really required or needed in the first place?

Is the application or service inside or outside of the GCP environment?
Relatedly, can you have GCP manage the service account keys, rather than have user-managed keys?
Whether you are using Kubernetes workloads (GKE)?

Google has good guidance in its best practices for using and managing service accounts.

Generally, much of the risk with service accounts is the abuse of them if there are user-managed/created keys, which have key files containing private keys that are downloaded, typically outside of the GCP environment, and used by client application code. It’s a big risk for compromise of those credentials, and multiple CIS IAM controls aim to secure user-managed service account keys in order to reduce the chances of abuse should they be compromised. We’ll discuss this next.

CIS IAM Controls

The backdrop of the service account configurations highlighted in the introduction is the broader IAM controls in the CIS Foundations Benchmark for GCP v1.2. We analyzed 10,783 service accounts, 207,538 policies, and all generated encryption keys across 156,737 projects in several hundred organizations. Some of the key controls in the IAM section and the violations from our dataset are shown in the following table:

Table showing some of the key controls in the IAM section and the violations from our dataset

Controls 1.4 through 1.7 capture several recommended best practices regarding service accounts:

Control 1.4: Use GCP-managed service account keys whenever possible, since GCP will handle key rotation automatically and local key files will not be created.
Control 1.5: Ensure that service accounts do not run with broad admin privileges, so that it is less likely that compromised service accounts can be used for lateral movement and escalation of privileges.
Control 1.6: Ensure that users do not have access to all service accounts at the project level, as this almost always is giving users overly-broad privileges.
Control 1.7: If user-managed service account keys are used, ensure that the keys are rotated every 90 days or less, which reduces the attack surface for compromised credentials

As we can also see from the table, some practices such as KMS key security and use of personal login credentials (e.g. @gmail.com) within policies have low numbers of violations and overall expose less risk.

By contrast, service account practices (1.4-1.7) are lagging and introduce risk in a significant (30+%) number of the 10,783 service accounts. This risk comes not only from the higher numbers of violations but also the cumulative risk from violations among these interdependent controls, as we’ll discuss below.

Service Account Risk

What’s of particular interest is how service accounts pose a common attack surface for access to your GCP environment, how various configuration options can increase the attack surface, and how multiple security issues can be chained together and exploited by an attacker to gain elevated privileges and access to all resources in a project.

Let’s go back and analyze the controls and risks associated with service accounts in our dataset.

1. User-managed keys create service account attack surface

An attacker will focus on service accounts with user-managed keys, because the keys pose a higher probability for compromising and gaining access to the service account. User-managed keys will have a json key file containing the private key which is often downloaded and stored on disk local to the external application or script that needs access to the GCP environment. This attack surface is similar to any credential file stored locally on endpoints like laptops or work computers, outside the protections of your GCP environment’s IAM controls.

In our dataset, we found 38% (4,160) service accounts out of 10,783 total service accounts had user-managed keys:

Graphic showing how many service accounts also had user-managed keys

38% shows that a large percentage of the overall service accounts have user-managed keys. In many cases, there are alternatives (to be discussed in a future blog) that should be explored to reduce the number of user-managed keys.

Note that service accounts with user-managed keys were seen in 82% of the customers (several hundred) in this dataset. It is not the result of a few bad organizations but rather it’s a common practice, and is our starting point for characterizing the service account attack surface.

2. Older keys increase the attack surface

Of the 4,160 service accounts with user-managed keys, 89% (3,721) of the service accounts have keys older than 90 days. 90 days is the recommended key rotation period in the CIS Benchmark.

Graphic showing how many of the service accounts with user-managed keys had keys that were more than 90 days old

Keys that are not rotated frequently increase the risk for service account compromise by increasing the time window for the validity of a compromised key.

Note that GCP-managed keys are rotated approximately every two weeks automatically. A frequent and consistent key rotation mitigates lost or stolen keys since they will be invalid after the next rotation cycle. This highlights the difficulty of managing key rotation by yourself.

3. Really old keys (no rotation) create a very high impact from compromised keys

Over 71% of user-managed keys are older than a year, suggesting that the keys will never be rotated. It means that if the keys are ever compromised, they likely will still be valid, and the overall impact from compromise is even higher.

The age does make a difference. Is something just a little out of date at 92 days and will be rotated soon or is it a year old and probably never going to be rotated? The age of user-managed service account keys that are 90 days or older, breaks down into these age bands (% out of 3,721 old keys):

Pie chart showing how old the service account keys were.

Table outlining how old the service account keys were

The inaugural Rumplestiltskin Insecurity Award goes to a service account with a key that is over 8 years and 1 month old.

The fuller risk profile from old keys is not apparent from just looking at the failure counts from the CIS Benchmark. By showing some of the breakdown/distribution, we see that it is a very serious problem. Key rotation within 90 days is rare (5%) for user-managed keys, and most (65%) of user-managed keys are likely never rotated.

4. Your fault, my fault, we all scream about defaults: insecurity by default

Part of the challenge could be that user-managed keys have a default expiration date of Dec 31, 9999 (essentially no expiration), allowing customers to easily continue to use these keys. Default settings often dictate default security practices and we seem to see this effect clearly here.

5. Over-privileged service accounts create an even higher impact from compromise

Let’s look at the potential impact from the privileges that could be obtained by compromising service accounts. In this dataset, here is how the service accounts that have old keys overlap with service accounts that are over-privileged:

Venn diagram showing how many service accounts are over-privileged and also have old keys

Almost 30% (1,102 out of 3,721) of the service accounts with old, user-managed keys are also over-privileged. So if old keys are compromised there’s nearly a 30% chance that the service account has an over-privileged administrator role.

What does it mean to be over-privileged? The CIS Benchmark checks for the classic project owner/editor roles as well as the numerous service admin roles such as cloudkms.admin, cloudsql.admin, and compute.securityAdmin.

Here is the “Top-26 Most Granted Admin Privileges” List:

Table showing the Top-26 Most Granted Admin Privileges

Over 41% of the 1,102 over-privileged service accounts have full admin privileges on all resources within the project by virtue of having project owner or editor roles, which give broad administrative privileges on all resources within a project and are unfortunately still granted by default in many cases.
Over 50% of the service accounts have full administrative privileges on buckets, bucket objects, or BigQuery datasets.
Over 44% of the service accounts have full compute admin privileges, which are a common target for escalating privileges and moving laterally.

All of this greatly increases the impact after compromise, as it allows wider lateral movement and escalation of privileges.

6. Service accounts with access to multiple projects increase potential lateral movement

Of these 1,102 high-risk security accounts that are over-privileged with old user-managed keys, 10% (110) have access privileges to multiple projects, providing more opportunities for lateral movement.

These 110 service accounts have access to 1283 unique projects. To give an idea of the distribution and skew, the service account with the most project access is a service account used for management/automation (Terraform) with access to 522 projects. There are 20 service accounts out of the 110 that have access to 10 or more projects, with the mean being 14 and the median 2, reflecting some skew upwards from the top 20 service accounts.

7. Users with broad access to service accounts at the project level create more attack vectors for compromising service accounts with admin privileges

Access to over-privileged service accounts (item #5) can be gained not just from service account key compromise (items #1 and #2), but also from compromise of user accounts.

92 (1.3%) users have the iam.serviceAccountUser or iam.serviceAccountTokenCreator roles at the project level, allowing them to access or impersonate ALL service accounts within a project.

These 92 users exist in 61 projects. Of those 61 projects, 58 have 361 over-privileged admin accounts.

If the 92 users are compromised, they allow access to 361 over-privileged service accounts in 58 projects.

Service Account Attack Paths

The preceding discussion started with CIS controls and compliance violations but has highlighted two different attack paths that can take advantage of chaining together potential service account misconfigurations. Below we describe these attack paths as well as the potential impact/reach characterized by the numbers of projects and key entities at risk:

Diagram outlining attack paths and the potential impact/reach characterized by the numbers of projects and key entities at risk

Exploiting misconfigurations that violate CIS controls 1.4, 1.5, and 1.7:

In this attack chain, the attacker would focus on compromising credentials by attacking key files from service accounts with user-managed keys. The attacker would target service accounts with administrator privileges, and then gain access to resources within the project(s).

In this dataset, there were 4,160 service accounts with user-managed keys, 89% of them with old keys, and 1,102 of those service accounts had administrative privileges. The 1,102 service accounts had access to 1,425 projects with a significant number of resources. These numbers are aggregated across a large dataset, but serve to show how the broader impact and risk can be seen from a chained attack.

Abusing misconfigurations that violate CIS control 1.6:

In this attack chain, the attacker focuses on compromising credentials of user accounts, then gaining access to the service accounts that the users may have access to, then focusing on the over-privileged service accounts, and then gaining access to all the resources in the project(s).

In this dataset, there were 92 users who had access to all service accounts at the project level. These users were in 58 projects and those projects had 326 service accounts with over-privileged administrator roles. The 58 projects had a significant number of resources that would be at risk. These numbers are also aggregated across a larger dataset, but show how this different attack path also puts valuable assets at risk.

Research and Tools

There has been in-depth research on how to identify the risk of escalation of privileges from GCP permission chaining by Colin Estep at Black Hat Europe 2020: Assessing IAM Exposure in GCP, along with an open-source tool: https://github.com/netskopeoss/iaas_permission_mining that can provide the foundation for an operational approach to identifying IAM privilege escalation risk.

Conclusion

There are standard best practices for service accounts but many GCP environments lag behind in implementing these best practices. We’ve shown how these violations in best practices can allow attackers to chain together attack steps to gain broader access to resources in a GCP environment and that real-world data supports the feasibility of these IAM-related attack vectors.

When focused on compliance, one can lose sight of the bigger security picture. In an upcoming blog, we’ll review the best practices for securing service accounts and IAM in GCP environments, based on real-world examples.

Netskope’s Public Cloud Security platform also can automate configuration checking of your GCP environment, implementing both compliance standards, as well as custom configuration checks.

Dataset and Methodology

Time Period: Data was sampled/analyzed from June 16, 2021, through September 26, 2021.

Source: The analysis presented in this blog post is based on anonymized usage data collected by the Netskope Security Cloud platform relating to a subset of Netskope customers with prior authorization.

Data Scope: The data included 156,737 GCP projects in several hundred organizations.

The data was composed of configuration settings across 1,089,352 entities in GCP including IAM users, IAM policies, password policy, buckets, databases, logs, compute instances, and firewall rules.

Jenko Hwong

Jenko has 15+ years of experience in research, product mgmt., and engineering in cloud security, routers/appliances, threat intel, vulnerability scanning and compliance.