In AWS, there are two ways to log access to S3 storage resources, i.e. buckets and bucket objects:
- server access logging (https://docs.aws.amazon.com/AmazonS3/latest/dev/ServerLogs.html)
- bucket object-level logging using CloudTrail (https://docs.aws.amazon.com/AmazonS3/latest/user-guide/enable-cloudtrail-events.html)
However, understanding the differences and how to configure each one can be confusing and complex. In this blog, we’ll explore the functionality and caveats of both and why you would want to use one versus the other.
Server Access Logging
Server Access Logging is similar to http server logging in the kind of information logged. It answers the general question, “Who is making what type of access to which objects?” Server access logging has several limitations that make it a non-starter for production IT/security needs, but is straight-forward to understand and configure.
Figure 1. Server Access Logging Architecture
In the flow above, a PDF file is written to the bucket with a CLI or API command. Server Access Logging generates one line per bucket operation and writes it to a text log file that is uploaded to an S3 log bucket that you specify when you configure server access logging. The logged information is from the perspective of the “server,” which you can associate with the public REST API endpoint on the AWS side. The formatting is loosely-structured with a known ordering of fields, space-delimited, with quote escaping for fields containing spaces. You can use S3 features in the logging bucket to configure data retention for the logs and must build your own notification system for events of interest or to analyze the log file records.
Logged Operations
Server access logging provides records of the requests that are made to a bucket. Each access is recorded as one text record with the exception of a copy (which is recorded as a delete and write). Each log entry has 24 fields that fall into 4 general categories:
- HTTP/REST operation (e.g. GET, PUT, POST, OPTIONS, etc.)
- Requester information (including user agent, AWS account, IP)
- Resource (bucket or bucket object)
- Session information (such as data size, response times, authentication type).
The information logged summarizes the access operation but does not necessarily provide full payload details. For example, a server access log entry for a PUT ACL operation on an object does not include the new ACL definition.
Example
Here is an example entry with some of the more useful fields highlighted:
Figure 2. Server Access Logging Entry
Fields
The data fields are described in the AWS documentation (https://docs.aws.amazon.com/AmazonS3/latest/dev/LogFormat.html) and summarized in this table along with example requests:
Since the fields are space delimited with double-quote escapes, a common way to parse these records for analysis or eventing will be to use regular expression parsing. In order not to mis-parse the data, care must be taken to accommodate the range and type of values for each field since fields have been added as recently as the past 12 months.
Note as well that S3 server access logging is a bucket-level configuration, meaning that certain bucket operations will not be captured such as bucket listings (which is against the S3 service itself) and bucket creation and deletion. This is in contrast to S3 Object-Level Logging with CloudTrail which can recor