AWS S3 Access Management
Access control on Amazon S3 is subtle and complex. Here’s a high-level overview of the access controls that can be placed on S3 buckets and objects.
Background
Amazon Simple Storage Service (S3) behaves like a WebDAV filesystem. It’s ideal for files that are written once (or rarely) and read many times. Unlike a block-storage device, an S3 bucket has no fixed size; you’ll never see a “filesystem full” message from S3. There are even tools that allow a computer to make an S3 bucket look like a mounted directory. This approach makes reading from and writing to S3 seem like normal file operations, but no one should mistake S3 for a high-performance filesystem. S3 involves very high latencies and slow write operations.
Terms:
- Bucket: top-level “directory” of an object store. A bucket may contain objects or subdirectories.
- Object: A file or file fragment that exists within a bucket. Each object has a unique URL for access.
Beyond a simple file store, S3 offers some powerful services. S3 can
- provide BitTorrent access to any file under 5 GB in size
- provide version control for objects
- specify a lifecycle for moving an object from “frequent access” to “infrequent access” to “cold storage”; lifecycles can be handled manually or programmatically based on file age (requires version control)
- can encrypt files at rest
- can specify the level of redundancy (i.e. number of copies) for an object (requires version control)
- tag objects with metadata
- turn a bucket into a web site
- accelerate access to buckets
- provide notifications and logging based on user-provided criteria
Controlling Access
S3 access controls are as complex as the underlying service. ACLs can be placed on lifecycle management, replication, tagging, web site availability, versioning, and even on ACLs themselves. The upside is that access policies and controls can be very subtle and complex; the downside is that users are given enough rope to hang themselves.
Of First Importance
It’s crucial to know the difference between policies and ACLs.
- Policy: Every bucket may have a base policy that governs access in absence of ACLs specific to objects or subdirectories. The default S3 policy is to make buckets and their objects private. Only the owner has the ability to read to and write from them.
- ACL: Every bucket and object can be assigned its own ACL which overrides the bucket’s policy. (That’s right: the policy may say that the bucket is private, but a more permissive ACL placed on the bucket itself can override the policy.) A bucket may be declared private, but any and all objects within the bucket may be readable and/or writable by other users or even anonymous users.
It’d be cool to offer a tool that could read a bucket’s policy and then list files within the bucket that override that policy.
Canned ACLs
Amazon suspects that most access controls will be similar, so S3 offers a set of predefined grants, aka canned ACLs. They can easily make a bucket private (the default), public read-only, public read-write, read-only for authenticated users, read-only for a specific EC2 instance, etc.
Chances are, most buckets will be well served by these ACLs. My recommendation is to turn to them first. The canned ACLs are also easy to manage from the Web GUI; they can also be managed from the command line using the AWS command line interface:
aws s3api put-bucket-acl \
--acl private --bucket elizasbucket
aws s3api put-object-acl \
--acl public-read --bucket elizasbucket --key publicdir/index.html
Policies: the Available Knobs
An S3 policy has four main sections:
- Resources covered: buckets, objects
- Action requested (read, write, etc)
- Test of conditions (optional)
- Effect: either Allow or Deny the request
Resources
In an S3 policy, access controls can be applied to
- An entire bucket
- Subdirectories within a bucket
- Objects: all of them, a group in a subdirectory, grouped by name (e.g., *.pdf), or individually
The wildcard syntax for specifying buckets or objects looks similar to that used by command-line tools to specify files:
arn:aws:s3:::mybucket/Public/*
β all directories and objects in a root-level subdirectory named “Public”arn:aws:s3:::mybucket/*.pdf
β all root-level PDF filesarn:aws:s3:::mybucket/Private
β a root-level object or subdirectory named “Private”
Action Requested
The S3 REST API Introduction details all of the many possible actions that can be request. Generally speaking, actions apply to
- the S3 service itself: a read-only array of metainformation about a bucket
- buckets, read or write access to their ACLs, lifecycle, policy, versioning, replication, etc.
- objects, read or write access to the objects themselves, their ACLs, etc.
Conditions and Tests
AWS makes available at each request a variety of information for testing. At a high level, that information covers
- Date and time of request
- Identity of requestor (user name/ID, IP address, Amazon Resource Name, browser UserAgent)
- Security of request (https? temporary security credentials)
All those those variables can be tested, e.g.,
- Is this user allowed?
- Is this IP address allowed?
- Does the request come at the right time of day or month?
- Does the browser UserAgent meet our criteria?
- Is this request encrypted?
The IAM Policy Variables Overview provides more details about the specific conditions that can be tested.
Command-line Example
First, create a policy that allows anonymous read-only access to
the site-src
subdirectory in the bucket named bucket-02
, with
the exception that the site-src/mgmt
subdirectory and its contents
are off-limits except to users from a couple specific IPv4 addresses.
{
"Version": "2012-10-17",
"Id": "Policy1473195819235",
"Statement": [
{
"Sid": "Stmt1473195808073",
"Effect": "Allow",
"Principal": { "AWS": "*" },
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::bucket-02/site-src/*",
},
{
"Sid": "Stmt1473195808074",
"Effect": "Deny",
"Principal": { "AWS": "*" },
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::bucket-02/site-src/mgmt",
"arn:aws:s3:::bucket-02/site-src/mgmt/*"
],
"Condition":{
"NotIpAddress":{
"aws:SourceIp": [ "104.131.178.116", "162.209.76.176" ]
}
}
}
]
}
The second staza in the statement, which denies most access to the
mgmt
subdirectory, uses a double-negative to accomplish its goal.
“Deny everyone access who doesn’t come from IPv4 address 104.131.178.116
or 162.209.76.176.”
Then apply the policy:
# create our bucket
aws s3 mb s3://bucket-02
# apply our policy file
aws s3api put-bucket-policy --bucket bucket-02 \
--policy file://bucket-policy.json
# upload our content
aws s3 sync /tmp/site-src s3://bucket-02/site-src