AWS Simple Cloud Storage Service (S3)#
Posit Package Manager can also utilize the AWS Simple Cloud Storage Service (S3) as a storage provider. This integration requires AWS credentials and updates to the Package Manager configuration file.
Credentials#
As a best practice, AWS recommends that you specify credentials in the following order:
- Use IAM roles for Amazon EC2 (if your application is running on an Amazon EC2 instance). IAM roles provide temporary security credentials to your instance to make AWS calls, and provide an easy way to distribute and manage credentials on multiple Amazon EC2 instances.
- Use a shared credentials file. This credentials file is the same one used by other SDKs and the AWS CLI. If you’re already using a shared credentials file, you can also use it for this purpose.
- Use environment variables. Setting environment variables is useful if you’re doing development work on a machine other than an Amazon EC2 instance.
If you select IAM roles for Amazon EC2 instances, Package Manager will automatically use the instance’s credentials.
See the AWS CLI Configuration for detailed documentation on configuring your environment for interaction with AWS.
S3 Permissions#
The credentials Package Manager uses for S3 storage must have the following permissions for the bucket:
s3:GetObject
s3:ListBucket
s3:PutObject
s3:DeleteObject
s3:AbortMultipartUpload
Environment Variables#
For testing with environment variables, create and edit a new file at /etc/systemd/system/rstudio-pm.service.d/aws.conf
:
[Service]
Environment="AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE"
Environment="AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
Environment="AWS_DEFAULT_REGION=us-west-2"
Then, reload the systemd
process and restart the Package Manager service with:
Configuration#
On the Package Manager side, the Storage
and S3Storage
sections must be updated. Here is a simple example using the bucket my-s3-bucket
, the region us-east-1
, and a shared configuration:
[Storage]
; Sets all storage classes to use S3 instead of the `DataDir`
Default = s3
; Default S3 settings. This is the minimum-required setting for using S3.
[S3Storage]
Bucket = my-s3-bucket
Region = us-east-1
EnableSharedConfig = true
Users with advanced or specific needs can configure storage classes individually. For example, you could use this configuration if you only wanted to store internal R and CRAN packages in S3 and use local storage for everything else:
[Storage]
Packages = s3
CRAN = s3
[S3Storage]
Bucket = my-s3-bucket
Region = us-east-1
EnableSharedConfig = true
; Override default S3 settings for the "packages" class. This demonstrates
; all the available S3 configuration settings.
[S3Storage "packages"]
Bucket = another-s3-bucket
Prefix = rspm-packages
Profile = dev-rspm
Region = us-west-1
EnableSharedConfig = true
For more information on the storage classes, refer to the appendix.
Encryption Key#
While all application data including package and cache information can be stored in S3, Package Manager requires an encryption key available at boot to function properly. Preserving this key between instances is necessary for both high availability and ephemeral environments.
The key is generated automatically and stored at the Server.EncryptionKeyPath
path. For ephmeral environments, or environments where an environment variable is preferred, the PACKAGEMANAGER_ENCRYPTION_KEY
value can also be used to store this key. Refer to the environment variables section of the appendix for more information.
Client-side Encryption#
For users with strict security requirements, Package Manager supports client-side encryption with S3 and KMS. This setup requires a symmetric KMS key and additional credential permissions:
kms:Encrypt
kms:Decrypt
kms:GenerateDataKey
It also requires including the KMS Key ID in the Package Manager configuration file, for example:
[S3Storage]
Bucket = my-s3-bucket
Region = us-east-1
EnableSharedConfig = true
KMSKeyID = XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
Warning
Package Manager uses Transport Layer Security (TLS) for all communication with S3. For customers who want additional security, we instead recommend using server-side encryption for Amazon S3 buckets.
Client-side encryption uses the Go implementation for AES/GCM. Due to this, objects to be encrypted or decrypted will be fully loaded into memory before encryption or decryption can occur. Users must allocate additional memory to avoid allocation failures. This will also result in slower upload and download speeds for clients.