Skip to content

external-resource-rds-logs actions #214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

chassing
Copy link
Member

@chassing chassing commented Jul 30, 2025

Summary

This PR implements a new automated action external-resource-rds-logs that enables users to retrieve and export RDS database logs to S3 storage. The action fetches logs from specified RDS instances and packages them into zip files for easy download and analysis.

Ticket: APPSRE-12231

Technical Implementation

  • Stream RDS log files instead of downloading them
  • Use zipstream to create a streaming zip file without having everything in memory

Usage

$ automated-actions external-resource-rds-logs --account app-sre-stage --identifier glitchtip-dev
---
action_id: 3d11c83f-e5ca-4dd9-8418-7714cc9010cc
created_at: 1753869790.500306
name: external-resource-rds-logs
owner: cassing
result: null
status: PENDING
task_args: {}
updated_at: 1753869790.500306

$ automated-actions action-detail --action-id 3d11c83f-e5ca-4dd9-8418-7714cc9010cc
---
action_id: 77ddfe88-5385-4325-9477-a947229185bc
created_at: 1753800348.085379
name: external-resource-rds-logs
owner: cassing
result: "Download the RDS logs from the following URL(s): https://automated-actions.s3.amazonaws.com/....

Testing

  • ✅ Unit tests for all new components
  • ✅ Integration tests

Additionally

This PR splits the Celery external_resources.tasks into separate ones to maintain readability.

Dependencies

@chassing chassing self-assigned this Jul 30, 2025
@chassing chassing force-pushed the APPSRE-12231/action-RDS-logs branch from c9577b3 to 584bd9b Compare July 30, 2025 13:31
@chassing chassing force-pushed the APPSRE-12231/action-RDS-logs branch from 584bd9b to d4dae76 Compare July 30, 2025 13:34
Copy link
Contributor

@rporres rporres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for this PR, I really love the implementation and the refactoring of the celery tasks. I would somehow limit the amount of logs downloaded by default, as they can be a lot.

But I don't understand very well why we need this or who has asked for this when we can (and we actually do) send RDS logs to Cloudwatch where it is quite convenient to analyze them and where we can have more retention... Maybe there's a use case I'm missing, though...

Comment on lines 204 to 213
def list_rds_logs(self, identifier: str) -> list[str]:
"""Lists the log files for a specified RDS instance."""
response = self.rds_client.describe_db_log_files(
DBInstanceIdentifier=identifier
)
return [
log["LogFileName"]
for log in response["DescribeDBLogFiles"]
if log["LogFileName"]
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many logs this gets? All of them? That can be a non-trivial amount of data

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and if the response is paginated, this will only get first page

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 413d05b

* **Description**: Retrieves logs from an Amazon RDS instance and stores them in an S3 bucket.
* **Use Case**: Typically used for troubleshooting database issues, analyzing performance problems, or collecting logs for audit purposes.
* **Required Parameters**: The AWS account name and the RDS instance identifier.
* **Optional Parameters**: Expiration time in days (1-7, default: 7), S3 target file name (defaults to '{account}-{identifier}.zip').
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would clarify that this is the expiration date of the url you are going to get as result.

Copy link
Contributor

@hemslo hemslo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use s3 to store large content is a pattern we need for complex tasks, but for this rds log download, there can be several problems:

  • cost: some logs can be very large, download them and upload to s3 can introduce large traffic and storage cost, especially when people want to recheck logs, duplicate the process multiple times.
  • security: rds log can contain PII data, store it in S3 means s3 bucket need to have higher data security level, and compliance for cross account/region.

it would be easier if we create iam policies like AmazonRDSReadOnlyAccess but limit the resource to exact match instance, let people directly view logs in AWS console.

Comment on lines 204 to 213
def list_rds_logs(self, identifier: str) -> list[str]:
"""Lists the log files for a specified RDS instance."""
response = self.rds_client.describe_db_log_files(
DBInstanceIdentifier=identifier
)
return [
log["LogFileName"]
for log in response["DescribeDBLogFiles"]
if log["LogFileName"]
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and if the response is paginated, this will only get first page

Comment on lines +306 to +314
def generate_s3_download_url(
self, bucket: str, s3_key: str, expiration_secs: int = 3600
) -> str:
"""Generate a pre-signed URL for downloading an object from S3."""
return self.s3_client.generate_presigned_url(
"get_object",
Params={"Bucket": bucket, "Key": s3_key},
ExpiresIn=expiration_secs,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have lifecycle policy set to actually delete the file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lifecycle policy is on the buckets

@chassing
Copy link
Member Author

use s3 to store large content is a pattern we need for complex tasks, but for this rds log download, there can be several problems:

  • cost: some logs can be very large, download them and upload to s3 can introduce large traffic and storage cost, especially when people want to recheck logs, duplicate the process multiple times.

I don't expect such high usage of this action that the costs will become a factor.

  • security: rds log can contain PII data, store it in S3 means s3 bucket need to have higher data security level, and compliance for cross account/region.

The S3 bucket is read and write for the bucket owner only. I don't see any risk here.

it would be easier if we create iam policies like AmazonRDSReadOnlyAccess but limit the resource to exact match instance, let people directly view logs in AWS console.

This is still an option for our tenants, and no one prevents them from using it. This action will be just in addition.

@chassing
Copy link
Member Author

/retest

@chassing chassing force-pushed the APPSRE-12231/action-RDS-logs branch from 916fa9f to 413d05b Compare July 31, 2025 11:30
@chassing
Copy link
Member Author

But I don't understand very well why we need this or who has asked for this when we can (and we actually do) send RDS logs to Cloudwatch where it is quite convenient to analyze them and where we can have more retention... Maybe there's a use case I'm missing, though...

Together with an upcoming openshift-pods-logs, this would be a good way of gathering "debugging information" of an application

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants