feat: implement Python storage cleaner with Appwrite integration #338

zeus2611 · 2025-09-30T09:40:23Z

What does this PR do?

This PR implements a Python translation of the existing Node.js AppwriteService for cleaning storage buckets by removing files older than a retention period. It also provides a Python version of the serverless handler function that validates environment variables, authenticates the API key from request headers, and triggers the bucket cleanup process.

Test Plan

The changes were verified by running the Python service locally with mock environment variables and headers to ensure the cleanup function executes without error and correctly deletes files as intended. Additionally, error handling was tested by providing invalid or missing API keys to confirm appropriate error responses are returned.

Related PRs and Issues

#310

Have you read the Contributing Guidelines on issues?

Yes, I have read and followed the contributing guidelines.

Summary by CodeRabbit

New Features
- Python Storage Cleaner function that deletes files older than a configurable retention period from a specified bucket; supports GET invocation and cron scheduling, validates inputs, and returns success or error responses.
Documentation
- Added README describing usage, configuration, environment variables, runtime, permissions, cron, and timeout.
Chores
- Added Python .gitignore and introduced the Appwrite SDK runtime dependency.

coderabbitai · 2025-09-30T09:40:32Z

Walkthrough

Adds a Python storage-cleaner function under python/storage-cleaner: a Python-specific .gitignore, README.md, and requirements.txt (adds appwrite>=13.1.0,<14.0.0). New source files: src/utils.py (get_expiry_date computes UTC expiry from RETENTION_PERIOD_DAYS; throw_if_missing validates keys), src/appwrite.py (AppwriteService initializes Client from APPWRITE_FUNCTION_API_ENDPOINT, APPWRITE_FUNCTION_PROJECT_ID and provided API key; clean_bucket lists files with createdAt < expiry in batches (limit 25), deletes them, and aggregates failures), and src/main.py (entrypoint main(context) validating env vars, checking x-appwrite-key header, invoking clean_bucket, and returning HTTP responses).

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title concisely describes the primary feature added by the pull request, namely the implementation of a Python storage cleaner integrated with Appwrite, aligning closely with the objectives and scope of the changes.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🧪 Early access (Sonnet 4.5): enabled

We are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience.

Note:

Public repositories are always opted into early access features.
You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (4)

python/storage-cleaner/src/appwrite.py (4)
8-8: Remove unused import.

The ID import is not used anywhere in this module.
-from appwrite.id import ID
17-24: Validate required environment variables.

The __init__ method reads APPWRITE_FUNCTION_API_ENDPOINT and APPWRITE_FUNCTION_PROJECT_ID without validation. If these are missing or None, the Appwrite client may fail later during API calls with unclear errors.

Consider adding validation:
 def __init__(self, api_key: str):
+    endpoint = os.getenv("APPWRITE_FUNCTION_API_ENDPOINT")
+    project_id = os.getenv("APPWRITE_FUNCTION_PROJECT_ID")
+    
+    if not endpoint or not project_id:
+        raise ValueError("Missing required environment variables: APPWRITE_FUNCTION_API_ENDPOINT and APPWRITE_FUNCTION_PROJECT_ID")
+    
     client = (
         Client()
-        .set_endpoint(os.getenv("APPWRITE_FUNCTION_API_ENDPOINT"))
-        .set_project(os.getenv("APPWRITE_FUNCTION_PROJECT_ID"))
+        .set_endpoint(endpoint)
+        .set_project(project_id)
         .set_key(api_key)
     )
     self.storage = Storage(client)
43-43: Use defensive dictionary access.

Directly accessing f["$id"] will raise a KeyError if the API response format changes or is malformed. Consider using .get() with a fallback or explicit validation.
-            self.storage.delete_file(bucket_id, f["$id"])
+            file_id = f.get("$id")
+            if file_id:
+                self.storage.delete_file(bucket_id, file_id)
45-46: Consider more Pythonic empty check.

Using not files is more idiomatic in Python than len(files) == 0.
-        if len(files) == 0:
+        if not files:
             break

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79a7bcc and d786c8c.

📒 Files selected for processing (6)

python/storage-cleaner/.gitignore (1 hunks)
python/storage-cleaner/README.md (1 hunks)
python/storage-cleaner/requirements.txt (1 hunks)
python/storage-cleaner/src/appwrite.py (1 hunks)
python/storage-cleaner/src/main.py (1 hunks)
python/storage-cleaner/src/utils.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

python/storage-cleaner/src/appwrite.py (1)

python/storage-cleaner/src/utils.py (1)

get_expiry_date (9-23)

python/storage-cleaner/src/main.py (2)

python/storage-cleaner/src/appwrite.py (2)

AppwriteService (13-46)

clean_bucket (26-46)

python/storage-cleaner/src/utils.py (1)

throw_if_missing (26-40)

🪛 markdownlint-cli2 (0.18.1)

python/storage-cleaner/README.md

11-11: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

🪛 Ruff (0.13.1)

python/storage-cleaner/src/utils.py

40-40: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (2)

python/storage-cleaner/README.md (1)

24-24: Verify timeout sufficiency for large buckets.

The 15-second timeout may be insufficient when cleaning buckets with thousands of old files. Each batch fetches 25 files, and with network latency for delete operations, larger buckets could exceed this limit and cause incomplete cleanup.

Consider testing with a representative bucket size, or document the expected maximum bucket size this function can handle within 15 seconds. You may need to increase the timeout or implement cursor-based resumption for very large buckets.

python/storage-cleaner/src/main.py (1)

7-10: Environment validation logic approved.

The use of throw_if_missing correctly validates required environment variables before proceeding with bucket operations.

python/storage-cleaner/requirements.txt

python/storage-cleaner/src/appwrite.py

python/storage-cleaner/src/main.py

python/storage-cleaner/src/utils.py

…leaner

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

python/storage-cleaner/src/main.py (1)
9-14: Fix header access to prevent KeyError.

Line 9 uses direct dictionary access context.req.headers["x-appwrite-key"], which will raise a KeyError if the header is missing—before the check on Line 11 can execute. This exposes internal errors to clients.

Apply this diff:
-    api_key = context.req.headers["x-appwrite-key"]
+    api_key = context.req.headers.get("x-appwrite-key")
 
     if not api_key:

🧹 Nitpick comments (1)

python/storage-cleaner/src/appwrite.py (1)
17-24: Consider validating required environment variables.

The __init__ method reads APPWRITE_FUNCTION_API_ENDPOINT and APPWRITE_FUNCTION_PROJECT_ID from the environment without validation. If these are missing, the Appwrite SDK client will likely fail with an unclear error during subsequent API calls.

You can either:

Let the SDK fail naturally (current approach—acceptable if calling code validates upstream).

Add explicit validation here for clearer error messages:
def __init__(self, api_key: str):
    endpoint = os.getenv("APPWRITE_FUNCTION_API_ENDPOINT")
    project_id = os.getenv("APPWRITE_FUNCTION_PROJECT_ID")
    
    if not endpoint or not project_id:
        raise ValueError(
            "Missing required environment variables: "
            "APPWRITE_FUNCTION_API_ENDPOINT, APPWRITE_FUNCTION_PROJECT_ID"
        )
    
    client = (
        Client()
        .set_endpoint(endpoint)
        .set_project(project_id)
        .set_key(api_key)
    )
    self.storage = Storage(client)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d786c8c and accc817.

📒 Files selected for processing (4)

python/storage-cleaner/requirements.txt (1 hunks)
python/storage-cleaner/src/appwrite.py (1 hunks)
python/storage-cleaner/src/main.py (1 hunks)
python/storage-cleaner/src/utils.py (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-09-30T10:09:36.561Z

Learnt from: zeus2611
PR: appwrite/templates#338
File: python/storage-cleaner/src/utils.py:26-40
Timestamp: 2025-09-30T10:09:36.561Z
Learning: In the Python storage-cleaner template at python/storage-cleaner/src/utils.py, the throw_if_missing function intentionally treats 0 as an invalid value to prevent a retention period of 0, which would cause files to be deleted immediately and risk accidental data loss.

Applied to files:

python/storage-cleaner/src/utils.py

🧬 Code graph analysis (2)

python/storage-cleaner/src/appwrite.py (1)

python/storage-cleaner/src/utils.py (1)

get_expiry_date (9-23)

python/storage-cleaner/src/main.py (2)

python/storage-cleaner/src/appwrite.py (2)

AppwriteService (12-65)

clean_bucket (26-65)

python/storage-cleaner/src/utils.py (1)

throw_if_missing (26-41)

🪛 Ruff (0.13.1)

python/storage-cleaner/src/appwrite.py

45-47: Avoid specifying long messages outside the exception class

(TRY003)

46-46: Use explicit conversion flag

Replace with conversion flag

(RUF010)

56-56: Do not catch blind exception: Exception

(BLE001)

63-65: Avoid specifying long messages outside the exception class

(TRY003)

python/storage-cleaner/src/main.py

23-23: Do not catch blind exception: Exception

(BLE001)

python/storage-cleaner/src/utils.py

41-41: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (5)

python/storage-cleaner/requirements.txt (1)

1-1: LGTM!

The pinned version constraint ensures reproducible builds and prevents breaking changes from future major releases.

python/storage-cleaner/src/main.py (1)

18-25: LGTM!

The exception handling properly wraps bucket operations, distinguishing between validation errors (400) and unexpected failures (500) while logging details for debugging.

python/storage-cleaner/src/utils.py (2)

9-23: LGTM!

The function uses timezone-aware datetime (Python 3.12+ compatible), handles invalid environment values gracefully, and returns a properly formatted ISO 8601 timestamp.

26-41: LGTM!

The validation logic correctly treats 0 as invalid to prevent immediate deletion and data loss. The inline comment on Line 36 documents this business rule for maintainers.

python/storage-cleaner/src/appwrite.py (1)

41-65: LGTM!

The error handling properly wraps API calls, tracks successes and failures, and provides detailed diagnostics when deletions fail. The loop correctly exits when no more files are returned.

python/storage-cleaner/src/appwrite.py

…ing API key gracefully

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

python/storage-cleaner/src/main.py (1)
20-20: Consider a more informative success message.

The message "Buckets cleaned" is plural but only one bucket is cleaned. Consider making it singular or adding details about the operation.
-        return context.res.text("Buckets cleaned", 200)
+        return context.res.text("Bucket cleaned successfully", 200)
Alternatively, if you want to return structured data:
-        return context.res.text("Buckets cleaned", 200)
+        return context.res.json({"message": "Bucket cleaned successfully", "bucket_id": os.environ["APPWRITE_BUCKET_ID"]}, 200)
python/storage-cleaner/src/appwrite.py (1)
50-58: File deletion loop handles errors well.

The loop safely extracts file IDs, wraps deletions in error handling, and tracks both successes and failures. The use of the file_id variable in the exception handler correctly avoids the KeyError issue flagged in previous reviews.

Optional improvement: Files without a $id field are silently skipped (line 53's conditional). Consider logging these cases for visibility:
             try:
                 file_id = f.get("$id")
                 if file_id:
                     self.storage.delete_file(bucket_id, file_id)
                     deleted_files_count += 1
+                else:
+                    print(f"Warning: File object missing $id in bucket {bucket_id}")
             except Exception as e:
                 failed_files.append({"id": file_id, "error": str(e)})

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between accc817 and 0d103d6.

📒 Files selected for processing (2)

python/storage-cleaner/src/appwrite.py (1 hunks)
python/storage-cleaner/src/main.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

python/storage-cleaner/src/main.py (2)

python/storage-cleaner/src/appwrite.py (2)

AppwriteService (12-65)

clean_bucket (26-65)

python/storage-cleaner/src/utils.py (1)

throw_if_missing (26-41)

python/storage-cleaner/src/appwrite.py (1)

python/storage-cleaner/src/utils.py (1)

get_expiry_date (9-23)

🪛 Ruff (0.13.1)

python/storage-cleaner/src/main.py

23-23: Do not catch blind exception: Exception

(BLE001)

python/storage-cleaner/src/appwrite.py

45-47: Avoid specifying long messages outside the exception class

(TRY003)

46-46: Use explicit conversion flag

Replace with conversion flag

(RUF010)

56-56: Do not catch blind exception: Exception

(BLE001)

63-65: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (8)

python/storage-cleaner/src/main.py (3)

1-3: LGTM! Clean and minimal imports.

The imports are appropriate for the serverless function's requirements.

9-14: Excellent API key validation!

The safe header access using .get() combined with the empty-value check properly prevents KeyError exceptions and provides clear error messages.

18-25: Exception handling properly addresses previous concerns.

The try-except structure correctly catches and handles errors from the bucket cleaning operation, logs details internally, and returns sanitized error messages. The broad Exception catch is appropriate in this serverless context to prevent unhandled exceptions from crashing the function.

Note: The static analysis warning about catching Exception is a false positive here - broad exception handling is appropriate for serverless entry points to ensure graceful degradation.

python/storage-cleaner/src/appwrite.py (5)

1-9: LGTM! Well-organized imports and documentation.

The module docstring and imports are clean and appropriate for the storage cleanup service.

17-24: Client initialization looks correct.

The fluent API usage for configuring the Appwrite client is clean and idiomatic. Ensure that the required environment variables (APPWRITE_FUNCTION_API_ENDPOINT and APPWRITE_FUNCTION_PROJECT_ID) are validated at the entry point (see comment in main.py).

26-39: Good query setup and pagination strategy.

The batch size of 25 is reasonable for iterative deletion, and the counter initialization properly tracks success and failure states.

41-48: Robust error handling for list operation.

The try-except block properly catches API failures, preserves the exception chain with from e, and provides clear context. The broad Exception catch is appropriate here since the Appwrite SDK may raise various exception types.

Note: Static analysis warnings (TRY003, RUF010) are pedantic in this context - the current implementation is clear and maintainable.

59-65: Appropriate error handling for partial failures.

The loop termination logic is correct, and raising a RuntimeError for partial failures ensures callers are aware of issues. The detailed error message (including counts and failed file details) is valuable for debugging, despite the static analysis warning.

Note: The current behavior raises an exception even if some files were successfully deleted. This is a conservative approach that ensures operators are notified of any failures. If you prefer to log warnings instead and only raise on total failure, consider adjusting the logic - but the current behavior is safer.

python/storage-cleaner/src/main.py

coderabbitai · 2025-09-30T10:28:58Z

python/storage-cleaner/src/main.py

+            {"error": "Missing API key in x-appwrite-key header"}, 401
+        )
+
+    appwrite = AppwriteService(api_key)


⚠️ Potential issue | 🟠 Major

Validate all required environment variables.

The AppwriteService constructor depends on APPWRITE_FUNCTION_API_ENDPOINT and APPWRITE_FUNCTION_PROJECT_ID from the environment, but these are not validated in the throw_if_missing call on line 7. If either is missing, the Appwrite client initialization will fail with unclear errors.

Add the missing variables to the validation list:

- throw_if_missing(os.environ, ["RETENTION_PERIOD_DAYS", "APPWRITE_BUCKET_ID"]) + throw_if_missing(os.environ, [ + "RETENTION_PERIOD_DAYS", + "APPWRITE_BUCKET_ID", + "APPWRITE_FUNCTION_API_ENDPOINT", + "APPWRITE_FUNCTION_PROJECT_ID" + ])

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

appwrite = AppwriteService(api_key)

# Validate required environment variables before anything else

throw_if_missing(os.environ, [

"RETENTION_PERIOD_DAYS",

"APPWRITE_BUCKET_ID",

"APPWRITE_FUNCTION_API_ENDPOINT",

"APPWRITE_FUNCTION_PROJECT_ID"

])

🤖 Prompt for AI Agents

In python/storage-cleaner/src/main.py around line 16, the AppwriteService is instantiated but the required environment variables APPWRITE_FUNCTION_API_ENDPOINT and APPWRITE_FUNCTION_PROJECT_ID are not validated; update the throw_if_missing call (around line 7) to include "APPWRITE_FUNCTION_API_ENDPOINT" and "APPWRITE_FUNCTION_PROJECT_ID" so missing values are caught early and a clear error is raised before creating the AppwriteService.

@Meldiron In the Node.js implementation of the storage cleaner, we only validate the RETENTION_PERIOD_DAYS and APPWRITE_BUCKET_ID environment variables. My understanding is that APPWRITE_FUNCTION_API_ENDPOINT and APPWRITE_FUNCTION_PROJECT_ID might be automatically populated at build or runtime. Could you please clarify if these two variables need explicit validation here, or if they are always guaranteed to be set by the environment?

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

feat: implement Python storage cleaner with Appwrite integration

d786c8c

coderabbitai bot reviewed Sep 30, 2025

View reviewed changes

feat: update Appwrite service and improve error handling in storage c…

accc817

…leaner

coderabbitai bot reviewed Sep 30, 2025

View reviewed changes

python/storage-cleaner/src/appwrite.py Outdated Show resolved Hide resolved

fix: improve error handling for failed file deletions and handle miss…

0d103d6

…ing API key gracefully

coderabbitai bot reviewed Sep 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: implement Python storage cleaner with Appwrite integration #338

feat: implement Python storage cleaner with Appwrite integration #338

zeus2611 commented Sep 30, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 30, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot Sep 30, 2025 •

edited

Loading

Uh oh!

zeus2611 Sep 30, 2025

Uh oh!

coderabbitai bot Sep 30, 2025

Uh oh!

Uh oh!

-    appwrite = AppwriteService(api_key)
+     # Validate required environment variables before anything else
+    throw_if_missing(os.environ, [
+        "RETENTION_PERIOD_DAYS",
+        "APPWRITE_BUCKET_ID",
+        "APPWRITE_FUNCTION_API_ENDPOINT",
+        "APPWRITE_FUNCTION_PROJECT_ID"
+    ])

feat: implement Python storage cleaner with Appwrite integration #338

Are you sure you want to change the base?

feat: implement Python storage cleaner with Appwrite integration #338

Conversation

zeus2611 commented Sep 30, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Related PRs and Issues

Have you read the Contributing Guidelines on issues?

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zeus2611 Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zeus2611 commented Sep 30, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 30, 2025 •

edited

Loading

coderabbitai bot Sep 30, 2025 •

edited

Loading