Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use templated to/from datetime fields in S3DeleteObjectsOperator #42363

Open
1 of 2 tasks
mgorsk1 opened this issue Sep 20, 2024 · 0 comments
Open
1 of 2 tasks
Labels
area:providers kind:bug This is a clearly a bug provider:amazon-aws AWS/Amazon - related issues

Comments

@mgorsk1
Copy link
Contributor

mgorsk1 commented Sep 20, 2024

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==8.28.0

Apache Airflow version

2.9.1

Operating System

Debian GNU/Linux 11 (bullseye)

Deployment

Other 3rd-party Helm chart

Deployment details

No response

What happened

S3DeleteObjectsOperator fails when to_datetime or from_datetime are defined as airflow macros.

File "/opt/venv/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 465, in _execute_task
    result = _execute_callable(context=context, **execute_callable_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 432, in _execute_callable
    return execute_callable(context=context, **execute_callable_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/airflow/models/baseoperator.py", line 400, in wrapper
    return func(self, *args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/operators/s3.py", line 535, in execute
    keys = self.keys or s3_hook.list_keys(
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 132, in wrapper
    return func(*bound_args.args, **bound_args.kwargs)
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 869, in list_keys
    return self._list_key_object_filter(keys, from_datetime, to_datetime)
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 670, in _list_key_object_filter
    return [k["Key"] for k in keys if _is_in_period(k["LastModified"])]
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 670, in <listcomp>
    return [k["Key"] for k in keys if _is_in_period(k["LastModified"])]
  File "/opt/venv/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py", line 666, in _is_in_period
    if to_datetime is not None and input_date > to_datetime:
TypeError: '>' not supported between instances of 'datetime.datetime' and 'str'

What you think should happen instead

to_datetime and from_datetime fields can be templated and are converted from string to datetime in appropriate place.

How to reproduce

Create dag with following task:

to_datetime = "{{ macros.ds_add(ds, -30) }}"

task = S3DeleteObjectsOperator(
    task_id='delete_old_logs',
    bucket='mybucket',
    prefix='logs/',
    to_datetime=to_datetime,
    aws_conn_id=aws_conn_id
)

Anything else

It would work if i pass to_datetime (or from_datetime) as datetime.now() + timedelta(days=-30). It is then confusing why these fields are accepted as template_fields.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@mgorsk1 mgorsk1 added area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Sep 20, 2024
@dosubot dosubot bot added the provider:amazon-aws AWS/Amazon - related issues label Sep 20, 2024
@shahar1 shahar1 removed the needs-triage label for new issues that we didn't triage yet label Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers kind:bug This is a clearly a bug provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

No branches or pull requests

2 participants