Skip to content

Conversation

@naquad
Copy link

@naquad naquad commented May 14, 2025

An implementation of the reporter, storing the updates in the Atom feed at the specified location.

@naquad
Copy link
Author

naquad commented Jul 3, 2025

Erm are we going somewhere with this or it'll just fail time to time because of the formatting settings?

@thp
Copy link
Owner

thp commented Jul 7, 2025

Erm are we going somewhere with this or it'll just fail time to time because of the formatting settings?

[...]/lib/urlwatch/reporters.py:1173:1: E302 expected 2 blank lines, found 1
[...]/lib/urlwatch/reporters.py:1231:57: W504 line break after binary operator

@thp
Copy link
Owner

thp commented Jul 7, 2025

The two remaining issues are pycodestyle bugs, fixed in later versions, see: PyCQA/flake8#1845 (comment)

This is taken care of in #847 (and should be fixed once you merge upstream thp:master into your branch).

Copy link
Owner

@thp thp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. I think it's feature-complete, but making the _e() thing a bit less smart would make maintaining this easier.


logger.warning("%s: invalid atom feed", self.config['path'])
except etree.LxmlError as e:
logger.warning("failed to parse %s: %s", self.config['path'], e)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention here that a new file will be created. Should the old file be copied somewhere to not cause data loss?

Comment on lines +1207 to +1210
# fix the namespaces
for elem in tree.iter():
if hasattr(elem, 'tag') and elem.tag.startswith(nspfx):
elem.tag = elem.tag[len(nspfx):]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be done instead with using namespace features in etree?

.iter() doesn't seem to support namespaces, but .iterfind() seems to do..

Comment on lines +1229 to +1237
def _attrs_equal(self, a, b, exist):
for k in a.keys() | b.keys():
if (
k not in exist and a.get(k) != b.get(k)
or k in exist and k not in a
):
return False

return True
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here, explaining in words what this is supposed to do in terms of "a", "b" and "exist"? It seems to compare "a" and "b" for equality, somehow taking "exist" into account.

def _entry_updated(self, entry):
"""Tries to fetch the updated timestamp from the entry"""
updated = entry.find('./updated')
return updated is not None and updated.text or '2099-01-01T00:00:00Z'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the magical value of 2099? Can we just not have a value here, or use the current time/date?


maxitems = self.config.get('maxitems', 0)
if maxitems < 0:
logger.warning("atom: maxitems can't be negative")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what is the effect here? Ignoring and using all items? Maybe the warning should say that, something like "maxitems can't be negative, not limiting amount of items" or something.

'enabled': False,
'maxitems': 50,
'path': '/path/to/feed.xml',
'title': 'URLWatch Updates',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Branding.

Suggested change
'title': 'URLWatch Updates',
'title': 'urlwatch Updates',

.\" indent \\n[an-margin]
.\" old: \\n[rst2man-indent\\n[rst2man-indent-level]]
.nr rst2man-indent-level -1
.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly revert this file change for now. It's updated mechanically on release, and it seems to have more than just the added changes, so we shouldn't do the file update as part of this PR.

Comment on lines +1317 to +1329
e = functools.partial(self._e, entry)

e("id", self._mkuuid())
e("title", f'{job_state.verb}: {job.pretty_name()}')

if job.location_is_url():
e("link", job.get_location(), target='href')
else:
e("summary", job.get_location())

content = self._format_content(job_state, cfg['diff'])
e("content", str(content), target='cdata', type='html')
e("updated", self._tsfmt(timestamp))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not yet sure if I like the whole _e thing. It's probably clever and stuff, but I don't fully get it, and it's probably hard to maintain(?) if I don't understand it. Would it be possible to split it up into multiple functions? If not, why not?

# Optional: Unique feed ID (automatically generated if omitted)
id: "urn:uuid:ffa6dc6e-7436-48f6-bc99-020ab1e7d429"
# Optional: Title of the feed
title: "URLWatch"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Branding.

Suggested change
title: "URLWatch"
title: "urlwatch changes"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants