Add github_username to CustomUser for stats tracking (#615)#618
Add github_username to CustomUser for stats tracking (#615)#618Moueen-Togarvi wants to merge 26 commits intodjangonaut-space:developfrom
Conversation
|
Hi @tim-schilling and team! 👋 I'd like to work on this issue. I've reviewed @raffaellasuardini's library and analyzed how to integrate it into SessionAdmin. My approach would be:
Would it be okay if I take this on? Happy to discuss the implementation approach first if needed! |
|
@Moueen-Togarvi yup, yup that sounds good. If you feel like you're making some complex decisions, let us know and we can discuss. Otherwise I think this should be relatively contained. |
|
Thanks |
tim-schilling
left a comment
There was a problem hiding this comment.
Hi @Moueen-Togarvi, it looks like you're using an LLM to work on this. That's fine, but please review the code it's generating (tests are different). I mention this because this is a lot of code for something I'm hoping is a very small feature. I've also got some small comments since the codebase is currently under active development, so things have changed.
|
Hi @Moueen-Togarvi, I wanted to check-in regarding the state of the PR. How are things going? |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #618 +/- ##
===========================================
+ Coverage 95.19% 95.49% +0.30%
===========================================
Files 179 182 +3
Lines 8628 9149 +521
Branches 282 301 +19
===========================================
+ Hits 8213 8737 +524
+ Misses 369 366 -3
Partials 46 46 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Removed the deprecated action for collecting Djangonaut stats from the admin interface. - Added a new link for collecting GitHub stats directly in the admin session view. - Updated the GitHubStatsCollector to utilize the Search API for improved performance in collecting PRs and issues. - Enhanced the collect_stats_view to handle date validation and caching of report data for download. - Improved logging for better traceability during stats collection.
|
I have merged the develop branch into this, so now github_username issue is sorted. I replaced the search with search api instead of looping over all the issues. (It was taking too long for two usernames on my pc.). I have tested it with my username and it worked on Django. I added caching to handle the download csv and download txt file. So we do not have to redo the work. I am not clear how we will test this ... |
|
I would like to slim down this change significantly. I don't think we need to have the multiple formats of data being shown. |
@FarhanAliRaza we'll need to mock the GitHub service to match what it would return. It means we wouldn't capture the changes in the GitHub API though, but that's probably fine for our needs. To test the GitHub API doesn't change, we would need to define a fixed repo and user that would return the same information consistently. Not impossible, but it would be brittle. |
…indymeet into djangonaut-stats
Address review comments from PR djangonaut-space#618 by removing unnecessary complexity: - Remove excessive try-catch error logging (Sentry handles this) - Remove CSV and TXT download functionality (not requested) - Remove caching infrastructure for downloads - Simplify collect_stats_view to focus on core stats collection - Keep only HTML formatting for admin display This reduces code by 207 lines while maintaining core functionality. Fixes: discussion_r2607950514 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
The work done so far is impressive. I think I found some interesting inspiration for my project. It could be beneficial to include the distinction between closed PRs and merged PRs, as I did in my latest update. |
Add is_closed property and display closed PRs with 🚧 emoji.
55de078 to
c9b0345
Compare
dd1d10f to
cff81b0
Compare
for more information, see https://pre-commit.ci
- Switches to Django's template renderer for the report, removes ReportFormatter - Uses a form for handling the report request flow - Rearranges the view have to main flows of post and get
| DJANGONAUT_MONITORED_REPOS = [ | ||
| {"owner": "django", "repos": ["django"]}, | ||
| {"owner": "wagtail", "repos": ["wagtail"]}, | ||
| {"owner": "django-commons", "repos": ["*"]}, # All repos in django-commons | ||
| {"owner": "djangopackages", "repos": ["djangopackages"]}, | ||
| ] |
There was a problem hiding this comment.
I don't think this is necessary. We have the Project model that has the repo url for all the projects on the session. This would avoid us having to make a code change for each session. We would need to add a flag for the system to look at all repo's in the org for participation though. django CMS has several repos that djangonauts contribute to. Where as djangopackages just has the one.
There was a problem hiding this comment.
i also think its not sustainable to have the repos in the settings means any change we would have to touch the settings file and commit. in my opinion error prone and too much work
tim-schilling
left a comment
There was a problem hiding this comment.
I pushed this a bit further forward, but it needs a bit more work yet. Let me know if I need to clarify anything please.
- Convert StatsReport methods to cached_property and extract state constants
- Move CompareAvailabilityForm to forms.py and collect_stats_view to views/sessions.py
- Extract AuthorTestimonialMixin to DRY up testimonial Update/Delete views
- Replace order_by("?") with random.choice for hero testimonial selection
- Only generate testimonial slug on creation, not every save
- Add error handling for GitHub API calls in collect_stats_view
- Fix gettext import to use django.utils.translation
- Remove ad-hoc verify_pr_stats.py script
- Add monitor_all_organization_repos flag to Project model - Add github_repo_config property to Project for stats collection format - Add get_monitored_github_repos() to Session using available_projects - Remove hardcoded DJANGONAUT_MONITORED_REPOS from settings - Update collect_stats_view to pull repos from the session - Improve github_repo URL parsing to use urlparse
…indymeet into djangonaut-stats
|
@tim-schilling @FarhanAliRaza Some folks are moving away from github will we support other platforms like gitlab or codeberg in the future if the projects are hosted there? |
Django repository is still on GitHub, and I don’t think there’s a willingness to move away in a near future. If one of the projects is going to be on other platforms there could be a chance to open a new PR. |
- Remove redundant docstrings from self-evident dataclass properties and methods - Rename _parse_date to _to_date with simplified logic - Inline _build_author (was just Author(x, x)) - Simplify _build_search_queries to use f-string instead of filtered join - Rename _get_repos_from_config to _resolve_wildcard_repos, move wildcard check to caller - Streamline collection loops by removing intermediate variables - Use Django admin's closelink class for Cancel button instead of custom styling
Cover previously untested branches flagged by Codecov on PR djangonaut-space#618: - Project.github_repo with short GitHub path (< 2 path parts) - Project.github_repo_config returning None for non-GitHub URLs - Project.github_repo_config returning single repo config - Session.get_monitored_github_repos skipping non-GitHub projects - Session.get_monitored_github_repos deduplicating repos from same owner - Session.get_monitored_github_repos with no projects - collect_stats_view handling GithubException with error redirect - collect_stats_view re-rendering form on invalid POST - Testimonial.save including slug in update_fields on creation
tim-schilling
left a comment
There was a problem hiding this comment.
I think there was a bad merge at some point. It looks like the PR is out of sync.
| if start_date > today: | ||
| raise forms.ValidationError( | ||
| _("Start date cannot be in the future. Today is %(today)s."), | ||
| params={"today": today}, | ||
| ) |
There was a problem hiding this comment.
This is different than how we manage an invalid end_date where the end_date is silently changed. We may want to be consistent in our choice here.
| migrations.AlterField( | ||
| model_name="session", | ||
| name="djangonauts_have_access", | ||
| field=models.BooleanField( | ||
| default=False, | ||
| help_text="Whether Djangonauts can access their team detail pages. Automatically set to True when team welcome emails are sent. This will be ignored once session start date is in the past.", | ||
| ), | ||
| ), |
There was a problem hiding this comment.
This is likely due to a bad merge and should be removed.
|
|
||
| repos_by_owner.setdefault(owner, []) | ||
| for repo_name in configured_repos: | ||
| if repo_name not in repos_by_owner[owner]: |
There was a problem hiding this comment.
nitpick: I think we can drop this line due to the setdefault above.
| self.slug = slugify(f"{name}-{self.title}-{unique}") | ||
| if update_fields is not None and "title" in update_fields: | ||
| update_fields = {"slug"}.union(update_fields) | ||
| if not self.slug: |
There was a problem hiding this comment.
I think this may be due to a bad merge.
|
Hey @FarhanAliRaza, I'm writing my initial thoughts |
…lugs, and simplify repo deduplication - CollectStatsForm: return today instead of raising ValidationError for future start/end dates - Testimonial.save(): always regenerate slug on save, not just on creation - Simplify repo deduplication in Session using dict.fromkeys() instead of manual loop - Remove duplicate AlterField migration for djangonauts_have_access help text from 0051 - Add CollectStatsFormTests for future date normalization behavior - Remove stale test_save_with_update_fields_includes_slug test
raffaellasuardini
left a comment
There was a problem hiding this comment.
I also suggest using the team logic when creating the query to also be able to specify the project, plus the team members.
| self, repos: list[dict] | ||
| ) -> tuple[set[str], list[str]]: | ||
| """Build allowed repos for filtering and search scope terms for querying.""" | ||
| allowed_repos = set() |
There was a problem hiding this comment.
the allowed repos are not necessary because in the query, when org is checked automatically, search and only the linked repos
|
|
||
| return allowed_repos, list(dict.fromkeys(scope_terms)) | ||
|
|
||
| def collect_prs_for_repo( |
There was a problem hiding this comment.
this is only used on the test, even collect_issues_for_repo
| ) | ||
| if part | ||
| ) | ||
| for username in unique_usernames |
There was a problem hiding this comment.
for every djangonauts a query is created, but for this session, we have 20 djangonauts. I think we can take advantage of team, where there are lists of all members for a project
| Search results do not include the user's profile name. Using the login | ||
| avoids an extra GitHub API request for every result. | ||
| """ | ||
| return Author(github_username=github_username, name=github_username) |
There was a problem hiding this comment.
in the database, there are stored names for all djangonauts. not every login is self explainatory
| scope_clause, | ||
| f"author:{username}", | ||
| f"type:{item_type}", | ||
| f"created:{start_date}..{end_date}", |
There was a problem hiding this comment.
If you use the term 'created', it will correctly retrieve the list of open PRs, but not every merged PR. Only those opened and merged within the specified time period.
If a PR was created 10 days ago but merged today, it will not be retrieved.
I suggest making another API call with the 'merged' period.
There will also be consideration to avoid having merged PRs considered twice.
…, fail loudly on rate limits Drive GitHub stats collection off session.teams so each query is scoped to one team's project with that team's djangonauts as authors, and return a nested StatsReport(teams=[TeamReport(...)]) the admin can render directly per team. Each team runs a created: and merged: PR search (deduped by (repo, number)) so PRs opened before the window but merged inside it aren't dropped, plus one issue search. Author display uses the real profile name alongside the GitHub login. PyGithub's default retry is swapped for urllib3 Retry(total=0) so rate-limit 403s raise GithubException (already caught by the view) instead of silently sleeping. Removes the now-unused Project.github_repo_config, Session.get_monitored_github_repos, and the per-repo collector helpers; adds a logger entry in base.py so collector INFO output shows up in the console.
|
Thank you @raffaellasuardini, for the review. A bit late, but I fixed the points you reviewed. Now the admin view is also based on teams.
|





No description provided.