Skip to content

Conversation

@whilo
Copy link
Collaborator

@whilo whilo commented Jun 23, 2025

  • Remove priority calculation from math pipeline (conversation.py)

    • Delete _importance_metric and _priority_metric methods
    • Remove priority computation from recompute() method
    • Math pipeline now focuses on PCA, clustering, and representativeness
  • Add dedicated priority calculation script (502_calculate_priorities.py)

    • Implements PriorityCalculator class with group-based extremity
    • Matches Clojure priority formula: (importance * scaling_factor)^2
    • Retrieves extremity values from Delphi_CommentExtremity table
    • Updates priorities in Delphi_CommentRouting table
  • Update pipeline execution order (run_delphi.py, run_delphi.sh)

    • Math pipeline → UMAP pipeline → Extremity calculation → Priority calculation
    • Ensures priorities use group-based extremity instead of PCA-based
    • Maintains separation of concerns between mathematical and priority calculations

This fixes the priority calculation bug where all priorities were 0 due to missing extremity values, and implements proper group-based extremity usage as requested for the Pakistan conversation analysis.

🤖 Generated with Claude Code

whilo and others added 2 commits June 23, 2025 18:04
- Remove priority calculation from math pipeline (conversation.py)
  * Delete _importance_metric and _priority_metric methods
  * Remove priority computation from recompute() method
  * Math pipeline now focuses on PCA, clustering, and representativeness

- Add dedicated priority calculation script (502_calculate_priorities.py)
  * Implements PriorityCalculator class with group-based extremity
  * Matches Clojure priority formula: (importance * scaling_factor)^2
  * Retrieves extremity values from Delphi_CommentExtremity table
  * Updates priorities in Delphi_CommentRouting table

- Update pipeline execution order (run_delphi.py, run_delphi.sh)
  * Math pipeline → UMAP pipeline → Extremity calculation → Priority calculation
  * Ensures priorities use group-based extremity instead of PCA-based
  * Maintains separation of concerns between mathematical and priority calculations

This fixes the priority calculation bug where all priorities were 0 due to
missing extremity values, and implements proper group-based extremity usage
as requested for the Pakistan conversation analysis.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Extract priority formulas into polismath/conversation/priority.py
  * Create PriorityCalculator class with static methods for core formulas
  * Port importance_metric and priority_metric from Clojure implementation
  * Add convenience methods: calculate_comment_priority, validate_inputs, explain_priority
  * Pure mathematical logic with no I/O dependencies for better testability

- Refactor umap_narrative/502_calculate_priorities.py to use extracted formulas
  * Rename class from PriorityCalculator to PriorityService (clearer distinction)
  * Import and use PriorityCalculator.calculate_comment_priority()
  * Remove duplicate formula implementations (38 lines removed)
  * Focus service on DynamoDB operations and data orchestration

Benefits:
- Better separation of concerns: formulas vs data processing
- Improved testability: mathematical logic can be unit tested independently
- Enhanced reusability: priority formulas can be used in other contexts
- Cleaner maintainability: formula changes only need to happen in one place

Tested successfully with conversation 36324:
- 807 comments processed
- Priority statistics: min=0, max=3, avg=2.36
- All priorities calculated using group-based extremity values

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@whilo whilo marked this pull request as ready for review June 24, 2025 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants