Skip to content

remove duplicates of languages from tooling-data.yml #1512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Vishv0407
Copy link
Contributor

What kind of change does this PR introduce?
Bug fix - remove two caseInsensetive entries of same name such as 'JavaScript' and 'Javascript'

Issue Number:

Screenshots/videos:
image

If relevant, did you update the documentation?
No.

Summary
Mainly I update languages which contains duplicate entries.

Does this PR introduce a breaking change?
No

@Vishv0407 Vishv0407 requested a review from a team as a code owner March 12, 2025 19:34
Copy link

github-actions bot commented Mar 12, 2025

built with Refined Cloudflare Pages Action

⚡ Cloudflare Pages Deployment

Name Status Preview Last Commit
website ✅ Ready (View Log) Visit Preview 7cedaf6

Copy link

codecov bot commented Mar 12, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (219521e) to head (7cedaf6).

Additional details and impacted files
@@            Coverage Diff            @@
##              main     #1512   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           10        10           
  Lines          396       396           
  Branches       106       106           
=========================================
  Hits           396       396           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Vishv0407
Copy link
Contributor Author

Vishv0407 commented Mar 12, 2025

@benjagm I recently opened a PR(I closed) but it failed a check for changing unauthorized file. here is the proposal before making changes in those files.

Feature - Adds case-insensitive unique validation for language entries

Screenshots/videos:
Forcefully made mistakes in the name of language,
image

Validator finds the mistake,
image

Summary
This PR introduces case-insensitive unique validation for language entries in the tooling data to solve several existing problems:

  1. Inconsistent language casing across tools (e.g., "JavaScript" vs "javascript" vs "JAVASCRIPT")
  2. Potential confusion for users seeing the same language listed multiple times

My solution:
Implements a custom AJV keyword caseInsensitiveUnique that:

  • Detects and reports case-insensitive duplicates using set
  • Provides clear error messages for easy fixes
           ajv.addKeyword({
              keyword: 'caseInsensitiveUnique',
              type: 'array',
              validate: function (schema, data) {
                if (!Array.isArray(data)) return false;
                
                const languagesSet = new Set();
                const languagesLowercaseSet = new Set();
                data.forEach((tool) => {
                  if (tool.languages) {
                    tool.languages.forEach((language) => {
                      languagesSet.add(language);
                      languagesLowercaseSet.add(language.toLowerCase());
                    });
                  }
                });
                if (languagesSet.size !== languagesLowercaseSet.size) {
                  console.error('Duplicate languages found');
                  const lowercaseMap = new Map();
                  languagesSet.forEach((language) => {
                    lowercaseMap.set(
                      language.toLowerCase(), 
                      (lowercaseMap.get(language.toLowerCase()) || 0) + 1
                    );
                  });
                  
                  lowercaseMap.forEach((value, key) => {
                    if (value > 1) {
                      console.log('Duplicate found for:', key);
                    }
                  });
                  validate.errors = [{
                    keyword: 'caseInsensitiveUnique',
                    message: 'array contains case-insensitive duplicates',
                    params: { keyword: 'caseInsensitiveUnique' }
                  }];
                  return false;
                }
                return true;
              }
            });

For Tool Maintainers:

# Before
languages:
  - "JavaScript"
  - "javascript"
  - "Python"
  - "PYTHON"

# After
languages:
  - "JavaScript"  # Use consistent casing from schema enum
  - "Python"

What's your feedback on this approach?

@DhairyaMajmudar
Copy link
Member

Thanks for the PR @Vishv0407 but it seems these changes are already covered in your PR #1513

Closing this in response of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🐛 Bug: Two JavaScript labels for filter on Tools page
2 participants