Skip to content

Conversation

@cielbellerose
Copy link
Contributor

@cielbellerose cielbellerose commented Nov 6, 2025

Changes

We can upload rule documents to the ruleset page which creates a new ruleset and populates the ruleset with the parsed rules! Rules can currently be viewed in the edit rules page.

Notes

  • Extracts bulleted child rules (a, b, c, etc.) from a rule's content and adds them as separate rules
  • Duplicate rule codes currently just add a .duplicate suffix to fix prisma errors

  • FSAE is able to get page numbers if we need them for images later

Test Cases

FSAE 2025

  • Rules parsed: 2824
  • Unsure if we want to cut ones like this into subrules
Screenshot 2026-01-02 at 3 25 30 PM

Screenshots

Screenshot 2026-01-04 at 7 44 02 PM Screenshot 2026-01-04 at 7 44 41 PM Screenshot 2026-01-02 at 3 14 31 PM Screenshot 2026-01-04 at 7 45 40 PM Screenshot 2026-01-04 at 7 45 18 PM

Todo

  • FSAE need to remove page footer "Formula SAE® Rules 2025 © 2024 SAE International Page 23 of 143 Version 1.0 31 Aug 2024" from some rules

Checklist

It can be helpful to check the Checks and Files changed tabs.
Please review the contributor guide and reach out to your Tech Lead if anything is unclear.
Please request reviewers and ping on slack only after you've gone through this whole checklist.

  • All commits are tagged with the ticket number
  • No linting errors / newline at end of file warnings
  • All code follows repository-configured prettier formatting
  • No merge conflicts
  • All checks passing
  • Screenshots of UI changes (see Screenshots section)
  • Remove any non-applicable sections of this template
  • Assign the PR to yourself
  • No yarn.lock changes (unless dependencies have changed)
  • Request reviewers & ping on Slack
  • PR is linked to the ticket (fill in the closes line below)

Closes #3622

@cielbellerose cielbellerose self-assigned this Nov 6, 2025
@chpy04 chpy04 force-pushed the feature/rules-dashboard branch from bf6ef93 to ceb5ff8 Compare November 21, 2025 12:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a rule dashboard parsing feature that allows users to upload PDF rule documents to create rulesets with automatically parsed rules. The feature supports FSAE and FHE parser types, extracts bulleted child rules (a, b, c) from rule content, and handles duplicate rule codes by appending a .duplicate suffix.

Changes:

  • Added PDF parsing functionality with support for FSAE and FHE ruleset formats
  • Implemented file upload workflow with validation and rule extraction
  • Enhanced UI with disabled state for buttons when rulesets have no rules
  • Added lazy loading for child rules in the rule tree view

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
yarn.lock Added pdf-parse-new dependency for PDF parsing functionality
src/shared/src/types/rules-types.ts Added ruleAmount field to Ruleset interface
src/frontend/src/utils/urls.ts Added API URL endpoints for ruleset parsing and file upload
src/frontend/src/pages/RulesPage/components/RulesetTable.tsx Added disabled state for Edit/View buttons when rulesets have no rules
src/frontend/src/pages/RulesPage/components/AddNewFileModal.tsx Completely refactored file upload modal with improved UX and validation
src/frontend/src/pages/RulesPage/RulesetViewPage.tsx Moved useSingleRuleset hook from RulesetEditPage for reusability
src/frontend/src/pages/RulesPage/RulesetTypePage.tsx Added React import for proper typing
src/frontend/src/pages/RulesPage/RulesetPage.tsx Implemented file upload, ruleset creation, and parsing workflow with error handling
src/frontend/src/pages/RulesPage/RulesetEditPage.tsx Updated to use actual API hooks and lazy loading for rules
src/frontend/src/pages/RulesPage/RuleRow.tsx Added lazy loading support for child rules
src/frontend/src/hooks/rules.hooks.ts Added hooks for ruleset parsing, creation, and file upload
src/frontend/src/apis/rules.api.ts Added API functions for parsing, creating rulesets, and uploading files
src/backend/tests/unit/rule.test.ts Removed console.log statement
src/backend/src/utils/parse.utils.ts Implemented comprehensive PDF parsing logic for FSAE and FHE formats
src/backend/src/transformers/rules.transformer.ts Added ruleAmount calculation to ruleset transformer
src/backend/src/services/rules.services.ts Added parseRuleset, uploadRulesetFile, and getRulesetById methods
src/backend/src/routes/rules.routes.ts Added routes for ruleset parsing and file upload
src/backend/src/prisma/seed.ts Reorganized seed data for rules and ruleset types
src/backend/src/prisma/seed-data/rules.seed.ts Added FHE ruleset type to seed data
src/backend/src/prisma-query-args/rules.query-args.ts Added getRulesetPreviewQueryArgs
src/backend/src/controllers/rules.controllers.ts Added controllers for parsing rulesets and uploading files
src/backend/package.json Added pdf-parse-new dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 10 to 28
try {
const options = {
// max page number to parse, 0 = all pages
max: 0,
// errors: 0, warnings: 1, infos: 5
verbosityLevel: 0 as const
};
const pdfData = await pdf(buffer, options);

if (parserType === 'FSAE') {
return parseFSAERules(pdfData.text);
}
if (parserType === 'FHE') {
return parseFHERules(pdfData.text);
}
throw new Error(`Invalid parser type: ${parserType}. Must be 'FSAE' or 'FHE'`);
} catch (error) {
throw error;
}
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catch block with re-throw is unnecessary. The try-catch block adds no value since it just re-throws the error without any additional handling or transformation. Remove the try-catch wrapper entirely.

Suggested change
try {
const options = {
// max page number to parse, 0 = all pages
max: 0,
// errors: 0, warnings: 1, infos: 5
verbosityLevel: 0 as const
};
const pdfData = await pdf(buffer, options);
if (parserType === 'FSAE') {
return parseFSAERules(pdfData.text);
}
if (parserType === 'FHE') {
return parseFHERules(pdfData.text);
}
throw new Error(`Invalid parser type: ${parserType}. Must be 'FSAE' or 'FHE'`);
} catch (error) {
throw error;
}
const options = {
// max page number to parse, 0 = all pages
max: 0,
// errors: 0, warnings: 1, infos: 5
verbosityLevel: 0 as const
};
const pdfData = await pdf(buffer, options);
if (parserType === 'FSAE') {
return parseFSAERules(pdfData.text);
}
if (parserType === 'FHE') {
return parseFHERules(pdfData.text);
}
throw new Error(`Invalid parser type: ${parserType}. Must be 'FSAE' or 'FHE'`);

Copilot uses AI. Check for mistakes.
* @param organizationId The ID of the organization the ruleset belongs to
* @returns The ruleset if found, otherwise throws an error
*/
static async getRulesetById(rulesetId: string, organizationId: string): Promise<RulesetPreview> {
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect return type: The function signature indicates it returns RulesetPreview, but the frontend expects a Ruleset type (see useGetRuleset hook). The return type should be changed from RulesetPreview to Ruleset to match what rulesetTransformer returns and what the frontend expects.

Suggested change
static async getRulesetById(rulesetId: string, organizationId: string): Promise<RulesetPreview> {
static async getRulesetById(rulesetId: string, organizationId: string): Promise<Ruleset> {

Copilot uses AI. Check for mistakes.
const ruleset = await prisma.ruleset.findFirst({
where: {
rulesetId,
deletedBy: null,
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The where clause uses 'deletedBy: null' but the Ruleset model likely has a 'deletedByUserId' field. This should be 'deletedByUserId: null' to match the field name used elsewhere in the codebase (e.g., line 1317 in the parseRuleset method).

Suggested change
deletedBy: null,
deletedByUserId: null,

Copilot uses AI. Check for mistakes.
Comment on lines 108 to 128
const handleDuplicateCodes = (rules: ParsedRule[]): ParsedRule[] => {
const seenRuleCodes = new Map<string, number>();

return rules.map((rule) => {
const originalCode = rule.ruleCode;

if (seenRuleCodes.has(originalCode)) {
// duplicate found
const count = seenRuleCodes.get(originalCode)!;
seenRuleCodes.set(originalCode, count + 1);
const suffix = count === 1 ? '.duplicate' : `.duplicate${count}`;

return {
...rule,
ruleCode: `${originalCode}${suffix}`
};
}
seenRuleCodes.set(originalCode, 1);
return rule;
});
};
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When duplicate rule codes are renamed by appending .duplicate suffix, the parentRuleCode fields of their child rules are not updated to match. This will cause parent-child relationships to break. For example, if rule "T.1" is duplicated and becomes "T.1.duplicate", any child rules with parentRuleCode "T.1" will fail to find their parent. Consider updating parent references after handling duplicates, or tracking the mapping during the duplicate renaming process.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we should decide if we want to keep the .duplicate suffix (or possibly something else?) and def update the parent stuff here

rulesetTypeTransformer,
ruleTransformer
} from './transformers/rules.transformers';
} from '../../../backend/src/transformers/rules.transformer';
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import path is incorrect. This should import from './transformers/rules.transformers' instead of '../../../backend/src/transformers/rules.transformer'. Frontend code should not import from backend code. The transformers should exist in the frontend's own transformers directory.

Suggested change
} from '../../../backend/src/transformers/rules.transformer';
} from './transformers/rules.transformers';

Copilot uses AI. Check for mistakes.
if (parsedRules.length === 0) {
throw new HttpException(400, 'No rules found in provided file');
}
} catch (error) {
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling here swallows the original error details. If parseRulesFromPdf throws an HttpException with a specific status code or message, it will be replaced with a generic 500 error. Consider checking if the error is already an HttpException and rethrowing it, or at least preserving the original error message in development environments.

Suggested change
} catch (error) {
} catch (error) {
if (error instanceof HttpException) {
throw error;
}
if (process.env && process.env.NODE_ENV === 'development') {
const message =
error instanceof Error ? error.message : 'Unknown error';
throw new HttpException(
500,
`Error parsing rules from PDF file: ${message}`
);
}

Copilot uses AI. Check for mistakes.
data: ruleSeedData.rulesetType1(batman.userId, ner.organizationId)
});

const fheRulesetType = await prisma.ruleset_Type.create({
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable fheRulesetType.

Suggested change
const fheRulesetType = await prisma.ruleset_Type.create({
await prisma.ruleset_Type.create({

Copilot uses AI. Check for mistakes.
data: ruleSeedData.rulesetType2(batman.userId, ner.organizationId)
});

const emptyRulesetType = await prisma.ruleset_Type.create({
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable emptyRulesetType.

Suggested change
const emptyRulesetType = await prisma.ruleset_Type.create({
await prisma.ruleset_Type.create({

Copilot uses AI. Check for mistakes.
data: ruleSeedData.ruleset1(fergus.carId, batman.userId, fsaeRulesetType.rulesetTypeId)
});

const secondActiveRuleset = await prisma.ruleset.create({
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable secondActiveRuleset.

Suggested change
const secondActiveRuleset = await prisma.ruleset.create({
await prisma.ruleset.create({

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 22 changed files in this pull request and generated 14 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1359 to +1365
const createdRules = await tx.rule.findMany({
where: { rulesetId },
select: {
ruleId: true,
ruleCode: true
}
});
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transaction fetches all rules for the ruleset using 'where: { rulesetId }', but if the ruleset already had existing rules from a previous parse attempt, this will include those old rules in the rule map. Consider adding a filter to only fetch rules created in this transaction, or ensure the ruleset is empty before parsing.

Copilot uses AI. Check for mistakes.
}

// "ARTICLE A1 FORMULA HYBRID + ELECTRIC OVERVIEW"
// Caputres "A1" as rule code, removes "ARTICLE" and adds rest as content
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "Caputres 'A1' as rule code" contains a spelling error. It should be "Captures" instead of "Caputres".

Suggested change
// Caputres "A1" as rule code, removes "ARTICLE" and adds rest as content
// Captures "A1" as rule code, removes "ARTICLE" and adds rest as content

Copilot uses AI. Check for mistakes.
rulesRouter.post(
'/ruleset/:rulesetId/parse',
nonEmptyString(body('fileId')),
nonEmptyString(body('parserType')), // 'FSAE' or 'FHE'
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parserType validation only checks if it's a non-empty string but doesn't validate that it's specifically 'FSAE' or 'FHE'. Consider adding validation using express-validator's isIn(['FSAE', 'FHE']) to ensure only valid parser types are accepted at the API boundary.

Suggested change
nonEmptyString(body('parserType')), // 'FSAE' or 'FHE'
nonEmptyString(body('parserType')), // 'FSAE' or 'FHE'
body('parserType').isIn(['FSAE', 'FHE']),

Copilot uses AI. Check for mistakes.
Comment on lines 1 to 381
import pdf from 'pdf-parse-new';

export interface ParsedRule {
ruleCode: string;
ruleContent: string;
parentRuleCode?: string;
}

export const parseRulesFromPdf = async (buffer: Buffer, parserType: 'FSAE' | 'FHE'): Promise<ParsedRule[]> => {
const options = {
// max page number to parse, 0 = all pages
max: 0,
// errors: 0, warnings: 1, infos: 5
verbosityLevel: 0 as const
};
const pdfData = await pdf(buffer, options);

if (parserType === 'FSAE') {
return parseFSAERules(pdfData.text);
}
if (parserType === 'FHE') {
return parseFHERules(pdfData.text);
}
throw new Error(`Invalid parser type: ${parserType}. Must be 'FSAE' or 'FHE'`);
};

/**
* Extracts lettered sub-rules from rule content (a, b, c, etc.)
* "EV.5.2 Main text a. Sub-rule" becomes:
* - EV.5.2 Main text
* - EV.5.2.a Sub-rule
* If no subrules exist, returns the original rule
* @param ruleCode parent rule code
* @param content rule content to extract from
* @returns array of parsed rules including main rule and any subrules
*/
const extractSubRules = (ruleCode: string, content: string): ParsedRule[] => {
const letterPattern = /\s+([a-z])\.\s+/g;
const matches = [...content.matchAll(letterPattern)];

if (matches.length === 0) {
// no subrules found, return original rule
return [
{
ruleCode,
ruleContent: content.trim(),
parentRuleCode: findParentRuleCode(ruleCode)
}
];
}
const subRules: ParsedRule[] = [];

// Extract the main rule content (everything before the first lettered item)
const firstMatchIndex = matches[0].index!;
const mainContent = content.substring(0, firstMatchIndex).trim();

// add main rule
subRules.push({
ruleCode,
ruleContent: mainContent,
parentRuleCode: findParentRuleCode(ruleCode)
});

// Extract lettered sub-rules
for (let i = 0; i < matches.length; i++) {
const [, letter] = matches[i];
const startIndex = matches[i].index! + matches[i][0].length;

// Find where this sub-rule ends (either at next letter or end of rule content)
const endIndex = i < matches.length - 1 ? matches[i + 1].index! : content.length;
const subRuleContent = content.substring(startIndex, endIndex).trim();
const subRuleCode = `${ruleCode}.${letter}`;

subRules.push({
ruleCode: subRuleCode,
ruleContent: subRuleContent,
parentRuleCode: ruleCode
});
}
return subRules;
};

/**
* Determines parent rule code by removing last value.
* Top level rules return undefined.
* EV.5.2.2 -> EV.5.2
* GR -> undefined
* @param ruleCode rule code to find a parent for
* @returns Parent rule code, or undefined if top level
*/
const findParentRuleCode = (ruleCode: string): string | undefined => {
const parts = ruleCode.split('.');
if (parts.length <= 1) {
return undefined;
}
return parts.slice(0, -1).join('.');
};

/**
* Updates rules with duplicate rule codes by appending .duplicate suffix
* @param rules array of parsed rules
* @returns array of rules without duplicate rule codes
*/
const handleDuplicateCodes = (rules: ParsedRule[]): ParsedRule[] => {
const seenRuleCodes = new Map<string, number>();

return rules.map((rule) => {
const originalCode = rule.ruleCode;

if (seenRuleCodes.has(originalCode)) {
// duplicate found
const count = seenRuleCodes.get(originalCode)!;
seenRuleCodes.set(originalCode, count + 1);
const suffix = count === 1 ? '.duplicate' : `.duplicate${count}`;

return {
...rule,
ruleCode: `${originalCode}${suffix}`
};
}
seenRuleCodes.set(originalCode, 1);
return rule;
});
};

/**************** FSAE ****************/

const parseFSAERules = (text: string): ParsedRule[] => {
const rules: ParsedRule[] = [];
const lines = text.split('\n');

let currentRule: { code: string; text: string } | null = null;

const saveCurrentRule = () => {
if (!currentRule) return;
const parsedRules = extractSubRules(currentRule.code, currentRule.text);
rules.push(...parsedRules);
};

for (const line of lines) {
const trimmedLine = line.trim();
if (!trimmedLine) continue;

// Skip page headers/footers
if (isHeaderFooterFSAE(trimmedLine)) {
continue;
}

// Skip table of contents
if (/\.{4,}\s+\d+\s*$/.test(trimmedLine)) {
continue;
}

// Check if this line starts a new rule
const rule = parseRuleNumberFSAE(trimmedLine);
if (rule) {
saveCurrentRule();
currentRule = {
code: rule.ruleCode,
text: rule.ruleContent
};
} else if (currentRule) {
currentRule.text += ' ' + trimmedLine; // else append to existing rule
}
}
saveCurrentRule();

const fixedRules = fixOrphanedRulesFSAE(rules);
return handleDuplicateCodes(fixedRules);
};

/**
* Determines if this line starts a new rule, if so extracts code and content of the rule
* Matches rule pattern (e.g. GR.1.1 some text) or section pattern (e.g. GR - TEXT)
* @param line single line in the extracted text from the ruleset pdf
* @returns rule code and content, or null if this line does not start a new rule
*/
const parseRuleNumberFSAE = (line: string): ParsedRule | null => {
// Match rule patterns like "GR.1.1" followed by text
const rulePattern = /^([A-Z]{1,4}(?:\.[\d]+)+)\s+(.+)$/;
// Match section patterns like "GR - GENERAL REGULATIONS or PS - PRE-COMPETITION SUBMISSIONS"
const sectionPattern = /^([A-Z]{1,4})\s*-\s*(.+)$/;

const match = line.match(rulePattern) || line.match(sectionPattern);
if (match) {
const cleanContent = match[2].replace(/\.{5,}/g, '.....');
return {
ruleCode: match[1],
ruleContent: cleanContent
};
}
return null;
};

/**
* Checks if a line is a page header/footer that should be skipped
* @param line line to check
* @returns true if line should be skipped
*/
const isHeaderFooterFSAE = (line: string): boolean => {
const trimmed = line.trim();

// Match FSAE headers like "Formula SAE® Rules 2025 © 2024 SAE International Page 7 of 143 Version 1.0 31 Aug 2024"
if (/Formula SAE.*Rules.*\d{4}.*SAE International.*Page \d+ of \d+/i.test(trimmed)) {
return true;
}
// Match standalone page numbers
if (/^Page \d+ of \d+$/i.test(trimmed)) {
return true;
}
// Match version strings
if (/^Version \d+\.\d+.*\d{1,2}\s+(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+\d{4}$/i.test(trimmed)) {
return true;
}

return false;
};

/**
* Updates rules to point to nearest existing parent if their assigned parent doesn't exist.
* D.8.1.2 -> checks for D.8.1, if missing goes to D.8, then D
* @param rules array of parsed rules
* @returns rules with corrected parent references
*/
const fixOrphanedRulesFSAE = (rules: ParsedRule[]): ParsedRule[] => {
const existingCodes = new Set(rules.map((r) => r.ruleCode));

return rules.map((rule) => {
// skip if no parent or parent exists
if (!rule.parentRuleCode || existingCodes.has(rule.parentRuleCode)) {
return rule; // Top-level rule
}

// Set parent doesn't exist, walk up the hierarchy
const parts = rule.ruleCode.split('.');
for (let i = parts.length - 2; i > 0; i--) {
const ancestorCode = parts.slice(0, i).join('.');
if (existingCodes.has(ancestorCode)) {
return { ...rule, parentRuleCode: ancestorCode };
}
}

// No ancestor exists, becomes top-level
return { ...rule, parentRuleCode: undefined };
});
};

/**************** FHE *****************/

const parseFHERules = (text: string): ParsedRule[] => {
const rules: ParsedRule[] = [];
const lines = text.split('\n');
let inRulesSection = false;
let currentRule: { code: string; text: string } | null = null;

const saveCurrentRule = () => {
if (!currentRule) return;
const parsedRules = extractSubRules(currentRule.code, currentRule.text);
rules.push(...parsedRules);
};

for (const line of lines) {
const trimmedLine = line.trim();
if (!trimmedLine) continue;
if (/^Index of Tables/i.test(trimmedLine)) {
inRulesSection = true;
}
// Skip table of contents
if (inRulesSection) {
if (/^2025 Formula Hybrid.*Rules/i.test(trimmedLine)) {
saveCurrentRule();
currentRule = null;
continue;
}

// Check if this line starts a new rule
const rule = parseRuleNumberFHE(trimmedLine);
if (rule) {
saveCurrentRule();
currentRule = {
code: rule.ruleCode,
text: rule.ruleContent
};
} else if (currentRule) {
// Append to existing rule
currentRule.text += ' ' + trimmedLine;
}
}
}
saveCurrentRule();

const fixedRules = fixOrphanedRulesFHE(rules);
return handleDuplicateCodes(fixedRules);
};

/**
* Determines if this line starts a new rule, if so extracts code and content of the rule
* Matches three patterns: rule ("1T3.17.1 Text"), part ("PART A1 - Text"), and article ("ARTICLE A1 Text")
* @param line single line in the extracted text from the ruleset pdf
* @returns rule code and content, or null if this line does not start a new rule
*/
const parseRuleNumberFHE = (line: string): ParsedRule | null => {
// Match FHE rule patterns like "1T3.17.1" followed by text
const rulePattern = /^(\d+[A-Z]+\d+(?:\.\d+)*)\s+(.+)$/;

// "PART A1 - ADMINISTRATIVE REGULATIONS" removes "PART" and captures "A1" as rule code, rest as content
const partMatch = line.match(/^PART\s+([A-Z0-9]+)\s+-\s+(.+)$/);
if (partMatch) {
return {
ruleCode: partMatch[1], // "A1", not "PART A1"
ruleContent: partMatch[2]
};
}

// "ARTICLE A1 FORMULA HYBRID + ELECTRIC OVERVIEW"
// Caputres "A1" as rule code, removes "ARTICLE" and adds rest as content
const articleMatch = line.match(/^ARTICLE\s+([A-Z]+\d+)\s+(.+)$/);
if (articleMatch) {
return {
ruleCode: articleMatch[1], // "A11", not "ARTICLE A11"
ruleContent: articleMatch[2]
};
}

const match = line.match(rulePattern);
if (match) {
return {
ruleCode: match[1],
ruleContent: match[2]
};
}

return null;
};

/**
* Updates rules to point to nearest existing parent if their assigned parent doesn't exist.
* D.8.1.2 -> checks for D.8.1, if missing goes to D.8, then D
* Also for FHE formatting 1A11.1 -> checks for 1A11, if missing tries A11 (article format)
* @param rules array of parsed rules
* @returns rules with corrected parent references
*/
const fixOrphanedRulesFHE = (rules: ParsedRule[]): ParsedRule[] => {
const existingCodes = new Set(rules.map((r) => r.ruleCode));

return rules.map((rule) => {
// skip if no parent or parent exists
if (!rule.parentRuleCode || existingCodes.has(rule.parentRuleCode)) {
return rule;
}

// Set parent doesn't exist, walk up the hierarchy
const parts = rule.ruleCode.split('.');
for (let i = parts.length - 2; i > 0; i--) {
const ancestorCode = parts.slice(0, i).join('.');

if (existingCodes.has(ancestorCode)) {
return { ...rule, parentRuleCode: ancestorCode };
}

// Also check stripped version (1A5 -> A5)
if (/^\d+[A-Z]+/.test(ancestorCode)) {
const strippedAncestor = ancestorCode.replace(/^\d+/, '');
if (existingCodes.has(strippedAncestor)) {
return { ...rule, parentRuleCode: strippedAncestor };
}
}
}

// Special case: if parent is like "1A11" and doesn't exist, try "A11" (article format)
// This handles rules like "1A11.1" whose parent "1A11" doesn't exist but should be "A11"
if (rule.parentRuleCode && /^\d+[A-Z]+\d+$/.test(rule.parentRuleCode)) {
const withoutLeadingDigit = rule.parentRuleCode.substring(1); // "1A11" -> "A11"
if (existingCodes.has(withoutLeadingDigit)) {
return { ...rule, parentRuleCode: withoutLeadingDigit };
}
}

return { ...rule, parentRuleCode: undefined };
});
};
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new parsing logic in parse.utils.ts lacks test coverage. Given the complexity of PDF parsing, text extraction, sub-rule extraction, duplicate handling, and parent-child relationship fixing, comprehensive unit tests should be added to ensure the parsing logic works correctly for various edge cases and input formats.

Copilot uses AI. Check for mistakes.
return rule; // Top-level rule
}

// Set parent doesn't exist, walk up the hierarchy
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "Set parent doesn't exist" which is grammatically unclear. It should be "If parent doesn't exist" or "Parent doesn't exist" to improve clarity.

Suggested change
// Set parent doesn't exist, walk up the hierarchy
// If parent doesn't exist, walk up the hierarchy

Copilot uses AI. Check for mistakes.
data: { parentRuleId: parentId }
});
})
.filter(Boolean);
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Promise.all is used with an array that may contain null values (from .filter(Boolean)). While filter(Boolean) removes falsy values, TypeScript may not properly narrow the type. Consider using a type assertion or filtering with a proper type guard to ensure type safety, or explicitly type the filtered array.

Suggested change
.filter(Boolean);
.filter(
(update): update is ReturnType<typeof tx.rule.update> =>
update !== null
);

Copilot uses AI. Check for mistakes.
return rule;
}

// Set parent doesn't exist, walk up the hierarchy
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "Set parent doesn't exist" which is grammatically unclear. It should be "If parent doesn't exist" or "Parent doesn't exist" to improve clarity.

Suggested change
// Set parent doesn't exist, walk up the hierarchy
// If parent doesn't exist, walk up the hierarchy

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@Aryan0102 Aryan0102 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Aryan0102 Aryan0102 merged commit 23158ca into feature/rules-dashboard Jan 20, 2026
4 checks passed
@Aryan0102 Aryan0102 deleted the 3622-rulesdashboard-parsing branch January 20, 2026 03:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants