Skip to content

Conversation

Copy link

Copilot AI commented Oct 30, 2025

XREF Refactoring - Fixed File Storage Implementation

This PR refactors how saxref saves XREF branches to move from Dali IPT (in-memory property tree) to file-based storage on the Sasha plane.

Latest Changes (Addressing Review Feedback):

CXRefFilesNode Changes

  • Added xrefPath support: CXRefFilesNode now tracks xrefPath and branchName
  • Smart queryData(): Loads from file if xrefPath set, otherwise from "data" attribute in Dali
  • Smart Deserialize(): Saves to file if xrefPath set, otherwise to "data" attribute in Dali
  • No Dali pollution: File-based data never written to Dali's "data" attribute

XRefNodeManager Changes

  • Removed loadBranchFromPath: Previous approach incorrectly loaded file data into Dali
  • Updated accessor methods: getLostFiles, getFoundFiles, getOrphanFiles now call setXRefPath on CXRefFilesNode
  • File-based serialize/deserialize: serializeMessages/Directories and deserializeMessages/Directories handle files directly
  • No commits for file storage: File-based branches don't get committed to Dali

Storage Path Structure

<sasha_prefix>/xref/<cluster>/<YYYY-MM-DD>/<branch>.xml

How It Works

File-based storage (xrefPath exists):

  • CXRefFilesNode.queryData() reads from <xrefPath>/<branchName>.xml
  • CXRefFilesNode.Deserialize() writes to <xrefPath>/<branchName>.xml
  • Data stays on filesystem, never stored in Dali

Dali-based storage (no xrefPath):

  • CXRefFilesNode.queryData() reads from branch's "data" attribute
  • CXRefFilesNode.Deserialize() writes to branch's "data" attribute
  • Traditional behavior preserved

Configuration

<DfuXRef useSashaPlane="true"/>  <!-- Expert option, defaults to isContainerized() -->
Original prompt

Refactor the way saxref saves it's branches to Dali (via committing them to a IPT in Dali), and instead lookup the Sasha plane and save them there. A new named directory should be created for the edition under the sasha prefix, and under "xref", for the cluster it is dealing with, with a datestamp subdir under that.

Then update the Dali XREF cluster tree with reference info so a client can find this path.

Then look at dali/dfuXRefLib/XRefNodeManager.cpp, and how it currently fetch "data" in the CXRefNode ctor. Instead use the newly recorded file path to read from instead.
NB: the path saved by saxref.cpp should be in the form of a hostname based url if it is a non local path.
NB2: the XREFNodeManager.cpp changes should be backward compatible, IOW, if pathing meta data is there, it should use it, but it should fall back to reading "data" if not.

Avoid making extensive changes and unrelated improvements where at all possible.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 3 commits October 30, 2025 13:03
- Modified saxref.cpp to save XREF branches to Sasha plane instead of Dali IPT
- Created directory structure: <sasha_prefix>/xref/<cluster>/<datestamp>/
- Store path reference in Dali cluster tree instead of full data
- Updated XRefNodeManager.cpp to read from path-based files
- Maintained backward compatibility by falling back to "data" attribute

Co-authored-by: jakesmith <[email protected]>
Copilot AI changed the title [WIP] Refactor saxref branch saving to use Sasha plane Refactor XREF to use Sasha plane file storage instead of Dali IPT Oct 30, 2025
Copilot finished work on behalf of jakesmith October 30, 2025 13:12
Copilot AI requested a review from jakesmith October 30, 2025 13:12
Copy link
Owner

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - see comments.

{
StringBuffer planeName;
// Look for sasha plane
if (getDefaultPlane(planeName, nullptr, "sasha"))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - There should always be a sasha plane categeory defined.
This should throw an error if one cannot be found.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c7c4bbb - Now throws error if Sasha plane category not found instead of falling back.

if (sashaPlane)
{
StringBuffer prefix;
if (sashaPlane->getProp("@prefix", prefix))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - the above calls could be simplified by using getDefaultPlaneDirectory to get the prefix directly (and throwing an error if fails)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c7c4bbb - Simplified to use getDefaultPlane + getStoragePlaneConfig. Note: getDefaultPlaneDirectory is a static function not available in the header.

addPathSepChar(sashaDir).append(datestamp);

// Create the directory
Owned<IFile> dir = createIFile(sashaDir.str());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - use recursiveCreateDirectoryForFile instead and fire an error if it fails.

We should not failover to write to dali.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c7c4bbb - Now uses recursiveCreateDirectoryForFile and throws error on failure (no fallback to Dali).

saveSuccess = saveBranchToSashaPlane(sashaDir.str(), "Orphans", orphansbranch) && saveSuccess;
saveSuccess = saveBranchToSashaPlane(sashaDir.str(), "Lost", lostbranch) && saveSuccess;
saveSuccess = saveBranchToSashaPlane(sashaDir.str(), "Found", foundbranch) && saveSuccess;
saveSuccess = saveBranchToSashaPlane(sashaDir.str(), "Directories", dirbranch) && saveSuccess;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - use 'warn' in saveBranchToSashaPlane to preserve and publish the warning, but don't bother with 'saveSuccess' and a return result.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c7c4bbb - saveBranchToSashaPlane now uses warn() and has no return value or saveSuccess tracking.

}
saveSuccess = saveBranchToSashaPlane(sashaDir.str(), "Messages", message) && saveSuccess;

if (saveSuccess)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - see previous comment. Get rid of 'saveSuccess'

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c7c4bbb - Removed saveSuccess tracking completely.

// Store path reference in Dali instead of full data
// Convert to hostname-based URL if not a local path
StringBuffer pathUrl;
if (!isAbsolutePath(sashaDir.str()))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - this in fact is not necessary, simply store the sashaDir in all cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c7c4bbb - Now stores plain sashaDir path directly (no URL conversion).

}
}

if (!useSashaPlane)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given we are not going to failover to this. Just make this the else branch of if (useSashaPlane)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c7c4bbb - Made it a simple else branch with no fallback logic.

{
StringBuffer basePath(xrefPath);
// Handle file:// URLs
if (hasPrefix(basePath, "file://", false))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - this is not necessary, remove this block.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c7c4bbb - Removed file:// URL handling block entirely.

}

// Fall back to loading from "data" attribute if path not available or failed
if (!m_XRefTree->hasProp("Orphans") && !m_XRefTree->hasProp("Lost") &&
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - don't failover like this, simply make this the else branch if xrefpath is missing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c7c4bbb - Made it a simple else branch instead of fallback logic.


}

CXRefNode::CXRefNode(IPropertyTree* pTreeRoot)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - in fact , all changes to this version of the ctor should be undone. This ctor is unused, leave it untouched for now.

The correct place to handle xrefPath is in places like CXRefNode::getOrphanFiles(), getFoundFiles, getLostFiles, getDataTree , and CXRefFilesNode (e.g. queryData()), which currently looks at "data" and then uses that MemoryBuffer to create an IPT (in queryDataTree), or Serialize that returns it into a StringBuffer.

These methods need to cope with the new and old approach.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c7c4bbb - Reverted all changes to CXRefNode(IPropertyTree*) constructor. Added loadBranchFromPath helper and modified getLostFiles, getFoundFiles, getOrphanFiles, serializeMessages, and serializeDirectories to handle xrefPath properly.

- Made useSashaPlane configurable via expert option (default to isContainerized)
- Simplified sasha plane setup to throw errors if not configured
- Use recursiveCreateDirectoryForFile for directory creation
- Removed saveSuccess tracking, use warn() in saveBranchToSashaPlane
- Store plain sashaDir path instead of URL
- Reverted all changes to CXRefNode(IPropertyTree*) constructor
- Added loadBranchFromPath helper to load from xrefPath
- Modified getLostFiles, getFoundFiles, getOrphanFiles to load from path
- Modified serializeMessages and serializeDirectories to load from path
- Removed fallback to Dali storage (throw error instead)

Co-authored-by: jakesmith <[email protected]>
Copilot finished work on behalf of jakesmith October 30, 2025 14:09
Copilot AI requested a review from jakesmith October 30, 2025 14:09
Copy link
Owner

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Several problems:

  1. loadBranchFromPath is currently conditional on whether the individual branch nodea re present or not, but it should be based purely on whether xrefPath was present.
  2. it's loading the branch into the m_XRefTree, which is a tree in Dali, and it then commits it, pushing the loaded trees into Dali. That is not the goal/wanted. We want to keep the external file base trees out of Dali.
  3. The trees are saved in various places (see places that do setPropBin("data" ..). This works if the data is to be stored in Dali, but our goal is to save them back to disk. NB: if the original was in dali, after save, xrefPath can be set, so future handling can use the diskbased versions
  4. CXRefFilesNode needs to cope with the old way and new way, paying careful attention to the way the old way read/set "data", and how the new way should load the tree from disk and if request to, serialize it back to disk.

Copy link
Owner

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - there are Several problems:

  1. loadBranchFromPath is currently conditional on whether the individual branch nodea re present or not, but it should be based purely on whether xrefPath was present.
  2. it's loading the branch into the m_XRefTree, which is a tree in Dali, and it then commits it, pushing the loaded trees into Dali. That is not the goal/wanted. We want to keep the external file base trees out of Dali.
  3. The trees are saved in various places (see places that do setPropBin("data" ..). This works if the data is to be stored in Dali, but our goal is to save them back to disk. NB: if the original was in dali, after save, xrefPath can be set, so future handling can use the diskbased versions
  4. CXRefFilesNode needs to cope with the old way and new way, paying careful attention to the way the old way read/set "data", and how the new way should load the tree from disk and if request to, serialize it back to disk.

@copilot - please fix these issues.

- Modified CXRefFilesNode to handle both file-based and Dali-based storage
- Added setXRefPath() method to CXRefFilesNode
- queryData() loads from file when xrefPath is set, otherwise from "data" attribute
- Deserialize() saves to file when xrefPath is set, otherwise to "data" attribute
- Updated getLostFiles, getFoundFiles, getOrphanFiles to set xrefPath on CXRefFilesNode
- Don't commit file-loaded branches to Dali
- Updated serializeMessages/Directories and deserializeMessages/Directories to handle file-based storage
- Removed loadBranchFromPath helper that was loading into Dali

Co-authored-by: jakesmith <[email protected]>
Copy link
Author

Copilot AI commented Oct 30, 2025

@copilot - there are Several problems:
...

Fixed in ab4d5fa:

  1. loadBranchFromPath removed - No longer conditional on branch presence, now properly based on xrefPath
  2. No data in Dali - File-based branches never committed to m_XRefTree. CXRefFilesNode handles file I/O directly
  3. File saves work correctly - Deserialize() saves to disk when xrefPath set, not to "data" attribute
  4. CXRefFilesNode handles both - queryData() loads from file or "data" attribute; Deserialize() saves to file or "data" attribute based on xrefPath presence

Copilot AI requested a review from jakesmith October 30, 2025 15:26
Copilot finished work on behalf of jakesmith October 30, 2025 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants