Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions src/lotman.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1718,3 +1718,40 @@ int lotman_get_context_int(const char *key, int *output, char **err_msg) {
return -1;
}
}

int lotman_get_max_mpas_for_period(int64_t start_ms, int64_t end_ms, bool include_deletion, char **output,
char **err_msg) {
try {
// Call internal function
auto [result, error] = lotman::get_max_mpas_for_period_internal(start_ms, end_ms, include_deletion);

// Check for errors from internal function
if (!error.empty()) {
if (err_msg) {
*err_msg = strdup(error.c_str());
}
return -1;
}

// Build output JSON
json output_obj;
output_obj["start_ms"] = start_ms;
output_obj["end_ms"] = end_ms;
output_obj["include_deletion"] = include_deletion;
output_obj["max_dedicated_GB"] = result.max_dedicated_GB;
output_obj["max_opportunistic_GB"] = result.max_opportunistic_GB;
output_obj["max_combined_GB"] = result.max_combined_GB;
output_obj["max_num_objects"] = result.max_num_objects;

// Convert to string and allocate output
std::string output_str = output_obj.dump();
*output = strdup(output_str.c_str());

return 0;
} catch (std::exception &exc) {
if (err_msg) {
*err_msg = strdup(exc.what());
}
return -1;
}
}
66 changes: 66 additions & 0 deletions src/lotman.h
Original file line number Diff line number Diff line change
Expand Up @@ -936,6 +936,72 @@ int lotman_get_context_int(const char *key, int *output, char **err_msg);
A reference to a char array that can store any error messages.
*/

int lotman_get_max_mpas_for_period(int64_t start_ms, int64_t end_ms, bool include_deletion, char **output,
char **err_msg);
/**
DESCRIPTION: A function for determining the maximum summed Management Policy Attributes (MPAs)
across all overlapping lots during a specified time period. This function uses a sweep line
algorithm to efficiently calculate the peak resource allocation at any point during the period.
This is useful for capacity planning and scheduling systems that need to determine available
space for new lot allocations, e.g. "Can I create a lot with 50GB dedicated storage from time
A to time B without overcommitting resources?"

RETURNS: Returns 0 on success. Any other values indicate an error.

INPUTS:
start_ms:
A Unix timestamp in milliseconds indicating the start of the query period (inclusive).

end_ms:
A Unix timestamp in milliseconds indicating the end of the query period (inclusive).
Must be greater than start_ms or the function will return an error.

include_deletion:
A boolean indicating which lot endpoint to consider:
- When false: lots are considered active until their expiration_time
- When true: lots are considered active until their deletion_time
For most capacity planning scenarios, false is recommended since expired lots may still
consume resources even if they become opportunistic.
Comment on lines +961 to +964
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation is internally inconsistent regarding the include_deletion parameter. Line 963-964 states "false is recommended since expired lots may still consume resources even if they become opportunistic" - but if expired lots still consume resources, then true (using deletion_time) would be more accurate for capacity planning, not false. This contradicts the PR description which states expiration_time is "recommended for capacity planning". Please clarify the intended semantics: if lots stop consuming dedicated resources at expiration_time but continue consuming opportunistic resources until deletion_time, or if they continue consuming all resources until deletion_time.

Suggested change
- When false: lots are considered active until their expiration_time
- When true: lots are considered active until their deletion_time
For most capacity planning scenarios, false is recommended since expired lots may still
consume resources even if they become opportunistic.
- When false: lots are considered active until their expiration_time. After expiration, lots stop consuming dedicated resources, but may continue to consume opportunistic resources until deletion_time.
- When true: lots are considered active until their deletion_time, including both dedicated and opportunistic resource consumption.
For most capacity planning scenarios, true is recommended, since expired lots may still consume resources (opportunistic or otherwise) until deletion_time. Use true to ensure all resource consumption is accounted for.
Use false only if you wish to exclude opportunistic resource usage after expiration_time.

Copilot uses AI. Check for mistakes.

output:
A reference to a char * that will be allocated and populated with a JSON string containing
the results. The caller is responsible for freeing this memory.

err_msg:
A reference to a char array that can store any error messages.

Output JSON Specification:
The output JSON contains both the query parameters (for logging/debugging) and the results:
{
"start_ms": <input start_ms value, in unix milliseconds>,
"end_ms": <input end_ms value, in unix milliseconds>,
"include_deletion": <input include_deletion value>,
"max_dedicated_GB": <maximum sum of dedicated_GB at any point during the period>,
"max_opportunistic_GB": <maximum sum of opportunistic_GB at any point during the period>,
"max_combined_GB": <maximum sum of (dedicated_GB + opportunistic_GB) at any point during the period>,
"max_num_objects": <maximum sum of max_num_objects at any point during the period>
}

Notes:
- max_dedicated_GB represents the maximum cumulative storage Lotman has dedicated to lots during the
specified period
- max_combined_GB sums over both opportunistic and dedicated storage, representing the total maximum storage
Lotman has allocated to lots during the specified period
- max_opportunistic_GB and max_combined_GB may be produced by different sets of overlapping lots
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note should clarify that max_dedicated_GB, max_opportunistic_GB, and max_combined_GB can all occur at different times. The current wording only mentions that max_opportunistic_GB and max_combined_GB may be produced by different sets, but max_dedicated_GB can also occur at a different time than the others. Consider rephrasing to: "max_dedicated_GB, max_opportunistic_GB, and max_combined_GB may each be produced by different sets of overlapping lots at different points in time".

Suggested change
- max_opportunistic_GB and max_combined_GB may be produced by different sets of overlapping lots
- max_dedicated_GB, max_opportunistic_GB, and max_combined_GB may each be produced by different sets of overlapping lots at different points in time

Copilot uses AI. Check for mistakes.
- If no lots overlap the specified period, all maximum values will be 0.0

Example:
If the query period contains three overlapping lots:
- Lot A: 10GB dedicated, 5GB opportunistic
- Lot B: 8GB dedicated, 0GB opportunistic
- Lot C: 7GB dedicated, 3GB opportunistic

When all three overlap in time with the overall query interval:
- max_dedicated_GB = 25.0 (10 + 8 + 7)
- max_opportunistic_GB = 8.0 (5 + 0 + 3)
- max_combined_GB = 33.0 (25 + 8, the sum of total dedicated plus total opportunistic)
*/

// int lotman_get_matching_lots(const char *criteria_JSON, char ***output, char **err_msg);
// int lotman_check_db_health(char **err_msg); // Should eventually check that data structure conforms to expectations.
// If there's a cycle or a non-self-parent root, something is wrong
Expand Down
147 changes: 147 additions & 0 deletions src/lotman_internal.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2182,6 +2182,153 @@ bool lotman::Checks::will_be_orphaned(const std::string &LTBR, const std::string
return false;
}

/**
* Implementation of sweep line algorithm for finding maximum MPAs during a time period.
*
* This implements the classic sweep line algorithm for interval scheduling problems.
* See: https://www.geeksforgeeks.org/maximum-number-of-overlapping-intervals/
*
* The algorithm works by:
* 1. Creating "events" for each lot's start (creation) and end (expiration/deletion)
* 2. Sorting all events by time
* 3. Sweeping through events chronologically, tracking current resource usage with deltas
* that correspond to each event's attributes
* 4. Recording the maximum usage observed at any point
*
* Key semantic: Lot lifetimes are INCLUSIVE intervals [creation_time, end_time].
* A lot is active at both its start and end timestamps. Therefore, we schedule
* removal events at end_time + 1 (the first moment the lot is no longer active).
*/

std::pair<lotman::MaxMPAResult, std::string> lotman::get_max_mpas_for_period_internal(int64_t start_ms, int64_t end_ms,
bool include_deletion) {
// Validate input
if (start_ms >= end_ms) {
return {{0.0, 0.0, 0.0, 0}, "Error: start_ms must be less than end_ms"};
}

auto &storage = lotman::db::StorageManager::get_storage();

// Determine which time field to use for lot end time
using MPA = lotman::db::ManagementPolicyAttributes;
using Parent = lotman::db::Parent;
using namespace sqlite_orm;

// Query lots that overlap with the period, filtering to only ROOT lots.
//
// IMPORTANT: We only count root lots (self-parent lots) to avoid double-counting in hierarchies.
// A root lot is one where the lot has only itself as a parent in the parents table.
// Child lots consume quota from their parents, so counting both would be incorrect.
//
// For example, if parent_lot has 5GB and child_lot (child of parent_lot) has 3GB,
// the maximum capacity usage should be 5GB (from the parent), not 8GB (parent + child).
//
// Overlap condition for inclusive intervals: creation_time <= end_ms AND end_time >= start_ms
// This correctly handles all overlap cases including point-in-time overlaps at boundaries.
//
// Root lot condition: EXISTS exactly one parent record WHERE parent = lot_name
// We use a SQL subquery to identify root lots directly in the database for optimal performance.
std::string time_field = include_deletion ? "deletion_time" : "expiration_time";
std::string query = "SELECT mpa.lot_name, mpa.dedicated_GB, mpa.opportunistic_GB, mpa.max_num_objects, "
" mpa.creation_time, mpa." +
time_field +
" "
"FROM management_policy_attributes mpa "
"WHERE mpa.creation_time <= ? AND mpa." +
time_field +
" >= ? "
" AND mpa.lot_name IN ( "
" SELECT p.lot_name "
" FROM parents p "
" WHERE p.lot_name = p.parent "
" GROUP BY p.lot_name "
" HAVING COUNT(*) = 1 "
" )";

std::map<int64_t, std::vector<int>> query_int_map{{end_ms, {1}}, {start_ms, {2}}};
auto rp = lotman::db::SQL_get_matches_multi_col(query, 6, std::map<std::string, std::vector<int>>(), query_int_map);

if (!rp.second.empty()) {
return {{0.0, 0.0, 0.0, 0}, "Database query failed: " + rp.second};
}

auto &lots = rp.first;

// If no root lots overlap, return zeros with no error
if (lots.empty()) {
return {{0.0, 0.0, 0.0, 0}, ""};
}

// Event structure for sweep line algorithm
struct Event {
int64_t time;
double ded_delta; // Change in dedicated storage
double opp_delta; // Change in opportunistic storage
int64_t obj_delta; // Change in object count
bool is_start; // true for creation event, false for expiration/deletion event
};

std::vector<Event> events;
events.reserve(lots.size() * 2); // Each lot creates at most 2 events

// Build event list from query results
// Each row contains: [lot_name, dedicated_GB, opportunistic_GB, max_num_objects, creation_time, end_time]
for (const auto &lot_row : lots) {
// Parse query results from string vector (columns 0-5)
// lot_row[0] = lot_name (string, not used in sweep line)
double dedicated = std::stod(lot_row[1]); // dedicated_GB
double opportunistic = std::stod(lot_row[2]); // opportunistic_GB
int64_t objects = std::stoll(lot_row[3]); // max_num_objects
int64_t creation = std::stoll(lot_row[4]); // creation_time
int64_t end_time = std::stoll(lot_row[5]); // expiration_time or deletion_time

// Clamp lot start to query range (if lot starts before start_ms, treat as starting at start_ms)
int64_t effective_start = std::max(start_ms, creation);

// Add creation/start event at the lot's effective start time
events.push_back({effective_start, dedicated, opportunistic, objects, true});

// Add expiration/deletion event AFTER the lot ends (since lot is active through end_time inclusive)
// Only add if the lot ends before the query period ends
if (end_time < end_ms) {
// Schedule removal at end_time + 1 (first moment lot is no longer active)
events.push_back({end_time + 1, -dedicated, -opportunistic, -objects, false});
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential integer overflow when calculating end_time + 1. If end_time is INT64_MAX (or close to it), adding 1 will cause overflow. While this is unlikely in practice with Unix millisecond timestamps (INT64_MAX represents year 292,277,026), consider adding a check or documenting this limitation. The overflow would cause incorrect behavior where the removal event is scheduled at a negative timestamp.

Copilot uses AI. Check for mistakes.
}
// If end_time >= end_ms, the lot extends beyond our query range, so no removal event needed
}

// Sort events chronologically, with start events before end events at the same timestamp
std::sort(events.begin(), events.end(), [](const Event &a, const Event &b) {
if (a.time != b.time) {
return a.time < b.time;
}
// At same time, process starts before ends (true sorts before false)
// This ensures we correctly handle simultaneous creation/expiration events
return a.is_start > b.is_start;
});

// Sweep through events chronologically, tracking current and maximum resource usage
double current_ded = 0.0, current_opp = 0.0, current_combined = 0.0;
double max_ded = 0.0, max_opp = 0.0, max_combined = 0.0;
int64_t current_obj = 0, max_obj = 0;

for (const auto &event : events) {
// Update current resource usage based on event deltas
current_ded += event.ded_delta;
current_opp += event.opp_delta;
current_obj += event.obj_delta;
current_combined = current_ded + current_opp;

// Track the maximum values observed at any point
max_ded = std::max(max_ded, current_ded);
max_opp = std::max(max_opp, current_opp);
max_combined = std::max(max_combined, current_combined);
max_obj = std::max(max_obj, current_obj);
}

return {{max_ded, max_opp, max_combined, max_obj}, ""};
}

void lotman::Context::set_caller(const std::string caller) {
m_caller = std::make_shared<std::string>(caller);
}
Expand Down
38 changes: 38 additions & 0 deletions src/lotman_internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -424,4 +424,42 @@ class Checks {
// a parent/child, which should update data for the child
static bool will_be_orphaned(const std::string &LTBR, const std::string &child);
};

/**
* Result structure for maximum MPA queries.
*/
struct MaxMPAResult {
double max_dedicated_GB;
double max_opportunistic_GB;
double max_combined_GB;
int64_t max_num_objects;
};

/**
* Calculate maximum Management Policy Attributes (MPAs) during a time period using sweep line algorithm.
*
* This function implements a sweep line algorithm to efficiently find the maximum resource usage
* across all overlapping lots during a specified time period. The algorithm is based on the
* classic interval scheduling problem solution described at:
* https://www.geeksforgeeks.org/maximum-number-of-overlapping-intervals/
*
* Time Complexity: O(n log n) where n is the number of lots overlapping the query period
* Space Complexity: O(n) for the event list
*
* IMPORTANT: Lot lifetimes are treated as INCLUSIVE intervals [creation_time, end_time].
* A lot is considered active at both its creation_time and its end_time (expiration or deletion).
* This means a lot with creation_time=100 and expiration_time=200 is active during the entire
* range [100, 200], including both endpoints.
*
* @param start_ms Start of the query period in milliseconds since Unix epoch (inclusive)
* @param end_ms End of the query period in milliseconds since Unix epoch (inclusive)
* @param include_deletion If true, use deletion_time as lot end; if false, use expiration_time
* @return A pair containing:
* - first: MaxMPAResult struct with all maximum values
* - second: Error message string (empty on success, descriptive message on error)
* On error, all numeric values in the result struct are set to 0.
*/
std::pair<MaxMPAResult, std::string> get_max_mpas_for_period_internal(int64_t start_ms, int64_t end_ms,
bool include_deletion);

} // namespace lotman
Loading
Loading