Releases: openBackhaul/MicroWaveDeviceInventory
Specification of MicroWaveDeviceInventory v2.1.0
This release adds the following changes:
- a configurable list of attributes/subclasses to be excluded from the quality measurement process
- different approach for deriving the deviceType (use equipment-augment/device-model-name instead of air-interface-capability information and regex), vendor mapping adjusted accordingly
- adds the timestamp of the last complete ControlConstruct update to the respective ControlConstruct
For more details, see issue collection MWDI v2.1.0_spec.
Implementation of MicroWaveDeviceInventory v2.0.1.f_impl
| S.No | Issue Number | Status |
|---|---|---|
| 1 | #1501 | Fixed |
| 2 | #1464 | Fixed |
| 3 | #1350 | Fixed |
| 4 | #1349 | Fixed |
| 5 | #1348 | Fixed |
| 6 | #1347 | Fixed |
| 7 | #1082 | Fixed |
| 8 | #1065 | Fixed |
| 9 | #1057 | Fixed |
| 10 | #901 | Fixed |
| 11 | #1504 | Fixed |
| 12 | #1503 | Fixed |
| 13 | #1501 | Fixed |
| 14 | #1500 | Fixed |
| 15 | #1498 | Fixed |
| 16 | #1494 | Fixed |
| 17 | #1377 | Fixed |
| 18 | #1505 | Fixed |
| 19 | #1048 | Fixed |
| 20 | #1507 | Fixed |
| 21 | #1494 | Fixed |
| 22 | #923 | Fixed |
Fix for the performance problem :
Reference : #1505 (comment)
Node.js executes JavaScript on a single thread. So, our REST server and the 4 high-CPU tasks share the same event loop.
When those CPU-intensive tasks run, they block the event loop, preventing it from quickly handling incoming HTTP requests.
As a result, API responses slow down because the event loop is busy processing the continuous cyclic operations instead of processing REST requests.
So , we made all the 4 high-CPU tasks run in a separate worker thread(means allocated with its own heap space , event loop). Bussiness logic for this application remains the same where as each background task run as a separate thread inside a single application.
MWDI – Current State, Challenges, and v2.0.1 Enhancements
Current Production State (MWDI v1.2.0)
- MWDI v1.2.0 is currently running in production.
- Configured with a sliding window size of 700.
- The application consists of:
- REST Interface (asynchronous, event‑driven)
- Background Sliding Window Process (continuous)
Performance Snapshot
- Cache updates complete in around 3 hours (approximately 38K devices processed).
- When REST traffic increases, overall performance degrades.
- Notification Processing Disabled
- v1.2.0 and earlier versions could not handle notification load.
- Therefore, notification processing is disabled in production.
New Features Introduced in v2.0.1
Version 2.0.1 introduces multiple new background processes:
1. Kafka Consumer
- Consumes messages from Kafka topics.
- Continuous background process.
2. DeviceMetaDataList Update Process
- Periodic background process.
3. Cache Quality Measurement
- Periodic background task to evaluate cache health.
Total Processes in v2.0.1
- REST Server
- 2 Periodic High‑CPU Tasks
- 2 Continuous High‑CPU Tasks
Total: 5 parallel processes
Root Cause Analysis (Node.js Limitation)
As referenced in:
#1505 (comment)
Key points:
- Node.js executes JavaScript on a single thread.
- All processes share the same event loop:
- REST Server
- 2 periodic high‑CPU tasks
- 2 continuous high‑CPU background tasks
Impact
- CPU‑intensive background loops block the event loop.
- Incoming HTTP requests slow down significantly.
- REST APIs become slow or unresponsive under load.
In traditional multithreaded environments (e.g., Java), such tasks would naturally run on separate threads, avoiding contention.
Solution Approach in v2.0.1.f – Worker Threads
To overcome Node.js event loop limitations:
- Worker Threads introduced for all background tasks.
- Each background process receives:
- Its own execution thread
- Its own heap memory
- No contention with REST APIs
Expected Outcome
- REST APIs remain responsive.
- Background processing runs independently.
- Overall throughput and stability improve.
This solution must be validated in pre‑production with ~40K devices to confirm real‑world performance gains.
Key Challenges Faced During Development
1. Notification Processing Was Never Tested in Production
- Disabled from day one due to performance issues.
- Notification processing logic was untested.
- The real bottleneck existed inside the application’s notification processing loop.
- Required a complete rewrite (currently under testing).
2. Large Effort Estimation Gap
- Kafka integration + total redesign of notification processing.
- Initial estimates did not account for this complexity.
Testing Constraints
- Development and test environments initially lacked notification simulation.
- Multiple test builds were released for partial functionality testing in pre‑prod:
test_alarm_fix_1_v2.0.1test_slidingW_analysis_1_v2.0.1test_slidingwindow_analysis_2_v2.0.1test_slidingwindow_analysis_3_v2.0.1
- Pre‑production environment could not be disturbed.
- Testing limited to:
- Master Controller‑3
- Up to 17k devices
Summary
- v1.2.0 limitations stem from single‑threaded execution and disabled notification handling.
- v2.0.1 introduces multiple high‑CPU background processes, revealing Node.js scalability limits.
- v2.0.1.f addresses these challenges using Worker Threads, properly isolating workloads.
- The solution is architecturally sound but needs large‑scale pre‑production validation.
- Significant development effort was required due to:
- Missing load simulation environments
- Necessary redesign of core processing logic
- MWDI being a mega service (not a microservice) - long‑term fixes require breaking it into smaller, isolated applications to eliminate scalability bottlenecks.
Specification of MicroWaveDeviceInventory v2.0.1
MWDI v2.0.1 spec
This specification contains updates and fixes for findings from implementation of MWDI 2.0.0_spec.
Issue collection:
The list of changes can be found in issue collection MWDI v2.0.1_spec
What's Changed
- Specification of MicroWaveDeviceInventory v1.1.1 by @openBackhaul in #735
- MicroWaveDeviceInventory 1.2.0_spec by @openBackhaul in #1032
- Implementation for new link services by @IswaryaaS in #1088
- develop to v1.2.0 by @PrathibaJee in #1132
- Adding the code to branch by @ManasaBM1 in #1108
- IswaryaaS/issue1089 by @IswaryaaS in #1134
- Merge v1.2.2_spec from develop into main by @kmohr-soprasteria in #1143
- adding response for cache filter services by @ManasaBM1 in #1106
- Kmohr-soprasteria/issue1249 by @kmohr-soprasteria in #1250
- (Notifications) Create notificationPollingInterval profileInstance by @kmohr-soprasteria in #1252
- Kmohr-soprasteria/issue1246 by @kmohr-soprasteria in #1256
- Kmohr-soprasteria/issue1253 by @kmohr-soprasteria in #1257
- Kmohr-soprasteria/issue1255 by @kmohr-soprasteria in #1259
- Add callbacks to OAS fixes #1258 by @kmohr-soprasteria in #1261
- Kmohr-soprasteria/issue1254 by @kmohr-soprasteria in #1262
- Kmohr-soprasteria/issue1204 by @kmohr-soprasteria in #1263
- Kmohr-soprasteria/issue1264 by @kmohr-soprasteria in #1265
- (Notifications) Kafka parameters required - update spec accordingly by @kmohr-soprasteria in #1268
- Refine QM process description by @kmohr-soprasteria in #1275
- [Refine-QM]: Refine /v1/provide-cache-quality-statistics by @kmohr-soprasteria in #1277
- [Refinement-Notifications] Mark regard services as deprecated by @kmohr-soprasteria in #1281
- Redchy/issue1267 by @redchy in #1278
- Kmohr-soprasteria/issue1282 by @kmohr-soprasteria in #1285
- Kmohr-soprasteria/issue1286 by @kmohr-soprasteria in #1290
- Redchy/issue1175 (Create testcases for invalidResourcePath) by @redchy in #1283
- Kmohr-soprasteria/issue1287 by @kmohr-soprasteria in #1291
- Kmohr-soprasteria/issue1289 by @kmohr-soprasteria in #1293
- update description by @kmohr-soprasteria in #1294
- Kmohr-soprasteria/issue1292 by @kmohr-soprasteria in #1295
- [Refinement] Refine workings of deviceList by @kmohr-soprasteria in #1296
- Apply appPattern changes for Kafka by @kmohr-soprasteria in #1298
- Kmohr-soprasteria/issue1284 by @kmohr-soprasteria in #1302
- [Refinement-Notifications] Update readme for latest changes by @kmohr-soprasteria in #1303
- Redchy/issue1175 by @redchy in #1304
- Remove unneeded operationClient for provide metadata services by @kmohr-soprasteria in #1307
- Refine calling regard services internally by @kmohr-soprasteria in #1310
- Add responseCode 530 to /v1/provide-cache-quality-statistics by @kmohr-soprasteria in #1313
- Kmohr-soprasteria/issue1311 by @kmohr-soprasteria in #1314
- Kmohr-soprasteria/issue1308_resolve merge conflicts from main to develop by @kmohr-soprasteria in #1316
- Merge spec_v2.0.0 into main after resolving conflicts in develop by @kmohr-soprasteria in #1317
- Restore changes that were lost due to the conflict resolve for merging main into develop by @kmohr-soprasteria in #1319
- Kmohr-soprasteria/issue1306 by @kmohr-soprasteria in #1322
- Add a new genericResponseProfileInstance for v2.0.1 by @kmohr-soprasteria in #1323
- OAS Meta Information by @kmohr-soprasteria in #1324
- Fix linting errors in spec by @kmohr-soprasteria in #1326
- Rename proper_notifications topic to device_change_notifications by @kmohr-soprasteria in #1337
- Kmohr-soprasteria/issue1334 by @kmohr-soprasteria in #1338
- Kmohr-soprasteria/issue1335 by @kmohr-soprasteria in #1340
- [Testing] Update invalidOrMissingRequestBody/receiver testcases by @kmohr-soprasteria in #1342
- Remove callback PromptForEmbeddingCausesCyclicNotificationProcessingToApplyControllerAttributeValueChange from /v1/embed-yourself by @kmohr-soprasteria in #1345
- JEST TEST CASES by @ManasaBM1 in #1165
- [Testing] Update completeness/receiver tests by @kmohr-soprasteria in #1352
- Kmohr-soprasteria/issue1356 by @kmohr-soprasteria in #1357
- Incorrect use of console.log.debug, console.log.error, and console.log.warn in the NotificationManagement NotificationStreamManagement file by @ManasaBM1 in #1355
- Kmohr-soprasteria/issue1327 by @kmohr-soprasteria in #1361
- Review /v1/provide-device-status-metadata requestBody schema and possibly modify it by @kmohr-soprasteria in #1362
- Simplify response of /v1/provide-cache-quality-statistics by @kmohr-soprasteria in #1363
- Kmohr-soprasteria/issue1329 by @kmohr-soprasteria in #1365
- [Testing] Update unknownTargetObject/dataprovider testcases by @kmohr-soprasteria in #1366
- update tests by @kmohr-soprasteria in #1368
- V1.2.0 impl to develop branch by @PrathibaJee in #1370
- Kmohr-soprasteria/issue1305 by @kmohr-soprasteria in #1374
- Kmohr-soprasteria/issue1375 by @kmohr-soprasteria in #1379
- Kmohr-soprasteria/issue1378 by @kmohr-soprasteria in #1380
- Update CC schema (add LP to LTP for sync) by @kmohr-soprasteria in #1382
- Update MicroWaveDeviceInventory.yaml with corrected sync ltp schema by @kmohr-soprasteria in #1385
- Thorsten/issue1135 by @kmohr-soprasteria in #1408
- Reduce required statement of CC by @kmohr-soprasteria in #1409
- Make robust against new Profile definitions by @kmohr-soprasteria in #1410
- Make robust against new LP definitions by @kmohr-soprasteria in #1411
- Changes in the Kafka consumer. by @PrathibaJee in #1412
- Revert "Changes in the Kafka consumer." by @PrathibaJee in https://github.com/openB...
Implementation of MicroWaveDeviceInventory v2.0.1.e (for Defne's Master Thesis)
Fixed Issues :
Alarm related issues
Others
Upcoming release :
During our analysis of various cache update-related issues, we identified a common underlying cause: concurrency.
To address this, we are implementing a generic solution in the upcoming package that will handle concurrency more effectively across the system.
What's Changed
- IswaryaaS/issue1484 by @IswaryaaS in #1492
- Fix for concurrency problem in Elasticsearch and fix for excluding invalid mountnames. Related issues are 1472, 901 and 1455 by @KarthikeyanV-TechM in #1495
- IswaryaaS/issue1464 by @IswaryaaS in #1497
New Contributors
- @KarthikeyanV-TechM made their first contribution in #1495
Full Changelog: v2.0.1.d_impl...v2.0.1.e_impl
Implementation of MicroWaveDeviceInventory v2.0.1.d (for Defne's Master Thesis)
Implementation of MicroWaveDeviceInventory v2.0.1.c (for Defne's Master Thesis)
Following are the solved issues ,
#1387
#1386
#1388
#1451
#1449
What's Changed
- Notification based updates by @IswaryaaS in #1453
Full Changelog: v2.0.1.b_impl...v2.0.1.c_impl
Implementation of MicroWaveDeviceInventory v2.0.1.b (for Defne's Master Thesis)
Following are the solved issues ,
#1444
#1440
#1439
#1437
#1433
#1430
What's Changed
- Regexp profile by @ManasaBM1 in #1441
- IswaryaaS/issue1425 by @IswaryaaS in #1446
- IswaryaaS/issue1399 by @IswaryaaS in #1447
- /v1/provide-cache-quality-statistics API is not responding by @PrathibaJee in #1448
Full Changelog: v2.0.1.a_impl...v2.0.1.b_impl
Implementation of MicroWaveDeviceInventory v2.0.1.a (for Defne's Master Thesis)
This release consists of the changes required for Defne's Master Thesis that comprises of including Kafka into the existing MWDI's architecture.
Following are the solved issues ,
What's Changed
- Implementation for new link services by @IswaryaaS in #1088
- develop to v1.2.0 by @PrathibaJee in #1132
- Adding the code to branch by @ManasaBM1 in #1108
- IswaryaaS/issue1089 by @IswaryaaS in #1134
- adding response for cache filter services by @ManasaBM1 in #1106
- Incorrect use of console.log.debug, console.log.error, and console.log.warn in the NotificationManagement NotificationStreamManagement file by @ManasaBM1 in #1355
- V1.2.0 impl to develop branch by @PrathibaJee in #1370
- Changes in the Kafka consumer. by @PrathibaJee in #1414
- PrathibaJee/issue1396 by @PrathibaJee in #1416
- Modifications in the openApi.yaml file + few changes to the error handling. by @PrathibaJee in #1422
- PrathibaJee/issue1396 by @PrathibaJee in #1426
- PrathibaJee/issue1396 by @PrathibaJee in #1431
- IswaryaaS/issue1389 by @IswaryaaS in #1432
- IswaryaaS/issue1433 by @IswaryaaS in #1435
New Contributors
- @IswaryaaS made their first contribution in #1088
- @PrathibaJee made their first contribution in #1431
Full Changelog: 1.1.2.k_impl...v2.0.1.a_impl
Release note of MWDI v1.2.2-hotfix2_impl
Release note for MWDI v1.2.2-hotfix2_impl
Ticket closed or mitigated
#1350 - Missing OAM_PATH_CONTROLLER_ATTRIBUTE_* Constants in configConstants.js
#1048 - Performance]Observing slowness in retrieving the cache APIs when MWDI APIs are queried continuously by other application
Analyzed but not MWDI bug:
#1105 - Observing error message from ELK when bulk controller notification is triggered
Release note of MWDI v1.2.2-hotfix.1_impl
Release note for MWDI v1.2.2-hotfix.1_impl
Please read carefully the notes below
Ticket closed:
#772 -[Data] Cache air-interface capability value and live air-interface capability value are not same
#853 - [Data] /core-model-1-4:network-control-domain=cache/control-construct=/equipment=/actual-equipment not working in tag test_v1.1.2
#901 - [NotificationService] Cleared alarm notification is not updating the cache - To be validated
#989 - [Subscription] Unable to end the subscription in MWDI
#1001 - [NotificationService] Observing 409 error from /v1/regard-device-object-creation/deletion
#1004 - [LinkService] GET /core-model-1-4:network-control-domain=cache/link={uuid}/link-port={localId} does not return data (500)
#1050 - tag_1.1.2h:Observing 500 Internal server error with Object Deletion
#1051 - tag_1.1.2h: Link port creation fails with 500 response
#1054 - Device is cache is not available in ELK whereas device connected and Live is requested successfully
#1092 - /v1/regard-device-attribute-value-change - value not updating in MWDI cache - To be validated
#1107 - Observing 500 Internal server error when POST:/v1/provide-list-of-parallel-links
Ticket to be closed or addressed properly:
#974 - [Ops] observing Bad request error message in docker logs without any information on what caused bad request. - This is not an issue, just reporting in the log that issue is in the requestor App that is asking for information that doesn't exist. Exception is handled properly
#981 - [BasicService] /v1/inform-about-application returns the wrong release number - This issue cannot be validated in this release, since is build on 1.2.1 not in 1.1.2.
#1112 - Observing 502 BadGateway from NDL "/v1/provide-inventory-of-device" - Not a MWDI problem, SW is working as expecting, rejecting request with wrong or special chars
#1135 - Observing 500 for GET /core-model-1-4:network-control-domain=cache/control-construct={mountName} - Not a SW fault, is validation file that doesn't match with Control construct coming from ODL
#1168 - Observing 500 Internal server error when we invoke POST:/core-model-1-4:network-control-domain=cache/link={uuid} - Not a MWDI problem, SW is working as expecting, rejecting request with wrong or special chars