|
64 | 64 |
|
65 | 65 | <ul> |
66 | 66 |
|
| 67 | + <li class="toctree-l3"><a href="#openmpf-63x">OpenMPF 6.3.x</a></li> |
| 68 | + |
| 69 | + |
67 | 70 | <li class="toctree-l3"><a href="#openmpf-62x">OpenMPF 6.2.x</a></li> |
68 | 71 |
|
69 | 72 |
|
|
317 | 320 |
|
318 | 321 | <p><strong>NOTICE:</strong> This software (or technical data) was produced for the U.S. Government under contract, and is subject to the |
319 | 322 | Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2021 The MITRE Corporation. All Rights Reserved.</p> |
| 323 | +<h1 id="openmpf-63x">OpenMPF 6.3.x</h1> |
| 324 | +<h2>6.3.0: September 2021</h2> |
| 325 | + |
| 326 | +<h3>Documentation</h3> |
| 327 | + |
| 328 | +<ul> |
| 329 | +<li>Updated the API documents, Development Environment Guide, Node Guide, Install Guide, User Guide, Admin Guide, and |
| 330 | + others to clarify the difference between Docker and non-Docker behaviors.</li> |
| 331 | +<li>Transformed Packaging and Registering a Component document into Component Descriptor Reference.</li> |
| 332 | +<li>Split Media Segmentation Guide from User Guide.</li> |
| 333 | +<li>Updated and renamed the Workflow Manager document to Workflow Manager Architecture.</li> |
| 334 | +<li>Updated the various Docker guides to clarify the difference between building Docker images from scratch versus |
| 335 | + building them using pre-built base images on Docker Hub, emphasizing the latter.</li> |
| 336 | +<li>Updated the Contributor Guide to document the hotfix pull request process.</li> |
| 337 | +</ul> |
| 338 | +<h3>TiesDb Integration</h3> |
| 339 | + |
| 340 | +<ul> |
| 341 | +<li>TiesDb is a PostgreSQL DB with a RESTful API that stores media metadata. The metadata entries are queried using the |
| 342 | + hash (sha256, md5) of the media file. TIES stands |
| 343 | + for <a href="https://github.com/Noblis/ties-lib">Triage Import Export Schema</a>. TiesDb is deployed and managed externally to |
| 344 | + OpenMPF. For more information please contact us.</li> |
| 345 | +<li>When a job completes, OpenMPF can post assertions to media entries that exist in TiesDb. In general, one assertion is |
| 346 | + generated for each algorithm run on a piece of media. It contains the job status, algorithm name, detection |
| 347 | + type (<code>FACE</code>, <code>TEXT</code>, <code>MOTION</code>, etc.), and number of tracks generated, as well as a link to the full JSON output |
| 348 | + object.</li> |
| 349 | +<li>Each assertion serves as a lasting record so that job producers may first check TiesDb to see if an algorithm was run |
| 350 | + on a piece of media before submitting the same job to OpenMPF again.</li> |
| 351 | +<li>To enable TiesDb support, set the <code>TIES_DB_URL</code> job property or <code>ties.db.url</code> system property to |
| 352 | + the <code><http|https>://<host>:<port></code> part of the URL. The Workflow Manager will append |
| 353 | + the <code>/api/db/supplementals?sha256Hash=<hash></code> part. Here is an example of a TiesDb POST:</li> |
| 354 | +</ul> |
| 355 | +<pre><code class="json">{ |
| 356 | + "dataObject": { |
| 357 | + "sha256OutputHash": "1f8f2a8b2f5178765dd4a2e952f97f5037c290ee8d011cd7e92fb8f57bc75f17", |
| 358 | + "outputType": "FACE", |
| 359 | + "algorithm": "FACECV", |
| 360 | + "processDate": "2021-09-09T21:37:30.516-04:00", |
| 361 | + "pipeline": "OCV FACE DETECTION PIPELINE", |
| 362 | + "outputUri": "file:///home/mpf/git/openmpf-projects/openmpf/trunk/install/share/output-objects/1284/detection.json", |
| 363 | + "jobStatus": "COMPLETE", |
| 364 | + "jobId": 1284, |
| 365 | + "systemVersion": "6.3", |
| 366 | + "trackCount": 1, |
| 367 | + "systemHostname": "openmpf-master" |
| 368 | + }, |
| 369 | + "system": "OpenMPF", |
| 370 | + "securityTag": "UNCLASSIFIED", |
| 371 | + "informationType": "OpenMPF FACE", |
| 372 | + "assertionId": "4874829f666d79881f7803207c7359dc781b97d2c68b471136bf7235a397c5cd" |
| 373 | +} |
| 374 | +</code></pre> |
| 375 | + |
| 376 | +<h3>Natural Language Processing (NLP) Text Correction Component</h3> |
| 377 | + |
| 378 | +<ul> |
| 379 | +<li>This component utilizes the <a href="https://github.com/MSeal/cython_hunspell">CyHunspell</a> library, which is a Python |
| 380 | + port of the <a href="https://github.com/hunspell/hunspell">Hunspell</a> spell-checking library, to perform post-processing |
| 381 | + correction of OCR text. In general, it's intended to be used in a pipeline after a component like |
| 382 | + TesseractOCRTextDetection that generates <code>TEXT</code> tracks. These tracks are then fed-forward into NlpTextCorrection, |
| 383 | + which will add a <code>CORRECTED TEXT</code> property to the existing tracks. |
| 384 | + The <code>TESSERACT OCR TEXT DETECTION WITH NLP TEXT CORRECTION PIPELINE</code> performs this behavior. The component can also |
| 385 | + run on its own to process plain text files. Refer to |
| 386 | + the <a href="https://github.com/openmpf/openmpf-components/tree/master/python/NlpTextCorrection#readme">README</a> for details.</li> |
| 387 | +</ul> |
| 388 | +<h3>Azure Cognitive Services (ACS) Read Component</h3> |
| 389 | + |
| 390 | +<ul> |
| 391 | +<li>This component utilizes |
| 392 | + the <a href="https://westcentralus.dev.cognitive.microsoft.com/docs/services/computer-vision-v3-1-ga/operations/5d986960601faab4bf452005">Azure Cognitive Services Read Detection REST endpoint</a> |
| 393 | + to extract formatted text from documents (PDFs), images, and videos. Refer to |
| 394 | + the <a href="https://github.com/openmpf/openmpf-components/tree/master/python/AzureReadTextDetection#readme">README</a> for |
| 395 | + details.</li> |
| 396 | +</ul> |
| 397 | +<h3>Updates</h3> |
| 398 | + |
| 399 | +<ul> |
| 400 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1151">#1151</a>] Now supports <code>IN_PROGRESS_WITH_WARNINGS</code> status</li> |
| 401 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1234">#1234</a>] Now sorts JSON output object media by media id</li> |
| 402 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1341">#1341</a>] Added job id to all batch-job-specific Workflow Manager log |
| 403 | + messages</li> |
| 404 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1349">#1349</a>] Improved reporting and recording job status</li> |
| 405 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1353">#1353</a>] Updated the Workflow Manager to remove and warn about |
| 406 | + zero-size detections</li> |
| 407 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1382">#1382</a>] Updated Tika version to 1.27 for TikaImageDetection and |
| 408 | + TikaTextDetection components</li> |
| 409 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1387">#1387</a>] Markup can now be configured in a |
| 410 | + component's <code>descriptor.json</code></li> |
| 411 | +</ul> |
| 412 | +<h3>Bug Fixes</h3> |
| 413 | + |
| 414 | +<ul> |
| 415 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1080">#1080</a>] Batch jobs no longer prematurely set to 100% completion |
| 416 | + during artifact extraction</li> |
| 417 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1106">#1106</a>] When a job ends in <code>ERROR</code> or <code>CANCELLED_BY_SHUTDOWN</code> the |
| 418 | + job status UI now shows an End Date</li> |
| 419 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1158">#1158</a>] JSON output object URI no longer changes when callback fails</li> |
| 420 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1317">#1317</a>] TikaTextDetection no longer generates first PDF track |
| 421 | + at <code>PAGE_NUM</code> 2</li> |
| 422 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1337">#1337</a>] Now using <code>MPF_BAD_FRAME_SIZE</code> instead |
| 423 | + of <code>MPF_DETECTION_FAILED</code> for OpenCV empty/resize exception</li> |
| 424 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1359">#1359</a>] Image detection tracks no longer |
| 425 | + have <code>endOffsetFrameInclusive</code> set to 1</li> |
| 426 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1373">#1373</a>] When uploading large files through the Workflow Manager web |
| 427 | + UI, now more than the first 865032704 bytes get written</li> |
| 428 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1379">#1379</a>] TikaImageDetection component now avoids conflicts by no |
| 429 | + longer using the same path when extracting images for jobs with multiple pieces of media</li> |
| 430 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1386">#1386</a>] FeedForwardFrameCropper in the Python SDK now handles |
| 431 | + negative coordinates properly</li> |
| 432 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1391">#1391</a>] If a job is configured to upload markup and markup fails, |
| 433 | + the job no longer gets stuck</li> |
| 434 | +</ul> |
| 435 | +<h3>Known Issues</h3> |
| 436 | + |
| 437 | +<ul> |
| 438 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1372">#1372</a>] TikaImageDetection misses images in PowerPoint and Word |
| 439 | + documents</li> |
| 440 | +<li>[<a href="https://github.com/openmpf/openmpf/issues/1389">#1389</a>] NlpTextCorrection does not properly read the value |
| 441 | + of <code>FULL_TEXT_CORRECTION_OUTPUT</code></li> |
| 442 | +</ul> |
320 | 443 | <h1 id="openmpf-62x">OpenMPF 6.2.x</h1> |
321 | 444 | <h2>6.2.0: May 2021</h2> |
322 | 445 |
|
|
0 commit comments