Skip to content

Commit 41e4e31

Browse files
authored
Update release notes for R6.3.0. (#137)
1 parent f91cf23 commit 41e4e31

File tree

6 files changed

+271
-30
lines changed

6 files changed

+271
-30
lines changed

docs/docs/Release-Notes.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,120 @@
11
**NOTICE:** This software (or technical data) was produced for the U.S. Government under contract, and is subject to the
22
Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2021 The MITRE Corporation. All Rights Reserved.
33

4+
# OpenMPF 6.3.x
5+
6+
<h2>6.3.0: September 2021</h2>
7+
8+
<h3>Documentation</h3>
9+
10+
- Updated the API documents, Development Environment Guide, Node Guide, Install Guide, User Guide, Admin Guide, and
11+
others to clarify the difference between Docker and non-Docker behaviors.
12+
- Transformed Packaging and Registering a Component document into Component Descriptor Reference.
13+
- Split Media Segmentation Guide from User Guide.
14+
- Updated and renamed the Workflow Manager document to Workflow Manager Architecture.
15+
- Updated the various Docker guides to clarify the difference between building Docker images from scratch versus
16+
building them using pre-built base images on Docker Hub, emphasizing the latter.
17+
- Updated the Contributor Guide to document the hotfix pull request process.
18+
19+
<h3>TiesDb Integration</h3>
20+
21+
- TiesDb is a PostgreSQL DB with a RESTful API that stores media metadata. The metadata entries are queried using the
22+
hash (sha256, md5) of the media file. TIES stands
23+
for [Triage Import Export Schema](https://github.com/Noblis/ties-lib). TiesDb is deployed and managed externally to
24+
OpenMPF. For more information please contact us.
25+
- When a job completes, OpenMPF can post assertions to media entries that exist in TiesDb. In general, one assertion is
26+
generated for each algorithm run on a piece of media. It contains the job status, algorithm name, detection
27+
type (`FACE`, `TEXT`, `MOTION`, etc.), and number of tracks generated, as well as a link to the full JSON output
28+
object.
29+
- Each assertion serves as a lasting record so that job producers may first check TiesDb to see if an algorithm was run
30+
on a piece of media before submitting the same job to OpenMPF again.
31+
- To enable TiesDb support, set the `TIES_DB_URL` job property or `ties.db.url` system property to
32+
the `<http|https>://<host>:<port>` part of the URL. The Workflow Manager will append
33+
the `/api/db/supplementals?sha256Hash=<hash>` part. Here is an example of a TiesDb POST:
34+
```json
35+
{
36+
"dataObject": {
37+
"sha256OutputHash": "1f8f2a8b2f5178765dd4a2e952f97f5037c290ee8d011cd7e92fb8f57bc75f17",
38+
"outputType": "FACE",
39+
"algorithm": "FACECV",
40+
"processDate": "2021-09-09T21:37:30.516-04:00",
41+
"pipeline": "OCV FACE DETECTION PIPELINE",
42+
"outputUri": "file:///home/mpf/git/openmpf-projects/openmpf/trunk/install/share/output-objects/1284/detection.json",
43+
"jobStatus": "COMPLETE",
44+
"jobId": 1284,
45+
"systemVersion": "6.3",
46+
"trackCount": 1,
47+
"systemHostname": "openmpf-master"
48+
},
49+
"system": "OpenMPF",
50+
"securityTag": "UNCLASSIFIED",
51+
"informationType": "OpenMPF FACE",
52+
"assertionId": "4874829f666d79881f7803207c7359dc781b97d2c68b471136bf7235a397c5cd"
53+
}
54+
```
55+
56+
<h3>Natural Language Processing (NLP) Text Correction Component</h3>
57+
58+
- This component utilizes the [CyHunspell](https://github.com/MSeal/cython_hunspell) library, which is a Python
59+
port of the [Hunspell](https://github.com/hunspell/hunspell) spell-checking library, to perform post-processing
60+
correction of OCR text. In general, it's intended to be used in a pipeline after a component like
61+
TesseractOCRTextDetection that generates `TEXT` tracks. These tracks are then fed-forward into NlpTextCorrection,
62+
which will add a `CORRECTED TEXT` property to the existing tracks.
63+
The `TESSERACT OCR TEXT DETECTION WITH NLP TEXT CORRECTION PIPELINE` performs this behavior. The component can also
64+
run on its own to process plain text files. Refer to
65+
the [README](https://github.com/openmpf/openmpf-components/tree/master/python/NlpTextCorrection#readme) for details.
66+
67+
<h3>Azure Cognitive Services (ACS) Read Component</h3>
68+
69+
- This component utilizes
70+
the [Azure Cognitive Services Read Detection REST endpoint](https://westcentralus.dev.cognitive.microsoft.com/docs/services/computer-vision-v3-1-ga/operations/5d986960601faab4bf452005)
71+
to extract formatted text from documents (PDFs), images, and videos. Refer to
72+
the [README](https://github.com/openmpf/openmpf-components/tree/master/python/AzureReadTextDetection#readme) for
73+
details.
74+
75+
<h3>Updates</h3>
76+
77+
- [[#1151](https://github.com/openmpf/openmpf/issues/1151)] Now supports `IN_PROGRESS_WITH_WARNINGS` status
78+
- [[#1234](https://github.com/openmpf/openmpf/issues/1234)] Now sorts JSON output object media by media id
79+
- [[#1341](https://github.com/openmpf/openmpf/issues/1341)] Added job id to all batch-job-specific Workflow Manager log
80+
messages
81+
- [[#1349](https://github.com/openmpf/openmpf/issues/1349)] Improved reporting and recording job status
82+
- [[#1353](https://github.com/openmpf/openmpf/issues/1353)] Updated the Workflow Manager to remove and warn about
83+
zero-size detections
84+
- [[#1382](https://github.com/openmpf/openmpf/issues/1382)] Updated Tika version to 1.27 for TikaImageDetection and
85+
TikaTextDetection components
86+
- [[#1387](https://github.com/openmpf/openmpf/issues/1387)] Markup can now be configured in a
87+
component's `descriptor.json`
88+
89+
<h3>Bug Fixes</h3>
90+
91+
- [[#1080](https://github.com/openmpf/openmpf/issues/1080)] Batch jobs no longer prematurely set to 100% completion
92+
during artifact extraction
93+
- [[#1106](https://github.com/openmpf/openmpf/issues/1106)] When a job ends in `ERROR` or `CANCELLED_BY_SHUTDOWN` the
94+
job status UI now shows an End Date
95+
- [[#1158](https://github.com/openmpf/openmpf/issues/1158)] JSON output object URI no longer changes when callback fails
96+
- [[#1317](https://github.com/openmpf/openmpf/issues/1317)] TikaTextDetection no longer generates first PDF track
97+
at `PAGE_NUM` 2
98+
- [[#1337](https://github.com/openmpf/openmpf/issues/1337)] Now using `MPF_BAD_FRAME_SIZE` instead
99+
of `MPF_DETECTION_FAILED` for OpenCV empty/resize exception
100+
- [[#1359](https://github.com/openmpf/openmpf/issues/1359)] Image detection tracks no longer
101+
have `endOffsetFrameInclusive` set to 1
102+
- [[#1373](https://github.com/openmpf/openmpf/issues/1373)] When uploading large files through the Workflow Manager web
103+
UI, now more than the first 865032704 bytes get written
104+
- [[#1379](https://github.com/openmpf/openmpf/issues/1379)] TikaImageDetection component now avoids conflicts by no
105+
longer using the same path when extracting images for jobs with multiple pieces of media
106+
- [[#1386](https://github.com/openmpf/openmpf/issues/1386)] FeedForwardFrameCropper in the Python SDK now handles
107+
negative coordinates properly
108+
- [[#1391](https://github.com/openmpf/openmpf/issues/1391)] If a job is configured to upload markup and markup fails,
109+
the job no longer gets stuck
110+
111+
<h3>Known Issues</h3>
112+
113+
- [[#1372](https://github.com/openmpf/openmpf/issues/1372)] TikaImageDetection misses images in PowerPoint and Word
114+
documents
115+
- [[#1389](https://github.com/openmpf/openmpf/issues/1389)] NlpTextCorrection does not properly read the value
116+
of `FULL_TEXT_CORRECTION_OUTPUT`
117+
4118
# OpenMPF 6.2.x
5119

6120
<h2>6.2.0: May 2021</h2>

docs/site/License-And-Distribution/index.html

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -303,8 +303,7 @@ <h2 id="open-source">Open Source</h2>
303303
non-commercial and re-distribution restrictions imposed by the dependencies that the OpenMPF software uses. Building
304304
OpenMPF and linking in these dependencies at build time or run time may result in creating a derivative work under the
305305
terms of the GNU General Public License. Refer to <a href="../Acknowledgements/index.html">Acknowledgements</a> for more information
306-
about these dependencies.
307-
</p>
306+
about these dependencies.</p>
308307
</blockquote>
309308
<h2 id="docker-distribution">Docker Distribution</h2>
310309
<p>The OpenMPF Docker images are released under <a href="https://www.gnu.org/licenses/old-licenses/gpl-2.0.html">GPLv2</a>, unless

docs/site/Release-Notes/index.html

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,9 @@
6464

6565
<ul>
6666

67+
<li class="toctree-l3"><a href="#openmpf-63x">OpenMPF 6.3.x</a></li>
68+
69+
6770
<li class="toctree-l3"><a href="#openmpf-62x">OpenMPF 6.2.x</a></li>
6871

6972

@@ -317,6 +320,126 @@
317320

318321
<p><strong>NOTICE:</strong> This software (or technical data) was produced for the U.S. Government under contract, and is subject to the
319322
Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2021 The MITRE Corporation. All Rights Reserved.</p>
323+
<h1 id="openmpf-63x">OpenMPF 6.3.x</h1>
324+
<h2>6.3.0: September 2021</h2>
325+
326+
<h3>Documentation</h3>
327+
328+
<ul>
329+
<li>Updated the API documents, Development Environment Guide, Node Guide, Install Guide, User Guide, Admin Guide, and
330+
others to clarify the difference between Docker and non-Docker behaviors.</li>
331+
<li>Transformed Packaging and Registering a Component document into Component Descriptor Reference.</li>
332+
<li>Split Media Segmentation Guide from User Guide.</li>
333+
<li>Updated and renamed the Workflow Manager document to Workflow Manager Architecture.</li>
334+
<li>Updated the various Docker guides to clarify the difference between building Docker images from scratch versus
335+
building them using pre-built base images on Docker Hub, emphasizing the latter.</li>
336+
<li>Updated the Contributor Guide to document the hotfix pull request process.</li>
337+
</ul>
338+
<h3>TiesDb Integration</h3>
339+
340+
<ul>
341+
<li>TiesDb is a PostgreSQL DB with a RESTful API that stores media metadata. The metadata entries are queried using the
342+
hash (sha256, md5) of the media file. TIES stands
343+
for <a href="https://github.com/Noblis/ties-lib">Triage Import Export Schema</a>. TiesDb is deployed and managed externally to
344+
OpenMPF. For more information please contact us.</li>
345+
<li>When a job completes, OpenMPF can post assertions to media entries that exist in TiesDb. In general, one assertion is
346+
generated for each algorithm run on a piece of media. It contains the job status, algorithm name, detection
347+
type (<code>FACE</code>, <code>TEXT</code>, <code>MOTION</code>, etc.), and number of tracks generated, as well as a link to the full JSON output
348+
object.</li>
349+
<li>Each assertion serves as a lasting record so that job producers may first check TiesDb to see if an algorithm was run
350+
on a piece of media before submitting the same job to OpenMPF again.</li>
351+
<li>To enable TiesDb support, set the <code>TIES_DB_URL</code> job property or <code>ties.db.url</code> system property to
352+
the <code>&lt;http|https&gt;://&lt;host&gt;:&lt;port&gt;</code> part of the URL. The Workflow Manager will append
353+
the <code>/api/db/supplementals?sha256Hash=&lt;hash&gt;</code> part. Here is an example of a TiesDb POST:</li>
354+
</ul>
355+
<pre><code class="json">{
356+
&quot;dataObject&quot;: {
357+
&quot;sha256OutputHash&quot;: &quot;1f8f2a8b2f5178765dd4a2e952f97f5037c290ee8d011cd7e92fb8f57bc75f17&quot;,
358+
&quot;outputType&quot;: &quot;FACE&quot;,
359+
&quot;algorithm&quot;: &quot;FACECV&quot;,
360+
&quot;processDate&quot;: &quot;2021-09-09T21:37:30.516-04:00&quot;,
361+
&quot;pipeline&quot;: &quot;OCV FACE DETECTION PIPELINE&quot;,
362+
&quot;outputUri&quot;: &quot;file:///home/mpf/git/openmpf-projects/openmpf/trunk/install/share/output-objects/1284/detection.json&quot;,
363+
&quot;jobStatus&quot;: &quot;COMPLETE&quot;,
364+
&quot;jobId&quot;: 1284,
365+
&quot;systemVersion&quot;: &quot;6.3&quot;,
366+
&quot;trackCount&quot;: 1,
367+
&quot;systemHostname&quot;: &quot;openmpf-master&quot;
368+
},
369+
&quot;system&quot;: &quot;OpenMPF&quot;,
370+
&quot;securityTag&quot;: &quot;UNCLASSIFIED&quot;,
371+
&quot;informationType&quot;: &quot;OpenMPF FACE&quot;,
372+
&quot;assertionId&quot;: &quot;4874829f666d79881f7803207c7359dc781b97d2c68b471136bf7235a397c5cd&quot;
373+
}
374+
</code></pre>
375+
376+
<h3>Natural Language Processing (NLP) Text Correction Component</h3>
377+
378+
<ul>
379+
<li>This component utilizes the <a href="https://github.com/MSeal/cython_hunspell">CyHunspell</a> library, which is a Python
380+
port of the <a href="https://github.com/hunspell/hunspell">Hunspell</a> spell-checking library, to perform post-processing
381+
correction of OCR text. In general, it's intended to be used in a pipeline after a component like
382+
TesseractOCRTextDetection that generates <code>TEXT</code> tracks. These tracks are then fed-forward into NlpTextCorrection,
383+
which will add a <code>CORRECTED TEXT</code> property to the existing tracks.
384+
The <code>TESSERACT OCR TEXT DETECTION WITH NLP TEXT CORRECTION PIPELINE</code> performs this behavior. The component can also
385+
run on its own to process plain text files. Refer to
386+
the <a href="https://github.com/openmpf/openmpf-components/tree/master/python/NlpTextCorrection#readme">README</a> for details.</li>
387+
</ul>
388+
<h3>Azure Cognitive Services (ACS) Read Component</h3>
389+
390+
<ul>
391+
<li>This component utilizes
392+
the <a href="https://westcentralus.dev.cognitive.microsoft.com/docs/services/computer-vision-v3-1-ga/operations/5d986960601faab4bf452005">Azure Cognitive Services Read Detection REST endpoint</a>
393+
to extract formatted text from documents (PDFs), images, and videos. Refer to
394+
the <a href="https://github.com/openmpf/openmpf-components/tree/master/python/AzureReadTextDetection#readme">README</a> for
395+
details.</li>
396+
</ul>
397+
<h3>Updates</h3>
398+
399+
<ul>
400+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1151">#1151</a>] Now supports <code>IN_PROGRESS_WITH_WARNINGS</code> status</li>
401+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1234">#1234</a>] Now sorts JSON output object media by media id</li>
402+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1341">#1341</a>] Added job id to all batch-job-specific Workflow Manager log
403+
messages</li>
404+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1349">#1349</a>] Improved reporting and recording job status</li>
405+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1353">#1353</a>] Updated the Workflow Manager to remove and warn about
406+
zero-size detections</li>
407+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1382">#1382</a>] Updated Tika version to 1.27 for TikaImageDetection and
408+
TikaTextDetection components</li>
409+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1387">#1387</a>] Markup can now be configured in a
410+
component's <code>descriptor.json</code></li>
411+
</ul>
412+
<h3>Bug Fixes</h3>
413+
414+
<ul>
415+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1080">#1080</a>] Batch jobs no longer prematurely set to 100% completion
416+
during artifact extraction</li>
417+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1106">#1106</a>] When a job ends in <code>ERROR</code> or <code>CANCELLED_BY_SHUTDOWN</code> the
418+
job status UI now shows an End Date</li>
419+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1158">#1158</a>] JSON output object URI no longer changes when callback fails</li>
420+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1317">#1317</a>] TikaTextDetection no longer generates first PDF track
421+
at <code>PAGE_NUM</code> 2</li>
422+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1337">#1337</a>] Now using <code>MPF_BAD_FRAME_SIZE</code> instead
423+
of <code>MPF_DETECTION_FAILED</code> for OpenCV empty/resize exception</li>
424+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1359">#1359</a>] Image detection tracks no longer
425+
have <code>endOffsetFrameInclusive</code> set to 1</li>
426+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1373">#1373</a>] When uploading large files through the Workflow Manager web
427+
UI, now more than the first 865032704 bytes get written</li>
428+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1379">#1379</a>] TikaImageDetection component now avoids conflicts by no
429+
longer using the same path when extracting images for jobs with multiple pieces of media</li>
430+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1386">#1386</a>] FeedForwardFrameCropper in the Python SDK now handles
431+
negative coordinates properly</li>
432+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1391">#1391</a>] If a job is configured to upload markup and markup fails,
433+
the job no longer gets stuck</li>
434+
</ul>
435+
<h3>Known Issues</h3>
436+
437+
<ul>
438+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1372">#1372</a>] TikaImageDetection misses images in PowerPoint and Word
439+
documents</li>
440+
<li>[<a href="https://github.com/openmpf/openmpf/issues/1389">#1389</a>] NlpTextCorrection does not properly read the value
441+
of <code>FULL_TEXT_CORRECTION_OUTPUT</code></li>
442+
</ul>
320443
<h1 id="openmpf-62x">OpenMPF 6.2.x</h1>
321444
<h2>6.2.0: May 2021</h2>
322445

docs/site/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -456,5 +456,5 @@ <h1 id="overview">Overview</h1>
456456

457457
<!--
458458
MkDocs version : 0.16.0
459-
Build Date UTC : 2021-08-02 20:18:09
459+
Build Date UTC : 2021-09-09 20:26:58
460460
-->

docs/site/mkdocs/search_index.json

Lines changed: 7 additions & 2 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)