Microservice that exports data from Kaleidos to be published on Themis. An export can be triggered manually or scheduled via a publication-activity in Kaleidos.
Add the following snippet to your docker-compose.yml
:
themis-export:
image: kanselarij-vlaanderen/themis-export-service
links:
- database:database
volumes:
- ./data/exports/:/share
The final result of the export will be written to the volume mounted in /share
.
An export can be triggered in 2 ways:
- By manually calling the API endpoint for a specific meeting
- By creating a themis-publication-activity in the Kaleidos DB with a (planned) start date and related to a meeting
The following environment variables can be configured:
MU_SPARQL_ENDPOINT
(default: http://database:8890/sparql): mu-authorization based SPARQL endpoint of the internal triple store to write intermediate results to that will trigger delta notificationsVIRTUOSO_SPARQL_ENDPOINT
(default: http://triplestore:8890/sparql): SPARQL endpoint of the Virtuoso triple store, in order to perform fast queries skipping mu-authorization and to extract the ttl filesEXPORT_BATCH_SIZE
(default: 1000): number of triples to export in batch in the final dumpPUBLICATION_CRON_PATTERN
(default0 * * * * *
= every minute): frequency to fetch for scheduled publications in KaleidosPUBLICATION_WINDOW_MILLIS
(default: 24h): max window to fetch publication-activities in Kaleidos for. The window determines the period in which scheduled publications will still be executed (in delay) if this export service is not running at the moment the publication was originally planned.NB_OF_VIRTUOSO_QUERY_RETRIES
(default 6): max number of times to retry a query to Virtuoso. Sometimes Virtuoso is busy creating a checkpoint. If we make a request at this time, it will fail, to remedy this, we retry failed requests a number of timesVIRTUOSO_QUERY_RETRY_MILLIS
(default 1000): the timeout in milliseconds between query retries
Prefix | URI |
---|---|
dct | http://purl.org/dc/terms/ |
adms | http://www.w3.org/ns/adms# |
prov | http://www.w3.org/ns/prov# |
ext | http://mu.semte.ch/vocabularies/ext |
Resource representing an export job. Jobs are executed one by one using the FIFO approach. When an export job fails, it will be retried up to 5 times before being permanently marked as a failed job.
ext:PublicExportJob
Name | Predicate | Range | Definition |
---|---|---|---|
status | adms:status |
rdfs:Resource |
Status of the export job, initially set to <http://data.kaleidos.vlaanderen.be/public-export-job-statuses/scheduled> |
meeting | prov:used |
rdfs:Resource |
Meeting (in Kaleidos) the export job is executed for |
created | dct:created |
xsd:dateTime |
Datetime of creation of the job |
scope | ext:scope |
xsd:string |
Scope of the export jobs. Possible values are newsitems and documents . A job may contain multiple scopes. |
results | prov:generated |
rdfs:Resource |
The resources generated by the export job. |
source | dct:source |
rdfs:Resource |
Source of the export job (e.g. a publication-activity in Kaleidos) |
The status of the export job will be updated to reflect the progress of the job. The following statuses are known:
- http://data.kaleidos.vlaanderen.be/public-export-job-statuses/scheduled
- http://data.kaleidos.vlaanderen.be/public-export-job-statuses/ongoing
- http://data.kaleidos.vlaanderen.be/public-export-job-statuses/success
- http://data.kaleidos.vlaanderen.be/public-export-job-statuses/failure
Resource respresenting a planned publication for a meeting. This resource resides in the Kaleidos DB.
ext:ThemisPublicationActivity
Name | Predicate | Range | Definition |
---|---|---|---|
meeting | prov:used |
rdfs:Resource |
Meeting (in Kaleidos) the publication is planned for |
planned-start | prov:startedAtTime |
xsd:dateTime |
Datetime on which the publication in scheduled. |
scope | ext:scope |
xsd:string |
Scope of the publication. Possible values are newsitems and documents . A publication may contain multiple scopes. |
The data model used for the exported data is documented on the Themis documentation website.
Trigger the publication of the Kaleidos meeting with the given :uuid
. In case the meeting has already been published before, the new publication will be linked to the previous one on Themis.
Example request body:
{
"data": {
"type": "publication-activity",
"attributes": {
"scope": ["newsitems", "documents"],
"source": "http://themis.vlaanderen.be/publication-activity/933ea4cc-3786-4a5a-bace-8c99ce8c44aa"
}
}
}
The following attributes can be set on the publication-activity:
- scope: determines the scope of the export. Supported values are
"newsitems"
and"documents"
. Documents can only be exported if the newsitems are exported as well. To unpublish a meeting, send an empty array as scope. - source (optional): URI of the publication-activity in Kaleidos that triggered the export
- 202 Accepted on successfull trigger of an export. The
Location
response header contains the endpoint to monitor the progress of the job. - 400 Bad Request on invalid scope in the request body
- 404 Not Found if a meeting with the given id cannot be found in Kaleidos
Get the details, including the status, of an export job
- 200 OK with job details in the response body
- 404 Not Found if a job with the given id cannot be found
Get a summary of the triggered export jobs. Contains the number of export jobs, grouped per status.
- 200 OK with the summary in the response body