Initial sketches for a ResourceSync Source Framework
While sketching a generic implementation of the source side for
a ResourceSync
framework, we lean heavily on rspub-core,
a file-system-specific implementation for the source-side and
on omtd-rspub-elastic,
an implementation using
the Elasticsearch storage system.
It appears that by taking the Generator
out of the Executor
in these
implementations and making these Generators
pluggable, we can create a generic
framework that can easily be adapted to various types of resource management systems.
In a fully implemented plugin, a Generator
has specialisations for two
strategies in syncing: a ResourceGenerator
and a ChangeGenerator
The ResourceGenerator
is capable
of listing metadata on initial resources. The ChangeGenerator
is capable of listing
metadata on newly created, updated and deleted resources in reference to
a previous state of the body of resources.
We will first have a walk through the components of such a system, their tasks and interactions, than have a look at the required variability in the system.
Fig. 1. Component Diagram for a ResourceSync Source Framework. Optional components and
interfaces are in dotted lines. (See )
The generic framework
houses all functionality needed to do a complete ReourceSync
source-side synchronisation - except for picking the resources.
The generic framework
the following points of interaction with its environment.
Required Interfaces
- Connection point for an implementation specificGenerator
- Connection point for implementation specific parameters and configuration.
Provided Interfaces
- Central entrance point of the framework. Start a ResourceSync run, maybe in time set repeater function for successive sync runs, time gap between runs.Observable
- Register observers that will receive notifications of events taking place during a ResourceSync execution. A default observer could beReporter
, keeping a journal of successive runs, by writing a summary (start time, how many resources where affected, end time).IParas
- Read and write parameters and configuration, compute derived parameters.IXml
- Read and write ResourceSync sitemap documents from xml to a class structure and vice versa.
Central component, director of execution, orchestrating execution.
Questions: Should it be responsible for pasting the rest of the components together as well or is this a task for an external plugin framework?
Required interfaces:
- Parameters, configuration.IExceute
- For executing a resourcesync run.ISend
- Optional. For moving/copying/sending resourcesync metadata files and/or resources to the document root of a web server.
Provided interface:
- Start a ResourceSync run.Observable
- Register observers that will receive notifications of events taking place during ResourceSync execution.
Component capable of converting resourcesync sitemaps from xml to classes and vice versa.
Provided interface:
- Convert to and from hierarchical classes and xml streams/files.
Component capable of validating, computing and persisting parameters and configuration details for different configurations.
Required interface:
- Optional. Provide functionality for persisting implementation specific parameters and configuration details. Make implementation specific parameters accessible.
Provided interface:
- Read and write parameters and configuration details from/to file, validate parameters, list configurations.
Provided interface:
- Validate and compute implementation specific parameters.
Pluggable component capable of yielding metadata items (the data in the element url of an urlset).
A fully implemented Generator
has both a ResourceGenerator
and a ChangeGenerator
is in use for strategies ResourceList and ResourceDump; ChangeGenerator
for ChangeList and ChangeDump.
For each applicable resource encountered, a ResourceGenerator
yields through its provided
interface at least the values for the elements/attributes:
<loc>, <lastmod>, <rs:md hash, length, type/>
For each applicable resource encountered, a ChangeGenerator
yields through its provided
interface at least the values for the elements/attributes:
<loc>, <lastmod>, <rs:md change, datetime, hash, length, type/>
Both can also give (values for) the location of the resource on the local file system and/or the identifier for the resource.
A Generator
that has the faculty dump-capable
is able to supply a path to or a stream of
the resource in order to be packed in a dump.
Required interfaces:
- Optional. Select resources. (For Generators based on indexing systems this is probably integrated in the query of a generator it self.)IFilter
- Optional, pluggable. Filter resources. Filtering can be based on metadata on the resource or the content of a resource itself. (Filtering based on other criteria?)IXml
Optional. Required if Generator is responsible for comparing present resource state with previous resource state.
Provided interface:
- Yield applicable resource metadata items.
Component capable of yielding resourcelists or changelists, resourcedumps or changedumps.
The Executor
delivers a cohesive set of ResourceSync sitemap documents under exactly
one capability list and updates the description.
The Executor
stages a ResourceSync execution:
- Start processing
- Prepare metadata directory
- Generate ResourceSync documents
- (Pack resources) (Only needed for dump variants.)
- Post process ResourceSync documents
- Create indexes (if applicable)
- Create/update capability list
- Create/update description
- End processing
Required interfaces:
- Source of applicable resource metadata items.IXml
- For producing sitemaps, xml streams/filesIParas
- Source of validated parameters, derived parameters
Provided interface:
- Execute a specific resourcesync run.
Provides logistic services for handling resourcesync metadata and resources after an execution. After a successful ResourceSync execution several scenarios are possible:
- Resources and ResourceSync metadata are already under the document root of a web server. Synchronization was done 'in situ'. No further action is needed.
- Resources and metadata are on a file sync like ownCloud and a share is mounted on the web server machine. No further action is needed.
- Local copy. Resources are on the same machine as the web server, but published resources have to be moved/copied under the document root of the web server.
- Remote copy. Resources and metadata are on a different machine then the web server. Resources and metadata have to be moved by means of secure copy protocol.
- Zip. Pack resources and metadata in a zip-file that can be handed to a systems admin.
Required interface:
- Source of validated parameters, derived parameters.
Provided interface:
- Move/copy/send resourcesync metadata files and/or resources to the document root of a web server or pack them in a zip file.
Select resources.
Filter resources. Pluggable. Scenarios where intricate selecting or filtering of resources is needed. Examples: Only schema-valid xml resources must be published. Resources that contain certain key words must be grouped in sets under different capability lists. Metadata on the resource in a database is decisive for publishing the resource. etc.
The generic framework should facilitate a plugin mechanism for Selectors
and Gates
The system is expected to have variability on several points. The variability can be in the configuration of external plugins and in the program flow of the process. Both kinds of variability can be interdependent on each other.
Fig. 2. Variability points. (See )
Fig. 3. Variability model. (See )
The variability model shows the variability points (triangles) and their corresponding variants or choices (rectangles). Optional choices are indicated with a dotted line between point and variant. Mandatory choices and the cardinality are indicated with a slash-notation (1/4). Dependencies between points and choices are indicated with dotted arrows. Alternatively, interdependencies are marked with colored areas. (Modelling of variability described in Software Product Line Engineering (Pohl, Klaus, Böckle, Günter, van der Linden, Frank J.)) Plugin variants, or better variants outside the generic framework, are depicted against a grey background.
There are required core parameters (metadata_dir, url_prefix, strategy etc.) and implementation specific parameters.
What should the executor produce? Simple an enumeration of the 4 types: resourcelist, changelist, resourcedump and changedump. Strategy is coupled one-on-one with VP03 Execution Type.
Choice of executor: ResourceList, ChangeList, ResourceDump or ChangeDump. The Execution Type is dependent upon VP04 Generator Type and VP0 Generator Faculties. With a Generator Implementation that is not Dump-Capable, no dumps can be made. With a Generator Implementation that has no Change Generator, no ChangeLists, ChangeDumps can be made.
Type of generator. Resource Generator, Change Generator or both. An implementation must provide at least one type of generator.
Faculties of the implementation. Optional variant Dump-Capable. An implementation is Dump-Capable if resources can be obtained in order to create dumps.
The actual generator plugin.
Resource Gate adapted for the type of implementation.
Various ways to move resources and metadata to the document root of a web server.