Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 18 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,10 @@ jobs:
uses: actions/checkout@v4

- name: 🛡️ Validate Zenodo Metadata
uses: vsoch/zenodo-validator@main
uses: vsoch/zenodo-validator@main
with:
path: '.zenodo.json'
allowed_extra_properties: 'pub_id' # Optional: allow extra properties
```


Expand All @@ -42,6 +43,7 @@ docker run --rm \
-e INPUT_PATH=.zenodo.json \
-e INPUT_SCHEMA_PATH=/schema.json \
-e INPUT_ERROR_FORMAT=text \
-e INPUT_ALLOWED_EXTRA_PROPERTIES=pub_id \
zenodo-validator
```

Expand All @@ -60,6 +62,21 @@ And you can also use the image provided: `ghcr.io/vsoch/zenodo-validator`.
| :--- | :--- | :--- |
| `path` 📍 | Where is your `.zenodo.json`? | `.zenodo.json` |
| `error_format` 🎨 | `text`, `json`, or `pretty-json` | `text` |
| `allowed_extra_properties` ✨ | Comma-separated list of extra property names to allow (e.g., `pub_id,custom_field`) | `''` (empty) |
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just extra_properties ? I think allowed is implied.


### 🔓 Allowing Extra Properties

Some Invenio instances (like [RODARE](https://rodare.hzdr.de)) require additional properties beyond the standard Zenodo schema. You can explicitly allow these properties using the `allowed_extra_properties` input:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if something is allowed and required? Or requires more than just being in the listing? Notably, we are adding the allowed extra properties, but not defining types, etc.


```yaml
- name: 🛡️ Validate Zenodo Metadata
uses: vsoch/zenodo-validator@main
with:
path: '.rodare.json'
allowed_extra_properties: 'pub_id'
```

This will allow the specified properties while still validating all other fields against the Zenodo schema. Multiple properties can be specified as a comma-separated list: `'pub_id,custom_field,another_field'`.

---

Expand Down
5 changes: 5 additions & 0 deletions action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ inputs:
description: 'Output format (text, json, pretty-json)'
required: false
default: 'text'
allowed_extra_properties:
description: 'Comma-separated list of extra property names to allow (e.g., "pub_id,custom_field")'
required: false
default: ''

runs:
using: 'docker'
Expand All @@ -21,3 +25,4 @@ runs:
INPUT_PATH: ${{ inputs.path }}
INPUT_SCHEMA_PATH: ${{ inputs.schema_path }}
INPUT_ERROR_FORMAT: ${{ inputs.error_format }}
INPUT_ALLOWED_EXTRA_PROPERTIES: ${{ inputs.allowed_extra_properties }}
73 changes: 72 additions & 1 deletion docker/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,81 @@ fi
echo "🔮 Summoning the JSON spirits to check your work..."
echo "📜 Using Schema: $SCHEMA_PATH"
echo "🎨 Output Format: $INPUT_ERROR_FORMAT"

# Handle allowed extra properties
FINAL_SCHEMA_PATH="$SCHEMA_PATH"
if [ -n "$INPUT_ALLOWED_EXTRA_PROPERTIES" ]; then
echo "✨ Allowing extra properties: $INPUT_ALLOWED_EXTRA_PROPERTIES"
FINAL_SCHEMA_PATH="/tmp/modified_schema.json"

# Use Python to modify the schema and add allowed extra properties
if ! SCHEMA_PATH_ESC="$SCHEMA_PATH" \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this block let's:

  • Write an actual Python script with argparse
  • Run the command here, providing the environment variables

It will be more understandable for the reader developer than as currently done.

FINAL_SCHEMA_PATH_ESC="$FINAL_SCHEMA_PATH" \
ALLOWED_PROPS_ESC="$INPUT_ALLOWED_EXTRA_PROPERTIES" \
python3 << 'PYTHON_EOF'
import json
import os
import sys

try:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLMs have a tendency to wrap everything in try/except. My preference would be to not do that. We aren't catching any specific error here, especially with the various checks. If there is an error I want it to raise and see it and don't need the additional wrapping.

schema_path = os.environ['SCHEMA_PATH_ESC']
final_schema_path = os.environ['FINAL_SCHEMA_PATH_ESC']
allowed_props_str = os.environ['ALLOWED_PROPS_ESC']
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: variable names should not have types.


# Read the original schema
with open(schema_path, 'r') as f:
schema = json.load(f)

# Ensure schema has the expected structure
if not isinstance(schema, dict):
print('❌ ERROR: Schema must be a JSON object', file=sys.stderr)
sys.exit(1)

# Parse comma-separated list of allowed properties
allowed_props = [prop.strip() for prop in allowed_props_str.split(',') if prop.strip()]

if not allowed_props:
print('❌ ERROR: No valid property names found in allowed_extra_properties', file=sys.stderr)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just do sys.exit("Message") and it will exit with error and print the message.

sys.exit(1)

# Ensure properties object exists
if 'properties' not in schema:
schema['properties'] = {}

# Add each allowed property to the schema's properties object
# Use empty schema {} which allows any type
for prop in allowed_props:
if prop not in schema['properties']:
schema['properties'][prop] = {}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the property is nested? And where is the full definition of the property?


# Write the modified schema
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure that comments are useful/meaningful.

with open(final_schema_path, 'w') as f:
json.dump(schema, f, indent=2)

print(f'✅ Modified schema written to {final_schema_path}')
except KeyError as e:
print(f'❌ ERROR: Missing environment variable: {e}', file=sys.stderr)
sys.exit(1)
except FileNotFoundError:
print(f'❌ ERROR: Schema file not found: {schema_path}', file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f'❌ ERROR: Invalid JSON in schema file: {e}', file=sys.stderr)
sys.exit(1)
except Exception as e:
print(f'❌ ERROR: Failed to modify schema: {e}', file=sys.stderr)
sys.exit(1)
PYTHON_EOF
then
echo "❌ Failed to modify schema for extra properties"
exit 1
fi
fi

echo ""

# Run check-jsonschema with the corrected flag: --output-format
if check-jsonschema --schemafile "$SCHEMA_PATH" "$FILE_TO_VALIDATE" --output-format "$INPUT_ERROR_FORMAT"; then
if check-jsonschema --schemafile "$FINAL_SCHEMA_PATH" "$FILE_TO_VALIDATE" --output-format "$INPUT_ERROR_FORMAT"; then
echo ""
echo "🥳 HOORAY! Your .zenodo.json is shiny and valid!"
echo "🏆 You're a hero of Open Science! 🚀✨"
Expand Down