-
Notifications
You must be signed in to change notification settings - Fork 0
added dynamic schema #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: gl-master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Co-authored-by: Copilot <[email protected]>
…into dynamic_schema_formkeep
tap_formkeep/schema.py
Outdated
| return ["null", "number"] | ||
|
|
||
| if isinstance(value, list): | ||
| return ["null", "array"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need unit test for this. It should also handle nested fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.
tap_formkeep/client.py
Outdated
|
|
||
| LOGGER = get_logger() | ||
| REQUEST_TIMEOUT = 300 | ||
| DEFAULT_USER_AGENT = 'Singer.io Bing Ads Tap' |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User-Agent references 'Bing Ads Tap' instead of 'Formkeep Tap'. This should be corrected to accurately identify this tap.
| DEFAULT_USER_AGENT = 'Singer.io Bing Ads Tap' | |
| DEFAULT_USER_AGENT = 'Singer.io Formkeep Tap' |
| return {"type": ["null", "number"]} | ||
|
|
||
| if isinstance(value, str): | ||
| if DATETIME_REGEX.match(value) or DATE_REGEX.match(value): |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DATETIME_REGEX pattern allows space as a date-time separator (pattern includes [T ]), but this does not produce RFC 3339 compliant datetime strings. Per coding guideline #1000000 section 3, only RFC 3339 format with 'T' separator should be accepted. Remove the space alternative from the regex.
| raw_ids = config.get("form_ids", []) | ||
|
|
||
| if isinstance(raw_ids, str): | ||
| form_ids = ast.literal_eval(raw_ids) |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using ast.literal_eval() on user configuration input without error handling is risky. If the string is malformed, it will raise a ValueError. Add try-except to handle parsing errors gracefully and log a meaningful error message.
| form_ids = ast.literal_eval(raw_ids) | |
| try: | |
| form_ids = ast.literal_eval(raw_ids) | |
| except (ValueError, SyntaxError) as e: | |
| LOGGER.error(f"Failed to parse 'form_ids' from config: {raw_ids!r}. Error: {e}") | |
| form_ids = [] |
|
|
||
| DATE_REGEX = re.compile(r"^\d{4}-\d{2}-\d{2}$") | ||
| TIME_REGEX = re.compile(r"^\d{2}:\d{2}(:\d{2})?$") |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Module-level regex patterns should be named with uppercase convention (already correct) but consider adding docstrings explaining their purpose and the formats they match to improve code maintainability.
| DATE_REGEX = re.compile(r"^\d{4}-\d{2}-\d{2}$") | |
| TIME_REGEX = re.compile(r"^\d{2}:\d{2}(:\d{2})?$") | |
| # Matches dates in the format YYYY-MM-DD (e.g., 2024-06-01). | |
| DATE_REGEX = re.compile(r"^\d{4}-\d{2}-\d{2}$") | |
| # Matches times in the format HH:MM or HH:MM:SS (e.g., 14:30 or 14:30:59). | |
| TIME_REGEX = re.compile(r"^\d{2}:\d{2}(:\d{2})?$") | |
| # Matches datetimes in the format YYYY-MM-DDTHH:MM:SSZ, with optional timezone offsets or ' UTC'. | |
| # Examples: 2024-06-01T14:30:59Z, 2024-06-01 14:30:59+02:00, 2024-06-01T14:30:59 UTC |
| if not submissions: | ||
| continue | ||
|
|
||
| first_submission = submissions[0] |
Copilot
AI
Nov 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inferring schema from only the first submission is unreliable. If the first submission has null values or missing fields, the schema will be incomplete. Consider iterating through multiple submissions and merging their schemas, or at minimum, document this limitation.
* initial push for sync changes * Update tap_formkeep/streams/abstracts.py Co-authored-by: Copilot <[email protected]> * remove spam from params * integration tests * discovery test * integration tests * change logic for get records * change page size * user agent update * default page size * remove unwanted import * update camel case --------- Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observed field value like these "description of what you'd like to discuss": "[email protected] description" where there are spaces in between field keys. We should replace those spaces by _.
Description of change
This PR introduces the implementation of dynamic schema for formkeep.
(SAC-29462)
Manual QA steps
Risks
Rollback steps