-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: prevent the backfill from running forever. #1065
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There's an edge case where an author that no longer exists can still be assigned to a post. This throws the backfill script into an infinite loop, because the respective author-term is never found/created, and so the underlying problem of missing author-term records is never resolved. The infinite loop is started when at the end of the while loop, the script asks for "remaining posts which need author terms" and so it returns the same rows over and over. This fix addresses this in 2 ways: 1. If an author is not found, we look for the most prolific author on the site and assign the posts to them. If there is no prolific author, one is created. And if one can't be created, an exception is thrown so that the script can't proceed. 2. Checks have been added so that the script can't go beyond what should be the maximum number of rows needing to be addressed.
iuravic
requested changes
Oct 18, 2024
leogermani
reviewed
Oct 18, 2024
Unsubscribe me from this list. Thanks
Luanne Rife
540-467-2189
…On Fri, Oct 18, 2024 at 8:02 AM leogermani ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In php/class-wp-cli.php
<#1065 (comment)>
:
> @@ -1233,6 +1255,185 @@ private function get_posts_with_missing_terms( $author_taxonomy, $post_types = [
// phpcs:enable
}
+ /**
+ * This function handles obtaining an author account which should have the most posts assigned to it. If unable
+ * to find an appropriate account, this function will attempt to create an author account for use.
+ *
+ * @return WP_User
+ * @throws Exception If unable to successfully create an author user account.
+ */
+ public function get_most_prolific_author() {
This is a very opinionated approach.. It makes sense.. but I think
creating an author with a random display name doesn't.
What do you think of simply assigning them to the admin instead?
—
Reply to this email directly, view it on GitHub
<#1065 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A3NTYW4ZHPGFEBVZWMEBTV3Z4D2D7AVCNFSM6AAAAABQE42JEOVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDGNZXHA4TCNJUGA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
This approach is more faithful with what the current condition on the site would be anyway. If the post author doesn't exist on the site, you wouldn't be able to see the particular post in question in an author archive anyway. Skipping the post instead of reassigning it to the first available admin user is a cleaner solution.
naxoc
reviewed
Oct 24, 2024
naxoc
approved these changes
Oct 24, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to go!
iuravic
approved these changes
Oct 24, 2024
leogermani
added a commit
that referenced
this pull request
Nov 5, 2024
* Increase composer.json required PHP version to 7.4 * Update README to match required PHP version 7.4 * Remove PHP 7.1 from integration tests * PHP 7.4: Use array_key_first() Slightly cleaner to use the native function for getting the first key's value from an array. * PHP 7.4: Use instanceof * PHP 7.4: Use null coalescing * PHP 7.4: Add return types * PHP 7.4: Collapse nested dirname() calls * CI: Remove MySQL workaround for PHP <= 7.3 * Increase WordPress required version to 5.9 * Update integration tests to use WordPress 5.9 * Remove unnecessary phpunit versions for WordPress 5.9 * CI: Update tested versions Doesn't make sense to test WP versions would unsupported PHP versions (e.g. WP 5.9 with PHP 8.3). * Composer: Update dev-dependencies * PHPCS: Consolidate config into config file The PHPCS in the composer.json was duplicating but obscuring some aspects of what was in the `phpcs.xml.dist` file. This change consolidates the Composer commands and the config file. * Support for Yoast %%name%% variable * CI: Update deploy.yml Increase actions/checkout dependency version. * CI: Update integrate.yml action versios * Contents edited to consolidate instructions within the Wiki and bring more attention to its existence (#1055) * add: created a new CLI cmd to backfill missing author terms for posts. (#1060) * add: created a new CLI cmd to backfill missing author terms for posts. * add: adding some comments to the new and old backfill commands. The comments are meant to clarify the key differences between the two commands, and that the new one should be preferred over the old one. * add: batching is the default, pass `--unbatched` flag to run w/o it. --------- Co-authored-by: Gary Jones <[email protected]> Co-authored-by: Alec Geatches <[email protected]> * Fix/missing wp user type (#988) * fix: preventing loss of fact that a guest author might also be a WP_User * fix: making the update operation dependent on $append flag. This might be a problematic decision. But the way I justify this change is that if you are appending co-authors, there may already be a WP_User set as the author. So we don't really have to care whether one is passed or not. Because of this, we do not need to forcibly return a `false` flag since that is confusing to the caller, especially because we actually do save the guest authors which are given in the call! Instead, if the $append flag is false, we should expect that at least one user will be a WP_User. In that case, if none is passed in, then there is a mismatch of the intended authors. Because now, the `wp_posts.post_author` column will have an old `wp_users.ID` which remains set and most likely isn't the intent of the caller. * fix: attempting DB update only when $new_author is not empty. Also, returning the actual response from the DB, to make this call even more accurate in terms of what is actually happen at the DB layer. * fix: need to ensure pure WP_User is processed correctly as post_author. A pure WP_User (i.e. a WP_User that IS NOT linked to a Guest Author) needs to be handled specially. * fix: a necessary refactor of the `get_coauthor_by` function. This refactor is absolutely necessary in order for all the previous fixes to work as expected. Without this fix, what happens is that when you use `get_coauthor_by` by searching with a Guest Author, if that Guest Author has a valid link to a WP_User, it is summarily ignored. Functions like `add_coauthors` expect at least one coauthor to be a valid WP_User so that the `wp_posts.post_author` column can be appropriately updated. The only case where this function is returning an expected value is when you search by the WP_User first. When it arrives at `$guest_author = $this->guest_authors->get_guest_author_by( $key, $value, $force );`, `$guest_author === false`. It is then forced to move to the switch statement to find a user via their WP_User data. With this refactor, `get_coauthor_by` will now check if the `linked_account` attribute is set. If so, it will attempt to find the corresponding user for the Guest Account. It still gives priority to returning a Guest Author. When a Guest Author is not found, it will search for a WP_User. If found, it will also search to see if a linked Guest Author account exists. If it does, it will return that Guest Author object instead, without losing the fact that this account also has a WP_User associated with it. * fix: returning a plain WP_User if guest authors is not enabled. I forgot to run tests on my previous commit. This satisfies the test Test_CoAuthors_Plus::test_get_coauthor_by_when_guest_authors_not_enabled which is expecting a WP_User when the plugin is not enabled. * feat: adding additional tests for co-authors-plus.php functionality. * fix: preventing loss of fact that a guest author might also be a WP_User * fix: making the update operation dependent on $append flag. This might be a problematic decision. But the way I justify this change is that if you are appending co-authors, there may already be a WP_User set as the author. So we don't really have to care whether one is passed or not. Because of this, we do not need to forcibly return a `false` flag since that is confusing to the caller, especially because we actually do save the guest authors which are given in the call! Instead, if the $append flag is false, we should expect that at least one user will be a WP_User. In that case, if none is passed in, then there is a mismatch of the intended authors. Because now, the `wp_posts.post_author` column will have an old `wp_users.ID` which remains set and most likely isn't the intent of the caller. * fix: attempting DB update only when $new_author is not empty. Also, returning the actual response from the DB, to make this call even more accurate in terms of what is actually happen at the DB layer. * fix: need to ensure pure WP_User is processed correctly as post_author. A pure WP_User (i.e. a WP_User that IS NOT linked to a Guest Author) needs to be handled specially. * fix: a necessary refactor of the get_coauthor_by function. This refactor is absolutely necessary in order for all the previous fixes to work as expected. Without this fix, what happens is that when you use `get_coauthor_by` by searching with a Guest Author, any link to a WP_User the Guest Author may have is summarily ignored. Functions like `add_coauthors` expect at least one coauthor to be a valid WP_User so that the `wp_posts.post_author` column can be appropriately updated. The only case where this function is currently returning an expected value is when you search by a WP_User account/field first. When it arrives at `$guest_author = $this->guest_authors->get_guest_author_by( $key, $value, $force );`, `$guest_author === false`. It is then forced to move to the switch statement to find a user via their WP_User data. With this refactor, `get_coauthor_by` will now check if the `linked_account` attribute is set. If so, it will then attempt to find the corresponding WP_User for the Guest Author. Crucially, it still gives priority to returning a Guest Author. When a Guest Author is not found, it will then attempt to search for a WP_User. If found, it will also search to see if a linked Guest Author account exists. If it does, it will return that Guest Author object instead, without losing the fact that this account also has a WP_User associated with it. * fix: renaming user_login's for new authors introduced for new tests. These user_login's were causing other tests to fail because you cannot create another user with the same user_login. * fix: removing use of assertObjectHasProperty Older version of PHPUnit do not have this function available. Updating to workaround: `assertTrue( property_exists( $obj, 'prop' ) )` * fix: typo in function call * fix: using strict comparison instead of function call `is_null` * fix: using more descriptive assertion for array validation. * fix: using `create_and_get` post factory func, to avoid query call. * fix: removing use of newly introduced is_wp_user property. Relying instead on wp_user property which has already been used before. * fix: PHPCS fixes and added commentary/descriptions to docblocks. * fix: some small quick fixes for formatting and documentation * fix: removing repetitive test. * add: new assertion func that determines if an obj is not a WP_User class * add: new assertion to help determine if a Post has the correct Authors * add: new test solely for CoAuthorPlus::get_coauthor_by(). By fully testing CoAuthorPlus::get_coauthor_by(), we can remove some repetitive assertions that don't directly relate to what's being tested. * fix: was passing string values when I should've been passing Author objs * fix: using a data provider for very similar tests --------- Co-authored-by: Gary Jones <[email protected]> * bumping version to 3.6.2 (#1064) * bumping version to 3.6.2 * Update CHANGELOG.md Co-authored-by: Gary Jones <[email protected]> * add changelog link --------- Co-authored-by: Gary Jones <[email protected]> * fix: prevent the backfill from running forever. (#1065) * fix: prevent the backfill from running forever. There's an edge case where an author that no longer exists can still be assigned to a post. This throws the backfill script into an infinite loop, because the respective author-term is never found/created, and so the underlying problem of missing author-term records is never resolved. The infinite loop is started when at the end of the while loop, the script asks for "remaining posts which need author terms" and so it returns the same rows over and over. This fix addresses this in 2 ways: 1. If an author is not found, we look for the most prolific author on the site and assign the posts to them. If there is no prolific author, one is created. And if one can't be created, an exception is thrown so that the script can't proceed. 2. Checks have been added so that the script can't go beyond what should be the maximum number of rows needing to be addressed. * fix: obtaining the first available admin user account instead. * fix: updating output to reflect that the ID belongs to an Admin account. * fix: this function should be private * fix: switching tactic to skipping posts that have missing post_author. This approach is more faithful with what the current condition on the site would be anyway. If the post author doesn't exist on the site, you wouldn't be able to see the particular post in question in an author archive anyway. Skipping the post instead of reassigning it to the first available admin user is a cleaner solution. * fix: removed unused references from a past commit * fix: appeasing PHPCS * Bump versions to 3.6.3 (#1070) --------- Co-authored-by: Alec Geatches <[email protected]> Co-authored-by: Gary Jones <[email protected]> Co-authored-by: claudiulodro <[email protected]> Co-authored-by: Yoli Hodde <[email protected]> Co-authored-by: Eddie Carrasco <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
There's an edge case where an author that no longer exists can still be assigned to a post. This throws the backfill script into an infinite loop, because the respective author-term is never found/created, and so the underlying problem of missing author-term records is never resolved. The infinite loop is started when at the end of the while loop, the script asks for "remaining posts which need author terms" and so it returns the same rows over and over.
This fix addresses this in 2 ways:
Deploy Notes
Are there any new dependencies added that should be taken into account when deploying to WordPress.org?
No.
Steps to Test
SELECT MAX(ID) FROM wp_users
.post_author
column for those posts to IDs that do not exist in thewp_users
table (any ID above MAX(ID)).wp co-authors-plus create-author-terms-for-posts
). Be prepared to kill it. Notice how it goes beyond 100%.