Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queue tasks from in parallel #378

Merged
merged 17 commits into from
Jan 13, 2023
Merged

Queue tasks from in parallel #378

merged 17 commits into from
Jan 13, 2023

Conversation

npm1
Copy link
Collaborator

@npm1 npm1 commented Nov 22, 2022

Since DiscoverFromExternalSource is run in parallel, this PR does some changes:

  1. Queue tasks when we need to fetch or create a new Credential object.
  2. Add 'on' methods which receive the fetch result and continue the execution of the FedCM algorithm with the results from the fetch.

Relevant issue: #261


Preview | Diff

@npm1
Copy link
Collaborator Author

npm1 commented Nov 22, 2022

CC @bvandersloot-mozilla @annevk in case you're interested in reviewing.

@npm1
Copy link
Collaborator Author

npm1 commented Nov 23, 2022

I added a bunch of 'on' functions to receive the fetch results. @johannhof, could you take a look?

@npm1 npm1 requested a review from samuelgoto November 28, 2022 19:38
@npm1
Copy link
Collaborator Author

npm1 commented Dec 5, 2022

Ping @samuelgoto can you take a look please?

@npm1
Copy link
Collaborator Author

npm1 commented Dec 13, 2022

@domfarolino any chance you could take a look? I believe this PR is not too complicated and improves Fetch integration but Sam didn't feel comfortable taking the first look. Perhaps you can lend us your Fetch expertise :)

Copy link

@domfarolino domfarolino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here a few comments from skimming through this this evening, but I'll give it a more thorough look later.

spec/index.bs Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
Copy link
Collaborator Author

@npm1 npm1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review so far!

spec/index.bs Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
Copy link

@domfarolino domfarolino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I'd recommend is prefacing each algorithm with its return type, so we can more-easily track that all of the return values are appropriately used. I think I spotted some cases where algorithms were returning things like "false" and "an empty list" to nobody that was doing anything with those values, so it might be good to structure the algorithms as I mentioned, and do an audit.

The common way to do this is:

"To foo given a bar and baz, run the following steps. They return a qux."

spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
@npm1
Copy link
Collaborator Author

npm1 commented Jan 4, 2023

One thing I'd recommend is prefacing each algorithm with its return type, so we can more-easily track that all of the return values are appropriately used. I think I spotted some cases where algorithms were returning things like "false" and "an empty list" to nobody that was doing anything with those values, so it might be good to structure the algorithms as I mentioned, and do an audit.

The common way to do this is:

"To foo given a bar and baz, run the following steps. They return a qux."

Thanks, added return type to the algorithms which had one.

Copy link
Collaborator Author

@npm1 npm1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready for another review! Added asserts and removed some unneeded task queueing.

spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
Copy link

@domfarolino domfarolino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally LGTM. I think it's probably fine to clean the hand-wavy "Wait" part up in a follow-up, provided we clarify what we're actually doing with the thing we're waiting for.

spec/index.bs Outdated Show resolved Hide resolved
@npm1
Copy link
Collaborator Author

npm1 commented Jan 9, 2023

@samuelgoto @domfarolino lemme know if there are any objections to merging this! I want to get this merged to work on other stuff in the spec (this touches everything so easier to have it merged before)

@samuelgoto
Copy link
Collaborator

samuelgoto commented Jan 9, 2023 via email

spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
Copy link
Collaborator

@samuelgoto samuelgoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is looking excellent: massive progress in making the spec a lot more precise in terms of Fetch and being concrete about the algorithms. I made a few suggestions to try to remove further the chaining and use "async/await" more to make it easier to read, but otherwise this is LGTM.

Thanks for putting this together, looking forward to merging this!

@domfarolino
Copy link

I haven't had time to look deeply at the changes to the chaining and "async/await" as @samuelgoto as put it, can you summarize it briefly?

@npm1
Copy link
Collaborator Author

npm1 commented Jan 10, 2023

I haven't had time to look deeply at the changes to the chaining and "async/await" as @samuelgoto as put it, can you summarize it briefly?

In the latest change, I used the following style. In an algorithm (eg. fetch X):

1. Steps in the DOM manipulation task source.
2. Let |returnVar| be null
3. Fetch and in the fetch callback set |returnVar|.
4. In parallel, wait for |returnVar| to become non-null, and return it.

That is kinda the closest I could get to the await/async syntax that Sam wanted to simplify the methods / avoid a long chaining of callbacks.

Copy link
Collaborator

@samuelgoto samuelgoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@domfarolino can you take a last look and approve too?

Once @domfarolino LGTMs and this last round of feedback is addressed (I think there is a slightly cleaner way to deal with the async/awaits), I'm planning to squash and merge.

spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
@domfarolino
Copy link

In the latest change, I used the following style. In an algorithm (eg. fetch X):

2. Let |returnVar| be null
3. Fetch and in the fetch callback set |returnVar|.
4. In parallel, wait for |returnVar| to become non-null, and return it.

That is kinda the closest I could get to the await/async syntax that Sam wanted to simplify the methods / avoid a long chaining of callbacks.

On first scan, this looks nice but I think it has some issues. First, you'd need to be more explicit about posting tasks. An algorithm that has its control flow broken up by "Waiting in parallel" would need to explicitly come back to the original-calling thread to return a value. The common trend I see for this elsewhere is:

1. Do some main-thread stuff
2. Wait in-parallel for some main-thread state to change, then post a task back to the main thread to run the following steps
3. (Back on main thread), do something with the value `foo` that we now know has been changed to what we want

That would normally work, depending on what you do with foo. The problem with your algorithms, it seems, is that they need to return the value foo that you waited for in a non-blocking way. But I don't think you can really return from an algorithm once its control flow has been interrupted by waiting "in-parallel" (as to not block the main thread). By the time we come back to the main thread in step 3, the original caller of the algorithm has already totally finished invoking it, so I don't think we can return a concrete value after that. That's why async/await works in JS: async algorithms don't actually return concrete values, they all return promises so that the caller can only retrieve the value asynchronously after the algorithm was invoked, but I don't think we have a spec prose equivalent.

Do your algorithms have to return values, or can they just modify state? That is, can you do the following? :

4. In parallel, wait for `returnVar` to become non-null, and do something with it (but not return it)

@npm1
Copy link
Collaborator Author

npm1 commented Jan 10, 2023

Hey @samuelgoto @domfarolino the latest PR has changes that hopefully satisfy both of you. Now most stuff is run in parallel and we only go into a task when necessary, thus making it possible to wait for the results of fetches while still allowing the algorithms to return the awaited values. PTAL and let me know what you think!

Copy link
Collaborator

@samuelgoto samuelgoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last commit makes changes that I think meet the readability bar that I was trying to maintain. If this meets the fetch precision bar, I think we should merge this.

Dominic, WDYT?

spec/index.bs Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
Copy link
Collaborator

@samuelgoto samuelgoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still waiting some confirmation from farolino that this works, but here is another round of feedback while we wait for his approval.

Again, I think this is ready to be merged as far as I'm concerned, and these can be resolved in a separate PR.

spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Show resolved Hide resolved
spec/index.bs Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated
1. Let |json| be the result of [=extract the JSON fetch response=] from |response|.
1. [=converted to an IDL value|Convert=] |json| to an {{IdentityProviderAPIConfig}} stored
in |config|.
1. Wait for both |check| and |config| to be set.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think |check| doesn't exist at this point yet because it was defined for the first time in the fetch response handler.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed |check| and now just waiting for 'both fetch responses to be completed'. Let me know if that works?

spec/index.bs Outdated Show resolved Hide resolved
@domfarolino
Copy link

Before I do a full scan, here's one example of a red flag I'm seeing kind of frequently: https://github.com/fedidcg/FedCM/pull/378/files#diff-40cc3a1ba233cc3ca7b6d5873260da9676f6ae20bb897b62f7871c80d0bda4e9R1284. Basically we're throwing a DOMException from in parallel at least 4 times in that algorithm, and I think elsewhere as well. But DOMExceptions are platform objects and must be created/thrown with a surrounding agent / global, so I don't think what's happening now is technically valid. This is why I mentioned on chat the other day when we discussed introducing "queue fetch":

As long as everything besides those steps can run in parallel and doesn't need to be on the main thread
Which I wasn't under the impression of earlier, but if that's the case that would be great

It seems like there are more things besides Fetch that currently expect to run on the main thread, which is a little worrying. I'm not sure what the full implementation flow is in Chromium so I don't know what to recommend for these algorithms, but it seems like spec-wise they all used to run on the main thread with the exception of "waiting" for fetches to complete, but the "waiting" got complicated and callback-ey, so you switched it so the algorithms ran entirely "in parallel" with the exception of kicking off main-thread tasks for URL parsing and Fetching, but this causes two issues:

  1. There are other things that have to run on the main thread that are currently not
  2. While the way we're kicking off tasks for URL parsing and Fetching is technically correct, it kind of hints to me that there is a design issue. I think it's uncommon to make fetches where the URL is parsed asynchronously with respect to the Fetching, and this can have observable consequences too (i.e., you're allowing the <base> URL of the document to change in between parsing and fetching, whereas I'm guessing that is not possible in the implementation?).

I think (1) needs to be fixed regardless, but (2) might just be a symptom of the unsafe-no-cors Fetch mode not being ironed out yet. For example, as per whatwg/fetch#1533 (comment) we're going to want to Fetch these requests from some sort of detached context (I think in-parallel), whereas this PR sticks with today's precedent of only fetching from the main thread, which is best for now, but is in part why we have this weird asynchronous gap between parsing and Fetching and maybe more things. I think (2) can probably be overlooked for now, though (1) shouldn't be. But I don't know the best way to fix (1) without reverting the algorithms to run on the main thread and wait in parallel, but I am happy to hear other ideas @npm1 since you probably have the spec paged into your head more than me right now.

@npm1
Copy link
Collaborator Author

npm1 commented Jan 12, 2023

Before I do a full scan, here's one example of a red flag I'm seeing kind of frequently: https://github.com/fedidcg/FedCM/pull/378/files#diff-40cc3a1ba233cc3ca7b6d5873260da9676f6ae20bb897b62f7871c80d0bda4e9R1284. Basically we're throwing a DOMException from in parallel at least 4 times in that algorithm, and I think elsewhere as well. But DOMExceptions are platform objects and must be created/thrown with a surrounding agent / global, so I don't think what's happening now is technically valid.

Hmm yea, perhaps we can a) change those cases to 'return null', and then at the main method just queue a task to throw if the result was null b) add a helper so that any throw becomes 'queue a task that throws', and change all throws to also return too (this might be tricky because the parent will not know, so I think a) is better).

It seems like there are more things besides Fetch that currently expect to run on the main thread, which is a little worrying. I'm not sure what the full implementation flow is in Chromium

The algorithm in Chromium is implemented in the browser process. Unfortunately there is no such equivalent in the spec? It seems that the closest thing is 'in parallel' but we are forced to go back to main for things that supposedly cannot be handled in parallel, but which we can do in the browser process.

  1. While the way we're kicking off tasks for URL parsing and Fetching is technically correct, it kind of hints to me that there is a design issue. I think it's uncommon to make fetches where the URL is parsed asynchronously with respect to the Fetching, and this can have observable consequences too (i.e., you're allowing the <base> URL of the document to change in between parsing and fetching, whereas I'm guessing that is not possible in the implementation?).

Yea that's true although the fetches are not relative to the document.

I think (1) needs to be fixed regardless

Is there anything else you see besides the exceptions? We could do what I suggested if that's the only thing, but I may have missed other examples.

@npm1
Copy link
Collaborator Author

npm1 commented Jan 12, 2023

By the way the exception thing is a problem coming from the caller as well as this already assumes you can throw exceptions from in parallel: https://w3c.github.io/webappsec-credential-management/#ref-for-dom-credential-discoverfromexternalsource-slot%E2%91%A0:~:text=Set%20result%20to%20the%20result%20of%20executing%20result%E2%80%99s

But thinking about it, why is that not allowed? They are thrown and caught from in parallel (and I'm assuming we're always carrying around the |globalObject| anyways).

@npm1
Copy link
Collaborator Author

npm1 commented Jan 12, 2023

Answering to myself: at least DOMExceptions are Objects so they cannot be created from in parallel. I will return failure except at the end, and there 'queue a task to throw'.

@npm1
Copy link
Collaborator Author

npm1 commented Jan 12, 2023

I think I fixed exception handling for the most part, let me know what you think!


Issue: The credentialed fetch in this algorithm can lead to a timing attack that leaks the user's identities before the user grants permission. This is an active area of investigation that is being explored [here](https://github.com/fedidcg/FedCM/issues/230#issuecomment-1089040953).

1. [=Fetch request=] with |request|, |globalObject|, and <var ignore=>processResponseConsumeBody</var>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to break up the list and reset the counter at 1 in the rendered HTML. I'm not sure what the solution for that is to be honest, but maybe you have to resort to manual numbering?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vehemently oppose manual numbering :) It seems issues/notes require an extra indentation compared with numbered items (that fixed it, here and in a couple other places).

spec/index.bs Show resolved Hide resolved
Copy link
Collaborator Author

@npm1 npm1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ptal

spec/index.bs Show resolved Hide resolved

Issue: The credentialed fetch in this algorithm can lead to a timing attack that leaks the user's identities before the user grants permission. This is an active area of investigation that is being explored [here](https://github.com/fedidcg/FedCM/issues/230#issuecomment-1089040953).

1. [=Fetch request=] with |request|, |globalObject|, and <var ignore=>processResponseConsumeBody</var>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vehemently oppose manual numbering :) It seems issues/notes require an extra indentation compared with numbered items (that fixed it, here and in a couple other places).

Copy link

@domfarolino domfarolino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK it looks like the exception problem is fixed now with the exception (pun intended) of one case I think? Otherwise, I think your "failure" propagation works nicely.


Issue: Support choosing accounts from multiple [=Identity Provider=]s, as described [here](https://github.com/fedidcg/FedCM/issues/319).
1. Run {{setTimeout}} passing a [=task=] which throws a {{NetworkError}}, after a timeout of
60 seconds.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "network error" here should be a failure like you now have elsewhere, since this is all in parallel.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree but DiscoverFromExternalSource actually accepts exceptions being thrown at it somehow. I filed w3c/webappsec-credential-management#211 since I'm not sure that's acceptable (plus need the globalObject). I suspect it is done that way so that error messages can be set by whoever implements the DiscoverFromExternalSource. So yes this method is still throwing exceptions in two places, but I propose waiting for the linked issue to be resolved before modifying.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems fine, while slightly unfortunate 👍

@samuelgoto samuelgoto merged commit 69c064f into w3c-fedid:main Jan 13, 2023
github-actions bot added a commit that referenced this pull request Jan 13, 2023
SHA: 69c064f
Reason: push, by samuelgoto

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants