-
-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deterministic values #1413
Comments
I might understand somewhat what you are going for, but anyways: Could you explain the total difference between your proposal and using |
@Shinigami92, using the example from the docs, faker.seed(123);
const firstRandom = faker.datatype.number();
// Setting the seed again resets the sequence.
faker.seed(123);
const secondRandom = faker.datatype.number();
console.log(firstRandom === secondRandom); In this case I can guarantee that If my proposal existed, I would be able to do something like const deterministicNumber = faker.datatype.,number({min: 0, max: 100});
console.log(deterministicNumber === 50); And I would be able to tell you exactly what the number would be, literally. To double down on the benefit of such a reality existing, I could write a story like. function Counter({count}) {
return <div className="my-stylistic-genius">{count}</div>;
}
export default {
component: Counter,
}
export const Default = {
args: {
count: faker.datatype.number(),
},
}; I could load this story using normal faker mode and see how the component renders with infinite different numbers, but if I want to run a regression test against changes to the styles, I would want the count to be the same, so that there would be no pixels influenced by the prop. In this case, I can just write the story with normal faker values, and when I need |
Why is it important that the value is Please note that even with your suggestion of |
Great questions and alternatives @ST-DDT. Why is it important that the value is 50 instead of the "same as last time"?The However, I want to be able to basically make a seed that lasts across multiple sessions. I want to be able to run visual testing tools using faker and be able to ensure that if I run the same test in 2 months, it will give me back the same data. I don't want to have to manually write a bunch of static mock data also because faker literally exists to avoid doing that. I totally understand the drawbacks of this solution. It's not perfect. You will have to pass some values statically sometimes, but with this we can minimize those values to only the critical Problems with mocking with filesIn the real world trying to implement the
I've been looking into this as a solution, but each time I look at it, it seems like just making faker return static values solves the whole problem. Another alternativeWould another approach be offering a We could wrap each faker method in a dot method I imagine it would look something like... import {faker} from '@faker-js/faker';
faker.datatype.number.mockImplementation((options) => {
if (typeof options === 'number') {
return options / 2;
}
if (options?.min && options?.max) {
const range = options.max - options.min;
return options.max - (range / 2);
}
if (options?.max) {
return options.max / 2;
}
if (options?.min) {
return options.min * 2;
}
return 10;
});
faker.datatype.number(2); // 1
faker.datatype.number({min:0, max: 100}); // 50
faker.datatype.number({min: 10}); // 20
faker.datatype.number({max: 10}); // 5 @ST-DDT can you point out some of your concerns if something like either of these two proposals were implemented? What are the downsides I'm missing? |
I still don't understand why you don't just use We wont implement something like |
Just for clarification purposes. faker.seed(1337); // Assuming 1337 to be a hardcoded value
faker.name.firstName(); // => Devyn Will result in 100% reproducible results (if you use the same faker version e.g. 7.5.0.) We have test suits to ensure we always get predictable results: fakerjs test snapshots You can try them as well:
As for your proposal overwriting the methods. In v8 (development will start soon) we plan to modularize faker some more including the ability to add or omit certain parts to/from faker. You can use that to create your own faker instance that returns the values you need. https://github.com/faker-js/faker/milestone/11 Does that help you? |
Hi. So after digging deeper into Here is a sample repo I setup: https://github.com/Shopify/faker-seed-test I added chromatic and storybook to test visual snapshots using faker functions to populate the data. When there is a
However, if you change any of the faker calls, or even their order, you end up changing all of the data, which causes all snapshots to fail.
I think what I need is to set a new faker seed for each story, so that only if the calls to fake in that story change, the data is affected. This is not so easy to implement though it seems as many times the calls to faker are before the story even renders. There are also a lot of different places where a call to faker might happen, and even one change can totally throw off all the data. Any ideas on how I could work around this issue with existing functionality in faker? |
Tried some more approaches and maybe found one that works. If I wrap every story definition in a function that sets the seed then I can change the order of stories, and add any number of extraneous faker calls without really breaking the data produced. Seems to work. I will continue trying to break it but it seems indeed that |
@ST-DDT Can you correct me if I'm wrong, using a seed should result in 100% deterministic data? Does that include for date/time calls? Looks like when calling for dates or git methods that use date-time strings return different values each time. Is this expected? |
Currently yes, you need to pass a |
It seems the only method which is inconsolably random is |
Yes, the git commit method should have an option to use a fixed ref date in order to allow for reproducible results. Your contribution will be appreciated. We are also considering adding a faker.fork method to simplify creating deterministic value sets (e.g. multiple persons). See this comment for details: #627 (comment) |
@ST-DDT I've created a branch with the feat/fix (not sure if this should count as a bug fix or a new feature) But I cannot push to the repo. I didn't read anything in the contributing guide about needing to fork for contributions. Am I missing something? Could you point me in the right direction?
Is what I get in the terminal. Some kind of auth issue it seems. |
You have to fork the repository in order to propose changes. You can delete the fork after the changes have been merged.
Would you like to create a second PR that extends the contributing guide? |
@KevinMind Do you consider this fixed once #1512 is merged, or is here more to do? |
I'd like to run another check across all the methods to ensure I can get 100% deterministic values with a specific seed set but in principle yes. |
I'll consider this fixed and close the issue. If you think otherwise feel free to reopen. |
Clear and concise description of the problem
We use fake in several different contexts. It is an amazing tool. However in some contexts, it is valuable to have fully fully idempotent values returned from faker functions. For example:
args
ensuring the only difference in baseline/comparison pixels is caused by the code and not the data. This feature would obviously apply to any visual testing framework where consistent data is essential.To be utterly clear, when I say idempotent I mean that the faker function should return the exact same value every time it is called with the same parameters. Essentially we set the randomness to 0
Suggested solution
I could imagine a few solutions, but perhaps the most simple would be to either consider faker to be in one mode or another globally. We could do this with an environment level feature flag,
FAKER_IDEMPOTENT_MODE=true
.This flag can be stored in the global faker instance and used to control the logic of modules.
We could implement the underlying logic in a couple of ways:
Using this approach we can also implement idempotent modules over time, we don't need to do it in a big bang. When the flag is enabled we could add a warning log that this feature is under development, and that only certain modules have been implemented with idempotent mode. If a module is not yet implemented, it will return the normal module/method.
Given a file
script.js
Running normally you get a random value, in this case 42.
With idempotent mode, given the logic I've defined for how to implement idempotence we should get the average of the min/max values which would always be 50.
Alternative
Another solution that is perhaps more elegant, but much more complex would be to introduce a concept of randomness to faker. You could specify on each call, or perhaps globally the level of randomness faker should use.
This value could then be used to determine the precise level of randomness to introduce into the underlying methods. Since we don't have true randomness in CS we could theoretically do this, though I imagine it would be very difficult and for my particular use case is well beyond over-engineering.
One final solution to this might be to use faker when you want random values and static values when you want consistent values. This solves the problem at the wrong level of the stack though. Most people who use faker wrap it in functions that generate objects of specific types (it is literally the first example of the docs) and needing to switch between these functions and some other data source on the fly is a lot of complexity that would be much better integrated into faker itself.
Additional context
I would be more than happy to work on this feature as I would definitely benefit from having it!
I'm not 100% familiar with how all of faker is implemented and I imagine there might be some edge cases I'm missing where idempotency might be tricky. For example, how would this feature interact with the concept of
seeding
? What if you want to mix idempotent and dynamic values, for example when generating a list of Users where the name should be random but the age should be the same?The text was updated successfully, but these errors were encountered: