-
Notifications
You must be signed in to change notification settings - Fork 254
Open
Description
On the site https://leaf.cmu.edu/, FEMNIST is said to have 3,550 users and 805,263 samples. However, I ran the provided command
./preprocess.sh -s niid --sf 1.0 -k 0 -t sample
to get the full-sized dataset, and then run ./stats.sh to get the statistics. The outputs are as follows
0 1
20 4
40 11
60 5
80 16
100 66
120 125
140 394
160 1241
180 329
200 47
220 62
240 95
260 107
280 125
300 167
320 168
340 185
360 172
380 149
400 87
420 36
440 3
460 1
480 0
Summing up the number of clients, we get 3,597 rather than 3,550 ones. Actually, I've also count the total number of samples in train/ and test/, and got 817,851 rather than 805,263.
Is there anything wrong with my command for data processing, which leads to such inconsistency?
Metadata
Metadata
Assignees
Labels
No labels