You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+9-7Lines changed: 9 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -55,9 +55,9 @@ We use a modularized file structure to distribute DiffusionDB. The 2 million ima
55
55
./
56
56
├── diffusiondb-large-part-1
57
57
│ ├── part-000001
58
-
│ │ ├── 3bfcd9cf-26ea-4303-bbe1-b095853f5360.png
59
-
│ │ ├── 5f47c66c-51d4-4f2c-a872-a68518f44adb.png
60
-
│ │ ├── 66b428b9-55dc-4907-b116-55aaa887de30.png
58
+
│ │ ├── 0a8dc864-1616-4961-ac18-3fcdf76d3b08.webp
59
+
│ │ ├── 0a25cacb-5d91-4f27-b18a-bd423762f811.webp
60
+
│ │ ├── 0a52d584-4211-43a0-99ef-f5640ee2fc8c.webp
61
61
│ │ ├── [...]
62
62
│ │ └── part-000001.json
63
63
│ ├── part-000002
@@ -66,9 +66,9 @@ We use a modularized file structure to distribute DiffusionDB. The 2 million ima
66
66
│ └── part-010000
67
67
├── diffusiondb-large-part-2
68
68
│ ├── part-010001
69
-
│ │ ├── 3bfcd9cf-26ea-4303-bbe1-b095853f5360.png
70
-
│ │ ├── 5f47c66c-51d4-4f2c-a872-a68518f44adb.png
71
-
│ │ ├── 66b428b9-55dc-4907-b116-55aaa887de30.png
69
+
│ │ ├── 0a68f671-3776-424c-91b6-c09a0dd6fc2d.webp
70
+
│ │ ├── 0a0756e9-1249-4fe2-a21a-12c43656c7a3.webp
71
+
│ │ ├── 0aa48f3d-f2d9-40a8-a800-c2c651ebba06.webp
72
72
│ │ ├── [...]
73
73
│ │ └── part-000001.json
74
74
│ ├── part-010002
@@ -107,7 +107,9 @@ The data fields are:
107
107
108
108
To help you easily access prompts and other attributes of images without downloading all the Zip files, we include two metadata tables `metadata.parquet` and `metadata-large.parquet` for DiffusionDB 2M and DiffusionDB Large, respectively.
109
109
110
-
The shape of `metadata.parquet` is (2000000, 13) and the shape of `metatable-large.parquet` is (14000000, 13). Two tables share the same schema, and each row represents an image. We store these tables in the Parquet format because Parquet is column-based: you can efficiently query individual columns (e.g., prompts) without reading the entire table. Below are three random rows from `metadata.parquet`.
110
+
The shape of `metadata.parquet` is (2000000, 13) and the shape of `metatable-large.parquet` is (14000000, 13). Two tables share the same schema, and each row represents an image. We store these tables in the Parquet format because Parquet is column-based: you can efficiently query individual columns (e.g., prompts) without reading the entire table.
111
+
112
+
Below are three random rows from `metadata.parquet`.
0 commit comments