Skip to content

Commit bfee615

Browse files
committed
Added export script and documentation.
0 parents  commit bfee615

File tree

2 files changed

+104
-0
lines changed

2 files changed

+104
-0
lines changed

README.md

+62
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Google Firebase Firestore JSON Export
2+
3+
Exports a Google Firestore database recursively including all collections and
4+
subcollections into plain JSON files.
5+
6+
Uses the built-in `recursive()` flag of the Query class:
7+
https://cloud.google.com/python/docs/reference/firestore/latest/query#recursive
8+
9+
Allows you to analyze the data in your Firestore database locally without being
10+
constrained to Firestore or BigQuery limitations. For example:
11+
12+
- Finding incomplete documents
13+
- Finding broken document references
14+
- Finding documents without fields
15+
- Querying subcollections of collections without documents
16+
17+
You can use custom scripts or the powerful [jq command-line JSON processor](https://jqlang.github.io/jq/)
18+
to perform complex lookups, such as [this example](https://unix.stackexchange.com/a/466241/228730),
19+
or the following, which extracts all documents that do not have the key `'createdBy'`:
20+
21+
```console
22+
jq -r 'with_entries(select(.value.createdBy == null))' collection.json
23+
```
24+
25+
26+
:warning: **Warning**: A complete export causes 1 read operation for every document in
27+
your Firestore database. Depending on the size of your database, this might
28+
induce costs for database reads.
29+
30+
31+
## Requirements
32+
33+
- Python 3
34+
35+
36+
## Setup and Usage
37+
38+
1. Install firebase-admin via pip
39+
```console
40+
$ pip3 install firebase-admin
41+
```
42+
43+
2. Run export to JSON.
44+
```console
45+
$ python3 main.py
46+
```
47+
48+
49+
## Todos
50+
51+
- Serialize TimestampWithNanoseconds (instead of converting to string)
52+
https://code.luasoftware.com/tutorials/google-cloud-firestore/python-firestore-query-documents-to-json
53+
54+
- Serialize document references (e.g. as document path instead of address)
55+
```
56+
"ref": "<google.cloud.firestore_v1.document.DocumentReference object at 0x111ef1c50>"
57+
```
58+
59+
- Import into local Firestore emulator
60+
61+
- Allow to export as line-delimited JSON (LDJSON) to allow streaming and avoid
62+
loading the whole db into memory when importing.

main.py

+42
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
import firebase_admin
2+
from firebase_admin import credentials
3+
from firebase_admin import firestore
4+
import json
5+
6+
key_file = "../credentials.json"
7+
cred = credentials.Certificate(key_file)
8+
firebase_admin.initialize_app(cred)
9+
10+
db = firestore.client()
11+
12+
# @thanks https://stackoverflow.com/a/57561744/811306
13+
def set(my_dict, field_path, value):
14+
"""Given `foo`, 'key1.key2.key3', 'something', set foo['key1']['key2']['key3'] = 'something'"""
15+
here = my_dict
16+
keys = field_path.split('.')
17+
for key in keys[:-1]:
18+
# Create key with empty dictionary if it does not exist and move pointer.
19+
here = here.setdefault(key, {})
20+
here[keys[-1]] = value
21+
22+
# for collection in [{id: 'dayplans'}]:
23+
for collection in db.collections():
24+
collection_name = collection.id
25+
print(f"Exporting {collection_name}...")
26+
json_file = collection_name + '.json'
27+
28+
# documents = db.collection(collection_name).recursive().limit(100).get()
29+
documents = db.collection(collection_name).recursive().get()
30+
data = {}
31+
for doc in documents:
32+
print(doc.reference.path)
33+
keys = doc.reference.path.split('/')[1:]
34+
key = '.'.join(keys)
35+
set(data, key, doc.to_dict())
36+
37+
# print(json.dumps(data, indent=2, default=str))
38+
39+
with open(json_file, 'w') as file:
40+
json.dump(data, file, indent=4, sort_keys=True, default=str, ensure_ascii=False)
41+
42+
print(f"☑️ Exported to {json_file}.")

0 commit comments

Comments
 (0)