-
Notifications
You must be signed in to change notification settings - Fork 176
BSON performance #528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here's another set of benchmarks that might be helpful. |
Nice! |
I can add them for comparison, the benchmark suggestion was mostly to evaluate other formats. |
If you can that would be extremely helpful! |
The benchmarks have been updated with numbers for |
BSON is the format that MongoDB uses both for data storage and to communicate with drivers, so it won't be possible to change the driver to use another format. You can greatly speed up driver performance by utilizing a e.g. #[derive(Deserialize, Serialize, Debug)]
struct MyType { /* fields here */ }
let coll = db.collection::<MyType>("my_coll");
coll.insert_one(MyType::new(...), None).await?;
let mt: MyType = coll.find_one(doc! {}, None).await?.unwrap(); Also, we're currently working on introducing a number of raw-BSON wrapper types, borrowing a lot of code from the #[derive(Debug, Deserialize, Serialize)]
struct MyTypeRef<'a> {
some_borrowed_field: &'a str,
}
let coll = db.collection::<RawDocumentBuf>("my_coll");
coll.insert_one(bson::to_raw_document_buf(MyType::new(...))?, None).await?;
let rawdoc: RawDocumentBuf = coll.find_one(doc! {}, None).await?.unwrap();
let mt: MyTypeRef = bson::from_slice(rawdoc.as_bytes())?; BSON won't ever reach the speeds of NoProto or some of these other high performance serialization formats due to its dynamic / self-describing nature, but for most driver use-cases this won't really matter though, since the majority of the driver's execution time will be spent on network I/O between the driver and the server, with (de)serialization being negligible in comparison. That being said, we're always striving to improve the performance of |
Thank you very much Patrick! I've got a few further questions:
|
Yep, and also it deserializes it directly from BSON without having to go through
Our API currently doesn't support doing this, but I wouldn't think so unless the filter was really huge.
I invoked that explicitly in that example so that I could use a single
Yep, and you can actually do this today via #[derive(Debug, Serialize)]
struct MyData {
strings: Vec<String>,
}
let md = MyData { strings: vec!["a".to_string()] };
let raw_bson = bson::to_vec(&md)?; // vec of BSON bytes whose "strings" field is an array Once the raw BSON work is done, you'll be able to serialize to a If you're talking about directly serializing iterables of structs for the purposes of inserting them, you can also do that today via let collection = db.collection::<MyType>("my_coll");
collection.insert_many(vec![
MyType::new(),
MyType::new(),
...
], None).await?; This is a lot faster than calling |
Perfect, thanks very much again @patrickfreed ! |
No problem, happy to help! Leaving this open sounds fine to me. Once the raw BSON stuff is merged, I'll circle back with some updated examples (the API isn't completely set in stone just yet). |
Nice work! When will release new version? |
We've released betas of both the driver and the BSON library which contain support for the raw BSON features I mentioned above. To start using them, update your Note that network latency and DB processing constitute a large amount of the time spent waiting on a query, so you may not see huge performance improvements by using borrowed deserialization instead of regular owned deserialization to a Here's an example program that demonstrates how to use raw BSON with the driver: use mongodb::{
bson::{
rawdoc, spec::BinarySubtype, Binary, RawArray, RawBsonRef, RawDocument, RawDocumentBuf,
},
Client,
};
use serde::Deserialize;
#[derive(Debug, Deserialize)]
struct MyBorrowedData<'a> {
#[serde(borrow)]
string: &'a str,
#[serde(borrow)]
bin: &'a [u8],
#[serde(borrow)]
doc: &'a RawDocument,
#[serde(borrow)]
array: &'a RawArray,
}
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let client = Client::with_uri_str("mongodb://localhost:27017").await?;
let coll = client.database("foo").collection::<MyBorrowedData>("bar");
coll.clone_with_type::<RawDocumentBuf>()
.insert_one(
rawdoc! {
"string": "hello world",
"bin": Binary {
bytes: vec![1, 2, 3, 4],
subtype: BinarySubtype::Generic
},
"doc": {
"a": "subdoc",
"b": true
},
"array": [
12,
12.5,
false
]
},
None,
)
.await?;
let mut cursor = coll.find(None, None).await?;
while cursor.advance().await? {
let data = cursor.deserialize_current()?;
println!("{:#?}", data);
println!("doc.a => {}", data.doc.get_str("a")?);
println!(
"doc.array => {:#?}",
data.array
.into_iter()
.collect::<mongodb::bson::raw::Result<Vec<RawBsonRef>>>()?
);
}
Ok(())
} And this prints the following:
|
The performance is the main key to my current circumstances, I will update all related libs and let you know if facing any issues! Thanks so much for your hard work! Really appreciate that! |
Hello Everyone!
My team is currently writing a very traffic-heavy server, so our main goals are performance and security (which are Rust's lead perks). I was extremely happy with Rust's actix-web framework performance, before introducing Bson objects.
I've started reading about this issue and found those benchmarks, and also an alternative for document operations.
https://github.com/only-cliches/NoProto
I'm wondering if it's possible to replace BSON with NoProto Documents? They seem to have the same functionality, but noProto is around 160x faster for decodes, and 85x faster for updates of a single document.
I understand that Document functionality is one of the core MongoDB features, but using BSON for it is a major performance hit for the Rust driver. Changing it might raise the performance several times!
Thanks for your time and attention!
My bench results:
The text was updated successfully, but these errors were encountered: