Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: use a pool of scans slices #24

Closed
kovalromank opened this issue Dec 4, 2020 · 2 comments
Closed

Feature: use a pool of scans slices #24

kovalromank opened this issue Dec 4, 2020 · 2 comments

Comments

@kovalromank
Copy link

Thanks for the great library.

While looking through this library I noticed that each time RowScanner.Scan was called a new interface slice was created.

Since the row scanner caches column names and field indexes I wanted to see if there could be a benefit to using a pool of slices rather than allocating a new one each scan.

I created a data struct with 1024 columns and some quick benchmarks to my fork of scany here. The benchmark data struct and the benchmarks are in two new files in my fork, bench_data_test.go and bench_test.go, if anyone wants to run the benchmarks for themselves.

Results of benchmarks:

goos: darwin
goarch: amd64
pkg: github.com/georgysavva/scany
BenchmarkStructPool
BenchmarkStructPool-8   	   16312	     84675 ns/op	      44 B/op	       1 allocs/op
BenchmarkStruct
BenchmarkStruct-8       	   13929	     81237 ns/op	   16397 B/op	       1 allocs/op
BenchmarkMapPool
BenchmarkMapPool-8      	    5966	    171132 ns/op	   57429 B/op	    2050 allocs/op
BenchmarkMap
BenchmarkMap-8          	    6478	    171839 ns/op	   73760 B/op	    2050 allocs/op
PASS

Using a pool of slices reduces the memory usage by over 16000 B/op when scanning into both, a struct or a map. And specifically for a struct, the bytes allocated remain constant even though there are 1024 different columns.

This is a great use of sync.Pool since due to RowScanner's caching the allocated slices are of the same length each time Scan is called. I think it would be useful for RowScanner to provide an option for using a pool instead of allocating a new slice.

@georgysavva
Copy link
Owner

Hi. Any reason for closing?

@kovalromank
Copy link
Author

I ran some different benchmarks and noticed caching the column to field index maps would help a lot more when creating a new row scanner each iteration instead of reusing one as what I'm doing here. I created a new issue #25 explaining the cache.

I think creating a new row scanner is a lot more common because of functions like ScanAll/ScanOne.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants