-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gdal2tiles optimization or acceleration? #10803
Comments
Please give a re-producible example about what you do and what timings you get. Or at least the gdal2tiles command with source data. |
It took about 62 hours to tile a .vrt that is 255145 x 255145 pixels. The VRT file is composed of 64 subtiles that are each 31900 x 31900 pixels. I can't make the image available nor would it be practical. I don't think the actual pixels makes any difference though. When running, all 12 cores are fully saturated. The machine has 128GB of RAM. Here is the command I ran: gdal2tiles.py --exclude --xyz --processes=12 --tiledriver=PNG --resampling=lanczos --zoom=18-18 --tilesize=512 image_z18.vrt z18_tiles I know lanczos is slower than the other resample options but only slightly slower than bilinear and nearest isn't suitable. |
It is often (usually) possible to do reasonable tests with smaller-than-all-that-we-have-in-production datasets, and with images which are alike the real ones and available with some free license. Or images may be artificial and created with for example https://gdal.org/en/latest/programs/gdal_create.html Please show what gdalinfo prints about one of the 31900 x 31900 pixel sized subtile. You can hide the coordinates, what interests me is if they are tiled and if they have overviews. Please help us to get a view about the scale of the issue by calculating some statistics. How many tiles there will be at zoom level 18, how many tiles gdal2tiles is generating now, and what tiles/second or tiles/minute rate would satisfy you? And perhaps the output as megabytes of image data per minute (or in some other appropriate units) that gdal2tiles can handle, and what your optimized pipeline gives. |
Below is the output from one of the tiles. They are PNG with transparency. I once tried first creating a TIFF of the entire .vrt using gdal_translate with create options set to use tiling. The pre-operation took a long time, and it didn't seem to speed up the tiling process enough to make it overall faster. I haven't tried creating tiled TIFFs of the individual 31.9K images, though I suppose I should try. I am working on a prototype area. Eventually, I will create tiles at Z18 for the entire United States. The prototype area produces 955,221 512x512 tiles. If I recall correctly, the US will have 668 million 512x512 tiles. I don't really have a rate in mind; I just thought I would inquire because out of all the intensive image processing I am doing, this is by far the slowest part of production, which is counterintuitive to me. I can generate this imagery from many layers in less than 24 hours. If it was possible to achieve similar or better, that would be fantastic. Driver: PNG/Portable Network Graphics |
Larger PNG files are very unsuitable for gdal2tiles and generally always when there is a need to read data from a small area inside the image. The whole PNG image, in this case 31900x31900 pixels, must be decoded even if the need is to get source data for one 256x256 sized tile. The first thing to do is to convert the source images into a more suitable format. Tiled GeoTIFFs are fine. I am not sure if gdal2tiles can benefit on overview levels or if it only need to access the full resolution image, and other zoom levels are then sampled from the base tiles. Doing any hardware acceleration or micro-optimization at some level is useless at this point. Here is a base command for the format conversion:
|
Thanks. That is what I've done before on the whole VRT. I'll experiment though and actually record some benchmarks. |
I just ran two benchmarks, (A) using PNGS and (B) using tiled TIFFs. They were nearly the same time. The tiled TIFF was slightly faster, but I think it was because I was copying a file during the first test that slowed down disk access. For (A) I created a .vrt using four of the 31.9K PNGs. For (B) I created tiled tiffs (using the command you provided) of the same four PNGs and created a .vrt of the the TIFFs. (A) took 3 hours 59 minutes Since it seems the only difference is from copying a file, I think it must not be using the tile features of the TIFF and would explain why its taking so long for me and why tiling did not help when I tried before. The good news is that creating a TIFF with tile support is quite fast when I run it on the 31.9K PNGs which was not the case when I ran it on a VRT of all the PNGs. |
I did some benchmarking with my laptop. The source image is an RGB image with 12000x12000 pixels.
If a program is slow is not a bug by itself, but for me the speed really does not feel so good. I believe that for example Mapserver can perform much better. I can't say if there is much to improve in gdal2tiles as a Python script, or should it be rewritten totally. The original author choose the latter approach and created MapTiler. |
I started to think that the "png files in a directory tree" feels somehow old-fashioned. I made some more tests with MBTiles and GeoPackage output, which do fundamentally the same thing: cut the source raster into tiles and save the PNG blobs somewhere on a disk. Both options were at least 3 times faster than gdal2tiles and wrote the output in 50 seconds. Yes, I know that a tile directory is simple on the server side.
What I don't understand is that the following gdalwarp command runs in less than 7 seconds and writes warped tiles into GeoPackage at rate 315 tiles/second or more. I noticed that:
But anyway, 7 seconds is amazingly fast compared to 50 seconds with gdal_translate into MBTiles output, and 180 seconds with gdal2tiles. Even generating a cloud optimized GeoTIFF is slower, 20 seconds with
I know that this COG has overviews while the GPKG does not, sorry for the lazy test plan. |
I believe one of the major source of slowdown of gdal2tiles is that it requests the data corresponding to a tile into a temporary buffer whose dimension are 4 times the target size, and then apply downscaling using the resampling algorithm. Cf
This is a very inefficient (and dubious) way of proceeding. It should directly use the warping resampling code to fill a buffer at the expected dimension without doing that prior upsampling. I guess the rationale for that odd behavior is that gdal2tiles dates back to a time where the warping code didn't properly take into account the scaling factor when warping |
Very interesting, thanks guys! Question: is gdalwarp needed at all if the image is already in the target projection? Or is that what scales the tiles? I wasn't aware gdal_translate did tiling. Is it possible to create a TILING_SCHEME for Slippy Tiles? Does it create the tile pyramid, or just tiles at the native resolution of the image? |
Warping may be needed for getting the right pixel size, but depending on the output driver it may be that gdalwarp in not needed. In my two first examples (MBTiles and GeoPackage with GoogleMapsCompatible) the data gets warped because the output schema requires that, but gdalwarp with -t_srs EPSG:3857 does not work because the resolution would not match what if required by the fixed tiling schema. However, gdal_translate works because then GDAL somehow knows automatically to warp into EPSG:3857 and to change the pixel size to match with the well-known tiling schema. Gdal_translate does tiling if the output format supports or requires that. Database backed formats are such (MBTiles, GeoPackage, SpatiaLite, TileDB, there may be more). Also some file formats support tiling (GeoTIFF/COG, HFA etc.) but I guess that those tiles are not as concrete as the database tiles but rather a well defined sections of the code stream. There is no "slippy tiles" format https://gdal.org/en/latest/drivers/raster/index.html. My thoughts are more theoretical. But if you look at the raster table of GeoPackage you can see that it has done the same thing than gdal2tiles. Maybe it would not make sense to extract the PNG blobs and save them into files which are named by zoom_level, tile_column, and tile_row, but certainly it would be possible. The overviews in GPKG are documented in https://gdal.org/en/latest/drivers/raster/gpkg.html#overviews. Overviews into MBTiles are created in the same way. |
fwiw, I wrote a version in R that almost feature complete and parallelizes well https://github.com/hypertidy/filearchy I think it would be probably easier to rewrite gdal2tiles |
the meat of the task is here https://github.com/hypertidy/filearchy/blob/main/R/gdal_tiles.R#L32-L40 I could probably flesh out a basic go at that with osgeo.gdal, if others were interested to help with the broader python scaffolding |
what's the pythoniest way to generate a table of te and zoom from a bbox and shape? That's the kind of scaffolding I mean, it's a basic reusable component. |
I ran my test again using four 31.9K images using actual size:
It created 62,364 512x512 tiles from the TIFFs in 7 minutes! (148 tiles per second) Just for good measure, I did the same test with PNGs and it took 21 minutes, or 3X as long so it shows it is using the TIFF tiles to its advantage. I only created tiles at zoom level 18 for the test, but as far as I can tell, they look perfect. Is there any reason why I would need to make other changes? |
So setting querysize to actual tile_size definitely breaks the overview tiles. I tried using 4x for the overview tiles, but something is still not right. I don't understand why. |
Feature description
I have an extensive imagery processing pipeline, including computation intensive histogram matching, many image adjustments and many compositing steps, but running gdal2tiles takes far longer than any part of my proccessing. I can create the base image from scratch much faster than tile.
I'm not complaining per se, but I'm wondering why it is slow and if there is room for optimization. I'm a developer and don't have any experience profiling and optimizing code on other platforms. The machines that do the bulk of my processing run Ubuntu.
Has profiling been done on gdal2tiles to see the most expensive processes?
Besides any obvious optimizations, it would be really nice to have hardware acceleration for the most expensive processes, which I guess is the scaling operation.
I am willing to work on it if I have an idea of what would make the biggest improvement.
Additional context
No response
The text was updated successfully, but these errors were encountered: