Palette

Some image formats, such as GIF or PNG, can use a palette, which is a table of (usually) 256 colors to allow for better compression. Basically, instead of representing each pixel with its full color triplet, which takes 24bits (plus eventual 8 more for transparency), they use a 8 bit index that represent the position inside the palette, and thus the color. -- https://docs.geoserver.org/2.22.x/en/user/tutorials/palettedimage/palettedimage.html

So those mask files that look like color images are single-channel, uint8 arrays under the hood. When PIL reads them, it (correctly) gives you a two-dimensional array (opencv does not work AFAIK). If what you get is instead of three-dimensional, H*W*3 array, then your mask is not actually a paletted mask, but just a colored image. Reading and saving a paletted mask through opencv or MS Paint would destroy the palette.

Our code, when asked to generate multi-object segmentation (e.g., DAVIS 2017/YouTubeVOS), always reads and writes single-channel mask. If there is a palette in the input, we will use it in the output. The code does not care whether a palette is actually used -- we can read grayscale images just fine.

Importantly, we use np.unique to determine the number of objects in the mask. This would fail if:

Colored images, instead of paletted masks are used.
The masks have "smooth" edges, produced by feathering/downsizing/compression. For example, when you draw the mask in a painting software, make sure you set the brush hardness to maximum.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PALETTE.md

PALETTE.md

Palette

Files

PALETTE.md

Latest commit

History

PALETTE.md

File metadata and controls

Palette