Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor set_at() and it's usage in draw.c #2517

Open
7 tasks
MyreMylar opened this issue Oct 14, 2023 · 1 comment · May be fixed by #2551
Open
7 tasks

Refactor set_at() and it's usage in draw.c #2517

MyreMylar opened this issue Oct 14, 2023 · 1 comment · May be fixed by #2551
Assignees
Labels
draw pygame.draw Performance Related to the speed or resource usage of the project

Comments

@MyreMylar
Copy link
Member

Currently the function set_at() in draw.c looks like this:

static int
set_at(SDL_Surface *surf, int x, int y, Uint32 color)
{
    SDL_PixelFormat *format = surf->format;
    Uint8 *pixels = (Uint8 *)surf->pixels;
    Uint8 *byte_buf, rgb[4];

    if (x < surf->clip_rect.x || x >= surf->clip_rect.x + surf->clip_rect.w ||
        y < surf->clip_rect.y || y >= surf->clip_rect.y + surf->clip_rect.h)
        return 0;

    switch (format->BytesPerPixel) {
        case 1:
            *((Uint8 *)pixels + y * surf->pitch + x) = (Uint8)color;
            break;
        case 2:
            *((Uint16 *)(pixels + y * surf->pitch) + x) = (Uint16)color;
            break;
        case 4:
            *((Uint32 *)(pixels + y * surf->pitch) + x) = color;
            break;
        default: /*case 3:*/
            SDL_GetRGB(color, format, rgb, rgb + 1, rgb + 2);
            byte_buf = (Uint8 *)(pixels + y * surf->pitch) + x * 3;
#if (SDL_BYTEORDER == SDL_LIL_ENDIAN)
            *(byte_buf + (format->Rshift >> 3)) = rgb[0];
            *(byte_buf + (format->Gshift >> 3)) = rgb[1];
            *(byte_buf + (format->Bshift >> 3)) = rgb[2];
#else
            *(byte_buf + 2 - (format->Rshift >> 3)) = rgb[0];
            *(byte_buf + 2 - (format->Gshift >> 3)) = rgb[1];
            *(byte_buf + 2 - (format->Bshift >> 3)) = rgb[2];
#endif
            break;
    }
    return 1;
}

There has already been one attempt to improve it by making an 'unsafe' version that doesn't do clip checking.

This does not go far enough. This function is used all over draw.c inside of while loops that draw lines containing hundreds of pixels, and shapes that may contain many lines.

For each pixel of these lines we are checking what the bit depth of the surface is. That could be a thousand checks per polygon for a large polygon - but we only need one check.

I propose, unless set_at() is being called directly by a user to change literally a single pixel, we instead make all our tests for surface bit depth (and ideally shape clipping) happen outside of any per pixel while loops and inside of them we call very specific functions that just set the pixel.

e.g.

// This 'safe' version assumes we have determined the surface bit depth but does clipping per pixel,
// ideally we could clip our draw shapes at a higher level
static int set_at_32bit_safe(Uint8 *pixels, int pitch, int x, int y, Uint32 color, SDL_Rect clip_rect)
{
    if (x < clip_rect.x || x >= clip_rect.x + clip_rect.w ||
        y < clip_rect.y || y >= clip_rect.y + clip_rect.h)
        return 0;

    *((Uint32 *)(pixels + y * pitch) + x) = color;
    return 1;
}
//This 'unsafe' version just sets a 32bit surface pixel to a colour and assumes safe postions (i.e. inside the surface)
//and that we have already determined the needed pixel tpye before calling
static void set_at_32bit_unsafe(Uint8 *pixels, int pitch, int x, int y, Uint32 color)
{
    *((Uint32 *)(pixels + y * pitch) + x) = color;
}

In all cases we should also prioritise the 32 bit path - the current set_at() has an 8bit surface depth as the first case, before testing for a 16 bit surface, then a 24bit one before finally testing for a 32bit. These switch cases should be in surface commonality order which is 32bit (4byte) then 24bit (3byte), then 8bit (1byte) and finally the rarely seen 16bit (2 byte) surface.

To Do list:

  • Identify all the places where set_at() is eventually used inside of a pixel loop. For a refactorings list.
  • Create simple pixel setter functions like the 32bit onese above for all the normal bit depth surfaces (32bit, 24bit, 16bit, 8bit).
  • copy the surface depth switch case that currently sits inside set_at() to sit outside each of the pixel loops (or higher) so we only test the surface bit depth once per user call to draw.
  • use the new simple pixel setter functions inside the pixel loops now separated cleanly by bit depth.
  • make sure the tests still pass.
  • check the performance versus current main branch across however many draw functions this problem affects.
  • can any of this be macro'd up to reduce repetition?
@MyreMylar MyreMylar added Performance Related to the speed or resource usage of the project draw pygame.draw labels Oct 14, 2023
@MyreMylar
Copy link
Member Author

Just for some incentive, a quick test I did to check the potential of this saw performance jump by about 50% for single pixel diagonal lines.

If we can also strip the clip test out I expect it will do even better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
draw pygame.draw Performance Related to the speed or resource usage of the project
Projects
None yet
2 participants