Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SerializerCache: remove synchronization when only _sharedMap is accessed with a read call #4459

Open
wants to merge 2 commits into
base: 2.18
Choose a base branch
from

Conversation

pjfanning
Copy link
Member

  • there is a lot of synchronization in SerializerCache
  • most of it may be necessary on the update side but on the read side, we can remove some
  • we can follow up and replace the remaining synchronization later

@cowtowncoder
Copy link
Member

Ok, please see my notes on #4442: I am not sure these are safe to remove.
Problem I fear is that JsonSerializer instances are added to Map before dependency resolution is complete (since dependency resolution in fact will need to link partially initialized serializers for specific case of cyclic dependencies), and without synchronization partially initialized instances may escape (whether they'd be used before initialization completes depends on timing of course).

@pjfanning
Copy link
Member Author

pjfanning commented Mar 28, 2024

The code that I changed just reads values. I really don't think the existing synchronization helps these methods except in the unlikely event that they are called slightly after an update method in a different thread that locks and then blocks the read. This would only be by chance - there is no determinism. If the read methods here were doing something like this

  • check state and if valid return
  • no state found, then create the state and return it

That pattern benefits from locking.

These plain reads do not benefit from locking imho.

@cowtowncoder
Copy link
Member

@pjfanning But synchronization does have wider scope so it is NOT just guarding reads.

That's what I realized reading through the code: synchronized locks the whole SerializerCache instance which then affects (for better and worse) all code paths that use synchronized. And in particular calls to resolve() (ResolvableSerializer).
You could say locking is a side-effect, but that's what guarantees that half-resolved instances are not accessible.

@pjfanning
Copy link
Member Author

pjfanning commented Mar 28, 2024

_sharedMap is created in the constructor and is final. This coding approach makes it fully safe to read without locking. The JVM basically guarantees that the constructor must complete before anything else can use the instance. Lazy initialization requires locking. Eager initialization in the constructor does not.

There is more complicated locking in other methods in this class and that is why I've left them alone for now but imho, these simpler cases are safe to change.

@pjfanning pjfanning changed the title SerializerCache: remove synchronization when only _sharedMap is access with a read call SerializerCache: remove synchronization when only _sharedMap is accessed with a read call Mar 28, 2024
@cowtowncoder
Copy link
Member

cowtowncoder commented Apr 5, 2024

_sharedMap is created in the constructor and is final. This coding approach makes it fully safe to read without locking. The JVM basically guarantees that the constructor must complete before anything else can use the instance. Lazy initialization requires locking. Eager initialization in the constructor does not.

There is more complicated locking in other methods in this class and that is why I've left them alone for now but imho, these simpler cases are safe to change.

I am not worried about existence of _sharedMap -- that is perfectly fine, yes.
My concern has to do with synchronized(this) being wider global lock being relied on, around access to _sharedMap. Specifically, these accessors are now blocked due to:

    public void addAndResolveNonTypedSerializer(Class<?> rawType, JavaType fullType,
            JsonSerializer<Object> ser,
            SerializerProvider provider)
        throws JsonMappingException
    {
        synchronized (this) {
            Object ob1 = _sharedMap.put(new TypeKey(rawType, false), ser);
            Object ob2 = _sharedMap.put(new TypeKey(fullType, false), ser);
            if ((ob1 == null) || (ob2 == null)) {
                _readOnlyMap.set(null);
            }
            if (ser instanceof ResolvableSerializer) {
                ((ResolvableSerializer) ser).resolve(provider);
            }
        }
    }

so that ser.resolve(provider) is guarded. It's horrible from access perspective wrt shared map access but prevents concurrent attempts to resolve which is intentional.

So why doesn't performance suck really bad under normal circumstances? Because there is the "read-only map" that is used instead of shared one for all access, once initial construction of serializers settles.
This:

    public ReadOnlyClassToSerializerMap getReadOnlyLookupMap()

is what SerializerProvider gets:

        /* Non-blueprint instances do have a read-only map; one that doesn't
         * need synchronization for lookups.
         */
        _knownSerializers = _serializerCache.getReadOnlyLookupMap();

and over time most (or perhaps all) access to serializers will go through these immutable, non-synchronized maps.
It can still be potential concurrency problem during initial warmup phase, before all type->serializer mappings are resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants