Proposed SysV structures and Mixin functionality #64

kyewei · 2015-11-12T21:04:34Z

Old massive PR: #54
Small PR 1: #62

Possibly outstanding issues

With the SysV versions of the structure needing permissions and a name, Semian::Simple::SlidingWindow.new(max_size: 4) and Semian::SysV::SlidingWindow.new(max_size: 4, permissions: 0660, name: "sample_int") have different interfaces. Ways to solve
- do def initialize(max_size: , **) current choice
- do def initialize(options = {}) or something of the sort
- not pass in permissions and name at all, but they have to come from somewhere else without leaking the implementation, possible from Semian.register
Slightly extraneous: it was previously mentioned that test suite naming is a bit off, for example TestCircuitBreaker is circuit_breaker_test.rb
Naming issue of execute_atomically: you know what, I changed it to #synchronize, similar to classes like Mutex and Monitor. It's still aliased to #transaction
#shared? It's private, and just returns @using_shared_memory. Hopefully won't ruffle any feathers, or maybe just the ivar is enough?
It was mentioned (not sure if still applies) that giving name and permissions to the structures was bad. Personally I disagree, but if so, start a discussion below about it.

Content
Alright, here's what I propose for adding in the SysV structures. I didn't include the tests yet, they will obviously fail because there's no C code to provide actual shared memory in this commit.

The three SysV classes inherit directly from the Simple equivalents. They also mixin the Semian::SysVSharedMemory module, which provides the common functionality.

From the C perspective, to get the memory allocation and things to work, it needs to do a bunch of things. Certain ways of accomplishing them are IMO worse than others, but I'll list the things that the C side needs to do, and the different ways to do them, and why/how I decided on this way.
C needs to accomplish:

Hooking the allocation of a Ruby object so a backing C struct is made when the class is instantiated, whether or not acquire_memory_object is called. The C code needs this when unpacking the Ruby VALUE self object into a proper C semian_shm_object *
Hook the implementations of the shared methods like #acquire_shared_memory at a shared level
Override the specific implementations of a class, like #value=, #<<, etc

This is about to get heavy.

There were 3 ways of doing the above, and there is a diamond problem here. I originally had (1), but I changed to (2).

(1): Have a superclass SharedMemoryObject, and Simple structures inherit from that. The SysV further inherit from the Simple

Pros: Only hook access methods, allocation once, can replace functions at any point in time and it will work
Cons: Weird to have class Simple::Integer < SharedMemoryObject even though it doesn't do sharing at all

(2): Mixin a SysVSharedMemory module that does shares common functionality into the SysV variants only

Pros: Makes sense from class organization perspective, mixin a module if you want shared-ness

Cons: Although you can hook methods like acquire_memory_object in one spot, you have to replace the three different allocation methods separately. There is also a timing issue involved. You can't simply do this:

module SysVSharedMemory
...
def self.included(base)
  # this doesn't work! The classes that mixin this module 
  # are defined before semian.so are loaded
  # meaning this isn't defined when self.included is called
  call_c_function_to_override_alloc()  
end
end

Instead, as I found out, you had to do something like this in the C extension:

void Init_semian_shm_object() {
...
call_c_function_to_override_alloc_for(klass)
...
}

(3): Make the SysV classes inherit from SharedMemoryObject. But what do we do about the common functionality between Simple and SysV versions? Put that in a module, and include it into things that need the functionality, like this:

class Simple::Enum
module EnumImplementation
  ...
end
include EnumImplementation
end
class SysV::Enum < SharedMemoryObject
include Simple::Enum::EnumImplementation
end

Pros: Share things. Single point of replacing methods.
Again, this is ugly.

If you made it this far, 👍, it was really long.
I chose (2) because it's the most structurally clean, and I have a working solution to the timing problem, and there are no random abuses of Ruby features like Mixins beyond what is necessary.

I've isolated the locking/unlocking, and by doing that, eliminated the lock() and unlock() functions for a synchronize(), and using some delegating I wrap the methods with lock() and ensure unlock().

Here's the typical structure of an acquire that encompasses the C features as well just so we're on the same page:

:acquire_memory_object
  # calculate byte_size, verify semaphores enabled and _acquire exists, else exit
  :_acquire (in C as semian_shm_object_acquire())
    # receive name, permissions, byte_size, check validity
    # hash the name, 
    :bind_initialize_memory_callback
      # It binds a void* function pointer to the C struct. 
         the function is a callback that initializes memory of a given size.
         this is specialized per SysV structure.
    semian_shm_object_acquire_semaphore()
      # call semget, set permissions
    semian_shm_object_synchronize()
      semian_shm_object_lock_without_gvl()
      semian_shm_object_synchronize_with_block() (in begin block)
        semian_shm_object_check_and_resize_if_needed()
          # check requested byte_size against a possible existing memory block, 
             resizing if necesssary by calling shmctl to check sizes
          semian_shm_object_acquire_memory()
            # call callback bound by :bind_initialize_memory_callback to initialize the acquired memory
            # call shmat to attach
      semian_shm_object_synchronize_restore_lock_status() (in ensure block)
  # set @using_shared_memory to true

Here's the structure of a synchronized access. Any method defined using define_method_with_synchronize is synchronized.

:value
  :synchronize
    semian_shm_object_lock_without_gvl()
    semian_shm_object_synchronize_with_block() (in begin block)
      semian_shm_object_check_and_resize_if_needed()
      :value_inner (all the _inner methods are private)
        # access value from C struct
    semian_shm_object_synchronize_restore_lock_status() (in ensure block)

define_method_with_synchronize basically defines the original method, and then calls do_with_sync :method_name

Discussion, opinions, etc would be great. I also want suggestions as to how to go forward with this.

A version with everything working is on the branch with the large PR, #54
@sirupsen @byroot

byroot · 2015-11-16T18:33:16Z

lib/semian/simple_integer.rb

You don't need the _ here. If you disregard everything.

So just def initialize(**) ? I didn't know that.

byroot · 2015-11-16T18:41:14Z

do def initialize(max_size: , **_) current choice

Fine by me. As said though ** is enough.

Naming issue of execute_atomically: you know what, I changed it to #synchronize

Synchronize is definitely much better.

#shared? It's private, and just returns @using_shared_memory. Hopefully won't ruffle any feathers, or maybe just the ivar is enough?

Nah, an accessor is always good.

The rest of the PR is out of my my expertise.

kyewei · 2015-11-16T18:45:11Z

@csfrancis Any suggestions or comments on the C side? A working version can be found here in the old branch which I keep updated, but probably reading the long PR description up there ^ would be enough.

kyewei · 2015-11-16T19:06:23Z

lib/semian/sysv_shared_memory.rb

@byroot probably the most interesting change that you could provide an opinion for is this function here. I call it from C to replace a regular accessor or setter that accesses shared memory and is not a atomic call, and wrap it in a synchronize. I wrap all the functions with this, so it's essentially a do_with_sync :value=, :value, :reset, :increment and do_with_sync :size=, :max_size, :<<, :push, :pop etc

I just want to make sure there are no disagreements here.

Relevant C code is
https://github.com/Shopify/semian/blob/circuit-breaker-per-host/ext/semian/semian_shared_memory_object.c#L319-L324
where define_method_with_synchronize calls do_with_sync

and
https://github.com/Shopify/semian/pull/54/files#diff-99c3116b9c2c62cc89a4ae4cc84fadaaR262
which is where define_method_with_synchronize is used in place of rb_define_method

I'd prefer to avoid this kind of monkey patching.

I'd rather prefer have a noop sychronize method in Simple.

While looking over the C code base, one of the scarier things was that because I provided the lock() and unlock() as separate functions, it was very difficult to track where I had to re-unlock, since errors could occur in the middle of a function, and then you'd have to return right there, and insert an unlock() there. It just isn't safe to have many exit points sprawled all over the code. So I then opted to write ensures in, but doing so manually would add twice the number of functions and make things very manual.

Here's the what a begin-ensure would look like:

return rb_ensure(semian_shm_object_synchronize_with_block, self, semian_shm_object_synchronize_restore_lock_status, (VALUE)&status);

For reference: https://github.com/Shopify/semian/pull/54/files#diff-905e62ddb202d77b001ca1dbac3cbf06R310

Essentially each function needed a wrapper, and probably also some custom struct to overcome the one-argument limit of the function rb_ensure. This naturally lead to me doing it in Ruby instead since it doesn't have these limitations, which is why synchronize does what it does. IMO it's not a big deal since it only does it once per class, reduces code and complexity by a bit. Maybe using prepend with its use of super would make it cleaner than doing what I do currently, which is having an #{method_name}_inner as the original and ##{method_name} as the synchronized version.

With regards to the noop synchronize, do you mean something more along the lines of (1) , with SharedMemoryObject -> Simple::Integer -> SysV::Integer ? Otherwise I don't understand what you mean by noop synchronize. Unless you mean just having one there to keep the interfaces the same?

@kyewei I think what he means with noop syncronize is that the SysV implementations continue to inherit from Simple::Integer and friends, which then look like this:

class Semian::Simple::Integer # .. stuff def increment(delta) synchronize { value += delta } end private def synchronize yield end # more stuff end

When this mixin is included, it just overrides it. The implementation can then be simplified when all the fallback checking is done at boot-time because you can just override it in C, and not have it in Ruby.

I see what you're saying, but that also doesn't work in protecting lets say.. Semian::SysV::Integer#value. If the #value is not protected in C code, and it directly replaces the SysV::Integer, there is no entry point to override and replace with synchronize { ... } except through what do_with_sync does. If you do protect in C, well, that goes back to the problem of making 2x the functions to call rb_ensure(callback, arg1, ensure_fn, arg2)

For example, in code,

def increment(val=1) synchronize { self.value += 1 } # OK end def value #how do you put a synchronize in here? # you can't do self.value, that would just do recursion and crash @value end

csfrancis · 2015-11-17T15:01:33Z

lib/semian.rb

Should these requires be conditional depending on whether the target platform supports SysV?

kyewei · 2015-11-17T20:20:52Z

@sirupsen wanted to see the tests, so I'll include them now. Some of those new tests will fail because the C component isn't included in this PR :( They do succeed on the other branch however.

sirupsen · 2015-11-19T09:09:15Z

lib/semian/simple_sliding_window.rb

Why do you need this now?

sirupsen · 2015-11-19T09:17:05Z

I think that 2 is the right approach as well.

sirupsen · 2015-11-19T09:18:42Z

lib/semian/sysv_shared_memory.rb

Why not just use @type_size[type.to_sym] and make this easier to read? I assume the only reason you need the respond_to? is because you don't have the C implementation just yet?

It's for memoizing the sizes after it's read once from C. I don't want to hardcode sizes, they differ by platform.

I agree with that, but I don't see why you need the local variable when you have the class instance variable.

Oh. For raising the TypeError. Maybe that should be moved to C instead?

sirupsen · 2015-11-19T09:19:36Z

lib/semian/sysv_shared_memory.rb

Why does sizeof need to be a class method?

sirupsen · 2015-11-19T09:24:18Z

Fundamentally I think what makes this PR more complicated than it needs to be is that the fallback is done within the SysV driver, instead of at boot time as @csfrancis also pointed out. This will simplify I bunch of things because you can make more assumptions.

sirupsen · 2015-11-19T09:26:44Z

test/sysv_sliding_window_test.rb

This is a little bit of a nitpick, but I'd call this KLASS instead of make it a method. CLASS is too close to the reserved keyword.

sirupsen · 2015-11-23T08:04:55Z

lib/semian/sysv_state.rb

If you did alias_method :to_i, :value and always called value.to_i in Simple::State, you wouldn't need to override these. Duck typing hard!

kyewei added 2 commits November 12, 2015 19:50

Proposed SysV structures and Mixin functionality

ace7741

Stub tests that test fallback not-shared functionality

bcb187d

kyewei force-pushed the shared_memory_and_sysv branch from 6a888c3 to bcb187d Compare November 13, 2015 22:04

byroot reviewed Nov 16, 2015
View reviewed changes

kyewei reviewed Nov 16, 2015
View reviewed changes

Naming and nit-pick

144e8ea

csfrancis reviewed Nov 17, 2015
View reviewed changes

More nitpick

9d117e2

sirupsen reviewed Nov 19, 2015
View reviewed changes

lib/semian/simple_sliding_window.rb

Copy link

Contributor

sirupsen Nov 19, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this now?

sirupsen reviewed Nov 19, 2015
View reviewed changes

lib/semian/sysv_shared_memory.rb Outdated

Copy link

Contributor

sirupsen Nov 19, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does sizeof need to be a class method?

sirupsen reviewed Nov 19, 2015
View reviewed changes

Made SysV override on boot-up

89075a2

kyewei force-pushed the shared_memory_and_sysv branch from c300b81 to 89075a2 Compare November 20, 2015 21:14

sirupsen reviewed Nov 23, 2015
View reviewed changes

Small changes

8c6f6dd

kyewei closed this Nov 23, 2015

kyewei mentioned this pull request Nov 23, 2015

Entire SysV::Integer code and associated shared libraries #67

Open

epk deleted the shared_memory_and_sysv branch June 26, 2019 01:54

Proposed SysV structures and Mixin functionality #64

Proposed SysV structures and Mixin functionality #64

Uh oh!

Conversation

kyewei commented Nov 12, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

byroot commented Nov 16, 2015

Uh oh!

kyewei commented Nov 16, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kyewei commented Nov 17, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sirupsen commented Nov 19, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sirupsen commented Nov 19, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants