Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't send message with serialized NumPy array that is larger than 2 GB in size #4768

Open
wigging opened this issue Feb 11, 2025 · 0 comments

Comments

@wigging
Copy link

wigging commented Feb 11, 2025

I'm using pyzmq to send a large NumPy array from the client to the server. See the pyzmq discussion for more details. The client and server computers are both MacBook Pro laptops running macOS 15.3 with 32 GB of memory. I noticed that if the NumPy array is larger than 2 GB in size then it fails to send. Since pyzmq does not set a size limit, does libzmq impose a size limit on the TCP socket messages?

I have found another issue with a similar problem. But that issue seems abandoned and doesn't specifically deal with NumPy arrays.

Here is my Python code for serializing the NumPy array and sending it. This works fine as long as the NumPy array is less than 2 GB in size.

# client.py

import sys
import numpy as np
import zmq

class Client:
    """Client for sending/receiving messages."""

    def __init__(self, address="tcp://localhost:5555"):
        context = zmq.Context()
        socket = context.socket(zmq.REQ)
        socket.connect(address)
        self.socket = socket

    def send_array(self, array: np.ndarray):
        md = {"dtype": str(array.dtype), "shape": array.shape}
        self.socket.send_json(md, zmq.SNDMORE)  # send metadata
        self.socket.send(array, copy=False)     # send NumPy array data

    def recv_message(self):
        reply = self.socket.recv_string()
        print("Received reply:", reply)

def main():
    # Create array
    n = 16000  # 8000 is 500 MB, 11500 is 1 GB, 16000 is 2 GB, 17000 fails to send
    x = np.random.rand(n, n)
    print(f"Array shape:           {x.shape}")
    print(f"First three elements:  {x[0, 0:3]}")
    print(f"Size of array data:    {x.nbytes} bytes, {x.nbytes / 1000**2} MB")
    print(f"Size of array object:  {sys.getsizeof(x)} bytes, {x.nbytes / 1000**2} MB")

    # Create client and send array
    client = Client()
    client.send_array(x)
    client.recv_message()

if __name__ == "__main__":
    main()
# server.py

from typing import Any
import zmq
import numpy as np

class Server:
    """Server for receiving/sending messages."""

    def __init__(self, address="tcp://localhost:5555"):
        context = zmq.Context()
        socket = context.socket(zmq.REP)
        socket.bind(address)
        self.socket = socket
        print("Server started, waiting for array...")

    def _recv_array(self):
        md: Any = self.socket.recv_json()               # receive metadata
        msg: Any = self.socket.recv(copy=False)         # receive NumPy array data
        array = np.frombuffer(msg, dtype=md["dtype"])   # reconstruct the NumPy array
        return array.reshape(md["shape"])

    def run(self):
        """Run the server."""
        while True:
            # Receive the NumPy array
            array = self._recv_array()
            print("Received array with shape:", array.shape)
            print(f"First three elements:  {array[0, 0:3]}")

            # Send a confirmation reply
            self.socket.send_string("Array received")

def main():
    server = Server()
    server.run()

if __name__ == "__main__":
    main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant