Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for fixed ContentLength #32

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

gbonnefille
Copy link
Contributor

Some servers do not support chunked message.
Sending whole part in a single message
would help to interact with such servers.

Some servers do not support chunked message.
Sending whole *part* in a single message
would help to interact with such servers.
@Acconut
Copy link
Member

Acconut commented Feb 7, 2020

Can you elaborate more about what the problem is and how this PR fixes it?

@gbonnefille
Copy link
Contributor Author

For what I understood, the previous implementation of tus-java-client is based on chunked transfer encoding and the server I tried to use seems to not support this encoding.

Effect: the server considers the client sent an empty message.

The fix consists in adding an upload method. This method disable the chunked transfer, and send a message of size requestPayloadSize in a single run.

I tested it yesterday and it gives full satisfaction.

Copy link
Member

@Acconut Acconut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. That makes sense.

src/main/java/io/tus/java/client/TusUploader.java Outdated Show resolved Hide resolved
src/main/java/io/tus/java/client/TusUploader.java Outdated Show resolved Hide resolved
The Java HTTP client library is able to automatically set the Content-Length.
@foxware00
Copy link

foxware00 commented Apr 23, 2020

Is there any plans to merge this, we could benefit from this change. It'll also close off #30

@Acconut this would be really useful for us as it's currently preventing us from using TUS in our Android setting

@EverydayPineapple
Copy link

I'd also like to know on plans for merging this @Acconut

Copy link
Member

@Acconut Acconut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay on my side, this PR slipped under my radar. This is a sensitive area of tus-java-client so I might be very picky about changes here :)

@@ -115,6 +116,39 @@ public boolean removeFingerprintOnSuccessEnabled() {
return removeFingerprintOnSuccessEnabled;
}

/**
* Enable chunked transfer encoding.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand the comment a bit to explain what the advantages/disadvantages/impacts of enabling/disabling chunked transfer encoding has?

src/main/java/io/tus/java/client/TusClient.java Outdated Show resolved Hide resolved
src/main/java/io/tus/java/client/TusUploader.java Outdated Show resolved Hide resolved
bytesToRead = Math.min(getChunkSize(), bytesRemainingForRequest);
else {
// No chunked tranfer encoding, so we upload a single chunk.
bytesToRead = getChunkSize();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is still some confusion about the chunk size vs payload size. Traditionally the payload size determines how big the body of a HTTP can get. The chunk size on the other hand says, how much data is read from the input stream and written to the TCP connection in one iteration. Usually, you write multiple chunks per HTTP request, so the payload size is bigger than the chunk size.

However, if I understand this correctly, the chunk size now determines how big a HTTP request gets and the payload size is not used anymore. This would be contrary to how tus-java-client behaves when chunked transfer is enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the opposite reasoning:
chunk size exists because something in the chain does not support more than that. So, when chunk transfer is not supported, it seems better to use TUS protocol in place of chunk transfer and then, send many complete HTTP+TUS messages based on chunk size.

Furthermore, as a dumb user, I only wish things work without having to fine tune chunk size or payload size, depending on the server's capabilities. I like automagical approach. :-)

But I'm completly new to all of this. So I don't know what sort of logic is expected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chunk size exists because something in the chain does not support more than that.

That's not entirely true. The reason for chunk size is that the user can control how much data is read from the input source in one go. This is important to control how big the in-memory buffer of tus-java-client is, in order to avoid excess memory usage.

You are referring to the payload size when you say "something [e.g. a proxy] in the chain does not support more than that". Does that make sense?

Furthermore, as a dumb user, I only wish things work without having to fine tune chunk size or payload size, depending on the server's capabilities. I like automagical approach. :-)

Yes, I would like that as well and it will likely come in the next major release but not right now :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important to control how big the in-memory buffer of tus-java-client is, in order to avoid excess memory usage.

OK, so you state that client (memory) is "something in the chain does not support more than that", first part of the chain.

@Acconut
Copy link
Member

Acconut commented May 1, 2020

@EverydayPineapple @foxware00 Thanks for reminding me about this. There are no plans for merging it but I got some more feedback we need to look into.

@foxware00
Copy link

@EverydayPineapple @foxware00 Thanks for reminding me about this. There are no plans for merging it but I got some more feedback we need to look into.

@Acconut thank you. This is critical for me, for the usage of Tus. A present we're running this branch internally until it's upstreamed. Many thanks

@Acconut
Copy link
Member

Acconut commented May 1, 2020

Let's see if @gbonnefille is still interested in this PR. Otherwise someone else would have to pick it up since I don't have time to work on it directly. If you are interested in doing so, let me know :)

@foxware00
Copy link

Let's see if @gbonnefille is still interested in this PR. Otherwise someone else would have to pick it up since I don't have time to work on it directly. If you are interested in doing so, let me know :)

Sure if @gbonnefille isn't happy to do it. I would be happy to apply your changes

@gbonnefille
Copy link
Contributor Author

I'm still interested on it, but spare time is lacking now. I hope to do something on this during the week.

@Acconut
Copy link
Member

Acconut commented May 4, 2020

Perfect, let me know if you need help.

@dmytrohridin
Copy link

Hey guys
Any progress on this, need some help maybe?
cc @Acconut

@Acconut
Copy link
Member

Acconut commented Oct 23, 2020

I think there are still some open comments from me that would need to be addressed before we can merge. If there is still some interest in getting this merged, I am happy to review the work! In the future I hope to replace the HTTPUrlConnection with a more flexible HTTP Client (maybe OkHttp or something), which would make this code not necessary anymore.

@foxware00
Copy link

@Acconut I was picking this back up as I've noticed an issue in this logic around chunking/payload size. If I understand your open question correctly. Is it that payload size == chunk size when chunked transfer is disabled?

This seems to make sense to me, I'm interested in a way to solve it, but one way is removing the extra code to close the connection and force chunk/payload to be the smaller of the two when chunkedTransferEncoding is disabled.

In doing so, you negate the need for the additional code closing the connection and the logic within uploadChunk remains identical for both paths.

My question to you is how this fits in the the library, I'm happy to make the change. I feel we just need to document logic around chunk/payload being only valid in chunkedEncodingMode.

Such as below

if (!client.chunkedTransferEncoding()) {
    if (getChunkSize() < getRequestPayloadSize()) {
        setRequestPayloadSize(getChunkSize());
    } else {
        setChunkSize(getRequestPayloadSize());
    }
}

Another option is to ignore the chunkSize when chunkedEncoding is disabled and fallback to payload size.
That feels a bit more obvious to the user of the library.

Then, getChunkSize becomes the following.

/**
 * Returns the current chunk size set using {@link #setChunkSize(int)}. 
 * If chunked transfer encoding is disabled, the payload size is returned instead
 *
 * @return Current chunk size
 */
public int getChunkSize() {
    if (client.chunkedTransferEncoding()) {
        return buffer.length;    
    }
    return requestPayloadSize;
}

@gbonnefille
Copy link
Contributor Author

@Acconut I was picking this back up as I've noticed an issue in this logic around chunking/payload size. If I understand your open question correctly. Is it that payload size == chunk size when chunked transfer is disabled?

This seems to make sense to me, I'm interested in a way to solve it, but one way is removing the extra code to close the connection and force chunk/payload to be the smaller of the two when chunkedTransferEncoding is disabled.

In doing so, you negate the need for the additional code closing the connection and the logic within uploadChunk remains identical for both paths.

My question to you is how this fits in the the library, I'm happy to make the change. I feel we just need to document logic around chunk/payload being only valid in chunkedEncodingMode.

Such as below

if (!client.chunkedTransferEncoding()) {
    if (getChunkSize() < getRequestPayloadSize()) {
        setRequestPayloadSize(getChunkSize());
    } else {
        setChunkSize(getRequestPayloadSize());
    }
}

Another option is to ignore the chunkSize when chunkedEncoding is disabled and fallback to payload size. That feels a bit more obvious to the user of the library.

Then, getChunkSize becomes the following.

/**
 * Returns the current chunk size set using {@link #setChunkSize(int)}. 
 * If chunked transfer encoding is disabled, the payload size is returned instead
 *
 * @return Current chunk size
 */
public int getChunkSize() {
    if (client.chunkedTransferEncoding()) {
        return buffer.length;    
    }
    return requestPayloadSize;
}

See #32 (review)
IMHO, chunkSize < payloadSize and @Acconut stated this is because chunkSize is the max size supported by client due to memory management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants