Describe the bug
Every call to ingestFromBlob() with a URL-based BlobSourceInfo emits a high-volume
WARN log:
Blob 'https://...' was sent for ingestion without specifying its raw data size
This fires once per blob ingestion event. In production this generates thousands
of warnings per minute, polluting logs and making it harder to identify real issues.
The root cause is that BlobSourceInfo has a private blobExactSize field with no
public setter. The only way to set it is via internal static factory methods
(fromFile, fromStream) which are not applicable for URL-based blob ingestion.
There is currently no public API path to set blobExactSize when constructing
BlobSourceInfo from a URL — yet the SDK logs a WARN on every call where it is unset.
To Reproduce
- Construct a
BlobSourceInfo using the URL constructor:
BlobSourceInfo blobSourceInfo = new BlobSourceInfo("https://<storage-account>.blob.core.windows.net/<container>/<blob>");
- Call ingestFromBlob:
ingestClient.ingestFromBlob(blobSourceInfo, ingestionProperties);
- Observe the following WARN log emitted on every call:
Blob 'https://...' was sent for ingestion without specifying its raw data size
Expected behavior
Either:
- Option A: Expose a public setBlobExactSize(long size) setter on BlobSourceInfo
so callers can provide the blob size when constructing via URL. We can obtain the size
via Azure Storage SDK (BlobProperties.getBlobSize()) and would happily pass it if a
setter were available.
- Option B: Downgrade this log from WARN to DEBUG. A WARN implies something
is wrong with the call; in this case the blob is ingested successfully regardless —
the size is an optional optimization hint, not a correctness requirement. High-volume
WARN logs that do not indicate a real problem make production monitoring harder.
Screenshots
N/A
Setup (please complete the following information):
- JRE Version: OpenJDK 17
- SDK Version: kusto-ingest 7.0.6
Desktop (please complete the following information):
- OS: Linux (production), macOS (development)
- Version: Ubuntu 20.04 / macOS 14
Additional context
- The warning originates in QueuedIngestClientImpl and fires unconditionally when
blobExactSize is null on the BlobSourceInfo object.
- Switching to fromFile() or fromStream() is not viable — our blobs already reside
in Azure Storage and downloading them locally before re-ingesting would add double
network cost, local disk pressure, and significant latency, defeating the purpose of
URL-based queued ingestion.
- We checked BlobSourceInfo in SDK version 8.0.1 — setBlobExactSize() is still
not publicly exposed, so upgrading does not resolve this.
- Related class: com.microsoft.azure.kusto.ingest.source.BlobSourceInfo
Describe the bug
Every call to
ingestFromBlob()with a URL-basedBlobSourceInfoemits a high-volumeWARN log:
Blob 'https://...' was sent for ingestion without specifying its raw data size
This fires once per blob ingestion event. In production this generates thousands
of warnings per minute, polluting logs and making it harder to identify real issues.
The root cause is that
BlobSourceInfohas a privateblobExactSizefield with nopublic setter. The only way to set it is via internal static factory methods
(
fromFile,fromStream) which are not applicable for URL-based blob ingestion.There is currently no public API path to set
blobExactSizewhen constructingBlobSourceInfofrom a URL — yet the SDK logs a WARN on every call where it is unset.To Reproduce
BlobSourceInfousing the URL constructor:ingestClient.ingestFromBlob(blobSourceInfo, ingestionProperties);
Blob 'https://...' was sent for ingestion without specifying its raw data size
Expected behavior
Either:
so callers can provide the blob size when constructing via URL. We can obtain the size
via Azure Storage SDK (BlobProperties.getBlobSize()) and would happily pass it if a
setter were available.
is wrong with the call; in this case the blob is ingested successfully regardless —
the size is an optional optimization hint, not a correctness requirement. High-volume
WARN logs that do not indicate a real problem make production monitoring harder.
Screenshots
N/A
Setup (please complete the following information):
Desktop (please complete the following information):
Additional context
blobExactSize is null on the BlobSourceInfo object.
in Azure Storage and downloading them locally before re-ingesting would add double
network cost, local disk pressure, and significant latency, defeating the purpose of
URL-based queued ingestion.
not publicly exposed, so upgrading does not resolve this.