Skip to content

BlobSourceInfo has no public setBlobExactSize() setter — causing high-volume WARN log on every queued ingest #477

Description

@mittalprince

Describe the bug
Every call to ingestFromBlob() with a URL-based BlobSourceInfo emits a high-volume
WARN log:
Blob 'https://...' was sent for ingestion without specifying its raw data size
This fires once per blob ingestion event. In production this generates thousands
of warnings per minute, polluting logs and making it harder to identify real issues.

The root cause is that BlobSourceInfo has a private blobExactSize field with no
public setter. The only way to set it is via internal static factory methods
(fromFile, fromStream) which are not applicable for URL-based blob ingestion.
There is currently no public API path to set blobExactSize when constructing
BlobSourceInfo from a URL — yet the SDK logs a WARN on every call where it is unset.

To Reproduce

  1. Construct a BlobSourceInfo using the URL constructor:
    BlobSourceInfo blobSourceInfo = new BlobSourceInfo("https://<storage-account>.blob.core.windows.net/<container>/<blob>");
  2. Call ingestFromBlob:
    ingestClient.ingestFromBlob(blobSourceInfo, ingestionProperties);
  3. Observe the following WARN log emitted on every call:
    Blob 'https://...' was sent for ingestion without specifying its raw data size

Expected behavior
Either:

  • Option A: Expose a public setBlobExactSize(long size) setter on BlobSourceInfo
    so callers can provide the blob size when constructing via URL. We can obtain the size
    via Azure Storage SDK (BlobProperties.getBlobSize()) and would happily pass it if a
    setter were available.
  • Option B: Downgrade this log from WARN to DEBUG. A WARN implies something
    is wrong with the call; in this case the blob is ingested successfully regardless —
    the size is an optional optimization hint, not a correctness requirement. High-volume
    WARN logs that do not indicate a real problem make production monitoring harder.

Screenshots
N/A

Setup (please complete the following information):

  • JRE Version: OpenJDK 17
  • SDK Version: kusto-ingest 7.0.6

Desktop (please complete the following information):

  • OS: Linux (production), macOS (development)
  • Version: Ubuntu 20.04 / macOS 14

Additional context

  • The warning originates in QueuedIngestClientImpl and fires unconditionally when
    blobExactSize is null on the BlobSourceInfo object.
  • Switching to fromFile() or fromStream() is not viable — our blobs already reside
    in Azure Storage and downloading them locally before re-ingesting would add double
    network cost, local disk pressure, and significant latency, defeating the purpose of
    URL-based queued ingestion.
  • We checked BlobSourceInfo in SDK version 8.0.1 — setBlobExactSize() is still
    not publicly exposed, so upgrading does not resolve this.
  • Related class: com.microsoft.azure.kusto.ingest.source.BlobSourceInfo

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions