Skip to content

Conversation

@phaethon
Copy link

Description

This is initial implementation of Starrocks support as a separate destination. It implements Stream Load and INSERT INTO SELECT FROM FILES (if S3 compatible staging is used) methods.

@netlify
Copy link

netlify bot commented Apr 12, 2025

Deploy Preview for dlt-hub-docs ready!

Name Link
🔨 Latest commit 2e5e407
🔍 Latest deploy log https://app.netlify.com/projects/dlt-hub-docs/deploys/68b6efaaee22f000085c333a
😎 Deploy Preview https://deploy-preview-2518--dlt-hub-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@rudolfix
Copy link
Collaborator

@phaethon thanks for this! code looks pretty good, I assume you run it in productions somewhere? I have a few questions

  1. We are not familiar with Starrocks. is there OSS version of it? Or any other way to run tests against?
  2. We can enable our standard tests on this. There will be a few issues for sure to fix. Do you still have time to work on it?
  3. Similar question regarding docs. Our docs are pretty standardized and if you provide a basic version we can improve them ourselves.
  4. There's support for s3. Are other buckets like azure blob storage supported?
    thanks again!

@phaethon
Copy link
Author

phaethon commented Apr 23, 2025

  1. Starrocks is a fork from Apache Doris. Yes, it has OSS version. It normally needs at least 2 different types of nodes - frontend and backend. This image includes both nodes in all in one container: https://hub.docker.com/r/starrocks/allin1-ubuntu
  2. Yes, I can contribute more code. Having hints where to get samples from other dlt destinations would be helpful. E.g. Starrocks has different types of tables (primary key, duplicate key, aggregate key) - it is currently not clear for me what kind of API would naturally fit dlt to be able to specify table types and its parameters.
  3. I will try to come up with at least basic tutorial.
  4. Starrocks supports Azure, GCS, S3, and S3 compatible storage. My use case required S3 compatible implementation, others can be added.

@rudolfix
Copy link
Collaborator

rudolfix commented Jun 5, 2025

@phaethon we went through several discussion if we can take this destination into our core and we decided we do not have resources to do that. first we'd need to enable it for all the standard tests which is pretty big amount of work and then we'll need to maintain those tests and run it with every PR.
we'd also need to create documentation.

what we surely can do is to add your repo to our docs along other destinations and provide instructions how to use it. it would be clearly marked as community provided and we OFC can attribute you.

lmk. if you want to do that.

@phaethon
Copy link
Author

Yes, I would like this to be linked from official docs if you see it as best way to provide for potential users. I don't have any commercial interest in this, but sooner or later there might be someone else with whom to share maintenance.

This raises a couple of practical questions from user perspective. Do I understand correctly that you would recommend installing fork from my repo instead of official dlt package? To install using pip user would need to point to github repo or would it make sense to have a package dlt-starrocks or similar? To create and update documentation I would need to do PR for official repo?

Copy link

@Gunnnn Gunnnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like it! @rudolfix Please make it happen!

@Gunnnn
Copy link

Gunnnn commented Jul 30, 2025

I want to participate and work through this PR. @rudolfix Please show me the point to start with.

@phaethon
Copy link
Author

phaethon commented Aug 8, 2025

@Gunnnn if you have ideas for contribution or Starrocks specific issues, you can add them to fork https://github.com/phaethon/dlt (starrocks branch) If there will be traction on the fork repo, it will be more likely to get integrated in main repo with time. And feel free to ask questions how to use it, based on which we can create first documentation pieces

@rudolfix
Copy link
Collaborator

hey @phaethon ! we do a cleanup of all community destinations that we are not merging in the core library. they'll get a docs page where we link to your repo. the proposal for the page is here:
#3326

pls tell me

  • if you still want us to link to your fork
  • if you want to change anything in there (you can also do a PR or request changes to comments)

we'll close this PR soon

@phaethon
Copy link
Author

Yes, please, do this link. I have no immediate comments for the proposed text. I suppose I can do a PR at a later stage, too, and if it is just a text update, it shouldn't be hard for anyone involved.
Meanwhile, I can confirm that I am using this myself for last half a year, and for what it does, it works well for my own needs. Going to update it soon with latest dlt release.

@rudolfix rudolfix closed this Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants