Replies: 3 comments 2 replies
-
Could you please specify what "expands" mean? What do you want Airflow to do? How? What is missing? We already have a way https://airflow.apache.org/docs/docker-stack/recipes.html#apache-hadoop-stack-installation to add Java to the stack. Also Airflow has multiple ways of running JvM tasks - DockerOperator, KubernetesPodOperator, @docker soon @kubernetes to name a few. i'd love if you could elaborate on precise ways how "expand JVM" should work and how they are better and relate to the existing ways I described. What exactly do you mean by "Minimal JVM support" that you proposed. What should it consist of ? |
Beta Was this translation helpful? Give feedback.
-
BTW. it's really funny to see you like your own post and all the reactions coming from you. |
Beta Was this translation helpful? Give feedback.
-
I don't think the Hadoop stack installation is enough at all. I'd like to have the ability to declare a DAG of Java code exactly the same way that we can declare a DAG today with Python code. For the most basic example, I'd like to be able to follow this docs page with Java code inside the blocks instead of Python. Why? Because it will enable a bunch of great abilities for people who are using JVM languages and JVM libraries. The images solution is indeed an option, but it's not much different than running Python that executes jar files. |
Beta Was this translation helpful? Give feedback.
-
I'd like to know if anybody thinks it can be possible to expand the JVM support on Airflow to native support that will enable one to write DAGs in Java that Airflow will build and compile, without the need for writing Python code that will just run JVM based code. Even if behind the scenes Airflow will use Python to do this, the benefit can be huge. Even if Airflow will just take the DAG and execute a build task by some default config parameters it can be super useful to many people. Currently, lots of people are using Airflow just for the Airflow Scheduler to easily manage JVM-based code that they execute with Python code that runs it, and they can't enjoy the real Airflow experience.
Minimal JVM support can expand Airflow to a whole new community, and at the same time expand itself to lots of potential benefits of better integration to JVM based libraries (That are pretty common in Airflow usage. Suddenly people will be able to run parts of the DAG in Scala, and other parts with PySpark for example).
Beta Was this translation helpful? Give feedback.
All reactions