Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-37088][k8s]support flink applicaton taskmanager log mounting #25944

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

baimafeima-yf
Copy link

What is the purpose of the change

flink application support host mount log for the JobManager/TaskManager Container

Brief change log

  • add FlinkLogDecorator to the decorator chains in KubernetesJobManagerFactory and KubernetesTaskManagerFactory.

Verifying this change

This change is already covered by existing tests..

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive):no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented?no--will add an intro doc in later PR.

@flinkbot
Copy link
Collaborator

flinkbot commented Jan 10, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build



public static final ConfigOption<String> KUBERNETES_LOG_VOLUMES_MOUNT_MOUNTPATH =
key("kubernetes.log.volumemounts.mountpath")
Copy link
Contributor

@davidradl davidradl Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a k8s expert. I have some basic questions:

  • we should document these new config options.
  • are there not issues in supplying default locations to write logs in a PR, as suddenly there will be a lot of data written to locations that was previously not written. I would assume that /apps/log/flink and /opt/flink/log are folders and could get full - so size would need to be managed; maybe having criteria to delete / archive older logs. If there is new management required of these logs, then the considerations should be documented.
  • It would be helpful to describe in the Jira in more detail what the problem is and how this fixes it.
    • what is the life cycle of HostPathVolumeSource - it it associated with the pod, so are we trying to keep
      logs that would have disappeared with the container, in storage associated with the pod and we would
      lose the logs if the pods die.
  • I wonder what the impact is of enabling this for all containers after this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your review. Yes, we can configure a container's PersistentVolume (PV) and PersistentVolumeClaim (PVC) to achieve log mounting.
However, when starting a Flink application via code invocation, we found that the log mounting was not effective. Therefore, we modified this section of the code to implement log mounting and added a toggle kubernetes.decorator.log-mount.enabled. If users encounter the same situation as us, they only need to enable this toggle to achieve log mounting; it is disabled by default.

@davidradl
Copy link
Contributor

Reviewed by Chi on 16/01/2025 Go back to the submitter with review comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants