Skip to content

Latest commit

 

History

History
155 lines (107 loc) · 5.16 KB

troubleshooting.md

File metadata and controls

155 lines (107 loc) · 5.16 KB

Troubleshooting

When the application pod fails to start normally or has an exception, it is usually necessary to check the logs of the JuiceFS CSI Driver to troubleshoot the problem. Different versions of CSI Driver view logs in different ways, which are described below.

Check JuiceFS CSI Driver version

First, you need to check the version of the JuiceFS CSI Driver installed in the current Kubernetes cluster, which can be obtained with the following command:

kubectl -n kube-system get pod -l app=juicefs-csi-controller -o jsonpath="{.items[*].spec.containers[*].image}"

The above command will output something like juicedata/juicefs-csi-driver:v0.13.2, the last v0.13.2 is the version of JuiceFS CSI Driver.

View JuiceFS CSI Driver logs

v0.10+

:::tip It is recommended to continuously collect and store the logs of the JuiceFS Mount Pod to facilitate subsequent troubleshooting. For details, please refer to the "Collect Mount Pod Logs" document. :::

Find mount pod

  1. Find the node where the pod is deployed. For example, your pod name is juicefs-app:

    $ kubectl get pod juicefs-app -o wide
    NAME          READY   STATUS              RESTARTS   AGE   IP       NODE          NOMINATED NODE   READINESS GATES
    juicefs-app   0/1     ContainerCreating   0          9s    <none>   172.16.2.87   <none>           <none>

    From above output, the node is 172.16.2.87.

  2. Find the volume ID of the PersistentVolume (PV) used by your pod.

    For example, the PersistentVolumeClaim (PVC) used by your pod is named juicefs-pvc:

    $ kubectl get pvc juicefs-pvc
    NAME          STATUS   VOLUME       CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    juicefs-pvc   Bound    juicefs-pv   10Pi       RWX                           42d

    From above output, the name of PV is juicefs-pv, then get the YAML of this PV:

    $ kubectl get pv -o yaml juicefs-pv
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: juicefs-pv
      ...
    spec:
      ...
      csi:
        driver: csi.juicefs.com
        fsType: juicefs
        volumeHandle: juicefs-volume-abc
        ...

    From above output, the spec.csi.volumeHandle is the volume ID, i.e. juicefs-volume-abc.

  3. Find JuiceFS mount pod by node name and volume ID. For example:

    $ kubectl -n kube-system get pod -l app.kubernetes.io/name=juicefs-mount -o wide | grep 172.16.2.87 | grep juicefs-volume-abc
    juicefs-172.16.2.87-juicefs-volume-abc   1/1     Running   0          20h    172.16.2.100   172.16.2.87   <none>           <none>

    From above output, the name of JuiceFS mount pod is juicefs-172.16.2.87-juicefs-volume-abc.

Get logs of mount pod

  1. Get JuiceFS mount pod logs. For example:

    kubectl -n kube-system logs juicefs-172.16.2.87-juicefs-volume-abc
  2. Find any log contains WARNING, ERROR or FATAL.

Before v0.10

  1. Find the node where the pod is deployed. For example, your pod name is juicefs-app:

    $ kubectl get pod juicefs-app -o wide
    NAME          READY   STATUS              RESTARTS   AGE   IP       NODE          NOMINATED NODE   READINESS GATES
    juicefs-app   0/1     ContainerCreating   0          9s    <none>   172.16.2.87   <none>           <none>

    From above output, the node is 172.16.2.87.

  2. Find the JuiceFS CSI driver pod in the same node. For example:

    $ kubectl describe node 172.16.2.87 | grep juicefs-csi-node
    kube-system                 juicefs-csi-node-hzczw                  1 (0%)        2 (1%)      1Gi (0%)         5Gi (0%)       61m

    From above output, the JuiceFS CSI driver pod name is juicefs-csi-node-hzczw.

  3. Get JuiceFS CSI driver logs. For example:

    kubectl -n kube-system logs juicefs-csi-node-hzczw -c juicefs-plugin
  4. Find any log contains WARNING, ERROR or FATAL.

Diagnosis script

You can also use the diagnosis script to collect logs and related information.

  1. Download the diagnosis script to the node which can exec kubectl.

    wget https://raw.githubusercontent.com/juicedata/juicefs-csi-driver/master/scripts/diagnose.sh
  2. Add execute permission to script.

    chmod a+x diagnose.sh
  3. Collect diagnose information using the script. For example, your JuiceFS CSI Driver is deployed in kube-system namespace, and you want to see information in node named kube-node-2.

    $ ./diagnose.sh
    Usage:
        ./diagnose.sh COMMAND [OPTIONS]
    COMMAND:
        help
            Display this help message.
        collect
            Collect pods logs of juicefs.
    OPTIONS:
        -no, --node name
            Set the name of node.
        -n, --namespace name
            Set the namespace of juicefs csi driver.
    
    $ ./diagnose.sh -n kube-system -no kube-node-2 collect
    Start collecting, node-name=kube-node-2, juicefs-namespace=kube-system
    ...
    please get diagnose_juicefs_1628069696.tar.gz for diagnostics

    All relevant information is collected and packaged in a ZIP archive under the execution path.