diff --git a/docs/ceph_archive/mount_ceph_volume.md b/docs/ceph/mount_ceph_volume.md similarity index 100% rename from docs/ceph_archive/mount_ceph_volume.md rename to docs/ceph/mount_ceph_volume.md diff --git a/docs/ceph_archive/performance/ceph_benchmark.md b/docs/ceph/performance/ceph_benchmark.md similarity index 100% rename from docs/ceph_archive/performance/ceph_benchmark.md rename to docs/ceph/performance/ceph_benchmark.md diff --git a/docs/kubespray_offline_installation/README.md b/docs/kubespray_offline_installation/README.md deleted file mode 100644 index 8c6e41c..0000000 --- a/docs/kubespray_offline_installation/README.md +++ /dev/null @@ -1,265 +0,0 @@ -# How to deploy Kubernetes cluster with Kubespray -> Kubespray is an open-source project used to deploy a production ready Kubernetes cluster by using Ansible. [Read more](https://kubespray.io/#/) - -## Ansible prerequisites -Since the **Kubespray** is using **Ansible** to deploy a Kubernetes cluster, you will -need to setup some requirements to enable Ansible to have permission to access -to each node of the cluster. -* Enable **SSH** access without passphrase - - Creating authentication key pairs for SSH on all nodes - ``` - [tmax@c30 ~]$ ssh-keygen - Generating public/private rsa key pair. - Enter file in which to save the key (/home/tmax/.ssh/id_rsa): - Created directory '/home/tmax/.ssh'. - Enter passphrase (empty for no passphrase): - Enter same passphrase again: - Your identification has been saved in /home/tmax/.ssh/id_rsa. - Your public key has been saved in /home/tmax/.ssh/id_rsa.pub. - The key fingerprint is: - SHA256:FUnSTSKhFhz//IjmvI+JIqtH2QIyuYSIpx7VyNRIowI tmax@c30 - The key's randomart image is: - +---[RSA 2048]----+ - |E .oo.o.++++. | - |. .o...+ ooo. | - |=oo o o . . | - |X..+ o + | - |o*.o S o | - |o.+ . . o | - |.... o . . | - | .o . = o | - |.o.o .. *o. | - +----[SHA256]-----+ - ``` - - - Copying the public key to each node - ``` - [tmax@c30 ~]$ ssh-copy-id -i ~/.ssh/id_rsa tmax@192.168.0.31 - /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/tmax/.ssh/id_rsa.pub" - The authenticity of host '192.168.0.31 (192.168.0.31)' can't be established. - ECDSA key fingerprint is SHA256:BYEB4B9fYUgzYs52jih/eDn3GkibncvdcTM6kUvha+s. - ECDSA key fingerprint is MD5:69:6f:cf:50:4f:13:1a:91:1a:e1:8f:0d:7a:5f:69:ee. - Are you sure you want to continue connecting (yes/no)? yes - /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed - /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys - tmax@192.168.0.31's password: - - Number of key(s) added: 1 - - Now try logging into the machine, with: "ssh 'tmax@192.168.0.31'" - and check to make sure that only the key(s) you wanted were added. - ``` - -* Enable **sudo** privileges without password on all nodes - ```shell - sudo -i - echo 'tmax ALL=(ALL) NOPASSWD:ALL' > /etc/sudoers.d/tmax - # change 'tmax' to your username - ``` - -# Online Environment -In **online** environment where all of your cluster nodes can access to the -Internet,deploying a Kubernetes cluster can be done easily with just a few -commands. - -* Get Kubespray from Github - ```shell - git clone https://github.com/kubernetes-sigs/kubespray.git - cd kubespray - ``` - -* Install dependencies from `requirements.txt` - ```shell - sudo pip3 install -r requirements.txt - ``` - -* Copy `inventory/sample` as `inventory/mycluster` - ```shell - cp -rfp inventory/sample inventory/mycluster - ``` - -* Update Ansible inventory file with inventory builder - ```shell - declare -a IPS=(192.168.0.31 192.168.0.32 192.168.0.33) - CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]} - ``` - -* Review and change parameters under `inventory/mycluster/group_vars` to deploy your desired cluster - ```shell - cat inventory/mycluster/group_vars/all/all.yml - cat inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml - ``` - -* Deploy a Kubernetes cluster - ```shell - ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml - ``` - -# Offline Environment -In **offline** environment where all of your cluster nodes are not able to access to -the Internet, deploying a Kubernetes cluster can be complicated and troublesome. - -* To deploy Kubernetes cluster, you will need to: - 1. **Install Kubespray's requirement packages** (Python packages) - > Packages (requirements.txt): ansible==2.7.16, jinja2==2.10.1, netaddr==0.7.19, - pbr==5.2.0, hvac==0.8.2, jmespath==0.9.4, ruamel.yaml==0.15.96 - - 2. **Install Kubernetes's required packages** (deb for Ubuntu, rpm for CentOS) - > Packages: docker, python-apt, aufs-tools, apt-transport-https, software-properties-common, ebtables, etc... - - 3. **Download binary files and Docker images** - > Binary: kubeadm, kubelet, kubectl, etcd, etcdctl, calicoctl, etc... - - > Images: kube-proxy, kube-apiserver, kube-controller-manager, kube-scheduler, - etcd, pause, calico_node, calico_cni, calico_kuber-controllers, k8s-dns-node-cache, coredns, nginx, etc... - - 4. **Modify parameter variables** - > Modify some parameters to install from cache or private server - - 5. **Execute Ansible-playbook** - > Deploy or Reset the Kubernetes cluster - -## 1. Install Kubespray's required packages (Python packages) -In order to install Python packages on offline node, you first need to download all Python required packages and dependencies, copy to the node you wish to run *ansible-playbook* to deploy the Kubernetes cluster, and install them manually. -> Assume that all required Python packages and dependencies are already downloaded. -[Read more](pip_install_kubespray_requirements.md) - -* Installing required packages in `requirements.txt` from local directory - ```shell - sudo pip3 install --no-index --find-links=/path/to/pkg/ -r requirements.txt - ``` -* In case you want to install from private WebServer (example: 192.168.0.200) - ```shell - sudo pip3 install --index-url http://192.168.0.200/pip-pkg/ -r requirements.txt - ``` - -## 2. Install Kubernetes's required packages -In order to install packages on offline node, you first need to download all required packages and dependencies, copy to all nodes, and then install them manually. However, I personally recommend creating **local repository**. -> Assume that all required packages and dependencies are already downloaded. - -* Installing **deb** packages on Ubuntu - ```shell - sudo dpkg -i *.deb - ``` - If you want to create Ubuntu local repository, [read this](create_ubuntu_repository.md) - -* Installing **rpm** packages on CentOS - ```shell - sudo rpm -ivh *.rpm - ``` - If you want to create CentOS local repository, [read this](create_centos_repository.md) - -## 3. Download binary files and Docker images -When we execute *ansible-playbook* to deploy the Kubernetes cluster, Kubespray will download binary files (`kubeadm`, `kubelet`, `kubectl`, `etcdctl`, `calicoctl`, ...), also pull container images (`kube-proxy`, `kube-apiserver`, `kube-controller-manager`, `kube-scheduler`, `etcd`, ...) to initialize the cluster. However, this will be failed because you don't have the Internet connection to download them from the official URL address. - -To solve this problem, you have two choices: - 1. [**Cache**] Download all required *binary* files, and then store them in `download_cache_dir` location. For *Docker images*, pull and save container images as `tar` file, and then store them in `download_cache_dir`/images location. [Read more](kubespray_offline_with_cache.md) - - 2. [**Local Webserver**] Download all required *binary* files and then upload to local WebServer. For *Docker* images, create local Docker registry, - pull required images from official registry and then push them to local registry. - [Read more](kubespray_offline_with_private_server.md) - - -### Binary files -Ensure that you have downloaded the correct **version** of the binary which is declared in `roles/download/defaults/main.yml`. -```yaml -download_cache_dir: /tmp/kubespray_cache - -# Arch of Docker images and needed packages -image_arch: "{{host_architecture | default('amd64')}}" - -# Versions -kube_version: v1.17.2 -etcd_version: v3.3.12 -# More... - -# Binary File Download URLs -kubelet_download_url: "https://storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubelet" -kubectl_download_url: "https://storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubectl" -kubeadm_download_url: "https://storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubeadm" -etcd_download_url: "https://github.com/coreos/etcd/releases/download/{{ etcd_version }}/etcd-{{ etcd_version }}-linux-{{ image_arch }}.tar.gz" -cni_download_url: "https://github.com/containernetworking/plugins/releases/download/{{ cni_version }}/cni-plugins-linux-{{ image_arch }}-{{ cni_version }}.tgz" -calicoctl_download_url: "https://github.com/projectcalico/calicoctl/releases/download/{{ calico_ctl_version }}/calicoctl-linux-{{ image_arch }}" -crictl_download_url: "https://github.com/kubernetes-sigs/cri-tools/releases/download/{{ crictl_version }}/crictl-{{ crictl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz" -``` - -### Docker images -Ensure that you have downloaded the correct **tag version** of image which is -declared in `roles/download/defaults/main.yml`. -```yaml -# image repo define -gcr_image_repo: "gcr.io" -kube_image_repo: "k8s.gcr.io" -docker_image_repo: "docker.io" -quay_image_repo: "quay.io" - -# Container image name and tag -kube_proxy_image_repo: "{{ kube_image_repo }}/kube-proxy" -kube_proxy_image_tag: "{{ kube_version }}" -etcd_image_repo: "{{ quay_image_repo }}/coreos/etcd" -etcd_image_tag: "{{ etcd_version }}{%- if image_arch != 'amd64' -%}-{{ image_arch }}{%- endif -%}" -calico_node_image_repo: "{{ docker_image_repo }}/calico/node" -calico_node_image_tag: "{{ calico_version }}" -coredns_image_repo: "{{ docker_image_repo }}/coredns/coredns" -coredns_image_tag: "1.6.7" -# More... -``` -## 4. Modify parameter variables -After you have downloaded binary files and container images, you will need to -change some parameter **variables** to tell Kubespray to install in offline mode. -Those variables can be modified in `roles/download/defaults/main.yml`. - -> It is recommended to add or modify parameter variables in `inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml` because this will overwrite parameter variables in `roles/download/defaults/main.yml`. - -* **In case, you choose to install from cache**, [Read more](kubespray_offline_with_cache.md) - ```yaml - # Download cache directory - download_cache_dir: /tmp/kubespray_cache - # Run download binary files and container images only once - download_run_once: true - # Use the local_host for download_run_once mode - download_localhost: true - ``` - -* **In case, you choose to install from local WebServer**, [Read more](kubespray_offline_with_private_server.md) - ```yaml - # Download binary URL - hcs_url: "http://192.168.0.200/binary" - kubelet_download_url: "{{ hcs_url }}/storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubelet" - kubectl_download_url: "{{ hcs_url }}/storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubectl" - kubeadm_download_url: "{{ hcs_url }}/storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubeadm" - etcd_download_url: "{{ hcs_url }}/github.com/coreos/etcd/releases/download/{{ etcd_version }}/etcd-{{ etcd_version }}-linux-{{ image_arch }}.tar.gz" - cni_download_url: "{{ hcs_url }}/github.com/containernetworking/plugins/releases/download/{{ cni_version }}/cni-plugins-linux-{{ image_arch }}-{{ cni_version }}.tgz" - calicoctl_download_url: "{{ hcs_url }}/github.com/projectcalico/calicoctl/releases/download/{{ calico_ctl_version }}/calicoctl-linux-{{ image_arch }}" - crictl_download_url: "{{ hcs_url }}/github.com/kubernetes-sigs/cri-tools/releases/download/{{ crictl_version }}/crictl-{{ crictl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz" - - # Docker registry - docker_insecure_registries: - - 192.168.0.200:5000 - hcs_image_repo: "192.168.0.200:5000" - docker_image_repo: "{{ hcs_image_repo }}" - quay_image_repo: "{{ hcs_image_repo }}" - gcr_image_repo: "{{ hcs_image_repo }}" - kube_image_repo: "{{ gcr_image_repo }}/google-containers" - ``` - -## 5. Execute Ansible-playbook -* To deploy cluster: - ```shell - ansible-playbook -i inventory/mycluster/hosts.yaml \ - --become --become-user=root cluster.yml - ``` - -* To reset cluster: - ```shell - ansible-playbook -i inventory/mycluster/hosts.yaml \ - --become --become-user=root reset.yml - ``` - -## Related documents -* [How to deploy Kubernetes cluster in offline environment with cache (Kubespray)](kubespray_offline_with_cache.md) -* [How to deploy Kubernetes cluster in offline environment with local WebServer (kubespray)](kubespray_offline_with_private_server.md) -* [How to create local repository (Ubuntu)](create_ubuntu_repository.md) -* [How to create local repository (CentOS)](create_centos_repository.md) -* [How to create local registry (Docker)](create_docker_registry.md) -* [How to install Python packages in offline environment (pip3 install)](pip_install_kubespray_requirements.md) diff --git a/docs/kubespray_offline_installation/create_centos_repository.md b/docs/kubespray_offline_installation/create_centos_repository.md deleted file mode 100644 index fa46f50..0000000 --- a/docs/kubespray_offline_installation/create_centos_repository.md +++ /dev/null @@ -1,166 +0,0 @@ -# How to create local Yum Repository -> Tested on Centos 7 - -## Requirements - 1. Access to a user account with **root** or **sudo** privileges - 2. Packages: - * yum: Yellowdog Updater Modified (installed by default) - * yum-utils: Utilities based around the yum package manager - * httpd: web server (apache) - * createrepo: A tool used to create yum repository - 3. rpm package files ({package_name}.rpm) - -## Installing the required packages - ```shell - sudo yum install yum-utils - sudo yum install httpd - sudo yum install createrepo - ``` - -## Create the repository directory Structure -* Create a directory for an HTTP repository using: - ```shell - sudo mkdir -p /var/www/html/hcs-yum/packages - ``` - -* Move your rpm files (.rpm) to repository package directory - ```shell - sudo mv /path/to/my-packages/*.rpm /var/www/html/hcs-yum/packages - ``` - -## [Optional] Synchronize HTTP repositories (mirror) -> You can download a local copy of the original official CentOS -repositories to your server by using `reposync` command. - -* Create directory to store repositories - ```shell - sudo mkdir –p /var/www/html/hcs-yum/{base,centosplus,extras,updates} - ``` - -* To download the official CentOS **base** repository: - ```shell - sudo reposync -g -l -d -m --repoid=base --newest-only --download-metadata --download_path=/var/www/html/hcs-yum/ - ``` - -* To download the official CentOS **centosplus** repository: - ```shell - sudo reposync -g -l -d -m --repoid=centosplus --newest-only --download-metadata --download_path=/var/www/html/hcs-yum/ - ``` - -* To download the official CentOS **extras** repository: - ```shell - sudo reposync -g -l -d -m --repoid=extras --newest-only --download-metadata --download_path=/var/www/html/hcs-yum/ - ``` - -* To download the official CentOS **updates** repository: - ```shell - sudo reposync -g -l -d -m --repoid=updates --newest-only --download-metadata --download_path=/var/www/html/hcs-yum/ - ``` - -* In the previous commands, the options are as follows: - ```shell - -g – lets you remove or uninstall packages on CentOS that fail a GPG check - -l – yum plugin support - -d – lets you delete local packages that no longer exist in the repository - -m – lets you download comps.xml files, useful for bundling groups of packages by function - --repoid – specify repository ID - --newest-only – only download the latest package version, helps manage the size of the repository - --download-metadata – download non-default metadata - --download-path – specifies the location to save the packages - ``` - -## Create the Repository -* Using **createrepo** utility to create a repository - ```shell - sudo createrepo /var/www/html/hcs-yum - ``` - -## Make the Apache HTTP Server accessible -* Start HTTP service - ```shell - sudo systemctl restart httpd - ``` - -* Enable HTTP serive start automatically on system boot - ```shell - sudo systemctl enable httpd - ``` - -* check all the allowed services - ```shell - sudo firewall-cmd --list-all - ``` - -* check if http service is enabled - ```shell - [root@c30 html]# sudo firewall-cmd --list-all - public (active) - target: default - icmp-block-inversion: no - interfaces: enp0s3 - sources: - services: dhcpv6-client ssh - ports: - protocols: - masquerade: no - forward-ports: - source-ports: - icmp-blocks: - rich rules: - ``` - There are only **dhcpv6-client ssh** services are enabled. - **http** service or port **80** need be to enabled. - -* add HTTP service and/or port 80 - ```shell - sudo firewall-cmd --add-service=http --permanent - sudo firewall-cmd --add-port=80/tcp --permanent - ``` - -* restart firewalld to apply changes - ```shell - sudo firewall-cmd --reload - ``` -* check if http service / port 80 is enabled - ```shell - [root@c30 html]# sudo firewall-cmd --list-all - public (active) - target: default - icmp-block-inversion: no - interfaces: enp0s3 - sources: - services: dhcpv6-client http ssh - ports: 80/tcp - protocols: - masquerade: no - forward-ports: - source-ports: - icmp-blocks: - rich rules: - ``` - -## On Client system, setup local yum repository -* Preventing **yum** from downloading from wrong location - ```shell - sudo mv /etc/yum.repos.d/*.repo /tmp/yum_repo_backup/ - ``` - -* Creating new repository config file: - ```shell - cat < /etc/yum.repos.d/hcs.repo - [hcs] - name=HyperCloud-sds Repository - baseurl=http://192.168.0.30/hcs-yum - enabled=1 - gpgcheck=0 - EOF - - # change '192.168.0.30' to your server IP address - ``` - -* Installing new packages - ```shell - sudo yum clean all - sudo yum update - sudo yum install MYPAKCAGE_NAME - ``` diff --git a/docs/kubespray_offline_installation/create_docker_registry.md b/docs/kubespray_offline_installation/create_docker_registry.md deleted file mode 100644 index ed583f7..0000000 --- a/docs/kubespray_offline_installation/create_docker_registry.md +++ /dev/null @@ -1,81 +0,0 @@ -# Create Docker registry -> Tested on Ubuntu-server 18.04.3 LTS (bionic) - -## Setup Docker environment -* Install packages to allow apt to use a repository over HTTP/HTTPS -```shell -apt-get update && apt-get install -y \ - apt-transport-https ca-certificates curl software-properties-common gnupg2 -``` -* Add Docker’s official GPG key -```shell -curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - -``` - -* Add Docker apt repository. -```shell -add-apt-repository \ - "deb [arch=amd64] https://download.docker.com/linux/ubuntu \ - $(lsb_release -cs) stable" -``` -* Install Docker CE. -```shell -apt-get update && apt-get install -y \ - containerd.io=1.2.10-3 \ - docker-ce=6:19.03.4~3-0~ubuntu-$(lsb_release -cs) \ - docker-ce-cli=5:19.03.4~3-0~ubuntu-$(lsb_release -cs) -``` -* Setup daemon. -```shell -IP=$(hostname -I | cut -d' ' -f1) -cat > /etc/docker/daemon.json < Server IP는 `192.168.0.200`로 가정합니다. - -* Start docker registry -```shell -sudo docker run -it -d -p 5000:5000 \ - -v ~/docker_images:/var/lib/registry \ - --name hcs-registry registry:latest -``` -* Docker registry 확인하기 -```shell -curl -X GET http://192.168.0.200:5000/v2/_catalog -``` -* Pull some images from the original hub -```shell -sudo docker pull gcr.io/google-containers/kube-proxy:v1.17.2 -``` -* Tag the image so that it points to our registry -```shell -sudo docker tag gcr.io/google-containers/kube-proxy:v1.17.2 \ - 192.168.0.200:5000/google-containers/kube-proxy:v1.17.2 -``` -* Push image to our registry -```shell -sudo docker push 192.168.0.200:5000/google-containers/kube-proxy:v1.17.2 -``` -* Docker registry 확인하기 -```shell -curl -X GET 192.168.0.200:5000/v2/_catalog -curl -X GET 192.168.0.200:5000/v2/google-containers/kube-proxy/tags/list -``` diff --git a/docs/kubespray_offline_installation/create_ubuntu_repository.md b/docs/kubespray_offline_installation/create_ubuntu_repository.md deleted file mode 100644 index 85a2204..0000000 --- a/docs/kubespray_offline_installation/create_ubuntu_repository.md +++ /dev/null @@ -1,91 +0,0 @@ -# How to Create an Authenticated Repository -> Tested on Ubuntu-server 18.04.3 LTS (bionic) - -## Requirements - 1. Packages: apt-utils (기본적으로 설치됨), dpkg-dev, a web server (apache2 or - nginx), and dpkg-sig - 2. Base Directory for Repository - 3. .deb file (deb 패키지 파일) - -## Installing the Required Packages -```shell -sudo apt-get install dpkg-dev -sudo apt-get install apache2 -sudo apt-get install dpkg-sig -``` - -## Create the Repository Directory Structure -* apache2 web server 사용할 때 `/var/www/html` Directory에서 만들어야 합니다. -```shell -sudo mkdir -p /var/www/html/hcs-repo/binary -``` -* [OR] symbolic link 사용하셔도 됩니다. -```shell -sudo ln -s ~/repo-dir /var/www/html/repo-dir -``` -* .deb 패키지 파일을 binary Directory에 이동 -```shell -sudo mv /path/to/my/Packages.deb /var/www/html/hcs-repo/binary -``` - -## Authenticating Repository and Packages -* GPG key pair를 만들기 -```shell -gpg --gen-key -# Input Real name, Email, and passphrase -``` -* GPG key를 확인하기 -```shell -gpg --list-keys -# Output: -#/home/tmax/.gnupg/pubring.kbx -#----------------------------- -#pub rsa3072 2020-03-12 [SC] [expires: 2022-03-12] -# 9359DA7C2594A5C90E90421E1965FFAEB9D75E4B -#uid [ultimate] hcs-ck34 -#sub rsa3072 2020-03-12 [E] [expires: 2022-03-12] -``` -* Public key를 가져오기 -```shell -sudo gpg --output GPGkey --armor --export 9359DA7C2594A5C90E90421E1965FFAEB9D75E4B -``` -* Public key (GPGkey) Repository Directory에 복사하기 -```shell -sudo mv GPGkey /var/www/html/hcs-repo/GPGkey -``` -* Change the ownership of the Directory Structure -```shell -sudo chown -R tmax:tmax -R /var/www/html/hcs-repo -``` -* .deb 파일이랑 같은 Directory에서 `Packages` `Packages.gz` index file 만들기 -```shell -cd /var/www/html/hcs-repo/binary -apt-ftparchive packages . > Packages -gzip -c Packages > Packages.gz -``` -* .deb 파일이랑 같은 Directory에서 `Release` `InRelease` `Release.gpg` file 만들기 -```shell -cd /var/www/html/hcs-repo/binary -apt-ftparchive release . > Release -gpg --yes --clearsign -o InRelease Release -gpg --yes -abs -o Release.gpg Release -``` - -## On Client node -* 우리 Repository url 추가: (Repository server IP는 `192.168.0.200`로 가정합니다.) -```shell -echo 'deb http://192.168.0.200/hcs-repo/binary /' >> /etc/apt/source.list -``` -* Download and add Repository's public key (GPGkey) -```shell -wget -O - http://192.168.0.200/hcs-repo/GPGkey | sudo apt-key add - -``` -* 추가한 public key 확인하기 -```shell -apt-key list -``` -* Update and install packages -```shell -sudo apt update -sudo apt install my_package.deb -``` diff --git a/docs/kubespray_offline_installation/download_binary.sh b/docs/kubespray_offline_installation/download_binary.sh deleted file mode 100755 index bd85a09..0000000 --- a/docs/kubespray_offline_installation/download_binary.sh +++ /dev/null @@ -1,110 +0,0 @@ -#!/bin/bash -set -eo pipefail - -if [ "$EUID" -ne 0 ]; then - echo "Please run as root" - exit 1 -fi - -if [ -z "$1" ]; then - echo "USAGE: $0 /path/to/k8s-cluster.yml" - exit 1 -fi - -BINARY_DIR="/var/www/html/binary" -CLUSTER_YML="$1" - -echo "Download location: ${BINARY_DIR}" - -# download kubelet kubectl kubeadm -ARCH="amd64" -VERSION=$(grep ^kube_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -LOC="storage.googleapis.com/kubernetes-release/release/${VERSION}/bin/linux/${ARCH}" -for FILE in "kubelet" "kubectl" "kubeadm"; do - echo -e "\n[Download] ${FILE}:${VERSION}" - mkdir -p ${BINARY_DIR}/"${LOC}" - TARGET="${BINARY_DIR}/${LOC}/${FILE}" - wget -q -O "${TARGET}" "https://${LOC}/${FILE}" - ls -lh "${TARGET}" - CHKSUM=$(sha256sum "${TARGET}" | awk '{print $1}') - echo -e "[SHA256SUM] ${FILE}:${VERSION} - ${CHKSUM}" - case ${FILE} in - "kubelet" ) - sed -i "s/^\(\s*kubelet_binary_checksum\s*:\s*\).*/\1${CHKSUM}/" "${CLUSTER_YML}" - ;; - "kubectl" ) - sed -i "s/^\(\s*kubectl_binary_checksum\s*:\s*\).*/\1${CHKSUM}/" "${CLUSTER_YML}" - ;; - "kubeadm" ) - sed -i "s/^\(\s*kubeadm_binary_checksum\s*:\s*\).*/\1${CHKSUM}/" "${CLUSTER_YML}" - ;; - * ) - echo "[ERROR] Not match binary file: ${FILE}" - exit 1 - ;; - esac - echo "------------------------------------------------------------" -done - -# download etcd binary -VERSION=$(grep ^etcd_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -LOC="github.com/coreos/etcd/releases/download/${VERSION}" -FILE="etcd-${VERSION}-linux-${ARCH}.tar.gz" -echo -e "\n[Download] ${FILE}:${VERSION}" -mkdir -p "${BINARY_DIR}"/"${LOC}" -TARGET="${BINARY_DIR}/${LOC}/${FILE}" -wget -q -O "${TARGET}" "https://${LOC}/${FILE}" -ls -lh "${TARGET}" -CHKSUM=$(sha256sum "${TARGET}" | awk '{print $1}') -echo -e "[SHA256SUM] ${FILE}:${VERSION} - ${CHKSUM}" -sed -i "s/^\(\s*etcd_binary_checksum\s*:\s*\).*/\1${CHKSUM}/" "${CLUSTER_YML}" -echo "------------------------------------------------------------" - -# download cni-plugins -VERSION=$(grep ^cni_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -LOC="github.com/containernetworking/plugins/releases/download/${VERSION}" -FILE="cni-plugins-linux-${ARCH}-${VERSION}.tgz" -echo -e "\n[Download] ${FILE}:${VERSION}" -mkdir -p ${BINARY_DIR}/"${LOC}" -TARGET="${BINARY_DIR}/${LOC}/${FILE}" -wget -q -O "${TARGET}" "https://${LOC}/${FILE}" -ls -lh "${TARGET}" -CHKSUM=$(sha256sum "${TARGET}" | awk '{print $1}') -echo -e "[SHA256SUM] ${FILE}:${VERSION} - ${CHKSUM}" -sed -i "s/^\(\s*cni_binary_checksum\s*:\s*\).*/\1${CHKSUM}/" "${CLUSTER_YML}" -echo "------------------------------------------------------------" - -# download calicoctl -VERSION=$(grep ^calico_ctl_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -LOC="github.com/projectcalico/calicoctl/releases/download/${VERSION}" -FILE="calicoctl-linux-${ARCH}" -echo -e "\n[Download] ${FILE}:${VERSION}" -mkdir -p ${BINARY_DIR}/"${LOC}" -TARGET="${BINARY_DIR}/${LOC}/${FILE}" -wget -q -O $"{TARGET}" "https://${LOC}/${FILE}" -ls -lh $"{TARGET}" -CHKSUM=$(sha256sum "${TARGET}" | awk '{print $1}') -echo -e "[SHA256SUM] ${FILE}:${VERSION} - ${CHKSUM}" -sed -i "s/^\(\s*calicoctl_binary_checksum\s*:\s*\).*/\1${CHKSUM}/" "${CLUSTER_YML}" -echo "------------------------------------------------------------" - -# download crictl -crictl_versions="v1.15.0 v1.16.1 v1.17.0" -kube_version=$(grep ^kube_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -major_version=${kube_version::-2} # v1.15.3 -> v1.15 -for VERSION in ${crictl_versions}; do - if [[ ${VERSION} =~ ${major_version} ]]; then - LOC="github.com/kubernetes-sigs/cri-tools/releases/download/${VERSION}" - FILE="crictl-${VERSION}-linux-${ARCH}.tar.gz" - echo -e "\n[Download] ${FILE}" - mkdir -p ${BINARY_DIR}/"${LOC}" - TARGET="${BINARY_DIR}/${LOC}/${FILE}" - wget -q -O "${TARGET}" "https://${LOC}/${FILE}" - ls -lh "${TARGET}" - CHKSUM=$(sha256sum "${TARGET}" | awk '{print $1}') - echo -e "[SHA256SUM] ${FILE}:${VERSION} - ${CHKSUM}" - sed -i "s/^\(\s*crictl_binary_checksum\s*:\s*\).*/\1${CHKSUM}/" "${CLUSTER_YML}" - echo "------------------------------------------------------------" - break - fi -done diff --git a/docs/kubespray_offline_installation/kubespray_offline_with_cache.md b/docs/kubespray_offline_installation/kubespray_offline_with_cache.md deleted file mode 100644 index 6b54acd..0000000 --- a/docs/kubespray_offline_installation/kubespray_offline_with_cache.md +++ /dev/null @@ -1,98 +0,0 @@ -# How to run kubespray in offline environment (cache) -> tested on ubuntu-server 18.04.3 LTS (bionic) - -## Get kubespray -* clone kubespray github -```shell -git clone https://github.com/kubernetes-sigs/kubespray.git -cd kubespray -``` -* Install dependencies from `requirements.txt` -```shell -sudo pip install -r requirements.txt -``` -* Copy `inventory/sample` as `inventory/mycluster` -```shell -cp -rfp inventory/sample inventory/mycluster -``` -* Update Ansible inventory file with inventory builder -```shell -declare -a IPS=(192.168.0.201 192.168.0.202 192.168.0.203) -CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]} -``` -* Review and change parameters under `inventory/mycluster/group_vars` -```shell -cat inventory/mycluster/group_vars/all/all.yml -cat inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml -``` -* Modify or Add environment variables in `k8s-cluster.yml` -```yml -# Download cache directory -download_cache_dir: /tmp/kubespray_cache -# Run download binary files and container images only once -download_run_once: true -# Use the local_host for download_run_once mode -download_localhost: true -``` - -* Make sure that `download_cache_dir` (/tmp/kubespray_cache) contains all required binary files and images -```shell -$ tree /tmp/kubespray_cache -. -├── calicoctl -├── cni-plugins-linux-amd64-v0.8.5.tgz -├── images -│   ├── docker.io_calico_cni_v3.11.1.tar -│   ├── docker.io_calico_kube-controllers_v3.11.1.tar -│   ├── docker.io_calico_node_v3.11.1.tar -│   ├── docker.io_coredns_coredns_1.6.7.tar -│   ├── docker.io_library_nginx_1.17.tar -│   ├── gcr.io_google_containers_kubernetes-dashboard-amd64_v1.10.1.tar -│   ├── k8s.gcr.io_cluster-proportional-autoscaler-amd64_1.6.0.tar -│   ├── k8s.gcr.io_k8s-dns-node-cache_1.15.8.tar -│   ├── k8s.gcr.io_kube-apiserver_v1.17.2.tar -│   ├── k8s.gcr.io_kube-controller-manager_v1.17.2.tar -│   ├── k8s.gcr.io_kube-proxy_v1.17.2.tar -│   ├── k8s.gcr.io_kube-scheduler_v1.17.2.tar -│   ├── k8s.gcr.io_pause_3.1.tar -│   └── quay.io_coreos_etcd_v3.3.12.tar -├── kubeadm-v1.17.2-amd64 -├── kubectl-v1.17.2-amd64 -└── kubelet-v1.17.2-amd64 - -1 directory, 19 files -``` - -* Make sure required packages are already installed, or create local repository -see [create_ubuntu_repository](create_ubuntu_repository) -> ~/kubespray/roles/kubernetes/preinstall/vars/ubuntu.yml - required_pkgs: docker, python-minimal, python-apt, aufs-tools, apt-transport-https, software-properties-common, ebtables, etc... - -* Edit Ubuntu docker repository url and GPGkey -File: `kubespray/roles/container-engine/docker/vars/ubuntu-amd64.yml` - ```shell - docker_repo_key_info: - pkg_key: apt_key - url: 'http://192.168.0.200/hcs-repo/GPGkey' - repo_keys: - - 65B6A53C5C3D875175F3F59F906CC4D42DF1595D - -docker_repo_info: - pkg_repo: apt_repository - repos: - - > - deb http://192.168.0.200/hcs-repo/deb / - ``` - - -## Run ansible-playbook to create/reset Kubernetes cluster -* To create cluster: -```shell -ansible-playbook -i inventory/mycluster/hosts.yaml \ - --become --become-user=root cluster.yml -``` -* To reset cluster: -```shell -ansible-playbook -i inventory/mycluster/hosts.yaml \ - --become --become-user=root reset.yml -``` diff --git a/docs/kubespray_offline_installation/kubespray_offline_with_private_server.md b/docs/kubespray_offline_installation/kubespray_offline_with_private_server.md deleted file mode 100644 index c660741..0000000 --- a/docs/kubespray_offline_installation/kubespray_offline_with_private_server.md +++ /dev/null @@ -1,146 +0,0 @@ -# How to run kubespray in offline environment with private server -> tested on ubuntu-server 18.04.3 LTS (bionic) - -## Get kubespray -* clone kubespray github -```shell -git clone https://github.com/kubernetes-sigs/kubespray.git -cd kubespray -``` -* Install dependencies from `requirements.txt` -```shell -sudo pip install -r requirements.txt -``` -* Copy `inventory/sample` as `inventory/mycluster` -```shell -cp -rfp inventory/sample inventory/mycluster -``` -* Update Ansible inventory file with inventory builder -```shell -declare -a IPS=(192.168.0.201 192.168.0.202 192.168.0.203) -CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]} -``` -* Review and change parameters under `inventory/mycluster/group_vars` -```shell -cat inventory/mycluster/group_vars/all/all.yml -cat inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml -``` -* Edit environment variables in `k8s-cluster.yml` -```yml -# Default cluster component -container_manager: docker -kube_network_plugin: calico -# Version -kube_version: v1.17.2 -etcd_version: v3.3.12 -cni_version: "v0.8.3" -calico_version: "v3.11.1" -calico_ctl_version: "v3.11.1" -calico_cni_version: "v3.11.1" -calico_policy_version: "v3.11.1" -nodelocaldns_version: "1.15.8" -coredns_version: "1.6.0" -dnsautoscaler_version: 1.6.0 -pod_infra_version: 3.1 -nginx_image_tag: 1.17 -dashboard_image_tag: "v1.10.1" -# Download binary -hcs_url: "http://192.168.0.200/binary" -kubelet_download_url: "{{ hcs_url }}/storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubelet" -kubectl_download_url: "{{ hcs_url }}/storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubectl" -kubeadm_download_url: "{{ hcs_url }}/storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubeadm" -etcd_download_url: "{{ hcs_url }}/github.com/coreos/etcd/releases/download/{{ etcd_version }}/etcd-{{ etcd_version }}-linux-{{ image_arch }}.tar.gz" -cni_download_url: "{{ hcs_url }}/github.com/containernetworking/plugins/releases/download/{{ cni_version }}/cni-plugins-linux-{{ image_arch }}-{{ cni_version }}.tgz" -calicoctl_download_url: "{{ hcs_url }}/github.com/projectcalico/calicoctl/releases/download/{{ calico_ctl_version }}/calicoctl-linux-{{ image_arch }}" -crictl_download_url: "{{ hcs_url }}/github.com/kubernetes-sigs/cri-tools/releases/download/{{ crictl_version }}/crictl-{{ crictl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz" -# Checksum [These values will be update automatically when execute download_binary.sh -etcd_binary_checksum: dc5d82df095dae0a2970e4d870b6929590689dd707ae3d33e7b86da0f7f211b6 -cni_binary_checksum: 29a092bef9cb6f26c8d5340f3d56567b62c7ebdb1321245d94b1842c80ba20ba -kubelet_binary_checksum: 680d6afa09cd51061937ebb33fd5c9f3ff6892791de97b028b1e7d6b16383990 -kubectl_binary_checksum: 4475f68c51af23925d7bd7fc3d1bd01bedd3d4ccbb64503517d586e31d6f607c -kubeadm_binary_checksum: 366a7f260cbd1aaa2661b1e3b83a7fc8781c8a8b07c71944bdaf66d49ff5abae -calicoctl_binary_checksum: 045fdbfdb30789194c499ba17c8eac6d1704fe20d05e3c10027eb570767386db -crictl_binary_checksum: c3b71be1f363e16078b51334967348aab4f72f46ef64a61fe7754e029779d45a -# Docker registry -docker_insecure_registries: - - 192.168.0.200:5000 -hcs_image_repo: "192.168.0.200:5000" -docker_image_repo: "{{ hcs_image_repo }}" -quay_image_repo: "{{ hcs_image_repo }}" -gcr_image_repo: "{{ hcs_image_repo }}" -kube_image_repo: "{{ gcr_image_repo }}/google-containers" -``` - -## Add required packages into our local repository -> suppose that `192.168.0.200` is server IP. - -* Create Ubuntu repository -see [here](create_ubuntu_repository.md) -* Check required packages -> ~/kubespray/roles/kubernetes/preinstall/vars/ubuntu.yml - required_pks: python-apt, aufs-tools, apt-transport-https, software-properties-common, ebtables - -* Download required packages and dependencies -```shell -sudo apt-get install --download-only -``` -All downloaded deb files will be saved in `/var/cache/apt/archives` directory -* [optional] use `apt-rdepends` to get all packages -```shell -sudo apt install apt-rdepends -sudo apt download $(apt-rdepends vim | grep -v "^ ") -``` -* Copy downloaded deb packages into our repository directory -```shell -cp *.deb /var/www/html/hcs-repo/bionic/ -``` -* Update `Release` and index files -```shell -cd /var/www/html/hcs-repo/bionic -apt-ftparchive packages . > Packages -gzip -c Packages > Packages.gz -apt-ftparchive release . > Release -gpg --yes --clearsign -o InRelease Release -gpg --yes -abs -o Release.gpg Release -``` - -## Add required binary files into our local web Server -> suppose that `192.168.0.200` is server IP. - -* Create local web server using apache2. -ref: [create_ubuntu_repository](create_ubuntu_repository.md) -* Check required binary files -> ~/kubespray/roles/download/defaults/main.yml -required_file: kubeadm, kubelet, kubectl, etcd, cni, calicoctl, crictl - -* Download required binary files from the original url put into our local web server. -```shell -sudo ./download_binary.sh ~/kubespray/inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml -``` -script: [download_binary.sh](download_binary.sh) -This will download all required binary files with defined version in `k8s-cluster.yml` and store them into our local web server located in `/var/www/html/binary/`. - -## Add required docker images into our local Docker registry -> suppose that `192.168.0.200:5000` is docker registry url. - -* Create local Docker registry -see [here](create_docker_registry.md) - -* Pull and Push required docker images into local registry -```shell -sudo ./push_docker_image.sh 192.168.0.200:5000 \ - ~/kubespray/inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml -``` -script: [push_docker_image.sh](push_docker_image.sh) - -## Run kubespray's ansible-playbook to deploy Kubernetes mycluster -* To create cluster: -```shell -ansible-playbook -i inventory/mycluster/hosts.yaml \ - --become --become-user=root cluster.yml -``` -* To reset cluster: -```shell -ansible-playbook -i inventory/mycluster/hosts.yaml \ - --become --become-user=root reset.yml -``` diff --git a/docs/kubespray_offline_installation/pip_install_guide.md b/docs/kubespray_offline_installation/pip_install_guide.md deleted file mode 100644 index 8a97be0..0000000 --- a/docs/kubespray_offline_installation/pip_install_guide.md +++ /dev/null @@ -1,109 +0,0 @@ -# How to install Python packages via pip -> pip is the package installer for Python. You can us pip to install packages -from the `Python Package Index` and other indexes. [read more](https://pypi.org/project/pip/) - -## Requirements for installing Packages -* Ensure `Python` is installed and you can run `Python` from command line. - ```shell - # the latest version of Python is 3.x - python3 --version - - # Python 3.6.9 - ``` - -* Ensure `pip3` is installed and you can run `pip3` from command line. - ```shell - pip3 --version - - # pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.6) - ``` - -* If `pip3` is not installed, then install it - ```shell - sudo apt-get update - sudo apt-get install -y python3-pip - ``` -* Ensure pip3, setuptools, and wheel are up to Update - ```shell - python -m pip install --upgrade pip3 setuptools wheel - ``` - -## Usage -> we use `pip` in below examples, you can change to `pip3` if you are using the latest version 3.x - -### Installing package -- To install the latest version of **"SomeProject"**: - ```shell - pip install "SomeProject" - ``` -- To install a specific Version: - ```shell - pip install "SomeProject==1.4" - ``` -- To install greater than or equal to one version and less than another: - ```shell - pip install "SomeProject>=1,<2" - ``` -- To install a version that’s “compatible” with a certain version: - ```shell - pip install "SomeProject~=1.4.2" - ``` -### Upgrading packages -* Upgrade an already installed **"SomeProject"** to the latest: - ```shell - pip install --upgrade SomeProject - ``` -### Installing to the User site -* To install packages that are isolated to the current user, use the --user flag: - ```shell - pip install --user SomeProject - ``` -### Installing packages from requirement files -* List of packages in `requirements.txt` - ``` - pkg1 - pkg2==2.1.8 - pkg3>=1.3.0 - pkg4>=1.0,<=2.0 - ``` -* To install a list of requirements specific in a `requirements.txt` - ```shell - pip install -r requirements.txt - ``` -### Installing from VCS -* To install a project from VCS in 'editable' mode: - ```shell - pip install -e git+https://git.repo/some_pkg.git#egg=SomeProject # from git - pip install -e hg+https://hg.repo/some_pkg#egg=SomeProject # from mercurial - pip install -e svn+svn://svn.repo/some_pkg/trunk/#egg=SomeProject # from svn - pip install -e git+https://git.repo/some_pkg.git@feature#egg=SomeProject # from a branch - ``` -### Installing from other indexes -* To install from an alternative indexes - ```shell - pip install --index-url http://my.package.repo/simple/ SomeProject - ``` -* To Search an additional index during install: - ```shell - pip install --extra-index-url http://my.package.repo/simple SomeProject - ``` -### Installing from a local src tree -* Installing from local src in `Development` mode, i.e. in such a way that the project appears to be installed, but yet is still editable from the src tree. - ```shell - pip install -e - ``` -* install normally from source - ```shell - pip install - ``` -### Installing from local archives -* to install a particular source archive file: - ```shell - pip install ./downloads/SomeProject-1.0.4.tar.gz - ``` -* to install from a local directory containing archives: - ```shell - pip install --no-index --find-links=file:///local/dir/ SomeProject - pip install --no-index --find-links=/local/dir/ SomeProject - pip install --no-index --find-links=relative/dir/ SomeProject - ``` diff --git a/docs/kubespray_offline_installation/pip_install_kubespray_requirements.md b/docs/kubespray_offline_installation/pip_install_kubespray_requirements.md deleted file mode 100644 index 8b1bdf5..0000000 --- a/docs/kubespray_offline_installation/pip_install_kubespray_requirements.md +++ /dev/null @@ -1,151 +0,0 @@ -# How to install Kubespray's required packages via pip3 offline -> pip is the package installer for Python. You can us pip to install packages -from the `Python Package Index` and other indexes. [read more](pip_install_guide.md) - -## Kubespray requirement packages -* List of required packages in `requirements.txt` - ```shell - $ cat ./kubespray/requirements.txt - ansible==2.7.16 - jinja2==2.10.1 - netaddr==0.7.19 - pbr==5.2.0 - hvac==0.8.2 - jmespath==0.9.4 - ruamel.yaml==0.15.96 - ``` -* To deploy the cluster with Ansible, you will need to install dependencies from -`requirements.txt` - ```shell - sudo pip3 install -r requirements.txt - ``` - - Make sure that `pip3` is installed, and you can run from command line. - ```shell - pip3 --version - ``` - - However required packages won't be successfully installed if you don't have Internet connection. - -## Installing pip3 -* With Internet connection (Online), you can easily install it by: - ```shell - sudo apt-get update - sudo apt-get install -y python3-pip - ``` -* Without Internet connection (Offline), you have to install it manually. -On Ubuntu, you can download deb package file and its dependencies, then install -by using `dpkg -i *.deb` or create you own [private repository](create_ubuntu_repository.md). - - Based on Ubuntu 18.04.3 LTS, these deb packages need to be installed. - ``` - binutils_2.30-21ubuntu1~18.04.2_amd64.deb - binutils-common_2.30-21ubuntu1~18.04.2_amd64.deb - binutils-x86-64-linux-gnu_2.30-21ubuntu1~18.04.2_amd64.deb - build-essential_12.4ubuntu1_amd64.deb - cpp_4%3a7.4.0-1ubuntu2.3_amd64.deb - cpp-7_7.5.0-3ubuntu1~18.04_amd64.deb - dh-python_3.20180325ubuntu2_all.deb - dpkg-dev_1.19.0.5ubuntu2.3_all.deb - fakeroot_1.22-2ubuntu1_amd64.deb - g++_4%3a7.4.0-1ubuntu2.3_amd64.deb - g++-7_7.5.0-3ubuntu1~18.04_amd64.deb - gcc_4%3a7.4.0-1ubuntu2.3_amd64.deb - gcc-7_7.5.0-3ubuntu1~18.04_amd64.deb - gcc-7-base_7.5.0-3ubuntu1~18.04_amd64.deb - libalgorithm-diff-perl_1.19.03-1_all.deb - libalgorithm-diff-xs-perl_0.04-5_amd64.deb - libalgorithm-merge-perl_0.08-3_all.deb - libasan4_7.5.0-3ubuntu1~18.04_amd64.deb - libatomic1_8.3.0-26ubuntu1~18.04_amd64.deb - libbinutils_2.30-21ubuntu1~18.04.2_amd64.deb - libc6-dev_2.27-3ubuntu1_amd64.deb - libcc1-0_8.3.0-26ubuntu1~18.04_amd64.deb - libc-dev-bin_2.27-3ubuntu1_amd64.deb - libcilkrts5_7.5.0-3ubuntu1~18.04_amd64.deb - libdpkg-perl_1.19.0.5ubuntu2.3_all.deb - libexpat1-dev_2.2.5-3ubuntu0.2_amd64.deb - libfakeroot_1.22-2ubuntu1_amd64.deb - libfile-fcntllock-perl_0.22-3build2_amd64.deb - libgcc-7-dev_7.5.0-3ubuntu1~18.04_amd64.deb - libgomp1_8.3.0-26ubuntu1~18.04_amd64.deb - libisl19_0.19-1_amd64.deb - libitm1_8.3.0-26ubuntu1~18.04_amd64.deb - liblsan0_8.3.0-26ubuntu1~18.04_amd64.deb - libmpc3_1.1.0-1_amd64.deb - libmpx2_8.3.0-26ubuntu1~18.04_amd64.deb - libpython3.6_3.6.9-1~18.04_amd64.deb - libpython3.6-dev_3.6.9-1~18.04_amd64.deb - libpython3.6-minimal_3.6.9-1~18.04_amd64.deb - libpython3.6-stdlib_3.6.9-1~18.04_amd64.deb - libpython3-dev_3.6.7-1~18.04_amd64.deb - libquadmath0_8.3.0-26ubuntu1~18.04_amd64.deb - libstdc++-7-dev_7.5.0-3ubuntu1~18.04_amd64.deb - libtsan0_8.3.0-26ubuntu1~18.04_amd64.deb - libubsan0_7.5.0-3ubuntu1~18.04_amd64.deb - linux-libc-dev_4.15.0-91.92_amd64.deb - make_4.1-9.1ubuntu1_amd64.deb - manpages-dev_4.15-1_all.deb - python3.6_3.6.9-1~18.04_amd64.deb - python3.6-dev_3.6.9-1~18.04_amd64.deb - python3.6-minimal_3.6.9-1~18.04_amd64.deb - python3-crypto_2.6.1-8ubuntu2_amd64.deb - python3-dev_3.6.7-1~18.04_amd64.deb - python3-distutils_3.6.9-1~18.04_all.deb - python3-keyring_10.6.0-1_all.deb - python3-keyrings.alt_3.0-1_all.deb - python3-lib2to3_3.6.9-1~18.04_all.deb - python3-pip_9.0.1-2.3~ubuntu1.18.04.1_all.deb - python3-secretstorage_2.3.1-2_all.deb - python3-setuptools_39.0.1-2_all.deb - python3-wheel_0.30.0-0.2_all.deb - python3-xdg_0.25-4ubuntu1_all.deb - python-pip-whl_9.0.1-2.3~ubuntu1.18.04.1_all.deb - ``` - -## Downloading Kubespray's required packages -* On node with Internet connection, download required packages from `requirements.txt` - ```shell - sudo pip3 download -r requirements.txt - ``` - Then, copy downloaded files to offline node for installation. -* List of downloaded Packages - ```shell - $ tree /tmp/pip_download_pkg_dir/ - . - ├── ansible-2.7.16.tar.gz - ├── bcrypt-3.1.7-cp34-abi3-manylinux1_x86_64.whl - ├── certifi-2019.11.28-py2.py3-none-any.whl - ├── cffi-1.14.0-cp36-cp36m-manylinux1_x86_64.whl - ├── chardet-3.0.4-py2.py3-none-any.whl - ├── configparser-4.0.2-py2.py3-none-any.whl - ├── cryptography-2.8-cp34-abi3-manylinux1_x86_64.whl - ├── hvac-0.8.2-py2.py3-none-any.whl - ├── idna-2.9-py2.py3-none-any.whl - ├── ipaddress-1.0.23-py2.py3-none-any.whl - ├── Jinja2-2.10.1-py2.py3-none-any.whl - ├── jmespath-0.9.4-py2.py3-none-any.whl - ├── MarkupSafe-1.1.1-cp36-cp36m-manylinux1_x86_64.whl - ├── netaddr-0.7.19-py2.py3-none-any.whl - ├── paramiko-2.7.1-py2.py3-none-any.whl - ├── pbr-5.2.0-py2.py3-none-any.whl - ├── pycparser-2.20-py2.py3-none-any.whl - ├── PyNaCl-1.3.0-cp34-abi3-manylinux1_x86_64.whl - ├── PyYAML-5.3.1.tar.gz - ├── requests-2.23.0-py2.py3-none-any.whl - ├── ruamel.yaml-0.15.96-cp36-cp36m-manylinux1_x86_64.whl - ├── ruamel.yaml-0.16.10-py2.py3-none-any.whl - ├── ruamel.yaml.clib-0.2.0-cp36-cp36m-manylinux1_x86_64.whl - ├── setuptools-46.1.1-py3-none-any.whl - ├── six-1.14.0-py2.py3-none-any.whl - └── urllib3-1.25.8-py2.py3-none-any.whl - - 0 directories, 26 files - ``` - -## Installing Kubespray's required packages -* To install packages from local directory: - ```shell - sudo pip3 install --no-index --find-links=/tmp/pip_download_pkg_dir/ -r requirement.txt - ``` -* To install packages from private index (repository): - ```shell - sudo pip3 install --index-url http:/192.168.0.200/hcs-repo/pip/ -r requirements.txt - ``` diff --git a/docs/kubespray_offline_installation/push_docker_image.sh b/docs/kubespray_offline_installation/push_docker_image.sh deleted file mode 100755 index b394098..0000000 --- a/docs/kubespray_offline_installation/push_docker_image.sh +++ /dev/null @@ -1,78 +0,0 @@ -#!/bin/bash - -if [ "$EUID" -ne 0 ]; then - echo "Please run as root" - exit 1 -fi - -if [ "$#" -ne 2 ]; then - echo "[Usage]: $0 {Docker Registry URL} {Path/to/k8s-cluster.yml}" - echo "Example: $0 192.168.0.200:5000 inventory/mycluster/group-vars/k8s-cluster/k8s-cluster.yml" - exit 1 -fi - -REG_URL="$1" -CLUSTER_YML="$2" -IMAGE_LIST=() - -status_code=$(curl -I -k -s "${REG_URL}" | head -n 1 | cut -d ' ' -f 2) -if [[ "$status_code" != "200" ]]; then - echo "[ERROR] Docker registry (${REG_URL}) is not running." - exit 1 -fi - -set -eo pipefail -# Kubespary-based container image -TAG=$(grep ^nginx_image_tag "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -IMAGE_LIST+=("docker.io/library/nginx:${TAG}") - -TAG=$(grep ^calico_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -IMAGE_LIST+=("docker.io/calico/node:${TAG}") - -TAG=$(grep ^calico_cni_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -IMAGE_LIST+=("docker.io/calico/cni:${TAG}") - -TAG=$(grep ^calico_policy_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -IMAGE_LIST+=("docker.io/calico/kube-controllers:${TAG}") - -TAG=$(grep ^coredns_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -IMAGE_LIST+=("docker.io/coredns/coredns:${TAG}") - -TAG=$(grep ^nodelocaldns_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -IMAGE_LIST+=("gcr.io/google-containers/k8s-dns-node-cache:${TAG}") - -TAG=$(grep ^kube_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -IMAGE_LIST+=("gcr.io/google-containers/kube-proxy:${TAG}") -IMAGE_LIST+=("gcr.io/google-containers/kube-apiserver:${TAG}") -IMAGE_LIST+=("gcr.io/google-containers/kube-scheduler:${TAG}") -IMAGE_LIST+=("gcr.io/google-containers/kube-controller-manager:${TAG}") - -TAG=$(grep ^dnsautoscaler_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -IMAGE_LIST+=("gcr.io/google-containers/cluster-proportional-autoscaler-amd64:${TAG}") - -TAG=$(grep ^dashboard_image_tag "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -IMAGE_LIST+=("gcr.io/google_containers/kubernetes-dashboard-amd64:${TAG}") - -TAG=$(grep ^pod_infra_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -IMAGE_LIST+=("gcr.io/google-containers/pause:${TAG}") -IMAGE_LIST+=("gcr.io/google_containers/pause-amd64:${TAG}") - -TAG=$(grep ^etcd_version "${CLUSTER_YML}" | cut -d ' ' -f2 | tr -d '"') -IMAGE_LIST+=("quay.io/coreos/etcd:${TAG}") - - -for image in "${IMAGE_LIST[@]}" -do - echo "" - echo -e "\n[PULL]<- ${image}" - docker pull "${image}" - - ORI_URL=$(echo "${image}" | cut -d '/' -f1) - NEW_IMAGE="${image//${ORI_URL}/${REG_URL}}" - - echo -e "\n[PUSH]-> ${NEW_IMAGE}" - docker tag "${image}" "${NEW_IMAGE}" - docker push "${NEW_IMAGE}" -done -exit - diff --git a/docs/rook/README.md b/docs/rook/README.md new file mode 100644 index 0000000..0f049ff --- /dev/null +++ b/docs/rook/README.md @@ -0,0 +1,43 @@ +# Rook Ceph Cluster + +> Rook Ceph는 hypercloud-sds에서의 고가용성 storage 제공을 위해서 설치하는 모듈입니다. 본 프로젝트에서 cephFS(file system), rbd(block storage)를 제공합니다. + +## Rook Ceph Cluster 설정하는 법 + +* 자세한 방법은 이 [문서](./ceph-cluster-setting.md)를 확인하세요. + +## Rook-Ceph Cluster 제거 + +> Uninstall 작업 후, 초기화를 위해 다음 작업들을 수행해야합니다. + +- k8s Cluster의 모든 노드에서 `/var/lib/rook` directory를 삭제합니다. + + ```shell + $ rm -rf /var/lib/rook + ``` +- 모든 osd의 backend directory 혹은 device를 삭제합니다. + * osd의 backend가 directory인 경우 (backend directory 경로 예시: `/mnt/cephdir`) + + ```shell + $ rm -rf /mnt/cephdir + ``` + * osd의 backend가 device인 경우 (backend device 예시: `sdb`) + + ```shell + # device의 파티션 정보 제거 + $ sgdisk --zap-all /dev/sdb + + # device mapper에 남아있는 ceph-volume 정보 제거 (각 노드당 한 번씩만 수행하면 됨) + $ ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove % + + # /dev에 남아있는 찌꺼기 파일 제거 + $ rm -rf /dev/ceph-* + ``` +## 버전 업그레이드 + +- 가장 최신 버전으로의 업그레이드 가이드는 [이 문서](https://github.com/tmax-cloud/hypercloud-sds/tree/release-1.4/docs/upgrade)를 참고하실 수 있습니다. +- rook 버전 업그레이드는 각 버전별 release branch에 존재합니다. + +## Reference + +* [Rook Ceph Storage official page](https://rook.github.io/docs/rook/v1.4/ceph-storage.html) \ No newline at end of file diff --git a/docs/block-multipool.md b/docs/rook/block-multipool.md similarity index 100% rename from docs/block-multipool.md rename to docs/rook/block-multipool.md diff --git a/docs/block.md b/docs/rook/block.md similarity index 100% rename from docs/block.md rename to docs/rook/block.md diff --git a/docs/ceph-cluster-setting.md b/docs/rook/ceph-cluster-setting.md similarity index 100% rename from docs/ceph-cluster-setting.md rename to docs/rook/ceph-cluster-setting.md diff --git a/docs/ceph-cluster-update.md b/docs/rook/ceph-cluster-update.md similarity index 100% rename from docs/ceph-cluster-update.md rename to docs/rook/ceph-cluster-update.md diff --git a/docs/ceph-command.md b/docs/rook/ceph-command.md similarity index 100% rename from docs/ceph-command.md rename to docs/rook/ceph-command.md diff --git a/docs/ceph_monitoring.md b/docs/rook/ceph_monitoring.md similarity index 100% rename from docs/ceph_monitoring.md rename to docs/rook/ceph_monitoring.md diff --git a/docs/cluster-tuning.md b/docs/rook/cluster-tuning.md similarity index 100% rename from docs/cluster-tuning.md rename to docs/rook/cluster-tuning.md diff --git a/docs/file-multipool.md b/docs/rook/file-multipool.md similarity index 100% rename from docs/file-multipool.md rename to docs/rook/file-multipool.md diff --git a/docs/file.md b/docs/rook/file.md similarity index 100% rename from docs/file.md rename to docs/rook/file.md diff --git a/docs/object-store.md b/docs/rook/object-store.md similarity index 100% rename from docs/object-store.md rename to docs/rook/object-store.md diff --git a/docs/rook.md b/docs/rook/rook.md similarity index 100% rename from docs/rook.md rename to docs/rook/rook.md diff --git a/docs/scripts/rook-ceph-objectstore-test-with-s3.sh b/docs/rook/scripts/rook-ceph-objectstore-test-with-s3.sh similarity index 100% rename from docs/scripts/rook-ceph-objectstore-test-with-s3.sh rename to docs/rook/scripts/rook-ceph-objectstore-test-with-s3.sh diff --git a/docs/rook/troubleshooting.md b/docs/rook/troubleshooting.md new file mode 100644 index 0000000..25512a8 --- /dev/null +++ b/docs/rook/troubleshooting.md @@ -0,0 +1,379 @@ +# Rook Ceph Issues + +## First of all + +아래의 조건들이 지켜졌는지 확인이 필요합니다. + +### OSD 가 배포되지 않은 경우 + +- `cluster.yaml` 파일에 기입한 스토리지 디바이스가 해당 노드에 존재하며 unmount된 상태여야 합니다. +- `v1.3.0` 이상 버전을 배포한 경우 `cluster.yaml` 파일에 `directories:` 관련 설정은 없어야 합니다. + +### ceph cluster가 health_warn 인 경우 + +- `ntp` 패키지가 설치 되어 있고, 특히 monitor가 deploy 된 노드에서는 ntp service가 enable 되어 있는지 확인합니다. + +## 주요 커맨드 + +디버깅 시 유용하게 사용할 수 있는 K8s command들 입니다. + +### 상태 점검 + +- 아래 pod 들이 모두 배포되고, status 가 running 인지 확인합니다. + - csi-cephfsplugin + - csi-cephfsplugin-provisioner + - csi-rbdplugin + - csi-rbdplugin-provisioner + - rook-ceph-crashcollector + - rook-ceph-mgr + - rook-ceph-mon + - rook-ceph-operator + - rook-ceph-osd + - rook-ceph-osd-prepare + - rook-discover +- `cluster.yaml` 파일 설정에 따라서 배포된 pod 의 개수는 다를 수 있습니다. + +``` shell +$ kubectl get pod -n rook-ceph +NAME READY STATUS RESTARTS AGE +csi-cephfsplugin-4fdds 3/3 Running 0 19h +csi-cephfsplugin-provisioner-74964d6869-h8g9q 5/5 Running 0 19h +csi-cephfsplugin-provisioner-74964d6869-vpwh5 5/5 Running 0 19h +csi-rbdplugin-ph28f 3/3 Running 0 19h +csi-rbdplugin-provisioner-79cb7f7cb4-9p8vd 6/6 Running 0 19h +csi-rbdplugin-provisioner-79cb7f7cb4-fdh8m 6/6 Running 0 19h +rook-ceph-crashcollector-hyeongbin-759985c655-pp4f5 1/1 Running 0 19h +rook-ceph-mgr-a-797d9b578-6smnx 1/1 Running 0 19h +rook-ceph-mon-a-55f4754f4f-6tsrs 1/1 Running 0 19h +rook-ceph-operator-657fb97bf9-9lwdg 1/1 Running 0 19h +rook-ceph-osd-0-8dcfdbf5b-z5rw4 1/1 Running 0 19h +rook-ceph-osd-prepare-hyeongbin-sqkgq 0/1 Completed 0 19h +rook-discover-hjxzj 1/1 Running 0 19h +``` + +### OSD가 배포 되지 않은 경우 + +rook-ceph-osd-prepare pod 개수보다 rook-ceph-osd 개수가 작거나, rook-ceph-osd pod이 하나도 존재하지 않은 경우입니다. + +``` shell +# OSD pod이 배포되지 않은 노드 정보를 확인합니다. +$ kubectl get pod -n rook-ceph -o wide + +# cluster.yaml 에서 OSD 배포 설정과 디바이스 정보 설정을 재확인합니다. +``` + +### ceph cluster가 health_warn 인 경우 + +``` shell +# ceph cluster 상태 확인 +$ kubectl describe cephcluster -n rook-ceph rook-ceph + +.... +Status: + Ceph: + Details: + MON_CLOCK_SKEW: + Message: clock skew detected on mon.b + Severity: HEALTH_WARN + Health: HEALTH_WARN + Last Checked: 2020-07-20T01:24:30Z +``` + +Detail 메세지가 MON_CLOCK_SKEW인 경우에는 ntp 패키지 설치 및 enable 여부가 확인이 필요합니다. Ceph monitor 데몬들끼리 데이터 동기화가 제대로 이뤄지지 못할 위험이 있기 때문입니다. + +---------- + +# Containerized Data Importer Issues + +> Containerized Data Importer (CDI) 는 pv 관리를 위한 k8s 의 add-on 으로써 kubevirt 로 vm 을 생성할 때, vm 에 mount 시킬 pvc 에 image 등의 data를 담아 생성할 수 있는 기능을 제공합니다. + +## Troubleshooting Issues + +### 주요 커맨드 + +> 이슈 상황 파악에 용이하게 쓰일 기본적인 커맨드입니다. + +- `kubectl describe cdi` + - output 의 status.Observed Version, status.Operator Version, status.Target Version 을 통해 배포된 cdi 모듈의 version 을 확인할 수 있습니다. +- `kubectl get pod -n cdi` + - cdi namespace 에 떠있는 pod 의 목록을 확인합니다. + - 일반적인 상황에서는 cdi-apiserver, cdi-deployment, cdi-operator, cdi-uploadproxy 의 4 개의 pod 이 떠있습니다. +- `kubectl describe pod {$PodName} -n {$PodNamespace}` + - {$PodName} 이름을 갖는 pod 이 {$PodNamespace} 에 떠있을 때, 해당 pod 의 정보를 확인할 수 있습니다. +- `kubectl logs {$PodName} {$ContainerName} -n {$PodNamespace}` + - {$PodName} 이름을 갖는 pod 이 {$PodNamespace} 에 떠있을 때, 해당 pod 의 {$ContainerName} 컨테이너의 로그를 확인할 수 있습니다. + - (pod 이 single container 만 가지고 있을 경우는 kubectl logs {$PodName} -n {$PodNamespace} 만으로 확인할 수 있습니다. + - Pod 의 status 가 ContainerCreating 혹은 CrashLoopBackoff 상태일 경우는 해당 커맨드로 로그를 확인하기 어려울 수 있습니다. +- `kubectl get sc` + - kube cluster 환경에 등록되어있는 storageclass 목록을 확인합니다. + - storageclassName 뒤에 (default) 라고 적혀져 있는 storageclass 가 현재 default storageclass 입니다. + - kubernetes cluster 환경에 default sc 가 2개 이상일 경우 문제가 발생할 수 있으니 1개로 지정하기 바랍니다. + +### 주요 원인 목록 + +> cdi 설치 및 사용 중 발생하는 대부분 에러의 원인은 다음 중 하나입니다. + +- cdi 모듈의 비정상 설치 + - docker registry 등록 문제 + - image 문제 +- storage class 문제 + - provioning 불가 + - mount to pod 불가 +- network 문제 + - pod 에서 외부로의 통신 불가 + - pod to pod 통신 불가 + - NetworkPolicy 문제 +- namespace 문제 + - resourceQuota 부족 +- cdi 모듈 자체 버그 + - namespace with resourceQuota 문제 ([v1.12에서 해결](https://github.com/kubevirt/containerized-data-importer/releases/tag/v1.12.0)) + - ingress with null http 문제 ([v1.12에서 해결](https://github.com/kubevirt/containerized-data-importer/releases/tag/v1.12.0)) + +---------- + +## 이슈 분류 + +- **이슈 케이스는 상황(주로 importer-pod 의 phase)에 따라 분류하였습니다.** + - DataVolume 생성 시 해당 namespace 에 importer-pod 이 임시로 생성되며 data-import 후 삭제됩니다. +- 에러 상황에 해당하는 이슈를 검색하려면 `ctrl + F`로 `!{$키워드}` 를 검색하세요. + - 에러 **상황**을 나타내는 키워드를 검색하세요. + - 키워드 예시) `!importer pod`, `!pending`, `!crashloopbackoff`, `!storageclass`, `!webhook`, `!insecure registry` + +---------- + +## Issue (1) +> 원인 : 불명 + +> !webhook +> !cdi-apiserver pod +> !apiserver pod +> !version +> !authentication 정보 +> !finalizer + +#### 상황 +- datavolume 생성 혹은 삭제 요청 시 다음과 같은 에러 발생 +``` +Internal error occurred: failed calling webhook "datavolume-mutate.cdi.kubevirt.io": Post https://cdi-api.cdi.svc:443/datavolume-mutate?timeout=30s +``` + +#### 테스트 +- cdi 에서 관리하는 cr 의 다른 CRUD api 가 모두 같은 에러로 실패하는지 확인합니다. + - 예) `kubectl get dv -A`, `kubectl describe dv` + - 실패하지 않고 성공한다면 **해당 이슈가 아닙니다.** +- cdi namespace 에 pod 이 모두 `RUNNING` state 인지 확인합니다. + - `ERROR` 등의 다른 state 로 떠있는 pod 이 있다면 **해당 이슈가 아닙니다.** + +#### 해결방법 +- **아직 정확한 원인 및 해결방법을 찾지 못하였습니다.** + - 에러 상황은 kube-system ns 의 apiserver pod 에서 cdi ns 의 cdi-apiserver pod 으로 통신이 되지 않는 상황이지만 그 원인은 불명확합니다. 다음과 같은 문제가 원인일 수 있습니다: + - cdi ns 에 걸려있는 `NetworkPolicy` 문제 + - cdi-apiserver pod 이 떠있는 `node 의 network 문제` + - cdi 설치 시 `버전 미통일 문제` + - cdi 에서 사용하는 `authentication 정보`가 삭제되었거나 만료된 문제 + - [비슷한 이슈](https://github.com/kubevirt/containerized-data-importer/issues/1117) +- 임시 해결방안은 cdi 모듈 전체를 제거 후 재설치하는 방안이 있습니다. + - cdi 모듈 전체 제거 시 namespace delete 중 stuck 이 걸릴 경우 다음 링크를 참고하여 제거하시면 됩니다. + - [namespace 강제 삭제 방법](https://success.docker.com/article/kubernetes-namespace-stuck-in-terminating) + +---------- + +## Issue (2) +> 원인 : namespace, resourcequota + +> !cdi-deployment pod +> !limits.cpu +> !importer pod +> !v1.11 +> !pvcbound + +#### 상황 + +- **dv 생성 요청 시 importer-pod 이 아예 생성되지 않은 경우**: + - 해당 namespace 에 `kubectl get pod` 했을 시 pod 이 보이지 않으며, cdi-deployment pod 의 log 를 확인했을 때, 다음과 같은 형태의 에러 메시지가 있는 경우 + +``` +importer pod API create errored: pods importer-XXXXX is forbidden: failed quota: {$Namespace 이름} : must specify limits.cpu, limits.memory +``` + +#### 테스트 + +- 해당 namespace 에 sample pod 이 정상 생성되는지 테스트 + - nginx pod 과 같은 임의의 pod 을 `required paramter`만 입력하여 생성 시도하여 정상 생성되는지 확인합니다. + - 정상 생성된다면 **해당 이슈가 아닙니다.** + +- cdi version 확인 + - `kubectl -n cdi describe deployments.apps cdi-deployment` + - `Labels`에 명시된 `operator.cdi.kubevirt.io`의 버전이 `v1.12.0` 이상의 버전이면 **해당 이슈가 아닙니다.** + +- dv 생성 요청한 namespace 에 resourceQuota 객체가 존재하는지 확인 + - `kubectl -n {$Namespace} get resourceQuota` + - 존재한다면 describe 하여 `limits.cpu, limits.memory, requests.cpu, requests.memory` 에 대응되는 값이 하나라도 존재하는지 확인 + - 존재하지 않는다면 **해당 이슈가 아닙니다.** + +#### 해결방법 + +- 위의 테스트 결과 해당 이슈가 맞다면 해결하는 방법은 다음과 같습니다. + - cdi version 을 v1.12.0 이상의 버전으로 업그레이드하면 모두 해결됩니다. + - cdi version 을 유지하고 싶은 경우에는 **datavolume 을 생성할 namespace** 에 resourceQuota 에 대응되는 default resource 를 명시한 limitRange 객체를 생성해줍니다. + - [limitRange 생성 방법](https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/memory-default-namespace/) + +---------- +## Issue (3) +> 원인 : namespace, resourcequota + +> !importer pod +> !v1.11 +> !resourceQuota !resourcequota !quota +> !exceeded !exceeded quota +> !pvcbound + +#### 상황 + +- **dv 생성 요청 시 importer-pod 이 아예 생성되지 않은 경우**: + - 해당 namespace 에 `kubectl get pod` 했을 시 pod 이 보이지 않으며, cdi-deployment pod 의 log 를 확인했을 때, 다음과 같은 형태의 에러 메시지가 있는 경우 + +``` +import-controller.go:297] error processing pvc "hpcd-ccf03101/hpcd-d03451e2": scratch PVC API create errored: persistentvolumeclaims "hpcd-d03451e2-scratch" is forbidden: exceeded quota: hpcd-ccf03101-quota, requested: requests.storage=20Gi, used: requests.storage=82Gi, limited: requests.storage=100Gi +``` + +#### 테스트 + +- dv 생성 요청한 namespace 에 resourceQuota 객체가 존재하는지, 현재 사용량은 어떻게 되는지 확인 + - `kubectl -n {$Namespace} describe resourceQuota` + +#### 해결방법 + +- 위의 테스트 결과 해당 이슈가 맞다면 해결하는 방법은 다음과 같습니다. + - 단순히 해당 namespace 에 사용가능한 resourceQuota 가 부족한 것이 원인이므로 resourceQuota 를 `kubectl edit` 을 통해 늘려주면 됩니다. + - **단, 사용량이 충분해보이는데도 생성되지 않는 경우가 cdi 모듈 특성 상 발생할 수 있습니다.** + - 예) requested: 30Gi, used: 60Gi, limited: 100Gi 인데도 같은 이슈가 발생합니다. + - cdi 모듈 특성 상 정상적인 작동을 위해서는 남은 disk size 가 requested size 의 **2배 이상**이 있어야 합니다. + - 이는 cdi 모듈이 data import 를 위하여 임시로 같은 크기의 pvc 를 하나 더 만들어 사용한 뒤 import 가 완료되면 삭제하는 특징을 지니고 있기 때문에 발생합니다. + - 따라서, requested size : 30Gi 이므로 60Gi 를 필요로 하는데, 남은 size 는 40Gi 이므로 에러가 발생합니다. + +---------- + +## Issue (4) +> 원인 : image, registry + +> !importer pod +> !crashloopbackoff +> !namespace +> !docker +> !registry +> !insecure registry +> !configmap +> !networkpolicy + +#### 상황 + +- dv 의 source url 을 http 가 아닌 registry 로 적었으며, **dv 생성 요청 시 해당 namespace 에 importer pod 은 생성되었으나 pod 의 Status 가 Error -> CrashLoopBackOff 를 반복하며 계속 restart 하는 경우** + +#### 테스트 + +- cdi-deployment pod 의 log 와 importer pod 의 log 확인 후 저장 +- sample pod 생성하여 같은 문제가 발생하는지 확인 + - 1) busybox 이미지로 pod 을 importer pod 이 생성된 namespace, node 에 생성합니다. + - 2) 생성된 busybox pod 에 `kubectl exec -it` 로 붙어 해당 registry 로 ping 혹은 curl 이 되는지 확인합니다. + - 3) 서로 다른 namespace 와 서로 다른 node 에 busybox pod 2개를 생성하여 두 busybox pod 간에 통신이 되는지 확인합니다. +- docker registry 에 정확한 버전의 cdi image 들이 모두 존재하는지 확인 +- dv 생성 요청시 명시한 image:tag 가 docker registry 에 있는지 확인 (curl GET 으로 확인) +- 모든 node 에서 docker registry 에 접근 가능한지 확인 + - insecure registry 등록했는지 확인 + - /etc/docker/daemon.json 모든 노드에 있는지 확인 +- cdi configmap 에 추가되었는지 확인 + +#### 해결 방법 + +- sample pod 생성하여 같은 문제가 발생한다면 network 문제입니다. + - networkpolicy 등을 확인합니다. +- docker registry 에 cdi image 가 없다면 push 해줍니다. +- dv 생성 요청시 명시한 image:tag 가 docker registry 에 없다면 push 해줍니다. +- 특정 node 에서 docker registry 에 접근 가능하지 않다면 해당 node 의 insecure registry 등록 여부 확인 후 /etc/docker/daemon.json 에 추가해줍니다. +- cdi configmap 에 registry 정보가 없다면 추가해줍니다.([cdi configmap 변경](./cdi.md)) + +---------- + +## Issue (5) +> 원인 : network, storage + +> !importer pod +> !pending +> !containercreating +> !storageclass +> !cdiconfig +> !stuck + +#### 상황 + +- **dv 생성 요청 시 importer pod 이 Pending 혹은 ContainerCreating 상태로 stuck 된 경우** + +#### 테스트 + +- dv 생성 요청시 storage class 를 명시하였는지 테스트 + - 명시하지 않았다면 `kubectl get sc` 로 default sc 가 1개인지 확인 +- dv 생성 요청시 명시한 storage class 가 현재 정상인지 테스트 + - provisoner pod 이 Running Status 인지 + - 해당 sc 로 pvc 생성 시 pv 가 정상적으로 dynamic provisioning 되는지 테스트 + - 해당 sc 의 pvc 를 mount 시킨 pod 이 정상 생성되는지 테스트 +- cdiconfig 의 status.scratchSpaceStorageClass 에 적힌 storageClassName이 현재 사용가능한지 테스트 + - `kubectl describe cdiconfig` 후 `status.scratchSpaceStorageClass` 확인 +- dv 가 필요로하는 pv, pvc 정상 생성 테스트 + - `kubectl get pv,pvc -n {$Namespace}` + - dv 와 같은 이름으로 시작하는 pvc 와 dvName-scratch 이름의 pvc **2개**가 모두 생성되고 pv 와 bound 되었는지 확인 +- pod 끼리 통신은 잘 되고 있는지 테스트 + - busybox image 로 pod 을 각 노드별로 create 하여 pod 간 ping 이 정상적으로 가는지 확인 + +#### 해결 방법 + +- dv 생성 요청시 storage class 를 명시하지 않았는데 default sc 가 없거나 2개 이상인 경우는 1개를 선택하여 default sc 로 만들어줍니다. + - default sc 설정 : + - `kubectl patch storageclass {$StorageClassName} -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'` + - default sc 해제 : + - `kubectl patch storageclass {$StorageClassName} -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'` +- dv 생성 요청한 sc 가 정상이 아니라면 해당 storageclass 를 확인합니다. +- cdiconfig 에 적힌 storageclass 가 더이상 사용 중이 아닌 storageclass 라면 다른 storageclass 로 변경합니다.([cdiconfig 변경](./cdi.md)) +- dvName 으로 시작하는 pvc, pv 가 정상 생성되지 않는다면 storageclass 를 확인합니다. +- pod 끼리 통신이 되지 않는다면 network 를 확인합니다. + +---------- + +## Issue (6) +> 원인 : 비정상 삭제, 재설치 + +> !cdi-operator pod +> !crd +> !v1.11.0 +> !삭제 !uninstall !delete +> !설치 !install !deploy + +#### 상황 + +- **cdi 설치 단계에서 cdi-operator pod 을 제외한 다른 pod 이 아예 생성되지 않는 경우**: + - cdi-operator pod 의 log 를 확인했을 때, 다음과 같은 형태의 에러 메시지가 있는 경우 + +``` +{"level":"error","ts":1587955224.906415,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"cdi-operator-controller","request":"/cdi","error":"*v1beta1.CustomResourceDefinition /datavolumes.cdi.kubevirt.io missing last applied config","stacktrace":"kubevirt.io/containerized-data-importer/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/github.com/go-logr/zapr/zapr.go:128\nkubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\nkubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\nkubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nkubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\nkubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"} + +=> 중요 로그 : "request":"/cdi","error":"*v1beta1.CustomResourceDefinition /datavolumes.cdi.kubevirt.io missing last applied config" +``` + +#### 테스트 + +- log 에 적힌 crd 가 kubernetes cluster 에 존재하는지 확인 + - 위의 로그의 예에서는 `datavolumes.cdi.kubevirt.io` 를 확인합니다. + - `kubectl get crd | grep datavolumes.cdi.kubevirt.io` + - ``` + datavolumes.cdi.kubevirt.io {$과거 날짜} + ``` + - cdi 모듈 install 을 시도한 날짜가 아닌 {$과거 날짜} 가 조회된다면, 해당 k8s cluster 에 과거 cdi 모듈을 install 하였고, 삭제 과정에서 비정상적으로 삭제되었다는 것을 의미합니다. + - 존재하지 않는다면 **해당 이슈가 아닙니다.** + + +#### 해결방법 + +- 위의 테스트 결과 해당 이슈가 맞다면 해결하는 방법은 다음과 같습니다. + - `kubectl delete -f cdi-cr.yaml` 과 `kubectl delete -f cdi-operator.yaml` 을 통해 cdi 관련 resource 전체를 삭제합니다. + - 과거 delete 할 당시 미처 삭제되지 않은 cdi 관련 crd 를 `kubectl get crd` 를 통해 조회 후 삭제합니다. + - `kubectl apply` 를 통해 재설치하면 정상적으로 설치됩니다.