Skip to content

Commit 0b5fe5e

Browse files
authored
Merge pull request #230 from stfc/chatops_docs
Doc for ChatOps
2 parents 52b1925 + b680cbc commit 0b5fe5e

File tree

18 files changed

+481
-34
lines changed

18 files changed

+481
-34
lines changed

.github/workflows/chatops.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: ChatOps checks
1+
name: Linting
22

33
on:
44
push:

chatops_deployment/INSTALL.md

Whitespace-only changes.

chatops_deployment/README.md

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,29 @@
11
# ChatOps Deployment
22

3-
This project outlines the deployment of the Cloud ChatOps application found [here](https://github.com/stfc/cloud-docker-images/tree/master/cloud-chatops).
4-
The goal is to create an easily deployable and highly available infrastructure to run the Docker image on.
3+
![Linting](https://github.com/stfc/SCD-OpenStack-Utils/actions/workflows/chatops.yaml/badge.svg)
4+
5+
## Contents
6+
7+
- [About](#about)
8+
9+
### About
10+
11+
This project outlines the deployment of the Cloud ChatOps application
12+
found [here](https://github.com/stfc/cloud-docker-images/tree/master/cloud-chatops). The goal is to create an easily
13+
deployable and highly available infrastructure to run the Docker container on. We achieve this by using Terraform and
14+
Ansible to provision and configure a virtual machine the services will run on.
15+
16+
This includes:
17+
18+
- Load balanced application traffic
19+
- Infrastructure-wide service logging to a central location
20+
- Service monitoring with visual dashboards and alerting notifications
21+
- Multi-environment deployment (e.g. dev, staging, prod)
22+
23+
To get started with the deployment, see [INSTALL.md](docs/INSTALL.md).
24+
25+
For information about what services are deployed, see [SERVICES.md](docs/SERVICES.md)
26+
27+
To understand what the Terraform modules do, see [TERRAFORM.md](docs/TERRAFORM.md)
28+
29+
To know what and where variables are stored, see [VARIABLES.md](docks/VARIABLES.md)

chatops_deployment/ansible/roles/haproxy/tasks/certbot.yml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -35,26 +35,26 @@
3535
become: true
3636
ansible.builtin.stat:
3737
path: /etc/haproxy/{{ domain }}.crt
38-
register: certificate_file
38+
register: haproxy_certificate_file
3939

4040
- name: Generate the certificate for the first time
4141
become: true
4242
ansible.builtin.command: |
4343
certbot certonly --standalone --non-interactive --agree-tos --expand --domains \
4444
{{ domain }},chatops.{{ domain }},prometheus.{{ domain }},grafana.{{ domain }},alertmanager.{{ domain }},kibana.{{ domain }} \
4545
46-
register: generate_cert
47-
changed_when: generate_cert.rc == 0
48-
when: not certificate_file.stat.exists
46+
register: haproxy_generate_cert
47+
changed_when: haproxy_generate_cert.rc == 0
48+
when: not haproxy_certificate_file.stat.exists
4949

5050
- name: Copy certificate for the first time
5151
become: true
5252
ansible.builtin.command: |
5353
cat /etc/letsencrypt/live/{{ domain }}/privkey.pem \
5454
/etc/letsencrypt/live/{{ domain }}/fullchain.pem > /etc/haproxy/{{ domain }}.crt
55-
register: copy_cert
56-
changed_when: copy_cert.rc == 0
57-
when: not certificate_file.stat.exists
55+
register: haproxy_copy_cert
56+
changed_when: haproxy_copy_cert.rc != 0
57+
when: not haproxy_certificate_file.stat.exists
5858

5959
- name: Create a cron job for the renewal of certificates
6060
become: true
@@ -99,4 +99,4 @@
9999
ansible.builtin.systemd_service:
100100
state: restarted
101101
name: haproxy.service
102-
when: copy_cert.rc == 0
102+
when: haproxy_copy_cert.rc == 0

chatops_deployment/ansible/roles/haproxy/tasks/haproxy.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,11 @@
1919
become: true
2020
ansible.builtin.stat:
2121
path: /etc/haproxy/{{ domain }}.crt
22-
register: certificate_file
22+
register: haproxy_certificate_file
2323

2424
- name: Make sure haproxy.service is running
2525
become: true
2626
ansible.builtin.systemd_service:
2727
state: restarted
2828
name: haproxy.service
29-
when: certificate_file.stat.exists
29+
when: haproxy_certificate_file.stat.exists

chatops_deployment/ansible/roles/ssh_known_hosts/tasks/main.yml

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -18,50 +18,50 @@
1818
send "{{ bastion_key_passphrase }}\r"
1919
expect eof
2020
EOF
21-
register: _
22-
changed_when: _.rc == 0
21+
register: ssh_known_hosts_
22+
changed_when: ssh_known_hosts_.rc != 0
2323

2424
- name: Remove FIP known hosts
2525
ansible.builtin.command: 'ssh-keygen -R "{{ terraform_floating_ip }}"'
26-
register: _
27-
changed_when: _.rc == 0
26+
register: ssh_known_hosts_
27+
changed_when: ssh_known_hosts_.rc != 0
2828

2929
- name: Remove private VM known host entries
3030
ansible.builtin.command: "ssh-keygen -R {{ item }}"
3131
loop: "{{ groups['private'] }}"
32-
register: _
33-
changed_when: _.rc == 0
32+
register: ssh_known_hosts_
33+
changed_when: ssh_known_hosts_.rc != 0
3434

3535
- name: Add FIP fingerprint to known hosts
3636
ansible.builtin.command: 'ssh-keyscan "{{ terraform_floating_ip }}" >> ~/.ssh/known_hosts'
37-
register: _
38-
changed_when: _.rc == 0
37+
register: ssh_known_hosts_
38+
changed_when: ssh_known_hosts_.rc != 0
3939

4040
- name: Get private VM fingerprints and retrieve to local host
4141
delegate_to: "{{ terraform_floating_ip }}"
4242
block:
4343
- name: Add private VM fingerprints to known hosts on LB
4444
ansible.builtin.command: 'ssh-keyscan "{{ item }}" >> ~/.ssh/known_hosts'
4545
loop: "{{ groups['private'] }}"
46-
register: _
47-
changed_when: _.rc == 0
46+
register: ssh_known_hosts_
47+
changed_when: ssh_known_hosts_.rc != 0
4848

4949
- name: Retrieve known hosts from LB
5050
ansible.builtin.fetch:
5151
src: "~/.ssh/known_hosts"
5252
dest: "private_known_hosts.tmp"
5353
flat: true
54-
register: _
55-
changed_when: _.rc == 0
54+
register: ssh_known_hosts_
55+
changed_when: ssh_known_hosts_.rc != 0
5656

5757
- name: Append fetched known hosts to localhost
5858
ansible.builtin.command: "cat private_known_hosts.tmp >> ~/.ssh/known_hosts"
59-
register: _
60-
changed_when: _.rc == 0
59+
register: ssh_known_hosts_
60+
changed_when: ssh_known_hosts_.rc != 0
6161

6262
- name: Remove private_known_hosts.tmp
6363
ansible.builtin.file:
6464
path: "private_known_hosts.tmp"
6565
state: absent
66-
register: _
67-
changed_when: _.rc == 0
66+
register: ssh_known_hosts_
67+
changed_when: ssh_known_hosts_.rc != 0

chatops_deployment/ansible/roles/terraform/tasks/deploy.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,29 +4,29 @@
44
- name: Check clouds.yaml
55
ansible.builtin.stat:
66
path: "~/.config/openstack/clouds.yaml"
7-
register: clouds_yaml_state
7+
register: terraform_clouds_yaml_state
88

99
- name: Fail if clouds.yaml does not exist
1010
ansible.builtin.fail:
1111
msg: "Could not find a clouds.yaml in ~/.config/openstack/clouds.yaml"
12-
when: not clouds_yaml_state.stat.exists
12+
when: not terraform_clouds_yaml_state.stat.exists
1313

1414
- name: Check public and private keys
1515
block:
1616
# We can ignore this warning as this command doesn't change anything when it runs.
1717
- name: Check Bastion public key is valid # noqa: no-changed-when
1818
ansible.builtin.command: "ssh-keygen -l -f '../terraform/bastion-key.pub'"
1919
ignore_errors: true
20-
register: public_key_state
20+
register: terraform_public_key_state
2121

2222
# We can ignore this warning as this command doesn't change anything when it runs.
2323
- name: Check Bastion private key is valid # noqa: no-changed-when
2424
ansible.builtin.command: "ssh-keygen -l -f '../ansible/bastion-key'"
2525
ignore_errors: true
26-
register: private_key_state
26+
register: terraform_private_key_state
2727

2828
- name: Generate an SSH key pair and copy to directories
29-
when: public_key_state.rc != 0 or private_key_state.rc != 0
29+
when: terraform_public_key_state.rc != 0 or terraform_private_key_state.rc != 0
3030
block:
3131
- name: Generate key
3232
community.crypto.openssh_keypair:

chatops_deployment/docs/INSTALL.md

Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
# Deployment
2+
3+
## Contents:
4+
5+
- [Quick Start](#quick-start)
6+
7+
## Quick Start:
8+
9+
- If you are deploying from scratch, start at [Setting up localhost](#setting-up-localhost)
10+
- If you already have the repository cloned, the vault password saved and the projects clouds.yaml then start
11+
at [Deploy infrastructure](#deploy-infrastructure).
12+
- If you only need to make changes to an existing deployment then start
13+
at [Configure infrastructure](#configure-infrastructure)
14+
- To destroy all infrastructure, see [Destroy infrastructure](#destroy-infrastructure)
15+
16+
## OpenStack Project Requirements:
17+
18+
The project `Cloud-MicroServices` is already setup with all the required requisites. The variables in this repository
19+
reference that project. If you are using a different project for a deployment not used by the Cloud Team you will
20+
require the following:
21+
22+
- A floating IP (e.g. 130.246.X.Y)
23+
- DNS records:
24+
- `<your-domain> CNAME host-130-246-X-Y.nubes.stfc.ac.uk`
25+
- **AND**
26+
- ```
27+
# EITHER
28+
*.<your-domain> CNAME host-130-246-X-Y.nubes.stfc.ac.uk
29+
# OR
30+
kibana.<your-domain>. CNAME host-130-246-X-Y.nubes.stfc.ac.uk.
31+
grafana.<your-domain>. CNAME host-130-246-X-Y.nubes.stfc.ac.uk.
32+
prometheus.<your-domain>. CNAME host-130-246-X-Y.nubes.stfc.ac.uk.
33+
alertmanager.<your-domain>. CNAME host-130-246-X-Y.nubes.stfc.ac.uk.
34+
chatops.<your-domain>. CNAME host-130-246-X-Y.nubes.stfc.ac.uk.
35+
```
36+
- Ports 80 and 443 open inbound from the internet
37+
- OpenStack Volume for the VM ~10GB
38+
39+
### Deploying the Infrastructure:
40+
41+
You can run the deployment from any machine (including your local laptop).
42+
However, we suggest you make a dedicated "seed VM" in OpenStack as the
43+
deployment will create files such as SSL certificates and SSH keys which you
44+
will need to keep for further maintenance.
45+
46+
Machine requirements:
47+
48+
- Python3
49+
- Snap (to install Terraform)
50+
- Pip or equivalent (to install Ansible)
51+
52+
#### Setting up localhost:
53+
54+
1. Install Ansible and collections
55+
```shell
56+
# Install venv and Ansible
57+
apt install python3-venv ansible
58+
59+
# Create a virtual environment
60+
python3 -m venv venv
61+
source venv/bin/activate
62+
63+
# Install collections using Ansible Galaxy
64+
ansible-galaxy install -r requirements.yml
65+
66+
# Install dependencies
67+
pip install -r requirements.yml
68+
```
69+
70+
2. Create a vault password file to avoid repeated inputs
71+
```shell
72+
# Either
73+
74+
echo "chatops_vault_password" >> ~/.chatops_vault_pass
75+
76+
# or
77+
78+
vim ~/.chatops_vault_pass # and enter the vault password as plain text
79+
```
80+
81+
3. Change permissions and attributes to protect the file
82+
```shell
83+
chmod 400 ~/.chatops_vault_pass
84+
chattr +i ~/.chatops_vault_pass
85+
```
86+
87+
4. Copy the projects clouds.yaml to the `~/.config/openstack/clouds.yaml`
88+
```shell
89+
cp <path-to>/clouds.yaml ~/.config/openstack/clouds.yaml
90+
```
91+
92+
#### Deploy infrastructure:
93+
94+
You can deploy both development and production environments on the same machine but not at the same time.
95+
96+
1. Clone this repository
97+
```shell
98+
git clone https://github.com/stfc/SCD-OpenStack-Utils
99+
```
100+
101+
2. Change into the `ansible` directory
102+
```shell
103+
cd SCD-OpenStack-Utils/chatops_deployment/ansible
104+
```
105+
106+
3. Deploy infrastructure. Using -i to specify which inventory to use, dev or prod
107+
```shell
108+
ansible-playbook deploy.yml --vault-password-file=~/.chatops_vault_pass -i <environment>
109+
```
110+
111+
#### Configure infrastructure
112+
113+
1. Configure the VMs. This step will take ~15 minutes
114+
```shell
115+
ansible-playbook configure.yml --vault-password-file=~./chatops_vault_pass -i <environment>
116+
```
117+
118+
#### Destroy infrastructure
119+
120+
To destroy the infrastructure and all locally generated files run the destroy playbook.
121+
122+
1. Destroy the infrastructure and locally generated files
123+
```shell
124+
ansible-playbook destroy.yml --vault-password-file=~./chatops_vault_pass -i <environment>
125+
```
126+
127+
## Debugging:
128+
129+
### Terraform
130+
131+
To debug the Terraform deployment, it is best to use the Terraform directly rather than through Ansible.
132+
When you run the deploy.yml playbook, a `terraform.tfvars` file is created which allows you to run the Terraform modules
133+
separate to Ansible.
134+
135+
1. Ensure you have run deploy.yml at least once to generate the variables file `terraform.tfvars`
136+
137+
2. Change to the terraform directory
138+
```shell
139+
# Assuming you are in the ansible directory
140+
cd ../terraform
141+
```
142+
143+
3. Check and change Terraform workspace. Terraform separates environments into workspaces. Make sure you are using the
144+
correct workspace before making changes.
145+
```shell
146+
# List all workspaces. You should see at most "default, dev, prod"
147+
terraform workspace list
148+
149+
# Select the workspace you want to affect
150+
terraform workspace select <environment>
151+
```
152+
153+
4. Now you can make changes to the deployment. It is advisable you only use the Terraform commands directly if there is
154+
something very wrong. The Ansible playbooks should be the first choice.
155+
```shell
156+
# For example, plan and apply changes
157+
terraform plan -out plan
158+
terraform apply plan
159+
160+
# Refresh the state to check API connections
161+
terraform refresh
162+
163+
# Validate the config
164+
terraform validate
165+
```
166+
167+
### Ansible
168+
169+
Each role in the Ansible playbook is tagged in its play. This enables you to run only parts of the playbooks. This is
170+
important as it takes ~15 minutes to run the entire playbook. So, when you only want to make changes to certain parts
171+
of the deployment you can use `--tags <some-tag>` to run only that part of the play.
172+
173+
For example, if you change the Prometheus config file template you can just run the playbook with the **prometheus** tag
174+
.
175+
```shell
176+
ansible-playbook configure.yml --vault-password-file=~./chatops_vault_pass -i dev --tags prometheus
177+
```
178+
179+
It is not recommended to use tags when making changes to the production deployment. As changes are promoted to
180+
production the entire playbook should be run. This avoids any changes being missed out and ensures the entire deployment
181+
is running the latest configuration.

0 commit comments

Comments
 (0)