This project is a crawler proxy project base on GCP. Use Squid, the forword proxy server as the docker image running on VM group, and use GCP tcp proxy load balancer as the access entrypoint.
- build sqiud forword proxy docker image.
$ ./script/build_docker_push.sh
- create tf backend gcs and add backend.cof
$ ./script/create_state_bucket.sh
backend.conf
bucket = "<tf backend gcs>"
prefix = "<terraform state prefix>"
- init terraform
$ terraform init -backend-config=backend.conf
- create terraform.tfvars to apply your env.
project_id = "<GCP PROJECT ID>"
service_account = "<SVC NAME>@<GCP PROJECT ID>.iam.gserviceaccount.com"
region = "us-central1"
target_size = 100
- apply terraform
$ terraform apply
import requests
proxies = {
'https': 'http://35.190.69.208:8085', # your gcp tcp proxy address
}
res = requests.get('https://ifconfig.me/', proxies=proxies)
print(res.text)
- https://harry-lin.blogspot.com/2019/05/docker-azuredockersquid-proxy.html
- https://github.com/sameersbn/docker-squid#configuration
- https://medium.com/google-cloud/squid-proxy-cluster-with-ssl-bump-on-google-cloud-7871ee257c27
- https://cloud.google.com/load-balancing/docs/tcp/setting-up-tcp#configuring_the_load_balancer