This repository provides Terraform configurations to automate the deployment of CANedge data processing infrastructure on Google Cloud Platform. The deployment is split into three parts:
- Input Bucket Deployment: Creates an input bucket for storing uploaded CANedge log files
- MF4-to-Parquet Deployment: Creates an output bucket and Cloud Function for DBC decoding MDF to Parquet
- BigQuery Deployment: Creates BigQuery resources for querying Parquet data
- Log in to the Google Cloud Console
- Select your project from the dropdown (top left)
- Click on the Cloud Shell icon (>_) to open Cloud Shell (top right)
- Once Cloud Shell is open, run the following command to clone this repository:
cd ~ && rm -rf canedge-google-cloud-terraform && git clone https://github.com/CSS-Electronics/canedge-google-cloud-terraform.git && cd canedge-google-cloud-terraform
If you're just getting started, first deploy the input bucket where your CANedge devices will upload MF4 files:
chmod +x deploy_input_bucket.sh && ./deploy_input_bucket.sh --project YOUR_PROJECT_ID --region YOUR_REGION --bucket YOUR_BUCKET_NAME
Replace:
YOUR_PROJECT_ID
with your active Google Cloud project ID (e.g.bigquery7-464008
)YOUR_REGION
with your desired region (e.g.,europe-west1
- see this link for available regions)YOUR_BUCKET_NAME
with your desired bucket name (e.g.canedge-test-bucket-20
)
Once you have an input bucket set up, you can optionally deploy the processing pipeline to automatically DBC decode uploaded MF4 files to Parquet format and provide backlog/aggregation processing capabilities:
chmod +x deploy_mdftoparquet.sh && ./deploy_mdftoparquet.sh --project YOUR_PROJECT_ID --bucket YOUR_INPUT_BUCKET_NAME --id YOUR_UNIQUE_ID --email YOUR_EMAIL
Replace:
YOUR_PROJECT_ID
with your Google Cloud project IDYOUR_INPUT_BUCKET_NAME
with your input bucket nameYOUR_UNIQUE_ID
with a short unique identifier (e.g.datalake1
)YOUR_EMAIL
with your email address to receive notifications
Optional parameters:
--zip YOUR_FUNCTION_ZIP
: Override the default main function ZIP file--zip-backlog YOUR_BACKLOG_FUNCTION_ZIP
: Override the default backlog function ZIP--zip-aggregation YOUR_AGGREGATION_FUNCTION_ZIP
: Override the default aggregation function ZIP- Download the ZIP files from the CANedge Intro (Process/MF4 decoders/Parquet data lake/Google)
Note
Make sure to upload all the ZIP files to your input bucket root before deployment
Important
If the deployment fails with a message regarding Eventarc propagation delay, simply re-run the deployment after a few minutes to complete it.
After setting up the MF4-to-Parquet pipeline, you can deploy BigQuery to query your Parquet data lake:
chmod +x deploy_bigquery.sh && ./deploy_bigquery.sh --project YOUR_PROJECT_ID --bucket YOUR_INPUT_BUCKET_NAME --id YOUR_UNIQUE_ID --dataset YOUR_DATASET_ID
Replace:
YOUR_PROJECT_ID
with your Google Cloud project IDYOUR_INPUT_BUCKET_NAME
with your input bucket nameYOUR_UNIQUE_ID
with a short unique identifier (e.g.datalake1
)YOUR_DATASET_ID
with your BigQuery dataset ID (e.g.canedge_data
)
Optional parameters:
--zip YOUR_FUNCTION_ZIP
: Override the default BigQuery function ZIP file- Download the ZIP from the CANedge Intro (Process/MF4 decoders/Parquet data lake/Google)
Note
Make sure to upload the ZIP to your input bucket root before deployment
If you encounter issues with either deployment:
- When deploying the MF4-to-Parquet pipeline for the first time in a Google project, the deployment may fail due to propagation delay on Eventarc permissions - in this case, simply re-run the deployment after a few minutes
- Make sure you have proper permissions in your Google Cloud project
- Use unique identifiers with the
--id
parameter to avoid resource conflicts - Check the Google Cloud Console logs for detailed error messages
- For the MF4-to-Parquet and BigQuery deployments, ensure the relevant function ZIP files are uploaded to your input bucket before deployment
- Contact us if you need deployment support
input_bucket/
- Terraform configuration for input bucket deploymentmdftoparquet/
- Terraform configuration for MF4-to-Parquet pipeline deploymentmodules/
- Terraform modules specific to the MF4-to-Parquet pipelineoutput_bucket/
- Module for creating the output bucketiam/
- Module for setting up IAM permissionscloud_function/
- Module for deploying the main Cloud Functioncloud_function_backlog/
- Module for deploying the Backlog Cloud Functioncloud_function_aggregation/
- Module for deploying the Aggregation Cloud Functioncloud_scheduler_backlog/
- Module for the Backlog Cloud Scheduler (paused, manual trigger)cloud_scheduler_aggregation/
- Module for the Aggregation Cloud Schedulermonitoring/
- Module for setting up monitoring configurations
bigquery/
- Terraform configuration for BigQuery deploymentmodules/
- Terraform modules specific to the BigQuery deploymentdataset/
- Module for creating the BigQuery datasetservice_accounts/
- Module for setting up service accountscloud_function/
- Module for deploying the BigQuery mapping functioncloud_scheduler_map_tables/
- Module for the BigQuery Map Tables Cloud Scheduler (paused, manual trigger)
bigquery-function/
- Source code for BigQuery table mapping functiondeploy_input_bucket.sh
- Script for input bucket deploymentdeploy_mdftoparquet.sh
- Script for MF4-to-Parquet pipeline deploymentdeploy_bigquery.sh
- Script for BigQuery deployment