|
1 | 1 | --- |
2 | | -title: Secrets encryption |
| 2 | +title: Secrets encryption and PII redaction |
3 | 3 | description: Keep your secrets a secret |
4 | | -sidebar_position: 10 |
5 | 4 | --- |
6 | 5 |
|
7 | 6 | ## What's the risk? |
8 | 7 |
|
9 | | -As you interact with an AI coding assistant, sensitive data like passwords and |
10 | | -access tokens can be unintentionally exposed to third-party providers through |
11 | | -the code snippets and files you share as context. These secrets may become part |
12 | | -of the training data used to improve the AI model and potentially be exposed to |
13 | | -other users. |
| 8 | +As you interact with an AI coding assistant, sensitive data like passwords |
| 9 | +access tokens, and even personally identifiable information (PII) can be |
| 10 | +unintentionally exposed to third-party providers through the code and files you |
| 11 | +share as context. Besides the privacy and regulatory implications of exposing |
| 12 | +this information, it may become part of the AI model's training data and |
| 13 | +potentially be exposed to future users. |
14 | 14 |
|
15 | 15 | ## How CodeGate helps |
16 | 16 |
|
17 | 17 | CodeGate helps you protect sensitive information from being accidentally exposed |
18 | 18 | to AI models and third-party AI provider systems by redacting detected secrets |
19 | | -from your prompts using encryption. |
| 19 | +and PII found in your prompts. |
20 | 20 |
|
21 | 21 | ## How it works |
22 | 22 |
|
23 | | -CodeGate automatically scans all prompts for secrets such as: |
| 23 | +CodeGate automatically scans all prompts for secrets and PII. This happens |
| 24 | +transparently without requiring a specific prompt. Without interrupting your |
| 25 | +development flow, CodeGate protects your data by encrypting secrets and |
| 26 | +anonymizing PII. These changes are made before the prompt is sent to the LLM and |
| 27 | +are restored when the result is returned to your machine. |
24 | 28 |
|
25 | | -- API keys and tokens |
26 | | -- Private keys and certificates |
27 | | -- Database credentials |
28 | | -- SSH keys |
29 | | -- Cloud provider credentials |
30 | | - |
31 | | -This scan happens transparently without requiring a specific prompt. |
| 29 | +When a secret or PII is detected, CodeGate adds a message to the LLM's output |
| 30 | +and an alert is recorded in the [dashboard](../how-to/dashboard.md) (PII alerts |
| 31 | +in the dashboard are coming soon). |
32 | 32 |
|
33 | 33 | :::info |
34 | 34 |
|
35 | 35 | Since CodeGate runs locally, your secrets never leave your system unprotected. |
36 | 36 |
|
37 | 37 | ::: |
38 | 38 |
|
39 | | -CodeGate transparently encrypts secrets before sending the prompt to the LLM. |
40 | | -This way, CodeGate protects your sensitive data without blocking your |
41 | | -development flow. This is performed on the fly using AES256-GCM encryption with |
42 | | -a temporary per-session key that is securely erased from memory after the |
43 | | -response is delivered to your plugin. |
44 | | - |
45 | 39 | ```mermaid |
46 | 40 | sequenceDiagram |
47 | 41 | participant Client as AI coding<br>assistant |
48 | 42 | participant CodeGate as CodeGate<br>(local) |
49 | 43 | participant LLM as AI model<br>(remote) |
50 | 44 |
|
51 | | - Client ->> CodeGate: Prompt with<br>plaintext secrets |
| 45 | + Client ->> CodeGate: Prompt with<br>plaintext secrets/PII |
52 | 46 | activate CodeGate |
53 | | - CodeGate ->> LLM: Prompt with<br>encrypted secrets |
| 47 | + CodeGate ->> LLM: Prompt with<br>redacted secrets/PII |
54 | 48 | deactivate CodeGate |
55 | 49 | activate LLM |
56 | | - note right of LLM: LLM only sees<br>encrypted values |
57 | | - LLM -->> CodeGate: Response with<br>encrypted secrets |
| 50 | + note right of LLM: LLM only sees<br>redacted values |
| 51 | + LLM -->> CodeGate: Response with<br>redacted data |
58 | 52 | deactivate LLM |
59 | 53 | activate CodeGate |
60 | | - CodeGate -->> Client: Response with<br>plaintext secrets |
| 54 | + CodeGate -->> Client: Response with<br>original data |
61 | 55 | deactivate CodeGate |
62 | 56 | ``` |
| 57 | + |
| 58 | +### Secrets encryption |
| 59 | + |
| 60 | +CodeGate uses pattern matching to detect secrets such as: |
| 61 | + |
| 62 | +- API keys and tokens |
| 63 | +- Private keys and certificates |
| 64 | +- Database credentials |
| 65 | +- SSH keys |
| 66 | +- Cloud provider credentials |
| 67 | +- ...and more - see the |
| 68 | + [signatures file](https://github.com/stacklok/codegate/blob/main/signatures.yaml) |
| 69 | + in the project repo |
| 70 | + |
| 71 | +CodeGate transparently encrypts secrets before sending the prompt to the LLM. |
| 72 | +This is performed on the fly using AES256-GCM encryption with a temporary |
| 73 | +per-session key. When the LLM returns a response, CodeGate decrypts the secret |
| 74 | +before delivering it to your coding assistant, then securely erases the |
| 75 | +temporary key from memory. |
| 76 | + |
| 77 | +### PII redaction |
| 78 | + |
| 79 | +CodeGate scans for common types of PII like: |
| 80 | + |
| 81 | +- Email addresses |
| 82 | +- Phone numbers |
| 83 | +- Government identification numbers |
| 84 | +- Credit card numbers |
| 85 | +- Bank accounts and crypto wallet IDs |
| 86 | + |
| 87 | +CodeGate anonymizes PII by replacing each string with a unique identifier before |
| 88 | +sending the prompt to the LLM. This way, CodeGate protects your sensitive data |
| 89 | +without blocking your development flow. When the LLM returns a response, |
| 90 | +CodeGate matches up the identifier and replaces it with the original value. |
0 commit comments