Skip to content

Commit a82d0b3

Browse files
authored
Merge pull request #2763 from rraws/bedrock-batch-infernece
New Serverless Pattern - S3 to Bedrock Batch Inference
2 parents af2607a + 9c338fe commit a82d0b3

File tree

10 files changed

+610
-0
lines changed

10 files changed

+610
-0
lines changed
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# Amazon S3 to Amazon Bedrock Batch Inference using an Amazon EventBridge Rule
2+
3+
This pattern demonstrates how to trigger a Bedrock batch inference job when an object, that is the input to the batch inference job, is uploaded to S3. The pattern is implemented using AWS CDK.
4+
5+
Learn more about this pattern at [Serverless Land Patterns](https://serverlessland.com/patterns/s3-eventbridge-bedrock-batch-cdk)
6+
7+
Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the AWS Pricing page for details. You are responsible for any AWS costs incurred. No warranty is implied in this example.
8+
9+
## Pre-requisites
10+
11+
- [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources.
12+
- **[ IMPORTANT]** This pattern uses an example input that is [based on Messages API](https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference-data.html#batch-inference-data-ex-text) format of Anthropic Claude. Only specific models support batch inference in specific regions. Check the [supported regions and models](https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference-supported.html) section of the batch inference documentation and make sure you have access to [the correct Claude model](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) that you want to use.
13+
- [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured
14+
- [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
15+
- [AWS CDK](https://docs.aws.amazon.com/cdk/latest/guide/cli.html) installed and configured
16+
- [Python 3.13](https://www.python.org/downloads/) installed
17+
18+
## Deployment Instructions
19+
20+
1. Create a new directory, navigate to that directory in a terminal and clone the GitHub repository:
21+
22+
```shell
23+
git clone https://github.com/aws-samples/serverless-patterns
24+
```
25+
2. Change directory to the pattern directory:
26+
27+
```shell
28+
cd s3-bedrock-batch-inference-cdk
29+
```
30+
31+
3. Create and activate a Python virtual environment:
32+
33+
```shell
34+
python3 -m venv .venv
35+
source .venv/bin/activate
36+
```
37+
38+
4. Install the required dependencies:
39+
40+
```shell
41+
pip3 install -r requirements.txt
42+
```
43+
44+
5. Set your AWS region (replace `us-west-2` with your desired region):
45+
46+
```shell
47+
export AWS_REGION=us-west-2
48+
```
49+
50+
6. Deploy the stack by running the command below. Replace `ModelARN` with the ARN of the model you want to use. For example, if you want to use **Claude 3.5 Sonnet** in **us-west-2** the ARN would be `arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0`.
51+
52+
**Note:** You can find the complete list of foundation model that support batch inference in the [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference-supported.html). The ARN format is: `arn:aws:bedrock:<region>::foundation-model/<model-id>`
53+
54+
```shell
55+
cdk deploy --parameters ModelARN=<ARN of the model>
56+
```
57+
58+
**Note:** You will be prompted to confirm the deployment. Type `y` and press Enter to proceed.
59+
60+
## How it works
61+
![End to End Architecture](images/architecture.png)
62+
63+
This pattern creates an S3 bucket to store the input and output of the batch inference job. It also creates an EventBridge rule that is triggered when a batch inference input file is uploaded.
64+
65+
When the EventBridge rule is triggered, it uses an AWS-managed Lambda function (created automatically by the `AwsApi` target) to call the Bedrock `createModelInvocationJob` API. This Lambda function creates a new Bedrock batch inference job with the specified model and the uploaded input file. The output of the batch inference job is stored in the same S3 bucket.
66+
67+
**Architecture Flow:**
68+
1. Input file (`input.jsonl`) is uploaded to S3 bucket under `/input/` prefix
69+
2. S3 sends an event to EventBridge when the object is created
70+
3. EventBridge rule matches the event and triggers an AWS-managed Lambda function
71+
4. The Lambda function calls the Bedrock `createModelInvocationJob` API
72+
5. Bedrock processes the batch inference job and stores results in S3 under `/output/` prefix
73+
74+
## Testing
75+
- Once the pattern is deployed successfully you should see the name of the S3 bucket in the CDK output.
76+
- Upload the sample `input.jsonl` to this bucket by running this command
77+
```shell
78+
aws s3 cp model_input/input.jsonl s3://<S3 bucket name>/input/
79+
```
80+
- The upload will trigger the batch inference job. You can check the status of the job on the [AWS Console](https://console.aws.amazon.com/bedrock/home?#/batch-inference) or alternatively, you can use this command:
81+
```shell
82+
aws bedrock list-model-invocation-jobs | jq '.invocationJobSummaries[] | {jobArn, status, submitTime}'
83+
```
84+
- Once the batch inference job's status is "Completed" (this may take 30+ minutes), you can check the results by navigating to the [S3 bucket](https://console.aws.amazon.com/s3/buckets?&bucketType=general) and checking the `output` prefix
85+
86+
87+
## Cleanup
88+
89+
1. Empty the S3 bucket before deleting the stack:
90+
```bash
91+
aws s3 rm s3://<S3 bucket name> --recursive
92+
```
93+
94+
2. Delete the stack by running the command below:
95+
```bash
96+
cdk destroy
97+
```
98+
99+
---
100+
101+
Copyright 2025 Amazon.com, Inc. or its affiliates. All Rights Reserved.
102+
103+
SPDX-License-Identifier: MIT-0

s3-bedrock-batch-inference-cdk/app.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
#!/usr/bin/env python3
2+
3+
import aws_cdk as cdk
4+
5+
from bedrock_batch_inference.bedrock_batch_inference_pattern_stack import BedrockBatchInferencePatternStack
6+
7+
app = cdk.App()
8+
stack = BedrockBatchInferencePatternStack(app, "BedrockBatchInferencePatternStack")
9+
10+
app.synth()

s3-bedrock-batch-inference-cdk/bedrock_batch_inference/__init__.py

Whitespace-only changes.
Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
#!/usr/bin/env python3
2+
import time
3+
4+
from aws_cdk import (
5+
Stack,
6+
aws_s3 as s3,
7+
aws_events as events,
8+
aws_events_targets as targets,
9+
aws_iam as iam,
10+
CfnOutput, CfnParameter,
11+
)
12+
from aws_cdk.aws_iam import PolicyStatement
13+
from constructs import Construct
14+
15+
16+
class BedrockBatchInferencePatternStack(Stack):
17+
18+
def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
19+
super().__init__(scope, construct_id, **kwargs)
20+
21+
# Accept the model id as a parameter
22+
model_id = CfnParameter(self, "ModelARN",
23+
type="String",
24+
description="ARN of the the Claude model that you want to use for batch inference. ",
25+
)
26+
27+
# Create an S3 bucket with default encryption
28+
bucket = s3.Bucket(
29+
self,
30+
"BatchInfBucket",
31+
encryption=s3.BucketEncryption.S3_MANAGED,
32+
enforce_ssl=True,
33+
versioned=True,
34+
event_bridge_enabled=True, # Enable EventBridge notifications
35+
)
36+
37+
# Create IAM role for Bedrock
38+
bedrock_role = iam.Role(
39+
self,
40+
"BedrockBatchInferenceRole",
41+
assumed_by=iam.ServicePrincipal("bedrock.amazonaws.com")
42+
)
43+
44+
# Add S3 permissions to the role
45+
bedrock_role.add_to_policy(iam.PolicyStatement(
46+
actions=["s3:GetObject", "s3:PutObject", "s3:ListBucket"],
47+
resources=[
48+
f"{bucket.bucket_arn}",
49+
f"{bucket.bucket_arn}/*",
50+
]
51+
))
52+
53+
statement = PolicyStatement(
54+
actions=["bedrock:CreateModelInvocationJob", "iam:PassRole"],
55+
resources=[f"arn:aws:bedrock:{Stack.of(self).region}:{Stack.of(self).account}:model-invocation-job/*",
56+
model_id.value_as_string,
57+
bedrock_role.role_arn],
58+
effect=iam.Effect.ALLOW
59+
)
60+
61+
# Create EventBridge rule to trigger Bedrock model invocation when input.jsonl file is
62+
# uploaded to S3 under /input prefix
63+
# Note: The AwsApi target creates an AWS-managed Lambda function that makes the actual API call
64+
events.Rule(
65+
self,
66+
"S3ToBedrockBatchInferenceRule",
67+
event_pattern=events.EventPattern(
68+
source=["aws.s3"],
69+
detail_type=["Object Created"],
70+
detail={
71+
"bucket": {
72+
"name": [bucket.bucket_name]
73+
},
74+
"object": {
75+
"key": ["input/input.jsonl"]
76+
}
77+
}
78+
),
79+
80+
targets=[targets.AwsApi(
81+
service="bedrock",
82+
action="createModelInvocationJob",
83+
parameters={
84+
"modelId": model_id.value_as_string,
85+
"jobName": f"batch-inference-job-{int(time.time())}",
86+
"inputDataConfig": {
87+
"s3InputDataConfig": {
88+
"s3Uri": f"s3://{bucket.bucket_name}/input/input.jsonl"
89+
}
90+
},
91+
"outputDataConfig": {
92+
"s3OutputDataConfig": {
93+
"s3Uri": f"s3://{bucket.bucket_name}/output/"
94+
}
95+
},
96+
"roleArn": bedrock_role.role_arn
97+
},
98+
policy_statement=statement
99+
)]
100+
)
101+
102+
# Output the bucket name
103+
CfnOutput(
104+
self,
105+
"BatchInferenceBucketName",
106+
value=bucket.bucket_name,
107+
description="S3 bucket to store the batch inference input and output"
108+
)
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
{
2+
"app": "python3 app.py",
3+
"watch": {
4+
"include": [
5+
"**"
6+
],
7+
"exclude": [
8+
"README.md",
9+
"cdk*.json",
10+
"requirements*.txt",
11+
"source.bat",
12+
"**/__init__.py",
13+
"**/__pycache__",
14+
"tests"
15+
]
16+
},
17+
"context": {
18+
"@aws-cdk/aws-lambda:recognizeLayerVersion": true,
19+
"@aws-cdk/core:checkSecretUsage": true,
20+
"@aws-cdk/core:target-partitions": [
21+
"aws",
22+
"aws-cn"
23+
],
24+
"@aws-cdk-containers/ecs-service-extensions:enableDefaultLogDriver": true,
25+
"@aws-cdk/aws-ec2:uniqueImdsv2TemplateName": true,
26+
"@aws-cdk/aws-ecs:arnFormatIncludesClusterName": true,
27+
"@aws-cdk/aws-iam:minimizePolicies": true,
28+
"@aws-cdk/core:validateSnapshotRemovalPolicy": true,
29+
"@aws-cdk/aws-codepipeline:crossAccountKeyAliasStackSafeResourceName": true,
30+
"@aws-cdk/aws-s3:createDefaultLoggingPolicy": true,
31+
"@aws-cdk/aws-sns-subscriptions:restrictSqsDescryption": true,
32+
"@aws-cdk/aws-apigateway:disableCloudWatchRole": true,
33+
"@aws-cdk/core:enablePartitionLiterals": true,
34+
"@aws-cdk/aws-events:eventsTargetQueueSameAccount": true,
35+
"@aws-cdk/aws-ecs:disableExplicitDeploymentControllerForCircuitBreaker": true,
36+
"@aws-cdk/aws-iam:importedRoleStackSafeDefaultPolicyName": true,
37+
"@aws-cdk/aws-s3:serverAccessLogsUseBucketPolicy": true,
38+
"@aws-cdk/aws-route53-patters:useCertificate": true,
39+
"@aws-cdk/customresources:installLatestAwsSdkDefault": false,
40+
"@aws-cdk/aws-rds:databaseProxyUniqueResourceName": true,
41+
"@aws-cdk/aws-codedeploy:removeAlarmsFromDeploymentGroup": true,
42+
"@aws-cdk/aws-apigateway:authorizerChangeDeploymentLogicalId": true,
43+
"@aws-cdk/aws-ec2:launchTemplateDefaultUserData": true,
44+
"@aws-cdk/aws-secretsmanager:useAttachedSecretResourcePolicyForSecretTargetAttachments": true,
45+
"@aws-cdk/aws-redshift:columnId": true,
46+
"@aws-cdk/aws-stepfunctions-tasks:enableEmrServicePolicyV2": true,
47+
"@aws-cdk/aws-ec2:restrictDefaultSecurityGroup": true,
48+
"@aws-cdk/aws-apigateway:requestValidatorUniqueId": true,
49+
"@aws-cdk/aws-kms:aliasNameRef": true,
50+
"@aws-cdk/aws-autoscaling:generateLaunchTemplateInsteadOfLaunchConfig": true,
51+
"@aws-cdk/core:includePrefixInUniqueNameGeneration": true,
52+
"@aws-cdk/aws-efs:denyAnonymousAccess": true,
53+
"@aws-cdk/aws-opensearchservice:enableOpensearchMultiAzWithStandby": true,
54+
"@aws-cdk/aws-lambda-nodejs:useLatestRuntimeVersion": true,
55+
"@aws-cdk/aws-efs:mountTargetOrderInsensitiveLogicalId": true,
56+
"@aws-cdk/aws-rds:auroraClusterChangeScopeOfInstanceParameterGroupWithEachParameters": true,
57+
"@aws-cdk/aws-appsync:useArnForSourceApiAssociationIdentifier": true,
58+
"@aws-cdk/aws-rds:preventRenderingDeprecatedCredentials": true,
59+
"@aws-cdk/aws-codepipeline-actions:useNewDefaultBranchForCodeCommitSource": true,
60+
"@aws-cdk/aws-cloudwatch-actions:changeLambdaPermissionLogicalIdForLambdaAction": true,
61+
"@aws-cdk/aws-codepipeline:crossAccountKeysDefaultValueToFalse": true,
62+
"@aws-cdk/aws-codepipeline:defaultPipelineTypeToV2": true,
63+
"@aws-cdk/aws-kms:reduceCrossAccountRegionPolicyScope": true,
64+
"@aws-cdk/aws-eks:nodegroupNameAttribute": true,
65+
"@aws-cdk/aws-ec2:ebsDefaultGp3Volume": true,
66+
"@aws-cdk/aws-ecs:removeDefaultDeploymentAlarm": true,
67+
"@aws-cdk/custom-resources:logApiResponseDataPropertyTrueDefault": false,
68+
"@aws-cdk/aws-s3:keepNotificationInImportedBucket": false,
69+
"@aws-cdk/aws-ecs:reduceEc2FargateCloudWatchPermissions": true,
70+
"@aws-cdk/aws-dynamodb:resourcePolicyPerReplica": true,
71+
"@aws-cdk/aws-ec2:ec2SumTImeoutEnabled": true,
72+
"@aws-cdk/aws-appsync:appSyncGraphQLAPIScopeLambdaPermission": true,
73+
"@aws-cdk/aws-rds:setCorrectValueForDatabaseInstanceReadReplicaInstanceResourceId": true,
74+
"@aws-cdk/core:cfnIncludeRejectComplexResourceUpdateCreatePolicyIntrinsics": true,
75+
"@aws-cdk/aws-lambda-nodejs:sdkV3ExcludeSmithyPackages": true,
76+
"@aws-cdk/aws-stepfunctions-tasks:fixRunEcsTaskPolicy": true,
77+
"@aws-cdk/aws-ec2:bastionHostUseAmazonLinux2023ByDefault": true
78+
}
79+
}
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
{
2+
"title": "Amazon S3 to Amazon Bedrock Batch Inference using an Amazon EventBridge Rule",
3+
"description": "This pattern demonstrates how to trigger a Bedrock batch inference job when an object, that is the input to the batch inference job, is uploaded to S3. The pattern is implemented using AWS CDK.",
4+
"language": "Python",
5+
"level": "200",
6+
"framework": "CDK",
7+
"introBox": {
8+
"headline": "How it works",
9+
"text": [
10+
"This pattern creates an S3 bucket to store the input and output of the batch inference job.",
11+
"It also creates an EventBridge rule that is triggered when a batch inference input file is uploaded.",
12+
"When the EventBridge rule is triggered, it uses an AWS-managed Lambda function (created automatically by the AwsApi target) to call the Bedrock createModelInvocationJob API.",
13+
"This Lambda function creates a new Bedrock batch inference job with the specified model and the uploaded input file.",
14+
"The output of the batch inference job is stored in the same S3 bucket."
15+
]
16+
},
17+
"gitHub": {
18+
"template": {
19+
"repoURL": "https://github.com/aws-samples/serverless-patterns/tree/main/s3-bedrock-batch-inference-cdk",
20+
"templateURL": "serverless-patterns/s3-bedrock-batch-inference-cdk",
21+
"projectFolder": "s3-bedrock-batch-inference-cdk",
22+
"templateFile": "app.py"
23+
}
24+
},
25+
"resources": {
26+
"bullets": [
27+
{
28+
"text": "Amazon Bedrock Batch Inference Documentation",
29+
"link": "https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference.html"
30+
},
31+
{
32+
"text": "Supported Models and Regions for Batch Inference",
33+
"link": "https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference-supported.html"
34+
},
35+
{
36+
"text": "EventBridge Targets Documentation",
37+
"link": "https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-targets.html"
38+
},
39+
{
40+
"text": "AWS CDK EventBridge Targets",
41+
"link": "https://docs.aws.amazon.com/cdk/api/v2/python/aws_cdk.aws_events_targets.html"
42+
}
43+
]
44+
},
45+
"deploy": {
46+
"text": [
47+
"python3 -m venv .",
48+
"source ./bin/activate",
49+
"pip3 install -r requirements.txt",
50+
"export AWS_REGION=us-west-2",
51+
"cdk deploy --parameters ModelARN=<ARN of the model>"
52+
]
53+
},
54+
"testing": {
55+
"text": [
56+
"Upload the sample input file: <code>aws s3 cp model_input/input.jsonl s3://{S3 bucket name}/input/</code>",
57+
"Check batch inference job status: <code>aws bedrock list-model-invocation-jobs | jq '.invocationJobSummaries[] | {jobArn, status, submitTime}'</code>",
58+
"Once completed (30+ minutes), check results in S3 bucket under 'output' prefix"
59+
]
60+
},
61+
"cleanup": {
62+
"text": [
63+
"Empty the S3 bucket: <code>aws s3 rm s3://{S3 bucket name} --recursive</code>",
64+
"Delete the stack: <code>cdk destroy</code>"
65+
]
66+
},
67+
"authors": [
68+
{
69+
"name": "Biswanath Mukherjee",
70+
"image": "https://serverlessland.com/assets/images/resources/contributors/biswanath-mukherjee.jpg",
71+
"bio": "I am a Sr. Solutions Architect working at AWS India.",
72+
"linkedin": "biswanathmukherjee"
73+
},
74+
{
75+
"name": "Rakshith Rao",
76+
"image": "https://serverlessland.com/assets/images/resources/contributors/rakshith-rao.png",
77+
"bio": "I am a Senior Solutions Architect at AWS and help our strategic customers build and operate their key workloads on AWS.",
78+
"linkedin": "rakshithrao"
79+
}
80+
]
81+
}
21.6 KB
Loading

0 commit comments

Comments
 (0)