Skip to content

New Serverless Pattern - S3 to Bedrock Batch Inference #2763

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 103 additions & 0 deletions s3-bedrock-batch-inference-cdk/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Amazon S3 to Amazon Bedrock Batch Inference using an Amazon EventBridge Rule

This pattern demonstrates how to trigger a Bedrock batch inference job when an object, that is the input to the batch inference job, is uploaded to S3. The pattern is implemented using AWS CDK.

Learn more about this pattern at [Serverless Land Patterns](https://serverlessland.com/patterns/s3-eventbridge-bedrock-batch-cdk)

Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the AWS Pricing page for details. You are responsible for any AWS costs incurred. No warranty is implied in this example.

## Pre-requisites

- [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources.
- **[ IMPORTANT]** This pattern uses an example input that is [based on Messages API](https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference-data.html#batch-inference-data-ex-text) format of Anthropic Claude. Only specific models support batch inference in specific regions. Check the [supported regions and models](https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference-supported.html) section of the batch inference documentation and make sure you have access to [the correct Claude model](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html) that you want to use.
- [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured
- [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
- [AWS CDK](https://docs.aws.amazon.com/cdk/latest/guide/cli.html) installed and configured
- [Python 3.13](https://www.python.org/downloads/) installed

## Deployment Instructions

1. Create a new directory, navigate to that directory in a terminal and clone the GitHub repository:

```shell
git clone https://github.com/aws-samples/serverless-patterns
```
2. Change directory to the pattern directory:

```shell
cd s3-bedrock-batch-inference-cdk
```

3. Create and activate a Python virtual environment:

```shell
python3 -m venv .venv
source .venv/bin/activate
```

4. Install the required dependencies:

```shell
pip3 install -r requirements.txt
```

5. Set your AWS region (replace `us-west-2` with your desired region):

```shell
export AWS_REGION=us-west-2
```

6. Deploy the stack by running the command below. Replace `ModelARN` with the ARN of the model you want to use. For example, if you want to use **Claude 3.5 Sonnet** in **us-west-2** the ARN would be `arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0`.

**Note:** You can find the complete list of foundation model that support batch inference in the [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference-supported.html). The ARN format is: `arn:aws:bedrock:<region>::foundation-model/<model-id>`

```shell
cdk deploy --parameters ModelARN=<ARN of the model>
```

**Note:** You will be prompted to confirm the deployment. Type `y` and press Enter to proceed.

## How it works
![End to End Architecture](images/architecture.png)

This pattern creates an S3 bucket to store the input and output of the batch inference job. It also creates an EventBridge rule that is triggered when a batch inference input file is uploaded.

When the EventBridge rule is triggered, it uses an AWS-managed Lambda function (created automatically by the `AwsApi` target) to call the Bedrock `createModelInvocationJob` API. This Lambda function creates a new Bedrock batch inference job with the specified model and the uploaded input file. The output of the batch inference job is stored in the same S3 bucket.

**Architecture Flow:**
1. Input file (`input.jsonl`) is uploaded to S3 bucket under `/input/` prefix
2. S3 sends an event to EventBridge when the object is created
3. EventBridge rule matches the event and triggers an AWS-managed Lambda function
4. The Lambda function calls the Bedrock `createModelInvocationJob` API
5. Bedrock processes the batch inference job and stores results in S3 under `/output/` prefix

## Testing
- Once the pattern is deployed successfully you should see the name of the S3 bucket in the CDK output.
- Upload the sample `input.jsonl` to this bucket by running this command
```shell
aws s3 cp model_input/input.jsonl s3://<S3 bucket name>/input/
```
- The upload will trigger the batch inference job. You can check the status of the job on the [AWS Console](https://console.aws.amazon.com/bedrock/home?#/batch-inference) or alternatively, you can use this command:
```shell
aws bedrock list-model-invocation-jobs | jq '.invocationJobSummaries[] | {jobArn, status, submitTime}'
```
- Once the batch inference job's status is "Completed" (this may take 30+ minutes), you can check the results by navigating to the [S3 bucket](https://console.aws.amazon.com/s3/buckets?&bucketType=general) and checking the `output` prefix


## Cleanup

1. Empty the S3 bucket before deleting the stack:
```bash
aws s3 rm s3://<S3 bucket name> --recursive
```

2. Delete the stack by running the command below:
```bash
cdk destroy
```

---

Copyright 2025 Amazon.com, Inc. or its affiliates. All Rights Reserved.

SPDX-License-Identifier: MIT-0
10 changes: 10 additions & 0 deletions s3-bedrock-batch-inference-cdk/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/usr/bin/env python3

import aws_cdk as cdk

from bedrock_batch_inference.bedrock_batch_inference_pattern_stack import BedrockBatchInferencePatternStack

app = cdk.App()
stack = BedrockBatchInferencePatternStack(app, "BedrockBatchInferencePatternStack")

app.synth()
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
#!/usr/bin/env python3
import time

from aws_cdk import (
Stack,
aws_s3 as s3,
aws_events as events,
aws_events_targets as targets,
aws_iam as iam,
CfnOutput, CfnParameter,
)
from aws_cdk.aws_iam import PolicyStatement
from constructs import Construct


class BedrockBatchInferencePatternStack(Stack):

def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
super().__init__(scope, construct_id, **kwargs)

# Accept the model id as a parameter
model_id = CfnParameter(self, "ModelARN",
type="String",
description="ARN of the the Claude model that you want to use for batch inference. ",
)

# Create an S3 bucket with default encryption
bucket = s3.Bucket(
self,
"BatchInfBucket",
encryption=s3.BucketEncryption.S3_MANAGED,
enforce_ssl=True,
versioned=True,
event_bridge_enabled=True, # Enable EventBridge notifications
)

# Create IAM role for Bedrock
bedrock_role = iam.Role(
self,
"BedrockBatchInferenceRole",
assumed_by=iam.ServicePrincipal("bedrock.amazonaws.com")
)

# Add S3 permissions to the role
bedrock_role.add_to_policy(iam.PolicyStatement(
actions=["s3:GetObject", "s3:PutObject", "s3:ListBucket"],
resources=[
f"{bucket.bucket_arn}",
f"{bucket.bucket_arn}/*",
]
))

statement = PolicyStatement(
actions=["bedrock:CreateModelInvocationJob", "iam:PassRole"],
resources=[f"arn:aws:bedrock:{Stack.of(self).region}:{Stack.of(self).account}:model-invocation-job/*",
model_id.value_as_string,
bedrock_role.role_arn],
effect=iam.Effect.ALLOW
)

# Create EventBridge rule to trigger Bedrock model invocation when input.jsonl file is
# uploaded to S3 under /input prefix
# Note: The AwsApi target creates an AWS-managed Lambda function that makes the actual API call
events.Rule(
self,
"S3ToBedrockBatchInferenceRule",
event_pattern=events.EventPattern(
source=["aws.s3"],
detail_type=["Object Created"],
detail={
"bucket": {
"name": [bucket.bucket_name]
},
"object": {
"key": ["input/input.jsonl"]
}
}
),

targets=[targets.AwsApi(
service="bedrock",
action="createModelInvocationJob",
parameters={
"modelId": model_id.value_as_string,
"jobName": f"batch-inference-job-{int(time.time())}",
"inputDataConfig": {
"s3InputDataConfig": {
"s3Uri": f"s3://{bucket.bucket_name}/input/input.jsonl"
}
},
"outputDataConfig": {
"s3OutputDataConfig": {
"s3Uri": f"s3://{bucket.bucket_name}/output/"
}
},
"roleArn": bedrock_role.role_arn
},
policy_statement=statement
)]
)

# Output the bucket name
CfnOutput(
self,
"BatchInferenceBucketName",
value=bucket.bucket_name,
description="S3 bucket to store the batch inference input and output"
)
79 changes: 79 additions & 0 deletions s3-bedrock-batch-inference-cdk/cdk.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
{
"app": "python3 app.py",
"watch": {
"include": [
"**"
],
"exclude": [
"README.md",
"cdk*.json",
"requirements*.txt",
"source.bat",
"**/__init__.py",
"**/__pycache__",
"tests"
]
},
"context": {
"@aws-cdk/aws-lambda:recognizeLayerVersion": true,
"@aws-cdk/core:checkSecretUsage": true,
"@aws-cdk/core:target-partitions": [
"aws",
"aws-cn"
],
"@aws-cdk-containers/ecs-service-extensions:enableDefaultLogDriver": true,
"@aws-cdk/aws-ec2:uniqueImdsv2TemplateName": true,
"@aws-cdk/aws-ecs:arnFormatIncludesClusterName": true,
"@aws-cdk/aws-iam:minimizePolicies": true,
"@aws-cdk/core:validateSnapshotRemovalPolicy": true,
"@aws-cdk/aws-codepipeline:crossAccountKeyAliasStackSafeResourceName": true,
"@aws-cdk/aws-s3:createDefaultLoggingPolicy": true,
"@aws-cdk/aws-sns-subscriptions:restrictSqsDescryption": true,
"@aws-cdk/aws-apigateway:disableCloudWatchRole": true,
"@aws-cdk/core:enablePartitionLiterals": true,
"@aws-cdk/aws-events:eventsTargetQueueSameAccount": true,
"@aws-cdk/aws-ecs:disableExplicitDeploymentControllerForCircuitBreaker": true,
"@aws-cdk/aws-iam:importedRoleStackSafeDefaultPolicyName": true,
"@aws-cdk/aws-s3:serverAccessLogsUseBucketPolicy": true,
"@aws-cdk/aws-route53-patters:useCertificate": true,
"@aws-cdk/customresources:installLatestAwsSdkDefault": false,
"@aws-cdk/aws-rds:databaseProxyUniqueResourceName": true,
"@aws-cdk/aws-codedeploy:removeAlarmsFromDeploymentGroup": true,
"@aws-cdk/aws-apigateway:authorizerChangeDeploymentLogicalId": true,
"@aws-cdk/aws-ec2:launchTemplateDefaultUserData": true,
"@aws-cdk/aws-secretsmanager:useAttachedSecretResourcePolicyForSecretTargetAttachments": true,
"@aws-cdk/aws-redshift:columnId": true,
"@aws-cdk/aws-stepfunctions-tasks:enableEmrServicePolicyV2": true,
"@aws-cdk/aws-ec2:restrictDefaultSecurityGroup": true,
"@aws-cdk/aws-apigateway:requestValidatorUniqueId": true,
"@aws-cdk/aws-kms:aliasNameRef": true,
"@aws-cdk/aws-autoscaling:generateLaunchTemplateInsteadOfLaunchConfig": true,
"@aws-cdk/core:includePrefixInUniqueNameGeneration": true,
"@aws-cdk/aws-efs:denyAnonymousAccess": true,
"@aws-cdk/aws-opensearchservice:enableOpensearchMultiAzWithStandby": true,
"@aws-cdk/aws-lambda-nodejs:useLatestRuntimeVersion": true,
"@aws-cdk/aws-efs:mountTargetOrderInsensitiveLogicalId": true,
"@aws-cdk/aws-rds:auroraClusterChangeScopeOfInstanceParameterGroupWithEachParameters": true,
"@aws-cdk/aws-appsync:useArnForSourceApiAssociationIdentifier": true,
"@aws-cdk/aws-rds:preventRenderingDeprecatedCredentials": true,
"@aws-cdk/aws-codepipeline-actions:useNewDefaultBranchForCodeCommitSource": true,
"@aws-cdk/aws-cloudwatch-actions:changeLambdaPermissionLogicalIdForLambdaAction": true,
"@aws-cdk/aws-codepipeline:crossAccountKeysDefaultValueToFalse": true,
"@aws-cdk/aws-codepipeline:defaultPipelineTypeToV2": true,
"@aws-cdk/aws-kms:reduceCrossAccountRegionPolicyScope": true,
"@aws-cdk/aws-eks:nodegroupNameAttribute": true,
"@aws-cdk/aws-ec2:ebsDefaultGp3Volume": true,
"@aws-cdk/aws-ecs:removeDefaultDeploymentAlarm": true,
"@aws-cdk/custom-resources:logApiResponseDataPropertyTrueDefault": false,
"@aws-cdk/aws-s3:keepNotificationInImportedBucket": false,
"@aws-cdk/aws-ecs:reduceEc2FargateCloudWatchPermissions": true,
"@aws-cdk/aws-dynamodb:resourcePolicyPerReplica": true,
"@aws-cdk/aws-ec2:ec2SumTImeoutEnabled": true,
"@aws-cdk/aws-appsync:appSyncGraphQLAPIScopeLambdaPermission": true,
"@aws-cdk/aws-rds:setCorrectValueForDatabaseInstanceReadReplicaInstanceResourceId": true,
"@aws-cdk/core:cfnIncludeRejectComplexResourceUpdateCreatePolicyIntrinsics": true,
"@aws-cdk/aws-lambda-nodejs:sdkV3ExcludeSmithyPackages": true,
"@aws-cdk/aws-stepfunctions-tasks:fixRunEcsTaskPolicy": true,
"@aws-cdk/aws-ec2:bastionHostUseAmazonLinux2023ByDefault": true
}
}
81 changes: 81 additions & 0 deletions s3-bedrock-batch-inference-cdk/example.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
{
"title": "Amazon S3 to Amazon Bedrock Batch Inference using an Amazon EventBridge Rule",
"description": "This pattern demonstrates how to trigger a Bedrock batch inference job when an object, that is the input to the batch inference job, is uploaded to S3. The pattern is implemented using AWS CDK.",
"language": "Python",
"level": "200",
"framework": "CDK",
"introBox": {
"headline": "How it works",
"text": [
"This pattern creates an S3 bucket to store the input and output of the batch inference job.",
"It also creates an EventBridge rule that is triggered when a batch inference input file is uploaded.",
"When the EventBridge rule is triggered, it uses an AWS-managed Lambda function (created automatically by the AwsApi target) to call the Bedrock createModelInvocationJob API.",
"This Lambda function creates a new Bedrock batch inference job with the specified model and the uploaded input file.",
"The output of the batch inference job is stored in the same S3 bucket."
]
},
"gitHub": {
"template": {
"repoURL": "https://github.com/aws-samples/serverless-patterns/tree/main/s3-bedrock-batch-inference-cdk",
"templateURL": "serverless-patterns/s3-bedrock-batch-inference-cdk",
"projectFolder": "s3-bedrock-batch-inference-cdk",
"templateFile": "app.py"
}
},
"resources": {
"bullets": [
{
"text": "Amazon Bedrock Batch Inference Documentation",
"link": "https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference.html"
},
{
"text": "Supported Models and Regions for Batch Inference",
"link": "https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference-supported.html"
},
{
"text": "EventBridge Targets Documentation",
"link": "https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-targets.html"
},
{
"text": "AWS CDK EventBridge Targets",
"link": "https://docs.aws.amazon.com/cdk/api/v2/python/aws_cdk.aws_events_targets.html"
}
]
},
"deploy": {
"text": [
"python3 -m venv .",
"source ./bin/activate",
"pip3 install -r requirements.txt",
"export AWS_REGION=us-west-2",
"cdk deploy --parameters ModelARN=<ARN of the model>"
]
},
"testing": {
"text": [
"Upload the sample input file: <code>aws s3 cp model_input/input.jsonl s3://{S3 bucket name}/input/</code>",
"Check batch inference job status: <code>aws bedrock list-model-invocation-jobs | jq '.invocationJobSummaries[] | {jobArn, status, submitTime}'</code>",
"Once completed (30+ minutes), check results in S3 bucket under 'output' prefix"
]
},
"cleanup": {
"text": [
"Empty the S3 bucket: <code>aws s3 rm s3://{S3 bucket name} --recursive</code>",
"Delete the stack: <code>cdk destroy</code>"
]
},
"authors": [
{
"name": "Biswanath Mukherjee",
"image": "https://serverlessland.com/assets/images/resources/contributors/biswanath-mukherjee.jpg",
"bio": "I am a Sr. Solutions Architect working at AWS India.",
"linkedin": "biswanathmukherjee"
},
{
"name": "Rakshith Rao",
"image": "https://serverlessland.com/assets/images/resources/contributors/rakshith-rao.png",
"bio": "I am a Senior Solutions Architect at AWS and help our strategic customers build and operate their key workloads on AWS.",
"linkedin": "rakshithrao"
}
]
}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading