Claude on AWS Bedrock, Google Vertex AI & Anthropic Cookbook Notes

Notes from Anthropic's cloud platform courses covering Claude on Amazon Bedrock, Google Vertex AI, and the Anthropic Cookbook — with practical code examples.

Posted Mar 7, 2026 Updated Mar 20, 2026

Cloud computing infrastructure

By YuXuan Yan

6 min read

Claude on AWS Bedrock, Google Vertex AI & Anthropic Cookbook Notes

Notes from Anthropic’s cloud platform courses (Claude With Amazon Bedrock, Claude With Google Cloud Vertex AI) plus the free Jupyter notebook tutorials from the Anthropic Cookbook on GitHub. All free resources available at anthropic.skilljar.com and github.com/anthropics/anthropic-cookbook.

☁️ 1. Claude With Amazon Bedrock

Amazon Bedrock is AWS’s managed AI platform. Claude models are available in Bedrock, letting you use Claude inside your existing AWS infrastructure with IAM, VPC, CloudWatch, and all the enterprise controls you already have.

Why Bedrock?

No separate API keys — use AWS IAM credentials you already manage
Data residency — keep requests within your chosen AWS region
Cost consolidation — Claude usage on the same AWS bill
Compliance — AWS’s SOC 2, HIPAA, and other certifications apply

Setup

pip install boto3
aws configure  # or use IAM roles in production

Basic Invocation

  
import boto3
import json

client = boto3.client("bedrock-runtime", region_name="us-east-1")

body = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 1024,
    "messages": [
        {"role": "user", "content": "What is the capital of Singapore?"}
    ]
})

response = client.invoke_model(
    modelId="anthropic.claude-opus-4-5-20241022-v2:0",
    body=body,
    contentType="application/json",
    accept="application/json"
)

result = json.loads(response["body"].read())
print(result["content"][0]["text"])

Streaming With Bedrock

  
response = client.invoke_model_with_response_stream(
    modelId="anthropic.claude-sonnet-4-5-20241022-v2:0",
    body=body,
    contentType="application/json",
    accept="application/json"
)

for event in response["body"]:
    chunk = json.loads(event["chunk"]["bytes"])
    if chunk["type"] == "content_block_delta":
        print(chunk["delta"].get("text", ""), end="", flush=True)

Available Models on Bedrock

Model	Bedrock Model ID
Claude Opus	anthropic.claude-opus-4-5-…
Claude Sonnet	anthropic.claude-sonnet-4-5-…
Claude Haiku	anthropic.claude-haiku-3-5-…

Always check the Bedrock console for the latest available model IDs in your region — they change with new releases.

Production Patterns

Use Bedrock Guardrails to add content filtering on top of Claude’s built-in safety
Use Bedrock Knowledge Bases to connect Claude to your private documents via RAG
Monitor with CloudWatch — Bedrock emits token usage metrics automatically

🔵 2. Claude With Google Cloud’s Vertex AI

Vertex AI is Google Cloud’s ML platform. Claude models are available as managed endpoints via Anthropic’s partnership with Google.

Why Vertex AI?

Tight integration with Google Cloud services (BigQuery, Cloud Storage, Pub/Sub)
Use existing GCP IAM and billing
Deploy alongside other Google AI models (Gemini, etc.) in the same platform
Ideal if your data is already in Google Cloud

Setup

pip install google-cloud-aiplatform anthropic[vertex]
gcloud auth application-default login

Basic Call

  
import anthropic

client = anthropic.AnthropicVertex(
    project_id="your-gcp-project-id",
    region="us-east5"
)

message = client.messages.create(
    model="claude-opus-4-5@20241022",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Describe the architecture of a transformer model."}
    ]
)

print(message.content[0].text)

The AnthropicVertex client uses Google Cloud credentials instead of an Anthropic API key. Everything else — messages format, tool use, streaming — works identically to the standard Anthropic SDK.

Integrating With Google Services

Read from Cloud Storage:

  
from google.cloud import storage

def read_gcs_file(bucket: str, blob: str) -> str:
    client = storage.Client()
    bucket = client.bucket(bucket)
    return bucket.blob(blob).download_as_text()

# Pass to Claude
content = read_gcs_file("my-bucket", "report.txt")
message = vertex_client.messages.create(
    model="claude-sonnet-4-5@20241022",
    max_tokens=512,
    messages=[{"role": "user", "content": f"Summarise this report:\n\n{content}"}]
)

📒 3. Anthropic Cookbook — GitHub Notebook Notes

The Anthropic Cookbook is a collection of Jupyter notebooks with practical, runnable examples.

API Fundamentals

Core concepts every Claude developer should know:

Message structure — always [{"role": "user"/"assistant", "content": "..."}]
max_tokens — always set this; Claude will stop when reached
stop_sequences — stop generation at specific strings (useful for structured output)

  
# Stop generation at a custom delimiter
response = client.messages.create(
    model="claude-haiku-3-5",
    max_tokens=256,
    stop_sequences=["</answer>"],
    messages=[{"role": "user", "content": "Answer in tags: What is ML? <answer>"}]
)

Prompt Engineering Techniques

1. Role Assignment

System: You are an expert radiologist with 20 years of experience interpreting CT scans.

2. Chain of Thought

Before answering, think through this step by step inside <thinking> tags.
Then give your final answer inside <answer> tags.

3. Few-Shot Examples — showing Claude examples of the format you want:

Input: "The cat sat on the mat."
Output: {"subject": "cat", "verb": "sat", "location": "mat"}

Input: "Birds fly south in winter."
Output:

4. XML Tags for Structure — Claude responds well to XML-structured prompts:

<task>Classify the sentiment of the following review</task>
<review>The product broke after two days. Very disappointed.</review>
<output_format>Return only: POSITIVE, NEGATIVE, or NEUTRAL</output_format>

5. Avoiding Hallucination:

Answer only from the provided context. If the answer is not in the context,
say "I don't have enough information to answer this."

Structured Extraction Pattern

  
import json

response = client.messages.create(
    model="claude-haiku-3-5",
    max_tokens=512,
    messages=[{
        "role": "user",
        "content": """Extract the following from this job posting and return as JSON:
        - job_title
        - company
        - required_skills (list)
        - salary_range

        Job posting: {posting}

        Return only valid JSON, no explanation.""".format(posting="...")
    }]
)

data = json.loads(response.content[0].text)

Evaluations

Building evals is critical for production — you can’t improve what you don’t measure.

Simple eval loop:

  
test_cases = [
    {"input": "What is 2+2?", "expected": "4"},
    {"input": "Capital of France?", "expected": "Paris"},
]

correct = 0
for case in test_cases:
    response = client.messages.create(
        model="claude-haiku-3-5",
        max_tokens=64,
        messages=[{"role": "user", "content": case["input"]}]
    )
    output = response.content[0].text
    if case["expected"].lower() in output.lower():
        correct += 1

print(f"Accuracy: {correct}/{len(test_cases)} = {correct/len(test_cases)*100:.1f}%")

LLM-as-judge — use Claude to evaluate Claude:

  
def evaluate_response(question, answer, rubric):
    judge_prompt = f"""
    Question: {question}
    Answer: {answer}
    Rubric: {rubric}

    Rate this answer 1-5 and explain briefly. Return JSON: score
    """
    response = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=256,
        messages=[{"role": "user", "content": judge_prompt}]
    )
    return json.loads(response.content[0].text)

LLM-as-judge is the most scalable evaluation approach for open-ended outputs. Use it for assessing quality when ground-truth answers don’t exist.

Tool Use (Advanced Patterns)

Parallel tool calls — Claude can call multiple tools in one response:

  
# Claude may return multiple tool_use blocks
tool_calls = [b for b in response.content if b.type == "tool_use"]
# Execute all in parallel, then return all results

Tool choice forcing — make Claude always use a specific tool:

  
response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    tools=tools,
    tool_choice={"type": "tool", "name": "get_weather"},  # Force this tool
    messages=[...]
)

💡 Summary: Which Platform to Use?

Scenario	Recommendation
Building a new app	Direct Anthropic API — simplest, latest models
Already on AWS	Amazon Bedrock — IAM, compliance, same bill
Already on GCP	Google Vertex AI — GCP native, BigQuery integration
Experimentation	Anthropic Cookbook notebooks — great starting point

Sources:

Part of my Anthropic developer notes series. Next: building your first MCP server in Python.

AI, LLMs

This post is licensed under CC BY 4.0 by the author.

☁️ 1. Claude With Amazon Bedrock

Why Bedrock?

Setup

Basic Invocation

Streaming With Bedrock

Available Models on Bedrock

Production Patterns

🔵 2. Claude With Google Cloud’s Vertex AI

Why Vertex AI?

Setup

Basic Call

Integrating With Google Services

📒 3. Anthropic Cookbook — GitHub Notebook Notes

API Fundamentals

Prompt Engineering Techniques

Structured Extraction Pattern

Evaluations

Tool Use (Advanced Patterns)

💡 Summary: Which Platform to Use?

Trending Tags