Claude on AWS Bedrock, Google Vertex AI & Anthropic Cookbook Notes
Notes from Anthropic's cloud platform courses covering Claude on Amazon Bedrock, Google Vertex AI, and the Anthropic Cookbook — with practical code examples.
Notes from Anthropic’s cloud platform courses (Claude With Amazon Bedrock, Claude With Google Cloud Vertex AI) plus the free Jupyter notebook tutorials from the Anthropic Cookbook on GitHub. All free resources available at anthropic.skilljar.com and github.com/anthropics/anthropic-cookbook.
☁️ 1. Claude With Amazon Bedrock
Amazon Bedrock is AWS’s managed AI platform. Claude models are available in Bedrock, letting you use Claude inside your existing AWS infrastructure with IAM, VPC, CloudWatch, and all the enterprise controls you already have.
Why Bedrock?
- No separate API keys — use AWS IAM credentials you already manage
- Data residency — keep requests within your chosen AWS region
- Cost consolidation — Claude usage on the same AWS bill
- Compliance — AWS’s SOC 2, HIPAA, and other certifications apply
Setup
1
2
pip install boto3
aws configure # or use IAM roles in production
Basic Invocation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import boto3
import json
client = boto3.client("bedrock-runtime", region_name="us-east-1")
body = json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "What is the capital of Singapore?"}
]
})
response = client.invoke_model(
modelId="anthropic.claude-opus-4-5-20241022-v2:0",
body=body,
contentType="application/json",
accept="application/json"
)
result = json.loads(response["body"].read())
print(result["content"][0]["text"])
Streaming With Bedrock
1
2
3
4
5
6
7
8
9
10
11
response = client.invoke_model_with_response_stream(
modelId="anthropic.claude-sonnet-4-5-20241022-v2:0",
body=body,
contentType="application/json",
accept="application/json"
)
for event in response["body"]:
chunk = json.loads(event["chunk"]["bytes"])
if chunk["type"] == "content_block_delta":
print(chunk["delta"].get("text", ""), end="", flush=True)
Available Models on Bedrock
| Model | Bedrock Model ID |
|---|---|
| Claude Opus | anthropic.claude-opus-4-5-… |
| Claude Sonnet | anthropic.claude-sonnet-4-5-… |
| Claude Haiku | anthropic.claude-haiku-3-5-… |
Always check the Bedrock console for the latest available model IDs in your region — they change with new releases.
Production Patterns
- Use Bedrock Guardrails to add content filtering on top of Claude’s built-in safety
- Use Bedrock Knowledge Bases to connect Claude to your private documents via RAG
- Monitor with CloudWatch — Bedrock emits token usage metrics automatically
🔵 2. Claude With Google Cloud’s Vertex AI
Vertex AI is Google Cloud’s ML platform. Claude models are available as managed endpoints via Anthropic’s partnership with Google.
Why Vertex AI?
- Tight integration with Google Cloud services (BigQuery, Cloud Storage, Pub/Sub)
- Use existing GCP IAM and billing
- Deploy alongside other Google AI models (Gemini, etc.) in the same platform
- Ideal if your data is already in Google Cloud
Setup
1
2
pip install google-cloud-aiplatform anthropic[vertex]
gcloud auth application-default login
Basic Call
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import anthropic
client = anthropic.AnthropicVertex(
project_id="your-gcp-project-id",
region="us-east5"
)
message = client.messages.create(
model="claude-opus-4-5@20241022",
max_tokens=1024,
messages=[
{"role": "user", "content": "Describe the architecture of a transformer model."}
]
)
print(message.content[0].text)
The
AnthropicVertexclient uses Google Cloud credentials instead of an Anthropic API key. Everything else — messages format, tool use, streaming — works identically to the standard Anthropic SDK.
Integrating With Google Services
Read from Cloud Storage:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
from google.cloud import storage
def read_gcs_file(bucket: str, blob: str) -> str:
client = storage.Client()
bucket = client.bucket(bucket)
return bucket.blob(blob).download_as_text()
# Pass to Claude
content = read_gcs_file("my-bucket", "report.txt")
message = vertex_client.messages.create(
model="claude-sonnet-4-5@20241022",
max_tokens=512,
messages=[{"role": "user", "content": f"Summarise this report:\n\n{content}"}]
)
📒 3. Anthropic Cookbook — GitHub Notebook Notes
The Anthropic Cookbook is a collection of Jupyter notebooks with practical, runnable examples.
API Fundamentals
Core concepts every Claude developer should know:
- Message structure — always
[{"role": "user"/"assistant", "content": "..."}] - max_tokens — always set this; Claude will stop when reached
- stop_sequences — stop generation at specific strings (useful for structured output)
1
2
3
4
5
6
7
# Stop generation at a custom delimiter
response = client.messages.create(
model="claude-haiku-3-5",
max_tokens=256,
stop_sequences=["</answer>"],
messages=[{"role": "user", "content": "Answer in tags: What is ML? <answer>"}]
)
Prompt Engineering Techniques
1. Role Assignment
1
System: You are an expert radiologist with 20 years of experience interpreting CT scans.
2. Chain of Thought
1
2
Before answering, think through this step by step inside <thinking> tags.
Then give your final answer inside <answer> tags.
3. Few-Shot Examples — showing Claude examples of the format you want:
1
2
3
4
5
Input: "The cat sat on the mat."
Output: {"subject": "cat", "verb": "sat", "location": "mat"}
Input: "Birds fly south in winter."
Output:
4. XML Tags for Structure — Claude responds well to XML-structured prompts:
1
2
3
<task>Classify the sentiment of the following review</task>
<review>The product broke after two days. Very disappointed.</review>
<output_format>Return only: POSITIVE, NEGATIVE, or NEUTRAL</output_format>
5. Avoiding Hallucination:
1
2
Answer only from the provided context. If the answer is not in the context,
say "I don't have enough information to answer this."
Structured Extraction Pattern
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import json
response = client.messages.create(
model="claude-haiku-3-5",
max_tokens=512,
messages=[{
"role": "user",
"content": """Extract the following from this job posting and return as JSON:
- job_title
- company
- required_skills (list)
- salary_range
Job posting: {posting}
Return only valid JSON, no explanation.""".format(posting="...")
}]
)
data = json.loads(response.content[0].text)
Evaluations
Building evals is critical for production — you can’t improve what you don’t measure.
Simple eval loop:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
test_cases = [
{"input": "What is 2+2?", "expected": "4"},
{"input": "Capital of France?", "expected": "Paris"},
]
correct = 0
for case in test_cases:
response = client.messages.create(
model="claude-haiku-3-5",
max_tokens=64,
messages=[{"role": "user", "content": case["input"]}]
)
output = response.content[0].text
if case["expected"].lower() in output.lower():
correct += 1
print(f"Accuracy: {correct}/{len(test_cases)} = {correct/len(test_cases)*100:.1f}%")
LLM-as-judge — use Claude to evaluate Claude:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
def evaluate_response(question, answer, rubric):
judge_prompt = f"""
Question: {question}
Answer: {answer}
Rubric: {rubric}
Rate this answer 1-5 and explain briefly. Return JSON: score
"""
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=256,
messages=[{"role": "user", "content": judge_prompt}]
)
return json.loads(response.content[0].text)
LLM-as-judge is the most scalable evaluation approach for open-ended outputs. Use it for assessing quality when ground-truth answers don’t exist.
Tool Use (Advanced Patterns)
Parallel tool calls — Claude can call multiple tools in one response:
1
2
3
# Claude may return multiple tool_use blocks
tool_calls = [b for b in response.content if b.type == "tool_use"]
# Execute all in parallel, then return all results
Tool choice forcing — make Claude always use a specific tool:
1
2
3
4
5
6
7
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=1024,
tools=tools,
tool_choice={"type": "tool", "name": "get_weather"}, # Force this tool
messages=[...]
)
💡 Summary: Which Platform to Use?
| Scenario | Recommendation |
|---|---|
| Building a new app | Direct Anthropic API — simplest, latest models |
| Already on AWS | Amazon Bedrock — IAM, compliance, same bill |
| Already on GCP | Google Vertex AI — GCP native, BigQuery integration |
| Experimentation | Anthropic Cookbook notebooks — great starting point |
Sources:
- Anthropic Learning Platform
- Anthropic Cookbook (GitHub)
- Anthropic API Docs
- AWS Bedrock Docs
- Google Vertex AI Docs
Part of my Anthropic developer notes series. Next: building your first MCP server in Python.