5 Pitfalls of Logging Failed Requests with CloudFront + Lambda@Edge
How we built a dual Lambda@Edge solution to capture full request headers and body for failed CloudFront requests — and the 5 pitfalls we hit along the way.
The Problem
We had a straightforward requirement: log every failed request that passes through our CloudFront distribution. “Failed” meant two things — requests blocked by AWS WAF (HTTP 403) and requests that made it to the origin but returned a 4xx or 5xx status code. For each failure, we needed the full request headers and body so the operations team could investigate issues and replay requests if needed.
Sounds simple, right? CloudFront sits in front of everything, Lambda@Edge lets you hook into the request lifecycle — just grab the data when something goes wrong and send it to CloudWatch. We figured this would take an afternoon.
It took considerably longer than that. Along the way, we discovered five pitfalls that forced us to rethink our approach multiple times before landing on a solution that actually works. If you’re building anything similar, this post might save you the same headaches.
A Quick Primer: CloudFront’s Four Event Stages
Before diving into the pitfalls, it helps to understand the four stages where Lambda@Edge can intercept requests:
- Viewer Request — fires when CloudFront receives a request from the client
- Origin Request — fires before CloudFront forwards the request to the origin (only on cache misses)
- Origin Response — fires when CloudFront receives a response from the origin
- Viewer Response — fires before CloudFront returns the response to the client
Each stage has different capabilities and limitations. Understanding these differences is the key to everything that follows.
Pitfall 1: Origin-Response Cannot Access the Request Body
Our first instinct was the most obvious one: attach a Lambda@Edge function to the origin-response event. When the response status is 4xx or 5xx, log the request details. Clean and simple.
We wrote the function, deployed it, and immediately hit a wall: the request body is not available in the origin-response event.
In CloudFront’s Lambda@Edge model, the request body is only accessible during the viewer-request and origin-request stages — and only if you explicitly enable the “Include Body” option in the CloudFront trigger configuration. By the time execution reaches origin-response, the body has been stripped from the event object.
This makes a certain kind of sense from a performance perspective (why carry the body through stages that don’t need it?), but it completely broke our initial approach. We needed the response status code to determine failure, but we also needed the request body for logging. These two pieces of information are available in different stages and never overlap.
Lesson: Always check which data is available at each Lambda@Edge event stage before designing your solution. The AWS docs on Lambda@Edge event structure list exactly what fields are present in each stage.
Pitfall 2: Two Different Header Size Limits
Once we realized we needed to pass the request body from an earlier stage (origin-request) to a later stage (origin-response), the obvious mechanism was custom headers. Store the body in a custom header during origin-request, then read it back in origin-response.
But how large can that header be? The CloudFront docs mention a limit of 1,783 characters for custom headers. That seemed tiny — most of our POST request bodies were well over 2KB.
After more digging, we discovered that CloudFront actually has two separate header limits that apply in different contexts:
| Context | Limit | Applies To |
|---|---|---|
| Static Origin Custom Headers (configured in the CloudFront console) | 1,783 characters per header value | Headers you define in the distribution settings |
| Lambda@Edge dynamic headers | 20 KB total request size (all headers combined) | Headers added or modified by Lambda@Edge functions |
The 1,783-character limit applies to the static headers you configure in the CloudFront distribution settings. When Lambda@Edge adds or modifies headers at runtime, the relevant limit is the total request size of 20 KB across all headers. This gave us significantly more room to work with.
For most API request bodies, 20 KB is plenty. For the edge cases where the body exceeds this limit, we truncate and add a flag indicating the body was trimmed. In practice, we found that over 98% of our failed requests had bodies well under this threshold.
Lesson: CloudFront’s documentation can be ambiguous about limits because different limits apply in different contexts. When you hit a limit, check whether it applies to your specific use case or to a different configuration path.
Pitfall 3: Custom Error Pages Look Perfect But Lose Context
While researching alternatives, we came across CloudFront’s Custom Error Pages feature. It lets you configure CloudFront to route specific error status codes (like 403 or 500) to a designated origin path — for example, an API Gateway backed by a Lambda function.
On paper, this was ideal: CloudFront detects the error, routes to our logging Lambda, and we capture everything. No need for Lambda@Edge at all.
We built a proof of concept and quickly discovered the fatal flaw: when CloudFront invokes a Custom Error Page, it sends a fresh GET request to the error page URL. The original request’s headers and body are completely gone. All you get is a handful of query string parameters that CloudFront appends, like the original URL and the status code.
This is by design — Custom Error Pages were built to serve user-friendly error pages, not to provide programmatic access to the original request. But it meant we would lose exactly the data we needed most: the headers (which contain auth tokens, session IDs, and tracing information) and the body (which contains the payload we’d want to replay).
Lesson: Custom Error Pages are for user experience, not for operational logging. If you need the original request context when handling errors, you need a different approach.
Pitfall 4: Lambda@Edge Pricing Adds Up
As we committed to the Lambda@Edge approach, we ran the cost numbers and got a mild surprise. Lambda@Edge pricing is notably different from standard Lambda:
| Standard Lambda | Lambda@Edge | |
|---|---|---|
| Request price | $0.20 / 1M requests | $0.60 / 1M requests |
| Compute price (128 MB) | $0.0000000021 / ms | $0.00000625 / 128 MB-second |
| Free tier | 1M requests + 400K GB-seconds / month | None |
| Memory limit | Up to 10,240 MB | 128 MB (origin-facing) / 40 KB (viewer-facing) |
| Timeout | Up to 15 minutes | 30 seconds (origin-facing) / 5 seconds (viewer-facing) |
Lambda@Edge is roughly 3x more expensive per request than regular Lambda, and there is no free tier. The memory and timeout constraints are also significantly tighter.
That said, when we ran the numbers for our actual workload — approximately 1 million requests per day, with the origin-request function running on every request and the origin-response logging function doing real work only on the ~2% that fail — the total came to around $40 per month.
Here is the breakdown:
- Origin-request function: 30M requests/month x $0.60/1M = $18 for requests, plus compute for a simple header-copying operation (~5ms each at 128 MB) = ~$12
- Origin-response function: 30M requests/month x $0.60/1M = $18 for requests, but only ~600K/month actually write to CloudWatch, so compute costs are minimal = ~$10
- Total: ~$40/month
For production error observability with full request capture, $40/month is entirely reasonable. But it’s worth doing this math upfront — if you’re processing hundreds of millions of requests, the costs scale linearly and can become significant.
Lesson: Lambda@Edge is not expensive in absolute terms for most workloads, but the 3x multiplier and lack of free tier mean you should model costs before committing. Also factor in that you’re paying for the function to execute on every request, even when most requests succeed and the function’s work is trivial.
Pitfall 5: Logs Go to the Edge Region, Not us-east-1
Lambda@Edge functions must be deployed in us-east-1 — that’s a hard requirement. So naturally, we assumed the CloudWatch logs would show up in us-east-1 as well.
They don’t. Lambda@Edge functions execute at the CloudFront edge location closest to the user, and their CloudWatch logs are written to the region of that edge location. If a user in Tokyo triggers your function, the logs go to the ap-northeast-1 CloudWatch. A user in Frankfurt? eu-central-1. A user in Virginia? Only then do they land in us-east-1.
This means your logs are scattered across every AWS region where CloudFront has an edge presence — which is a lot of regions. If you’re trying to search logs for a specific failed request, you might need to check a dozen different CloudWatch log groups.
We addressed this in two ways:
- For real-time alerting: The origin-response Lambda writes structured JSON to CloudWatch. We set up CloudWatch cross-region log aggregation to funnel everything into a central account.
- For the logging Lambda itself: Instead of relying on CloudWatch as the final destination, the function writes failure records to a centralized data store (in our case, an SQS queue that feeds into a DynamoDB table in us-east-1).
Lesson: When debugging Lambda@Edge, always check the CloudWatch logs in the region where the edge location that served the request is located. Better yet, design your logging to write to a centralized destination from the start.
Solution Comparison: The Six Approaches We Evaluated
Before arriving at our final architecture, we evaluated six different approaches. Here is how they stack up:
| Approach | Full Headers | Request Body | Error Status | Cost | Complexity |
|---|---|---|---|---|---|
| A. App-level logging | Yes | Yes | Yes | Free (app changes only) | Low, but requires app modification |
| B. CloudFront Real-Time Logs + Kinesis | Partial (selected headers) | No | Yes | ~$30/month for Kinesis | Medium |
| C. ALB Access Logs | Partial | No | Yes | Free (S3 storage cost only) | Low |
| D. Custom Error Pages + Lambda | No | No | Yes | Low | Medium |
| E. Origin-Request + Real-Time Log Correlation | Yes | Yes | Requires async join | ~$50/month | High |
| F. Dual Lambda@Edge (our solution) | Yes | Yes | Yes | ~$40/month | Medium |
Approach A (app-level logging) is the simplest if you control the origin and can modify it. But if you have multiple origins, legacy services, or third-party backends, it’s not always feasible. It also doesn’t capture WAF-blocked requests that never reach the origin.
Approach B (CloudFront Real-Time Logs) sends selected request fields to a Kinesis Data Stream. It’s great for analytics but only supports a predefined set of fields — you can select specific headers to include, but you cannot capture the request body.
Approach C (ALB Access Logs) only works if your origin is behind an ALB, and the logs contain limited header information and no body.
Approach D (Custom Error Pages) fails for the reasons described in Pitfall 3 — you lose the original request context.
Approach E involves logging the full request in origin-request and correlating it with the response status from Real-Time Logs. This works in theory but requires an async pipeline to join the two data streams, adding latency and complexity.
Approach F (Dual Lambda@Edge) is what we ultimately chose. It solves every requirement with moderate complexity and predictable costs.
The Final Architecture: Dual Lambda@Edge
The solution uses two Lambda@Edge functions working in tandem:
- Origin-Request function: Copies the request body into a custom header (
x-original-body). Runs on every request but does minimal work. - Origin-Response function: Checks the response status. If it’s 400 or above, extracts the original headers and body (from the custom header), and logs the complete failure record.
Here is a diagram of the flow:
Client → CloudFront → [Viewer Request]
→ [Origin Request] ← Lambda copies body to x-original-body header
→ Origin Server
→ [Origin Response] ← Lambda checks status, logs failures
→ [Viewer Response]
→ Client
For WAF-blocked requests (403), the origin-request function never fires because WAF evaluates before the request reaches the origin. To capture those, we use a separate WAF logging configuration that sends blocked request data to an S3 bucket via Kinesis Data Firehose. This is a standard WAF feature and works reliably.
Origin-Request Function
// origin-request.js
exports.handler = async (event) => {
const request = event.Records[0].cf.request;
// If the request has a body (POST, PUT, PATCH), store it in a custom header
if (request.body && request.body.data) {
request.headers['x-original-body'] = [{
key: 'X-Original-Body',
value: request.body.data
}];
}
return request;
};
This function is intentionally minimal. It runs on every cache miss, so keeping execution time low is critical. It reads the body from request.body.data (which is Base64-encoded when the body option is set to “read-only” or “replace”) and stores it in a custom header that will be accessible in the origin-response stage.
Important configuration: When setting up the CloudFront trigger for this function, you must enable the “Include Body” option. Without it, request.body will be undefined.
Origin-Response Function
// origin-response.js
const { CloudWatchLogsClient, PutLogEventsCommand } = require('@aws-sdk/client-cloudwatch-logs');
exports.handler = async (event) => {
const response = event.Records[0].cf.response;
const request = event.Records[0].cf.request;
const status = parseInt(response.status, 10);
// Only log failed requests
if (status < 400) {
return response;
}
// Extract the original body from our custom header
const originalBody = request.headers['x-original-body']
? request.headers['x-original-body'][0].value
: null;
// Build the failure record
const failureRecord = {
timestamp: new Date().toISOString(),
status: response.status,
statusDescription: response.statusDescription,
method: request.method,
uri: request.uri,
querystring: request.querystring,
headers: sanitizeHeaders(request.headers),
body: originalBody ? decodeBody(originalBody) : null,
clientIp: request.clientIp,
responseHeaders: response.headers
};
// Log to CloudWatch (or send to SQS/Kinesis for centralized collection)
console.log(JSON.stringify({
type: 'FAILED_REQUEST',
...failureRecord
}));
return response;
};
function sanitizeHeaders(headers) {
const sanitized = {};
for (const [key, values] of Object.entries(headers)) {
// Skip our internal transport header
if (key === 'x-original-body') continue;
sanitized[key] = values.map(v => v.value);
}
return sanitized;
}
function decodeBody(data) {
try {
// request.body.data is Base64-encoded
return Buffer.from(data, 'base64').toString('utf-8');
} catch (e) {
return data;
}
}
The origin-response function does the heavy lifting but only when the response status indicates a failure. On the ~98% of requests that succeed, it returns immediately after a single integer comparison. For failures, it constructs a structured JSON log entry with everything the operations team needs to investigate and potentially replay the request.
Deployment Notes
A few practical tips for deploying this:
-
IAM Role: Both functions need a single IAM role with
logs:CreateLogGroup,logs:CreateLogStream, andlogs:PutLogEventspermissions. If you’re writing to SQS or DynamoDB, add those permissions too. -
Memory: Set both functions to 128 MB (the maximum for origin-facing Lambda@Edge). This is usually more than enough for the work they’re doing.
-
Timeout: We set the origin-request function to 1 second and the origin-response function to 5 seconds. The origin-response function needs more time because it writes to CloudWatch on failures.
-
Versioning: Lambda@Edge requires you to deploy a numbered version (not
$LATEST). Automate this in your CI/CD pipeline to avoid manual version publishing. -
Body size handling: If your API accepts large payloads, add a size check in the origin-request function and truncate bodies that would push the total header size over the 20 KB limit:
const MAX_BODY_SIZE = 15000; // Leave room for other headers
if (request.body && request.body.data) {
const bodyData = request.body.data;
request.headers['x-original-body'] = [{
key: 'X-Original-Body',
value: bodyData.length > MAX_BODY_SIZE
? bodyData.substring(0, MAX_BODY_SIZE)
: bodyData
}];
if (bodyData.length > MAX_BODY_SIZE) {
request.headers['x-body-truncated'] = [{
key: 'X-Body-Truncated',
value: 'true'
}];
}
}
Wrapping Up
The dual Lambda@Edge approach is not the only way to solve this problem, and it may not be the best fit for every situation. If you control your origin and can modify the application code, app-level logging (Approach A) is simpler and cheaper. If you only need headers and not the body, CloudFront Real-Time Logs (Approach B) might be sufficient.
But if you need full request headers and body for failed requests across multiple origins, with real-time capture at the CDN edge, the dual Lambda@Edge pattern is a solid choice. The two functions are small, the cost is predictable, and the solution works for any origin without requiring changes to the backend.
The five pitfalls we hit along the way are all documented in the AWS docs if you know where to look. The challenge is that the relevant information is spread across multiple documentation pages covering Lambda@Edge event structure, CloudFront limits, Custom Error Pages, pricing, and CloudWatch log routing. Hopefully, having them collected in one place here saves you some time.