Complete Failed Request Logging and Async Replay with CloudFront and Lambda@Edge
A dual Lambda@Edge architecture for recording full request headers and body of failed requests — WAF blocks and origin errors — without modifying origin code, with async replay from S3.
1. Introduction
In architectures that use Amazon CloudFront for global acceleration alongside AWS WAF for security, enterprise customers frequently face an operational challenge: how to fully record the details of every failed request — including request headers and request body — to support subsequent asynchronous data compensation and request replay.
Standard logging mechanisms in CloudFront (access logs, real-time logs) capture metadata like URL, status code, and selected headers, but they do not include the request body. When a legitimate business request is mistakenly blocked by WAF or the origin encounters a transient failure, the operations team needs the complete request payload to investigate the root cause and replay the request for data recovery.
This article presents a dual Lambda@Edge architecture that solves this problem without modifying any origin code. The solution captures full headers and body for both WAF-blocked requests and origin error responses, funnels them into S3 via CloudWatch Logs and Kinesis Data Firehose, and supports async replay.
2. Overview
2.1 Business Scenario
“Failed requests” fall into two categories:
- WAF-blocked requests: Requests that trigger an AWS WAF rule and receive an immediate 403 response. They never reach the origin server.
- Origin error responses: Requests that reach the origin but receive a 4xx or 5xx status code in return.
Why do we need data compensation and replay? AWS WAF rules can produce false positives that block legitimate business requests. Origins can also return 500 errors due to transient failures (database timeouts, deployment windows, dependency outages). In both cases, the business needs a way to recover the lost data by replaying the original request once the issue is resolved.
2.2 Solution Benefits
| Dimension | Benefit |
|---|---|
| Information completeness | Records full request headers and body, not just metadata |
| Zero intrusiveness | No modifications required to origin code or application logic |
| Cost efficiency | Logging logic only executes on failed requests; successful requests pass through with minimal overhead |
| Architectural simplicity | Logs automatically aggregate to a single S3 bucket via a managed pipeline |
| Replay capability | Complete request data in S3 enables automated or manual async replay |
3. Solution Architecture
The architecture uses two Lambda@Edge functions deployed at the origin-request and origin-response stages of the CloudFront request lifecycle, plus a separate WAF logging pipeline for blocked requests.
3.1 Core Components
| Component | Role |
|---|---|
| Amazon CloudFront | CDN distribution with Lambda@Edge triggers |
| AWS WAF | Web application firewall attached to the CloudFront distribution |
| Lambda@Edge (origin-request) | Copies the request body into a custom header for downstream access |
| Lambda@Edge (origin-response) | Detects error responses and logs the complete request record |
| CloudWatch Logs | Receives structured JSON logs from Lambda@Edge functions |
| Kinesis Data Firehose | Streams logs from CloudWatch to S3 |
| Amazon S3 | Final storage for failed request records; source for async replay |
3.2 Request Flow
There are two distinct paths depending on where the failure occurs:
Path 1 — WAF-Blocked Requests:
Client → CloudFront → AWS WAF [BLOCK] → 403 Response
↓
WAF Logging → CloudWatch Logs → Kinesis Data Firehose → S3
WAF has its own logging mechanism that captures the full request headers and the first 8 KB of the request body. This is configured separately from the Lambda@Edge pipeline and requires enabling WAF logging with a CloudWatch Logs destination.
Path 2 — Origin Error Responses:
Client → CloudFront → [Origin Request] → Origin Server → [Origin Response] → Client
↓ ↓
Lambda copies Lambda detects 4xx/5xx,
body to header logs full request record
↓ ↓
└──────────── CloudWatch Logs → Kinesis → S3
The origin-request Lambda stores the request body in a custom header (X-Original-Body). The origin-response Lambda checks the response status; if it is 400 or above, it extracts the headers and body and writes a structured JSON log entry to CloudWatch.
4. Solution Evaluation and Selection
Before arriving at the dual Lambda@Edge approach, we evaluated six candidate solutions. Understanding why each alternative falls short is critical to appreciating the design decisions in the final architecture.
Candidate Solutions
| Approach | Description | Full Headers | Request Body | Error Status | Requires App Changes | Complexity |
|---|---|---|---|---|---|---|
| A. Application-layer logging | Origin application logs failed requests internally | Yes | Yes | Yes | Yes | Low |
| B. CloudFront Real-Time Logs | Stream selected fields to Kinesis Data Stream | Partial | No | Yes | No | Medium |
| C. ALB Access Logs | Standard ALB logging to S3 | Partial | No | Yes | No | Low |
| D. Custom Error Pages | Route error responses to a Lambda via API Gateway | No | No | Yes | No | Medium |
| E. Origin-Request + Real-Time Log Correlation | Log full request at origin-request, correlate with response status asynchronously | Yes | Yes | Requires async join | No | High |
| F. Dual Lambda@Edge (recommended) | Origin-request stores body; origin-response detects errors and logs | Yes | Yes | Yes | No | Medium |
Why Each Alternative Falls Short
Approach A works well if you control the origin, but it requires modifying every backend service. It also cannot capture WAF-blocked requests that never reach the origin. For organizations with multiple origins, legacy services, or third-party backends, this approach is not feasible.
Approach B (CloudFront Real-Time Logs) can stream a predefined set of request fields to Kinesis. You can select specific headers to include, but the request body is never available in real-time logs. This is a fundamental limitation of the feature.
Approach C (ALB Access Logs) only works for origins behind an ALB. The logs contain limited header information and no request body.
Approach D (Custom Error Pages) initially looks promising — you can route 4xx/5xx responses to a Lambda function for processing. However, when CloudFront invokes a Custom Error Page, it sends a fresh GET request to the error page URL. The original request’s headers and body are completely lost. All you receive are a few query string parameters (original URL, status code).
Approach E involves logging the complete request in the origin-request stage and later correlating it with the response status from Real-Time Logs using the CloudFront request ID. This works in theory but requires building an async pipeline to join two data streams, adding significant complexity and latency.
Approach F solves the problem at the edge in real-time, with full headers and body, no origin modifications, and manageable complexity.
Key Technical Insights from the Selection Process
Two critical technical details drove the architecture decisions:
-
The origin-response stage cannot access the request body. In CloudFront’s Lambda@Edge model, the request body is only available during the viewer-request and origin-request stages (and only when “Include Body” is explicitly enabled). By the time origin-response executes, the body has been stripped from the event object. This is why the origin-request function must cache the body in a custom header.
-
CloudFront has two different header size limits. Static origin custom headers configured in the distribution settings are limited to 1,783 characters per header value. However, headers added or modified dynamically by Lambda@Edge are subject to a different limit: a total request size of 20,480 bytes (approximately 20 KB) across all headers. This larger limit is what makes storing the request body in a custom header viable for most API payloads.
5. Implementation Details
5.1 Prerequisites
- An existing CloudFront distribution with an origin configured
- AWS WAF associated with the distribution (for Path 1 logging)
- IAM permissions to create Lambda functions, CloudWatch Logs resources, and Kinesis Data Firehose delivery streams
- Lambda@Edge functions must be created in us-east-1 (this is a hard requirement)
5.2 Step 1: Create the Origin-Request Lambda@Edge Function
This function runs on every origin request (cache miss) and copies the request body into a custom header so it is accessible in the origin-response stage.
'use strict';
exports.handler = async (event) => {
const request = event.Records[0].cf.request;
// If the request has a body (POST, PUT, PATCH), store it in a custom header
if (request.body && request.body.data) {
request.headers['x-original-body'] = [
{ key: 'X-Original-Body', value: request.body.data }
];
request.headers['x-original-body-encoding'] = [
{ key: 'X-Original-Body-Encoding', value: request.body.encoding || 'base64' }
];
}
return request;
};
Configuration requirements:
- Runtime: Node.js 20.x
- Memory: 128 MB (maximum for origin-facing Lambda@Edge)
- Timeout: 1 second
- Trigger: CloudFront origin-request event
- Include Body: Must be enabled in the CloudFront trigger configuration. Without this,
request.bodywill beundefined.
Body size handling: If your API accepts large payloads, add a size check to avoid exceeding the 20 KB header limit:
'use strict';
const MAX_BODY_HEADER_SIZE = 15000; // Leave room for other headers
exports.handler = async (event) => {
const request = event.Records[0].cf.request;
if (request.body && request.body.data) {
const bodyData = request.body.data;
const isTruncated = bodyData.length > MAX_BODY_HEADER_SIZE;
request.headers['x-original-body'] = [{
key: 'X-Original-Body',
value: isTruncated ? bodyData.substring(0, MAX_BODY_HEADER_SIZE) : bodyData
}];
request.headers['x-original-body-encoding'] = [{
key: 'X-Original-Body-Encoding',
value: request.body.encoding || 'base64'
}];
if (isTruncated) {
request.headers['x-original-body-truncated'] = [{
key: 'X-Original-Body-Truncated',
value: 'true'
}];
}
}
return request;
};
5.3 Step 2: Create the Origin-Response Lambda@Edge Function
This function inspects every origin response. For successful responses (status < 400), it returns immediately with no overhead. For failed responses, it extracts the complete request information and writes a structured JSON log entry.
'use strict';
exports.handler = async (event) => {
const cf = event.Records[0].cf;
const response = cf.response;
const request = cf.request;
// Safety check: if response is missing, return a 502
if (!response) {
return { status: '502', statusDescription: 'Bad Gateway' };
}
const status = parseInt(response.status);
// Only process failed requests
if (status >= 400) {
// Extract original headers, excluding our internal transport headers
const headers = Object.fromEntries(
Object.entries(request.headers)
.filter(([k]) => !k.startsWith('x-original-body'))
.map(([k, v]) => [k, v[0].value])
);
// Build and log the complete failure record
console.log(JSON.stringify({
type: 'origin_error',
requestId: cf.config.requestId,
timestamp: new Date().toISOString(),
status: status,
statusDescription: response.statusDescription,
method: request.method,
uri: request.uri,
querystring: request.querystring,
clientIp: request.clientIp,
headers: headers,
body: request.headers['x-original-body']?.[0]?.value ?? null,
bodyEncoding: request.headers['x-original-body-encoding']?.[0]?.value ?? null,
bodyTruncated: request.headers['x-original-body-truncated']?.[0]?.value === 'true',
responseStatus: response.status,
responseHeaders: Object.fromEntries(
Object.entries(response.headers).map(([k, v]) => [k, v[0].value])
)
}));
}
return response;
};
Configuration requirements:
- Runtime: Node.js 20.x
- Memory: 128 MB
- Timeout: 5 seconds (needs more time than origin-request because it writes to CloudWatch on failures)
- Trigger: CloudFront origin-response event
5.4 Step 3: Configure the CloudFront Distribution
-
Associate the Lambda@Edge functions with the appropriate cache behavior:
- Origin-request trigger → Step 1 function (published version, not
$LATEST) - Origin-response trigger → Step 2 function (published version, not
$LATEST)
- Origin-request trigger → Step 1 function (published version, not
-
Enable “Include Body” on the origin-request trigger. This is the checkbox in the CloudFront console under the trigger configuration (or
IncludeBody: truein CloudFormation/CDK). -
Attach AWS WAF to the distribution if not already done. Configure WAF logging (see Step 5).
-
IAM execution role for both functions must include:
logs:CreateLogGrouplogs:CreateLogStreamlogs:PutLogEvents- The trust policy must allow
edgelambda.amazonaws.comandlambda.amazonaws.comto assume the role.
5.5 Step 4: Set Up the Log Aggregation Pipeline
Lambda@Edge functions execute at CloudFront edge locations, and their CloudWatch logs are written to the AWS region of that edge location — not us-east-1 where the function is deployed. This means logs are scattered across every region where CloudFront has an edge presence.
To aggregate these logs into a single S3 bucket, set up the following pipeline in each active region (or use a CloudFormation StackSet for automation):
-
CloudWatch Logs Subscription Filter: Attach a subscription filter to the Lambda@Edge log group (
/aws/lambda/us-east-1.<function-name>) that forwards log events to a Kinesis Data Firehose delivery stream. -
Kinesis Data Firehose Delivery Stream: Configure a delivery stream that:
- Receives log events from CloudWatch
- Optionally transforms the data (e.g., decompress, parse JSON, add metadata)
- Delivers to an S3 bucket with a partitioned prefix like
logs/year=YYYY/month=MM/day=DD/hour=HH/
-
S3 Bucket: A centralized bucket that receives all failed request logs. Enable lifecycle policies to transition old logs to S3 Glacier or delete them after a retention period.
Alternative approach for simpler setups: If cross-region log aggregation is too complex, the origin-response Lambda can write directly to a centralized destination (SQS queue, DynamoDB table, or S3 bucket in us-east-1) instead of relying on CloudWatch Logs. This adds latency to the Lambda execution but simplifies the pipeline.
5.6 Step 5: Configure WAF Logging for Blocked Requests
For requests blocked by WAF (Path 1), configure WAF logging separately:
- In the AWS WAF console, enable logging for your Web ACL.
- Set the destination to a CloudWatch Logs log group (e.g.,
aws-waf-logs-<distribution-name>). - WAF logs include full request headers and the first 8 KB of the request body (when body inspection is enabled in the WAF rule).
- Set up a CloudWatch Logs subscription filter to forward WAF logs to the same Kinesis Data Firehose delivery stream and S3 bucket used for origin error logs.
5.7 Known Limitations
| Limitation | Impact | Mitigation |
|---|---|---|
| Request body stored in header is limited to ~20 KB | Large POST/PUT bodies will be truncated | Add truncation flag; for very large payloads, consider writing to S3 directly from origin-request |
| Lambda@Edge logs go to edge region, not us-east-1 | Logs are scattered across multiple regions | Use Kinesis Data Firehose or direct writes to a centralized store |
| Lambda@Edge has no free tier | Cost applies from the first request | Acceptable for production workloads; see cost analysis below |
| WAF body inspection limited to first 8 KB | WAF-blocked request logs may have incomplete bodies | Sufficient for most API payloads; very large payloads are rare in WAF-blocked scenarios |
| Lambda@Edge requires publishing a numbered version | Cannot use $LATEST | Automate version publishing in CI/CD pipeline |
| Origin-request function runs on every cache miss | Adds latency to all requests, not just failures | Function is minimal (~5ms); impact is negligible |
6. Cost Analysis
The following cost estimate is based on a scenario with 1 million requests per day (30 million per month), of which approximately 2% are failed requests (600,000 failures per month).
Lambda@Edge Costs
| Item | Calculation | Monthly Cost |
|---|---|---|
| Origin-request function requests | 30M x $0.60/1M | $18.00 |
| Origin-request function compute | 30M x 5ms avg x 128 MB x $0.00000625/128MB-s | $9.38 |
| Origin-response function requests | 30M x $0.60/1M | $18.00 |
| Origin-response function compute (failures only) | 600K x 20ms avg x 128 MB x $0.00000625/128MB-s | $0.08 |
| Origin-response function compute (successes) | 29.4M x 1ms avg x 128 MB x $0.00000625/128MB-s | $1.84 |
| Lambda@Edge subtotal | ~$47.30 |
Note: Lambda@Edge is approximately 3x more expensive per request than standard Lambda and has no free tier. The origin-request function runs on every cache miss, while the origin-response function’s compute cost is minimal for successful requests (immediate return).
AWS WAF Costs
| Item | Calculation | Monthly Cost |
|---|---|---|
| Web ACL | 1 x $5.00 | $5.00 |
| Rules | 5 rules x $1.00 | $5.00 |
| Requests inspected | 30M x $0.60/1M | $18.00 |
| WAF logging | Included with CloudWatch Logs destination | $0.00 |
| WAF subtotal | ~$28.00 |
Log Aggregation Costs
| Item | Calculation | Monthly Cost |
|---|---|---|
| CloudWatch Logs ingestion | 600K records x ~1 KB avg = ~600 MB x $0.50/GB | $0.30 |
| Kinesis Data Firehose | 600K records x ~1 KB = ~600 MB x $0.029/GB | $0.02 |
| S3 storage | ~600 MB/month (before compression) | $0.01 |
| S3 PUT requests | 600K x $0.005/1K | $3.00 |
| Log aggregation subtotal | ~$3.33 |
Total Monthly Cost
| Component | Cost |
|---|---|
| Lambda@Edge | ~$47.30 |
| AWS WAF | ~$28.00 |
| Log aggregation | ~$3.33 |
| Total | ~$78.63/month |
For the core logging functionality alone (Lambda@Edge + log aggregation, excluding WAF which you likely already have), the incremental cost is approximately $50/month. If you already have WAF configured, the additional cost for this solution is primarily the Lambda@Edge execution costs.
Cost Optimization Tips
- Use CloudFront cache effectively: The origin-request function only fires on cache misses. Higher cache hit ratios directly reduce Lambda@Edge invocations.
- Filter by behavior: Only attach the Lambda@Edge functions to cache behaviors that need logging, not to all behaviors in the distribution.
- Consider log sampling: For very high-traffic distributions, you can add sampling logic in the origin-response function to log only a percentage of failures.
7. Frequently Asked Questions
Q1: How much latency does this add to requests?
The origin-request function adds approximately 3-5ms to each cache miss. It performs a single operation (copying the body to a header) with no external API calls. The origin-response function adds less than 1ms to successful requests (a single integer comparison). For failed requests, the console.log call adds approximately 5-10ms. In practice, this latency is negligible compared to origin response times.
Q2: Can I add custom headers in the origin-request function for other purposes?
Yes. Lambda@Edge can add, modify, or delete headers at the origin-request stage. However, be mindful of the 20 KB total request size limit for all headers combined. If you are already storing the request body in X-Original-Body, you have less room for other custom headers. Also ensure your custom header names do not conflict with headers expected by the origin server.
Q3: What happens if the request body exceeds the header size limit?
If the Base64-encoded request body is larger than the available header space (~15 KB after accounting for other headers), the body will be truncated. The implementation includes a truncation flag (X-Original-Body-Truncated: true) so the log consumer knows the body is incomplete. For workloads with very large payloads, consider an alternative approach: have the origin-request function write the body directly to S3 and store only the S3 key in the custom header.
Q4: Why are my Lambda@Edge logs not appearing in us-east-1?
Lambda@Edge functions execute at CloudFront edge locations, and their CloudWatch logs are written to the AWS region of that edge location. If a request is served by an edge location in Tokyo, the logs go to ap-northeast-1. If served from Frankfurt, they go to eu-central-1. To find logs for a specific request, you need to check the CloudWatch log group in the region corresponding to the edge location that served the request. This is why the log aggregation pipeline (Step 4) is essential for operational use.
Q5: How do I capture WAF-blocked requests? Lambda@Edge does not seem to fire for them.
Correct. When AWS WAF blocks a request, the request never reaches the origin stages of the CloudFront lifecycle. The origin-request and origin-response Lambda@Edge functions do not execute for WAF-blocked requests. To capture these, you must configure WAF logging separately (Step 5). WAF logging captures full request headers and up to 8 KB of the request body. The WAF logs are sent to CloudWatch Logs and can be forwarded to the same S3 bucket via Kinesis Data Firehose, giving you a unified log store for both types of failures.
8. Async Replay from S3
Once failed request records are stored in S3, you can build an async replay mechanism:
- S3 Event Notification triggers a Lambda function when new log files arrive.
- The Lambda function parses the JSON records, decodes the Base64 body, and reconstructs the original HTTP request.
- The reconstructed request is sent to the origin (bypassing CloudFront) with the original headers and body.
- Replay results (success/failure) are logged for audit purposes.
import json
import base64
import urllib3
import boto3
http = urllib3.PoolManager()
s3 = boto3.client('s3')
def handler(event, context):
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
obj = s3.get_object(Bucket=bucket, Key=key)
content = obj['Body'].read().decode('utf-8')
for line in content.strip().split('\n'):
entry = json.loads(line)
# Decode the body if present
body = None
if entry.get('body'):
encoding = entry.get('bodyEncoding', 'base64')
if encoding == 'base64':
body = base64.b64decode(entry['body'])
else:
body = entry['body'].encode('utf-8')
# Reconstruct and send the request
url = f"https://origin.example.com{entry['uri']}"
if entry.get('querystring'):
url += f"?{entry['querystring']}"
response = http.request(
entry['method'],
url,
headers=entry.get('headers', {}),
body=body
)
print(json.dumps({
'type': 'replay_result',
'originalRequestId': entry.get('requestId'),
'replayStatus': response.status,
'originalStatus': entry.get('status')
}))
Important considerations for replay:
- Add idempotency checks to avoid duplicate processing
- Implement rate limiting to avoid overwhelming the origin
- Use a dead-letter queue for replay failures
- Add a replay window limit (e.g., only replay requests from the last 24 hours)
9. Conclusion
The dual Lambda@Edge architecture provides a robust, non-intrusive solution for capturing complete failed request data at the CloudFront edge. The core value propositions are:
- Zero intrusiveness: No modifications to origin code, application logic, or existing infrastructure
- Information completeness: Full request headers and body for both WAF-blocked requests and origin errors
- Cost efficiency: Approximately $50-80/month for 1 million requests per day, with costs scaling linearly
- Architectural simplicity: Two small Lambda functions, a standard log aggregation pipeline, and a single S3 bucket
The approach does have trade-offs — the 20 KB header size limit constrains body capture for large payloads, and cross-region log aggregation adds operational complexity. But for the vast majority of API workloads, these limitations are manageable, and the solution provides the complete observability needed for effective incident response and data recovery.