/plushcap/analysis/datadog/cloudfront-real-time-logs

Manage CloudFront real-time logs in Datadog

What's this blog post about?

Amazon CloudFront is a content delivery network (CDN) that minimizes latency by caching your content on AWS edge locations around the world. With CloudFront real-time logging, you can understand how efficiently CloudFront is distributing your content and responding to requests. You can collect CloudFront real-time logs in Datadog—in addition to CloudFront metrics—to get deep visibility into the health and performance of your CloudFront distribution. In this post, we’ll show you how to configure CloudFront to send real-time logs to Datadog, and how to organize, analyze, and alert on your logs. From CloudFront to Datadog, CloudFront sends real-time logs to Amazon Kinesis, a managed streaming data service. You can configure Kinesis to forward your logs to a destination of your choice, such as Datadog. To create your log configuration, provide a name for the configuration and specify its log sampling rate—the percentage of logs generated by CloudFront that you want to send to Kinesis. Next, select the fields to include in your logs. By default, your log configuration will include all of the available CloudWatch log fields, as shown in the screenshot below. You can easily configure Datadog to parse this format automatically. If you only want to log a subset of the available fields—for example, to reduce the amount of data in your stream—you can modify the log configuration’s Fields list and then create a custom log pipeline to parse your modified log format. Next, designate a Kinesis Data Stream as the endpoint to which CloudFront will send your logs. If you already have a stream you want to use, enter its Amazon Resource Name (ARN) in the Endpoint field. Or to create a new stream, click the Kinesis link, then fill in the fields shown in the screenshot below. Once Kinesis has created your data stream, copy its ARN as shown here, then navigate back to the Create real-time log configuration page and paste the ARN into the Endpoint field. To finish creating your real-time log configuration, specify the IAM role CloudFront will use to send your logs to Kinesis. Finally, select the CloudFront distribution and cache behaviors that will generate the logs, and click Create configuration. Kinesis Data Firehose is a managed service that can route streaming data in near real time to AWS services, HTTP endpoints, and third-party services like Datadog. Kinesis Data Firehose makes it easy to stream AWS service logs into Datadog—including real-time logs from CloudFront. To route your logs into Datadog, create a Kinesis Data Firehose delivery stream and choose the Kinesis Data Stream you created above as the source: Next, specify Datadog as your delivery stream’s destination, as shown in the screenshot below. Finally, enter your Datadog API key, select the appropriate HTTP endpoint URL, and provide the required additional configuration details. Once Datadog is ingesting your CloudFront real-time logs, you can use the Log Explorer to view, search, and filter your logs to better understand the performance of your CloudFront distribution. In this section, we’ll show you examples of CloudFront log fields you can use to investigate the source of errors and latency. But first, we’ll explain how you can use tags to make it easy to explore your CloudFront logs. Tags give you the ability to group and filter your logs on multiple dimensions and correlate them with data from other services you’re monitoring. Datadog automatically applies AWS tags like region and aws_account to your CloudFront logs, and you can add your own tags to associate your logs with related metrics from CloudFront and other AWS services like Amazon RDS or Elastic Load Balancing. To apply a custom tag to your CloudFront logs, create a parameter on your Kinesis Data Firehose delivery stream. In the screenshot below, we’ve added a parameter with a key of distributionid so we can isolate log data by distribution. If you’ve created a Kinesis Data Firehose delivery stream for each of your CloudFront distributions, then you can use the distributionid tag to easily correlate your logs with metrics from the same distribution. Now that you’re collecting and tagging your real-time CloudFront logs, you can analyze them in Datadog to monitor your distributions’ performance. You can create facets and measures based on the data in your logs to help you improve user experience by tuning parameters like cache size and time to live (TTL). For example, the x-edge-response-result-type field in your logs shows values such as Hit and Miss that describe how an edge location responded to each request. You can also create log-based metrics to monitor long-term trends in your logs. Like other metrics, log-based metrics are retained for 15 months at full granularity. In the screenshot below, we’ve created a log-based metric to track latency experienced by users in different geographies. To generate the metric, we queried the time-to-first-byte log field—which tells how long CloudFront took between receiving the request and beginning to respond—and grouped it by a log facet we created on the edge-location field. We can graph this metric to see the latency values logged by each of the edge locations used by our distributions, as shown in the screenshot below. Alerts Searching and filtering your logs is a powerful tool for investigation, but you can also use log monitors to proactively notify your team any time your CloudFront logs indicate a potential problem. The screenshot below shows an example of a log monitor that watches for an increase in the rate of 502 errors, which can occur when CloudFront is unable to communicate with your distribution’s origin server.

Company
Datadog

Date published
Nov. 13, 2020

Author(s)
David M. Lentz

Word count
1196

Language
English

Hacker News points
None found.