Steps to Configure AWS CloudWatch Logs with FluentD

In the world of computing, AWS is like a pizzeria offering specialty pizza with different flavors in every slice. Through a range of roughly 175 various services, AWS gives you a power-packed, feature-rich platform to achieve a tech stack that is highly scalable, secure, and exceptionally reliable.

Choosing AWS as your cloud platform is essentially the first step to get your hosting stack stand up. However, the job is only half-done, where it is equally important to monitor your AWS services and ensure that they remain available without any service disruptions.

To help with that, AWS offers CloudWatch as a near-real-time monitoring platform of AWS cloud resources by providing actionable insights through metrics, logs and alarms. CloudWatch, through its insightful metrics and logs, helps monitor critical infrastructure resource transactions, including bandwidth consumption, CPU usage, latency, memory, etc. Specifically for EKS/Kubernetes platforms, the Fluentd centralized logging solution acts as a great addition to CloudWatch’s overall solution by working as a centralized platform for data collection and log creation.

In this tutorial, we will hold you by the hand to help set up AWS CloudWatch Logs using a Fluentd Daemonset. And by the end of it, you would have an AWS CloudWatch Log Group set up, which will help achieve critical analysis and projections.

Prerequisites

1. An active AWS account with CloudWatch services enabled with necessary IAM policy (CloudWatchAgentServerPolicy) attached to worker nodes
2. An active Kubernetes cluster with Role-Based Access Control (RBAC) enabled
3. A kubectl command-line tool installed and configured on your local machine to connect with the Kubernetes cluster
4. Kubelet with Webhook authorization enabled
5. AWS Cluster and Region Name details (to be used as attributes)

Step 1: Setting Up Container Insights

CloudWatch Container Insights are used to collect and analyze metrics and logs of AWS resources, including memory, latency, CPU, etc. Container Insights use a containerized version of the CloudWatch agent to discover all running containers in a cluster and/or region. It then collects and helps analyze data at every layer of the performance stack.A. Start with deploying CloudWatch Container Insights using the following kubectl apply command :

$ curl https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluentd-quickstart.yaml | sed "s/{{cluster_name}}/cluster-darwin/;s/{{region_name}}/maidstone-ln/" | kubectl apply -f -

Here cluster-darwin and maidstone-ln are the AWS cluster and regions respectively, where the CloudWatch logs would be pushed and published. Please ensure that you replace these with the correct value of your cluster_name and region_name before executing the command.

Step 2: Initializing Namespace

A. Start with creating a Namespace object file, called project-darwin.yaml (you may use another name of your choice), using a nano editor with the following command:

$ nano project-darwin.yaml

B. Once the editor opens up the Namespace file, enter the Namespace objects as shown below:

project-darwin.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: darwin-cloudwatch

After entering the above details, close the editor. Note that this step will only save the Namespace object file project-darwin.yaml rather than creating the Namespace itself.

C. After creating the Namespace file, let us create the Namespace through the kubectl create command as shown below:

$ kubectl create  - f project-darwin.yaml

Executing the above command returns the following output on successful creation of the Namespace:

namespace/darwin-cloudwatch created

D. Next, verify if the namespace is correctly created:

$ kubectl get namespaces

And this is what you see:

NAME                STATUS    AGE
default             Active    23m
darwin-cloudwatch   Active    1m
 kube-system         Active    23m

Once you are done with this, let us move to the next steps to set up Fluentd on the darwin-cloudwatch Namespace.

Step 3: Deploying the Fluentd Daemonset

A. To set up Fluentd Daemonset start collecting logs from every worker node of the cluster, we will begin with creating the ConfigMap:

$ kubectl create configmap darwin-info \
--from-literal=cluster.name=cluster-darwin \
--from-literal=logs.region=maidstone-ln -n darwin-cloudwatch

Here,darwin-info specifies the ConfigMap Name, while cluster-darwin and maidstone-ln are the AWS cluster and regions respectively where the Fluentd log is to be sent.B. Next, download the Fluentd Daemonset on the cluster-darwin cluster through the following command:

$ wget https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/fluentd/fluentd.yaml -o fluentd.yaml

C. Once the fluentd.yaml file is downloaded, open the file in a nano editor to enter the correct Namespace:

$ nano fluentd.yaml

And then, change the details of the Service Account as shown below on the editor:

fluentd.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: darwin-cloudwatch
  labels:
    app: fluentd

Through the above, a Service Account called fluentd is created in the darwin-cloudwatch Namespace.

D. Once done, run the kubectl apply command to complete deployment of the Fluentd Daemonset.

$ kubectl apply -f fluentd.yaml

E. As the last step, verify the deployment of the Daemonset with the following command:

$ kubectl get ds --namespace=darwin-cloudwatch

And this is what you should see:

NAME      DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
fluentd   18        18        18        18           18          <none>          58s

The above output shows that the Daemonset has been successfully deployed on 18 worker nodes within the cluster. To access the CloudWatch logs, you must also log into the CloudWatch UI console and find the following logs under the maidstone-ln region:

/aws/containerinsights/cluster-darwin/application
/aws/containerinsights/cluster-darwin/host
/aws/containerinsights/cluster-darwin/dataplane

Final Comments

Without a doubt, AWS CloudWatch remains a powerful tool to monitor resource usage through its default and custom metrics & insightful logs. However, it is of utmost importance to keep a check on pricing tiers of CloudWatch, especially the availability of free metrics, alarms and log size your account is eligible for. By default, AWS allocates default metrics, logs and alarms free, which are pretty useful as a starting point. However, all custom metrics and/or usage beyond the free tier limits are billed per hour. To know more about CloudWatch Pricing Tier, you can also use this pricing calculator.

We hope this tutorial was helpful for you to configure the Fluentd Daemonset on AWS CloudWatch. While the above steps were specifically for creating Log Groups, there is a slightly different approach to creating Cluster Metrics, which we shall explore in our next article.

Do let us know your feedback or any questions, and we shall be happy to respond. Till then, happy coding!

Table of contents

Prerequisites

Step 1: Setting Up Container Insights

Step 2: Initializing Namespace

Step 3: Deploying the Fluentd Daemonset

Final Comments

Share

Sonali Sengupta