In Module 8, we explored the cloud-native ecosystem and CNCF projects like Prometheus and Fluentd. Now, we’ll dive into observability, the practice of understanding what’s happening in your Kubernetes applications by monitoring metrics and collecting logs. We’ll focus on Prometheus for monitoring and Fluentd for logging, with a hands-on exercise to see them in action. By the end of this module, you’ll know how to monitor your applications and troubleshoot issues.
Observability is like having a dashboard in a restaurant kitchen (our Kubernetes analogy from earlier modules) that shows how the food trucks (pods) are performing. It answers questions like: Are the trucks serving customers on time? Are any trucks broken? What’s causing delays?
In Kubernetes, observability involves three pillars:
Analogy: Observability is the kitchen manager who watches gauges (metrics) and reads chef notes (logs) to keep the restaurant running smoothly.
Prometheus is a CNCF-graduated project for monitoring. It collects metrics from Kubernetes clusters and applications, stores them, and lets you query or visualize them (often with Grafana, a dashboard tool).
Example: For an Nginx web server, Prometheus tracks request counts or resource usage.
Analogy: Prometheus is a health inspector checking food trucks’ speed and efficiency, alerting you if something’s wrong.
Fluentd is a CNCF project for logging. It collects logs from pods, processes them, and sends them to a central system (e.g., Elasticsearch or a file).
Example: If an Nginx pod logs a “404 Not Found” error, Fluentd captures it for review.
Analogy: Fluentd is a note-taker recording every chef’s activity log for later review.
Grafana visualizes Prometheus metrics in dashboards, showing graphs like pod CPU usage or app response times. We’ll explore Grafana conceptually, as sandbox setups may limit its availability.
Let’s deploy an Nginx application and monitor it using Prometheus in Killercoda, a free, browser-based sandbox that supports Kubernetes and monitoring scenarios, replacing Katacoda. Due to sandbox constraints, we’ll focus on Prometheus and simulate Fluentd logging conceptually.
kubectl get nodes
What to expect: Nodes listed with status Ready
.
Create a deployment and service for Nginx. Save this as nginx-deployment.yaml
:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
type: NodePort
selector:
app: nginx
ports:
- port: 80
targetPort: 80
nodePort: 30080
Apply it:
kubectl apply -f nginx-deployment.yaml
Check the deployment and service:
kubectl get deployments
kubectl get services
What to expect: Two Nginx pods and a NodePort service.
Killercoda provides pre-configured scenarios for Prometheus. Use a simplified setup:
Create a Prometheus deployment. Save this as prometheus.yaml
:
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
containers:
- name: prometheus
image: prom/prometheus
ports:
- containerPort: 9090
---
apiVersion: v1
kind: Service
metadata:
name: prometheus-service
spec:
type: NodePort
selector:
app: prometheus
ports:
- port: 9090
targetPort: 9090
nodePort: 30090
Apply it:
kubectl apply -f prometheus.yaml
Check Prometheus:
kubectl get pods
kubectl get services
What to expect: A Prometheus pod and a NodePort service.
http://<node-ip>:30090
).http_requests_total
to see Nginx request metrics.
What to expect: A graph or table showing request counts.container_cpu_usage_seconds_total
for pod CPU usage.
What to expect: CPU metrics for your Nginx pods.Fluentd setup is complex in Killercoda due to resource limits. Simulate logging:
kubectl exec -it <pod-name> -- sh
Replace <pod-name>
with a pod name from kubectl get pods
.
cat /var/log/nginx/access.log
What to expect: Access logs (e.g., requests to the Nginx server).
exit
.Note: In production, Fluentd would collect these logs centrally.
Delete the resources:
kubectl delete -f nginx-deployment.yaml
kubectl delete -f prometheus.yaml
What to expect: All resources removed.
Note for Beginners: Killercoda simplifies Prometheus setup for learning. If you can’t access the UI or prefer not to run commands, follow along to grasp the concepts. In a real cluster, Prometheus and Fluentd run as pods with more configuration.
For a local setup (advanced):
kubectl
.minikube start
.helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/prometheus
kubectl port-forward svc/prometheus-server 9090:9090
.http://localhost:9090
to query metrics.helm uninstall prometheus
, delete Nginx resources, minikube stop
.kubectl get services
).http://<node-ip>:30080
) to populate logs.kubectl get pods
.What is observability in Kubernetes?
A) Managing pod networking.
B) Monitoring and troubleshooting apps with metrics and logs.
C) Deploying containers.
What does Prometheus collect?
A) Logs from pods.
B) Metrics like CPU usage and request rates.
C) Persistent storage data.
What is Fluentd used for?
A) Visualizing metrics.
B) Collecting and aggregating logs.
C) Managing deployments.
Answers: 1-B, 2-B, 3-B
In Module 10, we’ll wrap up with Next Steps and Certification Prep, covering how to prepare for the KCNA exam, explore career paths, and continue your cloud-native journey.