Search
Guoping Jia

Handling application performance monitoring on HPE GreenLake for Private Cloud Enterprise – Part 3: K8s monitoring using Apache SkyWalking

January 11, 2023

Why is Kubernetes monitoring so important?

HPE GreenLake for Private Cloud Enterprise delivers a modern private cloud to support your app workloads running in any combination across your edges, colocations, and data centers. It contains one HPE service, called HPE GreenLake for Private Cloud Enterprise: Containers, which provides an enterprise-grade container management service using open source Kubernetes. It allows customers to create a Kubernetes cluster, view details about existing clusters, and launch the service console.

Though Kubernetes dramatically simplifies application deployment in containers and across clouds, it adds a new set of complexities for managing, securing and troubleshooting applications. Container-based applications are dynamic and they are being designed using microservices, where the number of components is increased by an order of magnitude.

To ensure Kubernetes security, it requires self-configuration that is typically specified in code, whether Kubernetes YAML manifests, Helm charts, or templating tools. Properly configuring for workloads, clusters, networks, and infrastructure is crucial for averting issues and limiting the impact if a breach occurs. Dynamic provisioning via Infrastructure as code, automated configuration management and orchestration also add to monitoring and troubleshooting complexity.

Since Kubernetes is widely used for processing customer workloads, the non-availability of both workloads and the cluster itself, from misconfiguration of core components to network connectivity issues in Kubernetes, can adversely impact productivity, business continuity and user experience. To avoid this, enterprises must closely monitor the status of the objects managed and operations performed by Kubernetes, proactively capture abnormalities, and resolve them well before end-users notice.

Kubernetes monitoring is critical to managing application performance, service uptime and troubleshooting. However, it presents a challenge for a traditional, static monitoring approach, emphasizing the need for real time monitoring. Having a good application performance monitoring (APM) tool is becoming essential for Kubernetes monitoring.

In my first blog post, I walked through some of the best APM tools, described their key features and discussed their strengths and weaknesses in detail. In this blog post, I choose one APM tool, Apache SkyWalking, and describe in detail how to set it up in HPE GreenLake for Private Cloud Enterprise for monitoring a Kubernetes cluster.

Apache SkyWalking

Apache SkyWalking is an open source application performance monitoring (APM) tool, especially designed for microservices, cloud native, and container-based architectures.

Apache SkyWalking is lightweight and scalable. It can be easily set up as a self-managed APM tool within an on-premises data center. This avoids leasing customer data to third party services and matches well with the strict security parameters of HPE GreenLake for Private Cloud environment.

Prerequisites

Before starting, make sure you have the following:

  • A Kubernetes cluster, being provisioned in HPE GreenLake for Private Cloud Enterprise
  • The kubectl, together with the HPE kubectl plugin and the kubeconfig file of the Kubernetes cluster. You can download them from the launched service console in HPE GreenLake for Private Cloud Enterprise. To simplify the setup process, you can export the environment variable KUBECONFIG and point it to the downloaded kubeconfig file.
  • The Helm CLI tool, version 3.8.1 or later

With your user access setup, you should have access to permissions that can create and update the following resources in the Kubernetes cluster:

  • Kubernetes Service Account(s)
  • Kubernetes Roles & RoleBindings

Set up Apache SkyWalking for Kubernetes monitoring

Apache SkyWalking leverages the Kubernetes kube-state-metrics service for collecting metrics data from Kubernetes cluster. It then leverages the OpenTelemetry collector to transfer the Kubernetes metrics to the OpenTelemetry receiver in the Apache SkyWalking Observability Analysis Platform (OAP) for Kubernetes monitoring.

Deploy Apache SkyWalking

In this blog post, I will take the approach to setting up the Apache SkyWalking as a self-managed APM tool within the Kubernetes cluster created in HPE GreenLake for Private Cloud Enterprise. This mainly takes into account matching with the strict security parameters of HPE GreenLake for Private Cloud environment.

To start, install Apache SkyWalking using Helm charts with elasticsearch as storage:

$ git clone https://github.com/apache/skywalking-kubernetes 
$ cd skywalking-kubernetes/chart
$ helm repo add elastic https://helm.elastic.co
$ helm dep up skywalking
$ kubectl create ns skywalking
$ helm install skywalking skywalking –n skywalking \
--set oap.image.tag=9.2.0 \
--set oap.storageType=elasticsearch \
--set ui.image.tag=9.2.0 \
--set elasticsearch.imageTag=7.1.1 \7
--set elasticsearch.persistence.enabled=true \
--set elasticsearch.sysctlInitContainer.enabled=false \
--set oap.env.SW_OTEL_RECEIVER=default \
--set oap.env.SW_OTEL_RECEIVER_ENABLED_OC_RULES="k8s-cluster\,k8s-service\,k8s-node"

After running the above commands, the Apache SkyWalking is installed to the Kubernetes cluster's namespace skywalking. It creates the elasticsearch as the StatefulSet resource, running a pod on each worker node. It runs the Apache SkyWalking OAP with replicas as 2 to provide high availability.

The last two options, oap.env.SW_OTEL_RECEIVER=default & oap.env.SW_OTEL_RECEIVER_ENABLED_OC_RULES="k8s-cluster,k8s-service,k8s-node", enable the OpenTelemetry receiver and define the metrics for the Kubernetes service, service instance and endpoint. It requires Apache SkyWalking OAP to have access to the Kubernetes API server to query the metadata.

You can check the detailed Apache SkyWalking installation by typing the following kubectl command:

$ kubectl get all -n skywalking
NAME                                  READY   STATUS      RESTARTS   AGE
pod/elasticsearch-master-0            1/1     Running     0          8m7s
pod/elasticsearch-master-1            1/1     Running     0          8m7s
pod/elasticsearch-master-2            1/1     Running     0          8m7s
pod/skywalking-es-init-m9t5c          0/1     Completed   0          8m7s
pod/skywalking-oap-7f757c7668-nq2cz   1/1     Running     0          8m8s
pod/skywalking-oap-7f757c7668-q8z7m   1/1     Running     0          8m8s
pod/skywalking-ui-549dc5989f-jq9b9    1/1     Running     0          8m8s

NAME                                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)               AGE
service/elasticsearch-master            ClusterIP   10.110.35.173    <none>        9200/TCP,9300/TCP     8m5s
service/elasticsearch-master-headless   ClusterIP   None             <none>        9200/TCP,9300/TCP     8m5s
service/skywalking-oap                  ClusterIP   10.108.29.84     <none>        11800/TCP,12800/TCP   8m5s
service/skywalking-ui                   ClusterIP   10.102.186.131   <none>        80/TCP                8m5s

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/skywalking-oap   2/2     2            2           8m6s
deployment.apps/skywalking-ui    1/1     1            1           8m6s

NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/skywalking-oap-7f757c7668   2         2         2       8m9s
replicaset.apps/skywalking-ui-549dc5989f    1         1         1       8m9s

NAME                                    READY   AGE
statefulset.apps/elasticsearch-master   3/3     8m5s

NAME                           COMPLETIONS   DURATION   AGE
job.batch/skywalking-es-init   1/1           7m27s      8m6s

You can edit the deployed SkyWalking UI service skywalking-ui and change its type from ClusterIP to NodePort. The service will be automatically mapped to the gateway host with an assigned port.

$ k edit service/skywalking-ui -n skywalking

$ k describe service/skywalking-ui -n skywalking 
Name:                     skywalking-ui
Namespace:                skywalking
Labels:                   app=skywalking
                          app.kubernetes.io/managed-by=Helm
                          chart=skywalking-4.2.0
                          component=ui
                          heritage=Helm
                          hpecp.hpe.com/hpecp-internal-gateway=true
                          release=skywalking
Annotations:              hpecp-internal-gateway/80: gl2-caas.gl-hpe.local:10037
                          meta.helm.sh/release-name: skywalking
                          meta.helm.sh/release-namespace: skywalking
Selector:                 app=skywalking,component=ui,release=skywalking
Type:                     NodePort
IP:                       10.102.186.131
Port:                     <unset>  80/TCP
TargetPort:               8080/TCP
NodePort:                 <unset>  32748/TCP
Endpoints:                10.192.7.25:8080
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

As shown in the Annotations section of the service description above, the SkyWalking UI can then be accessed in your browser by typing the address gl2-caas.gl-hpe.local:10037:

Deploy kube-state-metrics service

The Kubernetes kube-state-metrics service will be deployed to listen to the Kubernetes API server and generate metrics about the state of the Kubernetes objects.

$ helm install  kube-state-metrics -n skywalking prometheus-community/kube-state-metrics

Set up OpenTelemetry collector

The OpenTelemetry collector needs to be installed and set up to transfer the Kubernetes metrics to OpenTelemetry receiver from the SkyWalking OAP server. I use the standard Docker image otel/opentelemetry-collector:0.50.0 to deploy the OpenTelemetry collector to the Kubernetes cluster.

Set up role-based access control (RBAC)

Kubernetes RBAC is a key security control to ensure that cluster users and workloads have access only to resources required to execute their roles. It is important to ensure that, when designing permissions for cluster users, the cluster administrator understands the areas where privilege escalation could occur, to reduce the risk of excessive access leading to security incidents.

To set up RBAC, you create a Service Account, a ClusterRole, and connect the two with a ClusterRoleBinding.

1. Create a YAML file otel-sa-kubernetes-monitor.yaml for the service account:
apiVersion: v1
kind: ServiceAccount
metadata:
  name: otel-sa-kubernetes-monitor
2. Create a YAML file otel-role-kubernetes-monitor.yaml for the cluster roles:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole

metadata:
  name: otel-role-kubernetes-monitor
rules:
  - apiGroups: [ "" ]
    resources:
      # @feature: kubernetes-monitor; permissions to read resources
      - "endpoints"
      - "pods"
      - "services"
      - "nodes"
      - "nodes/metrics"
      - "nodes/proxy"
    verbs: [ "get", "watch", "list" ]
3. Create a YAML file otel-role-binding-kubernetes-monitor.yaml to bind the service account with the cluster roles:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: otel-role-binding-kubernetes-monitor
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: otel-role-kubernetes-monitor
subjects:
  - kind: ServiceAccount
    name: otel-sa-kubernetes-monitor
    namespace: skywalking
4. Deploy the service account, the cluster role and the cluster rolebinding:
$ kubectl apply -f otel-sa-kubernetes-monitor.yaml -n skywalking
$ kubectl apply -f otel-role-kubernetes-monitor.yaml -n skywalking
$ kubectl apply -f otel-role-binding-kubernetes-monitor.yaml.yaml -n skywalking

Deploy OpenTelemetry collector

1. Create a YAML file otel-collector-config.yaml to set the OpenTelemetry config to scrape the Kubernetes metrics:
apiVersion: v1
kind: ConfigMap
metadata:
  name: otel-collector-conf-kubernetes-monitor
  labels:
    app: otel-kubernetes-monitor
data:
  otel-collector-config: |
    service:
      pipelines:
        metrics:
          receivers: [ prometheus ]
          exporters: [ opencensus,logging ]
    exporters:
      opencensus:
        endpoint: "skywalking-oap.skywalking.svc.cluster.local:11800"
        tls:
          insecure: true
      logging:
        loglevel: debug
    receivers:
      prometheus:
        config:
          scrape_configs:
          # @feature: kubernetes-monitor; configuration to scrape Kubernetes Nodes metrics
          - job_name: 'kubernetes-cadvisor'
            scheme: https
            tls_config:
              ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
            kubernetes_sd_configs:
              - role: node
            relabel_configs:
              - action: labelmap
                regex: __meta_kubernetes_node_label_(.+)
              - source_labels: []
                target_label: cluster
                replacement: cfe-iac-clu
              - target_label: __address__
                replacement: kubernetes.default.svc:443
              - source_labels: [__meta_kubernetes_node_name]
                regex: (.+)
                target_label: __metrics_path__
                replacement: /api/v1/nodes/$${1}/proxy/metrics/cadvisor
              - source_labels: [instance]
                separator: ;
                regex: (.+)
                target_label: node
                replacement: $$1
                action: replace
          # @feature: kubernetes-monitor; configuration to scrape Kubernetes Endpoints metrics
          - job_name: kube-state-metrics
            metrics_path: /metrics
            kubernetes_sd_configs:
            - role: endpoints
            relabel_configs:
            - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
              regex: kube-state-metrics
              replacement: $$1
              action: keep
            - action: labelmap
              regex: __meta_kubernetes_service_label_(.+)
            - source_labels: []
              target_label: cluster
              replacement: cfe-iac-clu
2. Create a YAML file otel-collector-deploy.yaml for the OpenTelemetry collector deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: otel-deployment-kubernetes-monitor
  labels:
    app: otel-kubernetes-monitor
spec:
  replicas: 1
  selector:
    matchLabels:
      app: otel-kubernetes-monitor
  template:
    metadata:
      labels:
        app: otel-kubernetes-monitor
      annotations:
        sidecar.istio.io/inject: "false"
    spec:
      serviceAccountName: otel-sa-kubernetes-monitor
      containers:
        - name: otel-kubernetes-monitor
          image: otel/opentelemetry-collector:0.50.0
          command:
            - "/otelcol"
            - "--config=/conf/otel-collector-config.yaml"
          volumeMounts:
            - name: otel-collector-config-vol-kubernetes-monitor
              mountPath: /conf
      volumes:
        - name: otel-collector-config-vol-kubernetes-monitor
          configMap:
            name: otel-collector-conf-kubernetes-monitor
            items:
              - key: otel-collector-config
                path: otel-collector-config.yaml
3. Deploy the OpenTelemetry collector:
$ kubectl apply -f otel-collector-config.yaml -n skywalking
$ kubectl apply -f otel-collector-deploy.yaml -n skywalking
$ kubectl  get all -n skywalking -l app=otel-kubernetes-monitor
NAME                                                      READY   STATUS    RESTARTS   AGE
pod/otel-deployment-kubernetes-monitor-798cdd8486-gz885   1/1     Running   0          93d

NAME                                                 READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/otel-deployment-kubernetes-monitor   1/1     1            1           96d

NAME                                                            DESIRED   CURRENT   READY   AGE
replicaset.apps/otel-deployment-kubernetes-monitor-798cdd8486   1         1         1       96d

After all setup steps are finished, the Kubernetes metrics will show up in the SkyWalking UI, under the Kubernetes tab:

You can check the Kubernetes overview from the SkyWalking UI:

And the Kubernetes nodes:

The Kubernetes worker node:

And the Kubernetes services:

Conclusion

In this blog post, I discussed the challenges in Kubernetes monitoring and why it’s important for Kubernetes monitoring in HPE GreenLake for Private Cloud Enterprise. I then took the Apache SkyWalking as the application performance monitoring (APM) tool and showed the detailed process of setting it up, as a self-managed environment in HPE GreenLake for Private Cloud Enterprise for monitoring a Kubernetes cluster. It provides a way to gain the visibility of the objects and operations performed by Kubernetes, and to resolve issues in the cluster.

Related

Akash Patel, Guoping Jia, Sonu Sudhakaran

A guide to enabling a managed Istio service mesh in a Kubernetes cluster on HPE GreenLake for Private Cloud Enterprise

Feb 16, 2023
Ron Dharma

Using HPE GreenLake Console's API Gateway for Data Services Cloud Console

Nov 30, 2021
Paul Zinn

Automate ITOps: announcing foundational APIs for the HPE GreenLake edge-to-cloud platform

Dec 1, 2023
Chaitra Mylarappachar

Bare metal provisioning on HPE GreenLake using Terraform

Mar 20, 2023
Didier Lalli

Bulk onboarding of users in HPE GreenLake edge-to-cloud platform

Apr 24, 2024
Michael Rose Jr.

Configuring Azure AD as the SAML IDP with HPE Greenlake Cloud Platform and Aruba Central

Jul 11, 2022
Prabhu Murthy - Cloud Operations

How to implement a single sign-on solution to authenticate users onto the HPE GreenLake edge-to-cloud platform

Nov 29, 2023
Ron Dharma

Converting HPE GreenLake API specifications in OAS 3.1 using OpenAPI tools

May 27, 2024

HPE Developer Newsletter

Stay in the loop.

Sign up for the HPE Developer Newsletter or visit the Newsletter Archive to see past content.

By clicking on “Subscribe Now”, I agree to HPE sending me personalized email communication about HPE and select HPE-Partner products, services, offers and events. I understand that my email address will be used in accordance with HPE Privacy Statement. You may unsubscribe from receiving HPE and HPE-Partner news and offers at any time by clicking on the Unsubscribe button at the bottom of the newsletter.

For more information on how HPE manages, uses, and protects your personal data please refer to HPE Privacy Statement.