Robusta 를 활용한 Kubernetes 문제점 해결하기

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

Kubernetes 이야기

Robusta 를 활용한 Kubernetes 문제점 해결하기 본문

Kubernetes/모니터링

Robusta 를 활용한 Kubernetes 문제점 해결하기

kmaster 2022. 2. 27. 10:17

Robusta

Robustra는 Kubernetes 문제해결을 위한 오픈 소스 플랫폼이다. 모니터링 스택 ( Prometheus, ElasticSearch 등)의 맨 위에 있으면 경고가 발생한 이유와 해결 방법을 알려준다.

아키텍처

출처 : https://docs.robusta.dev/master/architecture.html

필수적인 요소로 2가지가 존재한다.

robusta-forwarder

APIServer에 연결하고 Kubernetes 변경 사항을 모니터링합니다. 이를 Robusta-runner에게 전달합니다.

robusta-runner

플레이북 실행

작동방식

triggers:
  - on_prometheus_alert:
      alert_name: KubePodCrashLooping
actions:
  - logs_enricher: {}
sinks:
  - slack

크게 세 부분이 있다.

Triggers : 실행 시기 ( 경고, 로그, 변경 사항 등 )
Actions : 해야 할 일 ( 50 개 이상의 기본 제공 작업 )
Sinks : 결과를 보낼 곳 ( Slack 등 )

많은 자동화가 포함되어 있지만 Python으로 직접 작성할 수도 있습니다.

예제)

# this runs on Prometheus alerts you specify in the YAML
@action
def my_enricher(event: PrometheusKubernetesAlert):
    # we have full access to the pod on which the alert fired
    pod = event.get_pod()
    pod_name = pod.metadata.name
    pod_logs = pod.get_logs()
    pod_processes = pod.exec("ps aux")

    # this is how you send data to slack or other destinations
    event.add_enrichment([
        MarkdownBlock("*Oh no!* An alert occurred on " + pod_name),
        FileBlock("crashing-pod.log", pod_logs)
    ])

설치

Helm 차트를 다운로드 하고 Robusta-CLI를 설치한다.

# helm repo add robusta https://robusta-charts.storage.googleapis.com && helm repo update
# helm show values robusta/robusta > values.yaml
  -> values.yaml의 값을 적절히 변경한다. ( clusterName은 필수값이어서 값을 아무거나 넣어준다. )

# helm install robusta robusta/robusta -f ./values.yaml -n robusta --create-namespace

# kubectl get pods -n robusta
NAME                                READY   STATUS    RESTARTS   AGE
robusta-forwarder-f9fd44b9c-h95n7   1/1     Running   0          3m29s
robusta-runner-7c64df6675-lm9tx     2/2     Running   0          3m29s

예제

values.yaml 에서 아래의 부분을 추가해 보자.

...

sinksConfig:
- webhook_sink:
    name: webhook_sink
    url: "https://alert.test.svc.cluster.local/robusta-alerts"

...

customPlaybooks:
- triggers:
    - on_deployment_update: {}
  actions:
    - resource_babysitter:
        omitted_fields: []
        fields_to_monitor: ["spec.replicas"]

...

1) deployment spec중 replicas 변경이 감지되면 https://alert.test.svc.cluster.local/robusta-alerts 로 webhook을 전달한다.

2) https://alert.test.svc.cluster.local/robusta-alerts 에서는 stdout으로 전달받은 payload를 출력한다.

여기서 robusta-alerts 은 다음의 소스를 참고한다. ( https://github.com/kmaster8/kubernetes/tree/main/alerts )

변경 후 아래와 같이 update한다.

# helm upgrade robusta robusta/robusta --values=values.yaml -n robusta

그럼 이제 deployment의 replicas를 조정해 보고 robusta-alerts 의 로그를 살펴보자

# kubectl scale --replicas=2 -n test deployments/rollouts-bluegreen

# kubectl get po -n test alert-6bbccd56d5-qgxv8
NAME                     READY   STATUS    RESTARTS   AGE
alert-6bbccd56d5-qgxv8   1/1     Running   0          26s
# kubectl logs -f -n test alert-6bbccd56d5-qgxv8
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on all addresses.
   WARNING: This is a development server. Do not use it in a production deployment.
 * Running on http://172.32.183.189:5000/ (Press CTRL+C to quit)
 
 Webhook Received
Payload:
b'\ndeployment/test/rollouts-bluegreen.yaml updated\n\nSource: mycluster\n\nUpdates to significant fields: 0 additions, 0 deletions, 1 changes.\n\n*spec.replicas*: 1 ==> 2\n'

또한 Pop가 CrashLoopBack 오류가 발생하는 경우를 보자.

# kubectl delete -f https://gist.githubusercontent.com/robusta-lab/283609047306dc1f05cf59806ade30b6/raw

...

Webhook Received
Payload:
b'\nCrashing pod crashpod-d969884cb-jshpv in namespace default\n\nSource: mycluster\n\n*crashpod* restart count: 2\n\n*crashpod* waiting reason: CrashLoopBackOff\n\n*crashpod* termination reason: Error\n'

이렇게 다양한 Webhook을 받을 수 있다. webhook 을 이용하여 이메일발송, SMS 발송, 다른 SNS 연동이 가능하고, Robusta에서 기본적으로 제공하는 기능 뿐만 아니라 python으로 사용자 정의를 확장할 수 있다. 다음에 Python 코드를 활용하여 시스템의 문제를 확인할 수 있는 방법을 테스트 해보자.

'Kubernetes > 모니터링' 카테고리의 다른 글

OpenTelemetry (0)	2022.04.16
OpenTelemetry auto-instrumentation (0)	2022.04.16
host 서버에서 pod내부의 외부 통신 상태 (netstat) 조회 (0)	2022.03.22
Kyverno 정책 모니터링 (0)	2022.02.20
Locust를 사용하여 부하테스트 실행하기 (0)	2022.02.18

'Kubernetes/모니터링' Related Articles

Comments

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Kubernetes 이야기

Kubernetes 이야기

Robusta 를 활용한 Kubernetes 문제점 해결하기 본문

Robusta 를 활용한 Kubernetes 문제점 해결하기

Robusta

아키텍처

작동방식

설치

예제

'Kubernetes > 모니터링' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역