Eugeny Shtoltc

IT Cloud


Скачать книгу

container = "POD", container_name = "POD", device = "/ dev / vda1", id = "/ kubepods.slice / kubepods-besteffort.subods / kubepods-besteffort.slice -besteffort-pod5a815a40_f2de_11ea_88d2_0242ac110032.slice / docker-76711789af076c8f2331d8212dad4c044d263c5cc3fa333347921bd6de7950a4.scope ", image =" k8s.gcr.io/pause:3.1 ", instance =" controlplane ", job =" kubernetes-cadvisor ", kubernetes_io_arch =" amd64 ", kubernetes_io_hostname = "controlplane", kubernetes_io_os = "linux", name = "k8s_POD_kube-proxy-nhzhn_kube-system_5a815a40-f2de-11ea-88d2-0242ac110032_0", namespace = "kube-system", pod = "kube_name =", podhn "kube-proxy-nhzhn"}

      253741748224

      It contains the metrics of RAM through its device: "container_fs_limit_bytes {device =" tmpfs "} / 1000/1000/1000"

      {beta_kubernetes_io_arch = "amd64", beta_kubernetes_io_os = "linux", device = "tmpfs", id = "/", instance = "controlplane", job = "kubernetes-cadvisor", kubernetes_io_arch = "amd64", kubernetes control_ioplane_host , kubernetes_io_os = "linux"} 0.209702912

      {beta_kubernetes_io_arch = "amd64", beta_kubernetes_io_os = "linux", device = "tmpfs", id = "/", instance = "node01", job = "kubernetes-cadvisor", kubernetes_io_arch = "amd64", kubernetes_io_host , kubernetes_io_os = "linux"} 0.409296896

      If we want to get the minimum disk, then we need to remove the RAM device from the list: "min (container_fs_limit_bytes {device! =" Tmpfs "} / 1000/1000/1000)"

      {} 253.74174822400002

      In addition to metrics that indicate the value of the metric itself, there are metrics and counters. Their names usually end in "_total". If we look at them, we will see an ascending line. To get the value, we need to get the difference (using the rate function) over a period of time (indicated in square brackets), something like rate (name_metric_total) [time]. Time is usually kept in seconds or minutes. The prefix "s" is used to represent seconds, for example 40s, 60s. For minutes – "m", for example, 2m, 5m. It is important to note that you cannot set a time shorter than the exporter polling time, otherwise the metric will not be displayed.

      And you can see the names of the metrics that you could record along the path / metrics:

      controlplane $ curl https://2886795314-9090-ollie08.environments.katacoda.com/metrics 2> / dev / null | head

      # HELP go_gc_duration_seconds A summary of the GC invocation durations.

      # TYPE go_gc_duration_seconds summary

      go_gc_duration_seconds {quantile = "0"} 3.536e-05

      go_gc_duration_seconds {quantile = "0.25"} 7.5348e-05

      go_gc_duration_seconds {quantile = "0.5"} 0.000163193

      go_gc_duration_seconds {quantile = "0.75"} 0.001391603

      go_gc_duration_seconds {quantile = "1"} 0.246707852

      go_gc_duration_seconds_sum 0.388611299

      go_gc_duration_seconds_count 74

      # HELP go_goroutines Number of goroutines that currently exist.

      Raising the Prometheus and Graphana ligament

      We examined the metrics in the already configured Prometheus, now we will raise Prometheus and configure it ourselves:

      essh @ kubernetes-master: ~ $ docker run -d –net = host –name prometheus prom / prometheus

      09416fc74bf8b54a35609a1954236e686f8f6dfc598f7e05fa12234f287070ab

      essh @ kubernetes-master: ~ $ docker ps -f name = prometheus

      CONTAINER ID IMAGE NAMES

      09416fc74bf8 prom / prometheus prometheus

      UI with graphs for displaying metrics:

      essh @ kubernetes-master: ~ $ firefox localhost: 9090

      Add the go_gc_duration_seconds {quantile = "0"} metric from the list:

      essh @ kubernetes-master: ~ $ curl localhost: 9090 / metrics 2> / dev / null | head -n 4

      # HELP go_gc_duration_seconds A summary of the GC invocation durations.

      # TYPE go_gc_duration_seconds summary

      go_gc_duration_seconds {quantile = "0"} 1.0097e-05

      go_gc_duration_seconds {quantile = "0.25"} 1.7841e-05

      Going to the UI at localhost: 9090 in the menu, select Graph. Let's add to the dashboard with the chart: select the metric using the list – insert metrics at cursor . Here we see the same metrics as in the localhost: 9090 / metrics list, but aggregated by parameters, for example, just go_gc_duration_seconds. We select the go_gc_duration_seconds metric and show it on the Execute button . In the console tab of the dashboard, we see the metrics:

      go_gc_duration_seconds {instance = "localhost: 9090", JOB = "prometheus", quantile = "0"} 0.000009186 go_gc_duration_seconds {instance = "localhost: 9090", JOB = "prometheus", quantile = "0.25"} 0.000012056 = go_congc_ instance "localhost: 9090", JOB = "prometheus", quantile = "0.5"} 0.000023256 go_gc_duration_seconds {instance = "localhost: 9090", JOB = "prometheus", quantile = "0.75"} 0.000068848 go_gc_duration_seconds {instance = "localhost: 9090 ", JOB =" prometheus ", quantile =" 1 "} 0.00021869

      by going to the Graph tab – their graphical representation.

      Now Prometheus collects metrics from the current node: go_ *, net_ *, process_ *, prometheus_ *, promhttp_ *, scrape_ * and up. To collect metrics from Docker, we tell him to write his metrics in Prometheus on port 9323:

      eSSH @ Kubernetes-master: ~ $ curl http: // localhost: 9323 / metrics 2> / dev / null | head -n 20

      # HELP builder_builds_failed_total Number of failed image builds

      # TYPE builder_builds_failed_total counter

      builder_builds_failed_total {reason = "build_canceled"} 0

      builder_builds_failed_total {reason = "build_target_not_reachable_error"} 0

      builder_builds_failed_total {reason = "command_not_supported_error"} 0

      builder_builds_failed_total {reason = "Dockerfile_empty_error"} 0

      builder_builds_failed_total {reason = "Dockerfile_syntax_error"} 0

      builder_builds_failed_total {reason = "error_processing_commands_error"} 0

      builder_builds_failed_total {reason = "missing_onbuild_arguments_error"} 0

      builder_builds_failed_total {reason = "unknown_instruction_error"} 0

      # HELP builder_builds_triggered_total Number of triggered image builds

      # TYPE builder_builds_triggered_total counter

      builder_builds_triggered_total 0

      # HELP engine_daemon_container_actions_seconds The number of seconds it takes to process each container action

      # TYPE engine_daemon_container_actions_seconds histogram

      engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.005"} 1

      engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.01"} 1

      engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.025"} 1

      engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.05"} 1

      engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.1"} 1

      In order for the docker daemon to apply the parameters, it must be restarted, which will lead to the fall of all containers, and when the daemon starts, the containers will be raised in accordance with their policy:

      essh @ kubernetes-master: ~ $ sudo chmod a + w /etc/docker/daemon.json

      essh @ kubernetes-master: ~ $ echo '{"metrics-addr": "127.0.0.1:9323", "experimental": true}' | jq -M -f / dev / null> /etc/docker/daemon.json

      essh @ kubernetes-master: ~ $ cat /etc/docker/daemon.json

      {

      "metrics-addr": "127.0.0.1:9323",

      "experimental": true

      }

      essh @ kubernetes-master: ~ $