合 Prometheus监控报错 context deadline exceeded
Tags: 故障处理监控Prometheus
现象
一个GreenPlum的Prometheus报错:
1 | Get "http://127.0.0.1:9297/metrics": context deadline exceeded |
同时,Grafana也不显示监控结果。
检查greenplum_exporter会有错误输出:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | [root@mdw1 ~]# systemctl status greenplum_exporter.service ● greenplum_exporter.service - greenplum exporter Loaded: loaded (/etc/systemd/system/greenplum_exporter.service; enabled; vendor preset: disabled) Active: active (running) since Mon 2023-01-30 14:05:27 CST; 1 day 6h ago Main PID: 97507 (greenplum_expor) Tasks: 38 CGroup: /system.slice/greenplum_exporter.service └─97507 /usr/local/greenplum_exporter/bin/greenplum_exporter --log.level=error Jan 30 14:05:27 mdw1 systemd[1]: Started greenplum exporter. Jan 30 14:44:41 mdw1 greenplum_exporter[97507]: time="2023-01-30T14:44:41+08:00" level=error msg="get metrics for scraper:segment_scraper failed, error:pq: canceling statement ...ctor.go:100" Jan 30 22:46:20 mdw1 greenplum_exporter[97507]: time="2023-01-30T22:46:20+08:00" level=error msg="get metrics for scraper:segment_scraper failed, error:pq: canceling statement ...ctor.go:100" Jan 30 22:49:36 mdw1 greenplum_exporter[97507]: time="2023-01-30T22:49:36+08:00" level=error msg="get metrics for scraper:segment_scraper failed, error:pq: canceling statement ...ctor.go:100" Jan 31 15:33:41 mdw1 greenplum_exporter[97507]: time="2023-01-31T15:33:41+08:00" level=error msg="get metrics for scraper:segment_scraper failed, error:pq: canceling statement ...ctor.go:100" Jan 31 16:57:43 mdw1 greenplum_exporter[97507]: time="2023-01-31T16:57:43+08:00" level=error msg="get metrics for scraper:database_size_scraper failed, error:context deadline e...ctor.go:100" Jan 31 19:58:06 mdw1 greenplum_exporter[97507]: time="2023-01-31T19:58:06+08:00" level=error msg="get metrics for scraper:database_size_scraper failed, error:pq: canceling stat...ctor.go:100" Hint: Some lines were ellipsized, use -l to show in full. |