使用 windows_exporter 可以非常方便地给 prometheus 增加监控 windows server 的能力。
通常情况下只需使用默认配置就可以监控 CPU,内存,网络,服务了。但某些场合,如服务器安装了安全狗,在某些配置下可能不能获取某些服务的状态,此时就需要自定义配置,比如只监控某些服务。
windows_exporter 配置说明
来源
https://github.com/prometheus-community/windows_exporter
说明
适用于 Windows 机器的 Prometheus 导出器。
兼容性
windows_exporter 支持 Windows Server 版本 2008R2 和更高版本,以及桌面 Windows 版本 7 和更高版本。
部署方式
下载exporter:
https://github.com/prometheus-community/windows_exporter/releases/download/v0.16.0/windows_exporter-0.16.0-amd64.exe
可直接执行.exe文件,也可自定义方式启动,直接启动将使用默认配置:
自定义配置
Flags: -h, --help Show context-sensitive help (also try --help-long and --help-man). --collectors.dfsr.sources-enabled="connection,folder,volume" Comma-seperated list of DFSR Perflib sources to use. --collectors.exchange.list List the collectors along with their perflib object name/ids --collectors.exchange.enabled="" Comma-separated list of collectors to use. Defaults to all, if not specified. --collector.iis.site-whitelist=".+" Regexp of sites to whitelist. Site name must both match whitelist and not match blacklist to be included. --collector.iis.site-blacklist=COLLECTOR.IIS.SITE-BLACKLIST Regexp of sites to blacklist. Site name must both match whitelist and not match blacklist to be included. --collector.iis.app-whitelist=".+" Regexp of apps to whitelist. App name must both match whitelist and not match blacklist to be included. --collector.iis.app-blacklist=COLLECTOR.IIS.APP-BLACKLIST Regexp of apps to blacklist. App name must both match whitelist and not match blacklist to be included. --collector.logical_disk.volume-whitelist=".+" Regexp of volumes to whitelist. Volume name must both match whitelist and not match blacklist to be included. --collector.logical_disk.volume-blacklist="" Regexp of volumes to blacklist. Volume name must both match whitelist and not match blacklist to be included. --collector.msmq.msmq-where=COLLECTOR.MSMQ.MSMQ-WHERE WQL 'where' clause to use in WMI metrics query. Limits the response to the msmqs you specify and reduces the size of the response. --collectors.mssql.classes-enabled="accessmethods,availreplica,bufman,databases,dbreplica,genstats,locks,memmgr,sqlstats,sqlerrors,transactions" Comma-separated list of mssql WMI classes to use. --collectors.mssql.class-print If true, print available mssql WMI classes and exit. Only displays if the mssql collector is enabled. --collector.net.nic-whitelist=".+" Regexp of NIC:s to whitelist. NIC name must both match whitelist and not match blacklist to be included. --collector.net.nic-blacklist="" Regexp of NIC:s to blacklist. NIC name must both match whitelist and not match blacklist to be included. --collector.process.whitelist=".*" Regexp of processes to include. Process name must both match whitelist and not match blacklist to be included. --collector.process.blacklist="" Regexp of processes to exclude. Process name must both match whitelist and not match blacklist to be included. --collector.service.services-where="" WQL 'where' clause to use in WMI metrics query. Limits the response to the services you specify and reduces the size of the response. --collector.smtp.server-whitelist=".+" Regexp of virtual servers to whitelist. Server name must both match whitelist and not match blacklist to be included. --collector.smtp.server-blacklist=COLLECTOR.SMTP.SERVER-BLACKLIST Regexp of virtual servers to blacklist. Server name must both match whitelist and not match blacklist to be included. --collector.textfile.directory="C:\\Program Files\\windows_exporter\\textfile_inputs" Directory to read text files with metrics from. --config.file=CONFIG.FILE YAML configuration file to use. Values set in this file will be overriden by CLI flags. --web.config.file="" [EXPERIMENTAL] Path to configuration file that can enable TLS or authentication. --telemetry.addr=":9182" host:port for exporter. --telemetry.path="/metrics" URL path for surfacing collected metrics. --telemetry.max-requests=5 Maximum number of concurrent requests. 0 to disable. --collectors.enabled="cpu,cs,logical_disk,net,os,service,system,textfile" Comma-separated list of collectors to use. Use '[defaults]' as a placeholder for all the collectors enabled by default. --collectors.print If true, print available collectors and exit. --scrape.timeout-margin=0.5 Seconds to subtract from the timeout allowed by the client. Tune to allow for overhead or high loads. --log.level="info" Only log messages with the given severity or above. Valid levels: [debug, info, warn, error, fatal] --log.format="logger:stderr" Set the log target and format. Example: "logger:syslog?appname=bob&local=7" or "logger:stdout?json=true" --version Show application version.
使用配置文件
可以使用–config.file标志指定 YAML 配置文件。例如
.\windows_exporter.exe --config.file=config.yml
config.yml格式如下,可根据配置文档进行内容调整:
collectors: enabled: cpu,cs,net,service collector: service: services-where: "Name='windows_exporter'" log: level: warn
rules配置参考
包含CPU超过90%使用量预警,内存超过90%用量预警,磁盘用量90%预警,windows_export自身预警及服务预警,如开头所说,未配置时将会监控所有服务,很多时候只需要监控特定服务即可
- name: WindowsServer rules: - alert: WindowsServerCpuUsage expr: 100 - (avg by (instance) (rate(windows_cpu_time_total{mode="idle"}[2m])) * 100) > 90 for: 0m labels: severity: warning annotations: summary: Windows Server CPU Usage (instance {{ $labels.instance }}) description: "CPU Usage is more than 90%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: WindowsServerMemoryUsage expr: 100 - ((windows_os_physical_memory_free_bytes / windows_cs_physical_memory_bytes) * 100) > 90 for: 2m labels: severity: warning annotations: summary: Windows Server memory Usage (instance {{ $labels.instance }}) description: "Memory usage is more than 90%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: WindowsServerDiskSpaceUsage expr: 100.0 - 100 * ((windows_logical_disk_free_bytes / 1024 / 1024 ) / (windows_logical_disk_size_bytes / 1024 / 1024)) > 90 for: 2m labels: severity: critical annotations: summary: Windows Server disk Space Usage (instance {{ $labels.instance }}) description: "Disk usage is more than 80%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: WindowsServerCollectorError expr: windows_exporter_collector_success == 0 for: 5m labels: severity: critical annotations: summary: Windows Server collector Error (instance {{ $labels.instance }}) description: "Collector {{ $labels.collector }} was not successful\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: WindowsServerServiceStatus expr: windows_service_status{status="ok"} != 1 for: 1m labels: severity: critical annotations: summary: Windows Server service Status (instance {{ $labels.instance }}) description: "Windows Service state is not OK\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
使用prometheus能够非常简单地建立起 web 服务器集群/数据库集群监控,通过这些监控,不仅能实时监控服务器集群的状态,也能够通过这些监控信息对服务器进行优化,特别是数据库参数方面的优化,以后月萌API将分享更多相关的文章。
参考:https://blog.csdn.net/qq_43021786/article/details/118809772
版权属于:月萌API www.moonapi.com,转载请注明出处