Metrics vs Logging, Continued
Christopher Covington
cov at fb.com
Wed Jan 3 13:22:39 AEDT 2018
Hi Michael, Patrick,
I probably should have hopped on this list months ago. Thanks for your patience as I come up to
speed on your code, configure my mail client to suit this list, and so on.
> Prometheus metrics is fundamentally a pull model, not a push model. If you have a pull model,
> it greatly simplifies the dependencies:
> - Pull metrics internally or externally (daemons listen on 127.0.0.1, optionally reverse proxy
> that through your web service).
An option for on-demand metrics (as opposed to periodic, always-on monitoring) is nice. I would
use it to more highly scrutinize upgrades in progress for example.
> - Optionally run the metrics server or not depending on configuration.
I agree it should fail gracefully when there is no server present, and think this generalizes to
other network services, even NTP and DHCP.
> - Pull model naturally self-limits in performance-limited cases... you don’t have a thundering
> herd of daemons trying to push metrics. In case metrics server gets loaded it will naturally
> slow down polls to backend daemons.
At large scale you'll either need multiple pollers or load-balancing for the receiving server. I'm
not sure what the best solution is. Is load-balancing perhaps more commonplace?
> But what I think would be pretty nice is if you could point graphana/Prometheus towards every
> BMC on your network to get nice graphs of temp, fan speeds, etc.
For metrics/counters, I've been centrally pulling/polling from a fleet running the following RESTful
API:
https://github.com/facebook/openbmc/tree/helium/common/recipes-rest/rest-api/files
But polling the whole fleet doesn't seem ideal, so I'm wondering about a push model.
Prometheus looks interesting, thanks for the pointer. It does seem to support a push model
https://prometheus.io/docs/instrumenting/pushing/
Do Go language applications run reasonably well on ASpeed 2400 SoCs?
I've heard that OpenWRT uses collectd: https://wiki.openwrt.org/doc/howto/statistic.collectd
Thanks,
Christopher Covington
More information about the openbmc
mailing list