If you want to be confident that your users are able to achieve their goals using your service there’s more to do than monitoring the health of individual micro-services. You need assurance that your set of micro-services are working well together, and when they aren’t, you need the information necessary to fix any problems as soon as you can. This blog follows one Tes team’s mission to better identify and diagnose problems, enabling them to move fast and ship with confidence.
A friend of mine tells a great story of a team avoiding a great deal of grief. All of their system health checks were green, but the live graph of purchases dropped to zero and stayed there. Despite the many positive system indicators, the team were able to see they had a problem and were able able to react quickly to find and to fix it. It turned out that user purchases was a key indicator of success.