- The system is not monolithic.
- OpenStack "is not only" OpenStack.
- Do not rely on just monitoring the default metrics.
- Proper procedures avoid more failures.
1) The system is not monolithic.
How can we know the real impact on our service when a specific component fails?
Also, for each specific major component, identify specific services whose failures can impact your services.
Simply put: Know the relations between all components in the cloud.
In conclusion, we need to monitor the end-of-the-chain and consider consistency tests along all final services: Monitor storage, monitor networking, monitor hypervision layer, monitor every component individually and be prepared to analyze logs and metrics when those failures do happen. Also and again: Be Able to relate things!
3) Do not rely on monitoring just the default metrics
4) Proper procedures avoid more failures.
Loom Systems delivers an AI-powered log analysis solution to predict and prevent problems in the digital business. Loom collects logs and metrics from the entire IT stack, continually monitors them, and gives a heads-up when something is likely to deviate from the norm. When it does, Loom sends out an alert and recommended resolution so DevOps and IT managers can proactively attend to the issue before anything goes down. Schedule Your Live Demo here!