know about what’s going on in your systems can be terrifying to an IT professional. Sometimes it comes at 3 am and wakes you from a sound sleep, and other times it’s just a nagging realization in the back of your mind, reminding you to remain diligent, keep learning, and hope for the best. 20 years as a Systems Administrator and Director of IT gave me a lot of perspective on what it’s like to be responsible for a lot of systems and users over which you have only marginal control.
I made due with limited resources while meeting rapidly evolving challenges, often mostly business was good, and growing fast. I tried to come prepared and mostly succeeded, but the landscape is changing all the time. And then there are those moments when a Heartbleed or a WannaCry makes you think, “I dodged a bullet this time, but if we had been hit, how would I have been able to identify the activity quickly, to minimize the damage? How do I even claim, with a straight face, to have a handle on these unpredictable developments? I can’t spend all my time reading security bulletins and systems logs.”
It’s challenging to maintain responsible controls, communicating issues and convincing leaders to support appropriate solutions and strategies. Accelerating change makes it hard to feel that I’m conscientiously keeping up. Increasing costs and tightening budgets compound to leave me feeling increasingly exposed.
I’m watching G-Suite, Office365, Sharepoint, a small number of servers running LAMP and a proprietary longitudinal reporting database that I designed myself. The database is an enormously rewarding and effective project, on which I can spend 10% of my time, if I’m lucky. Previously, I have requested an experienced systems administrator position to take some of the load off diagnosing, researching, and providing staff user services. I run audit logs on Sharepoint to keep the users honest. I make people run updates. I check for new firmware. I review configs. I’m overwhelmed constantly, and I just get done what I can.
I have long known what it’s like to be understaffed and to have to search for resources with little or no peer support, but I can read quickly and I’m good at Googling problems. Sometimes the broader context of peer experience is more important than the day to day clicks and keystrokes, and I’m always wondering if I’ve missed some crucial insight. Having someone to provide context, an alternate experience, can be invaluable. User services can handle the day to day grind of tickets, helping people login and print, but I needed an expert that I can hand off systems monitoring that requires real knowledge. I needed someone that would know intuitively when escalation is appropriate. I need a resource I can trust.
Trying to convince non-IT professionals when a given tool or architecture is obsolete, or when an emerging technology is a must-have, is sometimes a bit like trying to save people from themselves. Over the years, I have won some and I lost some, but as long as the business could execute and grow, then I have been mostly satisfied. Lately, I have begun to dream about a day the day when I have the staff and the tools to execute with full knowledge and precision, with grace. Like the most well orchestrated ballet, I want my staff to glide and weave together into a beautiful tapestry of vision and synchrony, or at least, I will settle for well herded sheep and a loyal border collie.
Once in a while, a solution comes along that you can instantly recognize as relevant, effective and feasible. You have a one word in response, and it is, “Yes.” Those days are rare, but so welcome.
So I allowed myself to dream boldly of ballets, or sheep. What if I could have a senior SysAdmin that could read every single log entry of every single host in my environment, across multiple cities, without domain restrictions? What if I could trust them to report to me on anything they came across that was unusual or interesting? What if they could research ahead of time, and at least help me triage some of the more common or serious issues my system might face, to keep me in the driver’s seat, to give me that first nudge toward a solution, right at the moment I’m seeing the issue? That would go a long way toward making me feel human again.
The answer is, “Yes.” I found them, I hired them, and then I joined them as an employee.
Loom Systems delivers an AIOps-powered log analytics solution, Sophie,
to predict and prevent problems in the digital business. Loom collects logs and metrics from the entire IT stack, continually monitors them, and gives a heads-up when something is likely to deviate from the norm. When it does, Loom sends out an alert and
recommended resolution so DevOps and IT managers can proactively attend to the issue before anything goes down.
Get Started with AIOps Today!