We are the modern “SysAdmins”, and we are very lazy (in a manner of speaking).
Yes, we are lazy, but not in a negative sense of the word. We, “The Modern SysAdmin”, don’t like to do manual things more than once. That’s buried way back in the past. In the “DevOps/SysOps” world of today, the entire infrastructure has to be deployed with automated tools. Nobody else (unless it’s someone living in the past, or with a self-imposed desired to suffer) thinks about any sort of infrastructure (especially cloud based ones) without some degree of automation involved.
Automation tools can begin with OS installation, down to the application configuration management.
Cloud environments like AWS and OpenStack use “cloud-formation” tools, which allow the “sysadmin” to create complete infrastructures (instances, storage, networks, etc.) with the use of simple YAML or JSON based templates with very clear and generally easy to use declarative languages. These templates also ease the deployment of applications inside the virtual instances once they’ve booted the first time.
In the following sections we’ll go through the many ways we have to automate our infrastructure. Also, and because we are firm believers that all things in the infra should be properly monitored, we’ll see how these tools help us to accomplish this very important task.
Our roots: Shell scripts and command line tools
The very first thing any sysadmin should learn is how to live in the console. The modern operating systems (Linux, FreeBSD, and even modern Windows) provide a console with some kind of scripting language. On old and almost forgotten systems like “MS DOS” we had the “.bat” files (batch files) enabling some degree of automation within the operating system. Today, we have very basic yet powerful tools like bash and PowerShell, which include very useful programmatic constructs like loops, conditionals, functions, etc. This allows a great degree of freedom and with that, a way to automate common tasks within the operating system.
With the use of command line tools, or even scripts with more complex languages (like python or perl), the basic core functions in the shell can be extended so that it applies to almost any administrative task.
In the monitoring arena, these basic scripts and their related tools are the base for extending the metrics included on common monitoring solutions (from basic net-snmp to more serious things like Zabbix and Prometheus). In order to see this concept in action, see the following screenshot from a Zabbix installation:
If you know anything about Zabbix, you’re probably asking yourself: “Hey, Zabbix doesn’t include any metrics for monitoring DNS queries. How did he make this happen?” The solution begins with a simple shell script running in a crontab, which uses the BIND RNDC command and some smart scripting to generate the desired metrics. The rest is just Zabbix normal procedures used to include custom metrics.
In conclusion, the shell scripts are the simplest way to automate things (and a very effective one). Also, with monitoring in our view, shell scripting is the most commonly used technique for customized metrics creation.
Unattended installation tools: Part 1 (a tale of snakes and seeds)
Shell scripting is a good start when the system is already installed and running but, what about the operating system itself? What if we need to automate the installation of, let’s say, 50 servers and all of them need to be exactly the same and have applications already configured? Would you risk doing this task manually? What about the “human error” factor?
This is where unattended installation tools enter the scene. Linux, the most widespread distributions (Debian and Red Hat) and its many children (Centos, Ubuntu, Mint, and Scientific Linux) are based on two installation tools or programs, called Anaconda (for Red Hat based distributions), and Debian-Installer (for Debian based distributions).
Both Debian-Installer and Anaconda include a way to automate and structure the operating system’s installation. For Debian-Installer, the tool is called preseed and for Anaconda, kickstart. The automated install goes from simple things like language, root password, and partition schemes, to more complex things like package installations, extra repository configuration, and even a post-install stage where you can define more tasks combining shell scripts, line command tools, and almost everything that will end in a “properly installed” server.
Find a sample kickstart (anaconda) template here.
This template will automate the basic operating system installation, and, perform additional“post-install” actions, as well.
If you think about monitoring, you can consider this; Any monitoring package or configuration can be included in your post-install stage. Your servers can auto-enroll in whatever monitoring solution you have so that you’ll not only automate the bare-metal install, but also automate the monitoring provisioning.
Unattended installation tools: Part 2 (PXE your bare metal)
The last idea, where you can automate the operating system installation, and post-install some extra stuff on your servers, can be extended even more with the use of PXE. But, what exactly is PXE? The acronym stands for Preboot Execution Environment.
With the basic unattended installation described previously, you still need to use a CD (or USB drive) with a basic netinstall version of your operating system, and write some basic instructions that will let the installer know from where to download the unattended installation template (preseed or Kickstart). This also leaves room for human error, as you can incorrectly enter the template name or URL and then lose time rebooting and starting over repeatedly until entering the correct info in the netinstall prompt for a successful installation.
With PXE, you can actually automate this part. Just turn on the server and you’ll be presented with a menu of available operating systems (or templates) and with just the cursor and a tap on “enter”, select the proper template and let the installation proceed on its own with no further intervention on your part.
Some tools in the market exploit this even more, like Fuel and Canonical MaaS (metal as a service), which are used to deploy OpenStack based clouds in an orderly and very automated fashion, from the operating system install to the specific OpenStack nodes deployment, all using a combination of PXE, unattended installation templates, and a lot of smart scripting as a base. Fuel in action:
Maas (Metal as a Service) running:
In conclusion, PXE can extend further your bare-metal installation options, and, can give you a better degree of automation and control over your datacenter.
Speaking of monitoring again, some of these options (like Fuel) can be extended with “plugins” to include extra monitoring in your cloud. And, if you’re constructing your very own PXE solution, you're completely free to add any kind of monitoring auto-provisioning for your deployed servers, to your operating system template.
Enter the application configuration automation tools
Some time ago, in the not so distant past, a group of information technology geeks and nerds identified that something was missing in the modern datacenter - a way to properly install, configure, and maintain configurations. Then CFEngine was born, and with it the “Configuration Management” concept. After CFEngine, other CM’s easier to use and with more features, became available in the OpenSource world; Puppet, Ansible, Chef, and many others.
CM as a paradigm includes the following basic premises:
- The configuration items are managed in a centralized way. Multiple hosts in the datacenter can be configured from a single server.
- Any unintended change by a “third party” made in a managed node will make the CM re-apply the intended configuration defined on the CM matrix.
- The CM can install packages and allows defining the update policies for them.
- The CMs are idempotent. This actually means that, if a running cycle of the CM detects that no changes are needed in the managed node, then no unnecessary changes are enforced.
- The CMs can run scripts and create files in the managed nodes. These files can have any desired content, also managed by the CM, and can be presented as templates too.
- The CMs can detect the “facts” of the managed node, and make installation/configuration decisions based on these facts (operating system versions, networking, “DNS” information, etc.)
If you combine the unattended tools described above with these CM tools, then we’re talking about full datacenter automation. Also, and because you really want to keep your systems properly monitored, you can actually deploy the monitoring agents in the servers using CM tools. Later, if you need to include new metrics in a specific set of servers, you can centralize this action on the CM server and in a controlled way, deploy all your monitoring changes in your CM tool managed servers.
A brief example can be seen here. This Ansible playbook will detect if the server is running a Red Hat or Debian based distribution, and then take the proper steps in order to install apache on the server.
Remember what we said about the CM being able to detect the managed node “facts”? These facts include almost anything from the operating system to the actual installed packages. With this information, you can simplify your CM templates, and make a single application installation template that covers many different operating system releases.
The cloud era: Cloud-formation is the answer
All the aforementioned tools (shell scripts, unattended, and even CMs) are okfor a traditional bare-metal based datacenter. Even virtualization systems like VMWare, ProxMox, and VirtualBox can benefit from the use of CMs in order to post-configure deployed servers but what about the Cloud?
This is where, thanks to the AMAZON WEB SERVICES ultra-tech-ninjas, we can both have our cake and eat it too, and its name is, Cloud Formation.
In cloud environments, virtual machines are just one of the many possible elements you can create. Actually, in clouds like OpenStack and AWS you can deploy:
- Virtual Machine (aka: instances)
- Volumes (block storage)
- REST/HTTP/HTTPS accessible files (object storage)
- IP load balancers
- Floating IP’s / Elastic IP’s
- Virtual Networks
- Routers (Internet Gateways)
- Database Engines
- Auto-scaling groups (with alarms and scale-up/scale-down policies)
- Many other things...
In normal conditions, you can deploy these elements manually, but that’s not always the right way to do it especially in production environments. Remember, modern “sysadmins” hate to do manual and repetitive tasks more than once!
So, what is cloud formation? It’s a way to define your complete cloud infrastructure (with all the elements listed before) in a template, with an easy to use declarative language. Those templates (normally JSON or YAML files) after being fed to the cloud, will activate the creation and launching of all your cloud objects the way you defined them.
One interesting thing is the “metadata service” present on most modern clouds (and again, AWS and OpenStack are the leaders here). You can include specific instructions that will be executed inside your virtual instances at first boot. It is very similar to the post-install stage on the pre-seed/Kickstart unattended templates, but on steroids!
The Cloud formation templates can include the post-install instructions that will be passed to the cloud instances through these metadata services. These instructions can include installation steps to enroll the virtual instance in a CM solution, then allow a central CM server into your cloud to fine-tune the machine provisioning.
This metadata service can be used in a very smart way with tags that specify extra information such as the environment (development, testing, or production) and the service layer (web servers, mail servers, etc.).
A basic “hello world” template (used in OpenStack) is presented here.
The template above (YAML, for OpenStack) will allow the selection of a web app install (apache or nginx). The application install steps are passed to the virtual machine from the metadata service by reading the template user_data section. This will basically bootstrap the instance with the desired application.
Speaking of monitoring (again), in both AWS and OpenStack you can use the cloud monitoring infrastructure to create alarms on the already-included metrics. Because these alarms are cloud objects too, you can automate their deployment using Cloud formation. If you include your own monitoring solution in your cloud infrastructure, you can also use the metadata services to help your monitoring self-servicing tooling to enroll the virtual instances in the monitoring services. It should be noted, if your monitoring services include any kind of categories or classes, you can use tags to correctly identify the right monitoring category where your deployed instances will reside.
Time for black magic: Canonical Juju: The APT on the cloud
Anyone who has used a Debian-based distro before, surely knows about APT. APT is the package manager in all of those distributions, including Debian as well as Canonical Ubuntu.
Canonical is known to be one of the companies that are heavily involved in OpenStack development, and in cloud tools development in general. Some years ago, they released a “CM with steroids” called “Juju”. Juju is really more than a simple CM because it works at cloud level, allowing the deployment and configuration of applications in a cloud environment in a very simple way, and also makes it possible to relate/link those applications in very smart ways. The way Juju works (from a user perspective) is very similar to APT, which is the reason it’s called “the APT on the cloud”.
Juju can deploy applications to bare-metal servers (actually creating a “bare-metal” based cloud), but also it can deploy applications to known clouds like AWS, Public OpenStack based clouds, and private/hybrid OpenStack based clouds. Not limited to deploying applications on virtual instances in the cloud, it also can deploy applications inside LXD-based containers. Its only limitation for the moment - Juju works mostly in Ubuntu.
If you compare Juju with the solutions we have been describing in this article, you’ll note that it includes a lot of features from CMs, Cloud formation, unattended, and shell scripts - basically everything!
Both examples show a “WordPress” application deployed in Juju, with an IP load balancer and a MySQL database.
Juju includes a ready-to-use library of common applications called “Juju charms”. These “charms” include some monitoring agents and complete monitoring solutions, as well. In conclusion, using Juju to include a monitoring solution for your infrastructure is within your reach.
Pedal to the Metal: Configuration management tools now can do cloud too
CM tools like Ansible, Chef, and Puppet can do clouds too. They can be used almost the same way cloud formation is used to deploy cloud elements. This is just an extension of the objects those CM tools can manage. In the example here, you can see a typical Ansible playbook used to deploy an autoscale group with virtual instances on AWS and issue (through metadata services) post-install instructions to the instance.
The last playbook includes a form of monitoring too, due to the fact that “Auto Scale” groups need policies based on alarms and monitoring of specific metrics.
Orchestrating the orchestrator: Docker infrastructure automation on AWS and OpenStack
Everyone using Docker containers in big platforms knows that containers orchestration is a “must” to make the container infrastructure reliable and scalable. The most widely used docker orchestration solutions are Docker Swarm, Google Kubernetes, and Apache Mesos.
AWS uses its own cluster solution (AWS ECS) for dockerized clusters, but OpenStack includes direct support in its container infrastructure offering (called Magnum) for Swarm, Kubernetes, and Mesos.
Both solutions are fully deployable using cloud formation tools. It is kind of interesting how the cloud, which is an “orchestration engine” by nature, can create other orchestrator solutions inside.
Below, you’ll see a Swarm cluster being created inside OpenStack’s Magnum:
Did you notice that the actual component that creates the cluster is the “Orchestration” (cloud-formation) component in OpenStack? Indeed, Magnum relies on Heat (an OpenStack cloud formation component) to create all docker-cluster components. Another shot showing cluster support in OpenStack’s Magnum:
And what about monitoring? With the docker cluster fully working, you can deploy container-based monitoring solutions, and/or apply container monitoring techniques, in order to maintain a view of your infrastructure. Remember that monitoring containers is a complex task that will include monitoring the cluster nodes and querying the cluster in order to determine the container states. With this information you can provision your monitoring platform. Here is where automation and smart scripting will help you a lot (going back to the first part of this article: Shell scripts).
Conclusions: Many ways to skin a cat.
Infrastructure automation is here to stay. With the many techniques mentioned on this brief article, modern sysadmins can automate every single component of the server’s infrastructure on today datacenters. You can choose the method best suited for your needs. Use “unattended” with “PXEBoot” to install your bare metal.
- Use single shell scripts in order to automate single tasks or when extending monitoring metrics is needed.
- Use “CM’s” in order to keep configurations controlled and up to date.
- Use “cloud formation” if you have AWS or OpenStack in order to automate your deployment.
- Use “juju” if your infrastructure is canonical-based.
And finally, mix all things in a smart way in order to fully automate your life as sysadmin. Do that, and you’ll be able to sleep more and better.
Loom Systems delivers an AIOps-powered log analytics solution, Sophie,
to predict and prevent problems in the digital business. Loom collects logs and metrics from the entire IT stack, continually monitors them, and gives a heads-up when something is likely to deviate from the norm. When it does, Loom sends out an alert and
recommended resolution so DevOps and IT managers can proactively attend to the issue before anything goes down.
Get Started with AIOps Today!