There’s an old Chinese DevOps saying that “there are three important decisions in life – who you love, who you fight, and how you ship your logs”.
When it comes to shipping logs, two popular open source tools come to mind: Logstash and Fluentd. Both address the collection, processing and transport aspects of centralized logging – but the differences are quite significant.
we’ll assume you are either well-into or inevitably entering the world of log analysis so we’ll just stick to the basics to avoid overkill.
Popularity and Livelihood of the tool
Logstash enjoys (at the time of writing) 7K stars on GitHub, where Fluentd has only 4K. While LogStash is slightly more popular, both enjoy a live community with frequent releases and a responsive GitHub support.
LogStash: 7k stars on GitHub, IRC channel and a forum.
Fluentd: 4k stars on Github, Slack channel, newsletters and Google group.
LogStash is now part of the popular ELK stack, so if you plan on using Elastic, you should tend to prefer LogStash (although Fluentd also has excellent support for Elastic).
LogStash: part of the ELK stack.
Fluentd: part of the CNCF.
Fluentd is the only one of the two tools which has an Enterprise support option.
Fluentd: offers enterprise support .
Platform Compatibility and Extensibility
Both tools run on both Windows and Linux, are written in Ruby and have an extensive and mature plugin system. LogStash plugin ecosystem is centralized while the Fluentd one isn’t.
There are about 200 plugins under logstash-plugins GitHub repo. You can find over 500 Fluentd plugins, but only 10 under the official repository (not necessarily the most popular among users).
One difference that might be important to you is that LogStash is based on JRuby (you’ll need to have Java installed) while Fluentd uses CRuby.
LogStash: Windows and Linux. about 200 plugins on GitHub.
Fluentd: Windows and Linux. about 500 plugins on GitHub.
Logstash is limited to an on-memory queue that holds 20 events (fixed size) and relies on an external queue for persistence across restarts. This is a known issue for Logstash and the lack of buffering can be overcome by using Redis or Kafka as a central buffer. The benefit of Logstash’s approach is simplicity, however, you are compelled to deploy Redis alongside Logstash for improved reliability in production.
Fluentd, on the other hand, has a highly configurable buffering system that can be in-memory or on-disk with a seemingly endless array of parameters . The downside is that configuring the parameters for its built-in reliability can take some getting used to.
LogStash: limited to an on-memory queue that holds 20 events.
Fluentd: highly configurable buffering system.
This is debatable and dynamic topic. Both perform well in most use cases, where Fluentd has a slightly better reputation when it comes to performance.
Fluentd: :) , slightly better reputation.
Monitoring & Tuning
This splits to a simpler “knowing that it’s generally working” and a more complex “where is spending time”, and “does it sometimes fail for some of the events”.
Both products produce logs which could be improved. Both products offer a heartbeat output which can be used to make sure “it’s generally working”.
LogStash offers the metrics filter , which you can use to track and report the rate of all or specific processing chains. These can be then sent to tools such as Graphite and visualized using Grafana. It also offers aRESTful monitoring API which can be used to understand resource consumption at a pretty detailed level.
Fluentd has a built-in monitoring agent which can be queried for the status of specific tags within the flow. There are also several monitoring plugins which should enable integrating with whatever monitoring stack you use.
If you want to be a pro, make sure to have the tool also forward its own operational logs. Don’t overlook the middleware logs – it’s part of your production environment. Have these logs centralized and constantly review them ( automatically , of course!).
LogStash: Metrics filter, can be visualized in 3rd party such as Graphite.
Fluentd: built-in monitoring agent, plug-ins to integrate with your monitoring stack.
Both log collectors support routing, but their approaches are different. Logstash routes all data into a single stream and then uses algorithmic if-then statements to send them to the correct destination. Fluentd uses tags to route events. Each Fluentd event has a tag that tells Fluentd where it needs to be routed.
Logstash’s approach is declarative in comparison to Fluentd’s more procedural approach. Review example configuration files from each of the tools ( LogStash examples , Fluentd examples ) and see which one fits you better.
* configuration files start small then soon come to resemble Mr. Creosote . Split your configuration into smaller files to keep peace with future self.
LogStash: routing into a single stream, then sends to the correct destination.
Fluentd: routing based on event tags.
Additional things to know
License – both are licensed under the Apache 2.0 license.
Configuration reload – newer versions of both tools support reloading the configuration without having to restart.
Which of the Alternatives would you choose for your stack? LogStash or Fluentd? If you can spare a few minutes, let the world know in the comments below
Loom Systems offers a leading operational analytics platform used for real-time detection and resolution for any type of application. DevOps & IT professionals use Loom to analyze structured and semi-structured logs and metrics for immediate visibility into their environments. start your free trial !
Loom Systems delivers an AI-powered log analysis solution to predict and prevent problems in the digital business. Loom collects logs and metrics from the entire IT stack, continually monitors them, and gives a heads-up when something is likely to deviate from the norm. When it does, Loom sends out an alert and recommended resolution so DevOps and IT managers can proactively attend to the issue before anything goes down. Schedule Your Live Demo here!