Elastic Stack: Elasticsearch Logstash and Kibana
Elastic Stack: Elasticsearch Logstash and Kibana
0
AUGUST 11, 2017
ELASTIC STACK
Elasticsearch Logstash and Kibana
2. COMPONENTS
Elasticsearch is a real-time distributed and open source full-text search and analytics engine.
It is a RESTful distributed search engine built on top of Apache Lucene and released under
an Apache license. It is Java-based and can search and index document files in diverse
formats.
Logstash is a data collection engine that unifies data from disparate sources, normalizes it
and distributes it. The product was originally optimized for log data but has expanded the
scope to take data from all sources.
Beats are “data shippers” that are installed on servers as agents used to send different types
of operational data to Elasticsearch either directly or through Logstash, where the data might
be enhanced or archived.
Kibana is an open source data visualization and exploration tool from that is specialized for
large volumes of streaming and real-time data. The software makes huge and complex data
streams more easily and quickly understandable through graphic representation.
Packetbeat is a network packet analyzer that ships information about the transactions exchanged
between your application servers.
Filebeat ships log files from your servers.
Metricbeat is a server monitoring agent that periodically collects metrics from the operating systems and
services running on your servers.
Winlogbeat ships Windows event logs.
The Beats are open source data shippers that you install as agents on your servers to send different types
of operational data to Elasticsearch. Beats can send data directly to Elasticsearch or send it to
Elasticsearch via Logstash, which you can use to parse and transform the data.
Then this data is available to Kibana for visualization.
5. ELASTICSEARCH
Elasticsearch is a highly available and distributed search engine.
• Built on top of Apache Lucene
• NoSQL Datastore
• Schema-free
• JSON Document
• RESTful APIs
5.1. FEATURES
• Distributed
• Scalable
• Highly available
• Near Real Time (NRT) search
• Full Text Search
• Java, .NET, PHP, Python, Curl, Perl, Ruby
• HADOOP & SPARK -- Elasticsearch-Hadoop (ES-Hadoop)
Elasticsearch is distributed, which means that indices can be divided into shards and each shard can have
zero or more replicas. By default, an index is created with 5 shards and 1 replica per shard (5/1).
Rebalancing and routing of shards are done automatically.
5.2. ELASTICSEARCH TERMINOLOGY
We will discuss few important ElasticSearch Terminology: Index, Type, Document, Key, Value etc
Analogy of Elasticsearch with RDMS
Document: In Elasticsearch, a Document is an instance of a Type. It contains Data with Key and Value pairs.
A Document is similar to a Row in a Table in Relation Database World. Key is Column name and value is
Column value.
5.5. INSTALLATION
As of now, Elastic stack 5.5.1 is the latest version.
On Ubuntu
curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.1.deb
On Centos
curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.1.rpm
5.6. CONFIGURATION
Open the file configuration file /etc/elasticsearch/elasticsearch.yml
Change the cluster.name to use descriptive name for your cluster (optional)
Change the node.name to use descriptive name for your node (optional)
Change the network.host to bind the address to specific IP (localhost).
And then restart the Elasticsearch using following command.
We can see that our cluster named "my-application" is up with a green status.
c) Node status
curl –XGET 'localhost:9200/_cat/nodes?v&pretty'
Created true means new document is added in the index customer, type external and document id is 1.
All above information is called metadata of the document.
f) Get the document (READ)
curl -XGET 'localhost:9200/customer/external/1?pretty&pretty'
Response shows that the document has been ‘Updated’ and new version is created.
Source : Logstash can collect the data from variety of source using Input plugins e.g.
Syslogfile, Websites, Databases (RDBMS & NoSQL), Sensors and IoT.
Target: Logstash can send the data to variety of targets using output plugins e.g.
For Analysis:
we can use various datastores like Elasticsearch, MangoDB
for Archiving, we can use HDFS, S3, Google cloud storage
for Monitoring, we can use Nagios, graphite, Ganglia
for Alerting, we can use Email, Slack, HipChat, Watcher (Elastic Stack)
stdin {
}
filter {
grok { match => { "message" => "Duration: %{NUMBER:duration}" } }
}
OUTPUT PLUGIN
The output plugin is used to send data to a destination. It acts as the final section required in the
Logstash configuration file. Some of the most used output plugins are as follows.
elasticsearch: send event data to Elasticsearch.
file: write event data to a file on disk
stdout
This is a fairly simple plugin, which outputs the data to the standard output of the shell. It is useful for
debugging the configurations used for the plugins. This is mostly used to validate whether Logstash is
parsing the input and applying filters (if any) properly to provide output as required.
stdout {
}
CODEC PLUGIN
Codec plugins are used to encode/decode the data. The input data can come in various formats, hence,
to read and store the data of different formats, we use codec. Some of the codec plugins are as follows.
rubydebug
The rubydebug codec is a fairly simple plugin that outputs the data to the standard output of the shell,
which prints the data using the Ruby Awesome Print library.
The basic configuration for rubydebug is as follows:
rubydebug {
}
6.3. INSTALLATION
On Ubuntu
curl -L -O https://artifacts.elastic.co/downloads/logstash/logstash-5.5.1.deb
On Centos
curl -L -O https://artifacts.elastic.co/downloads/logstash/logstash-5.5.1.rpm
sudo rpm -i logstash-5.5.1.rpm
6.4. CONFIGURATION
Basic configuration of Logstash is shown as below. Input and Output plugin is mandatory. The input
plugins consume data from a source, the filter plugins modify the data as you specify, and the output
plugins write the data to a destination.
Let's see example configuration of logstash pipeline
Here we are taking input as File using file plugin. path of the file is ... and start position of the log access
is from begining. However you’ll notice that the format of the log messages is not ideal. You want to
parse the log messages to create specific, named fields from the logs. To do this, you’ll use the grok filter
plugin. The grok filter plugin is one of several plugins that are available by default in Logstash.
Output of the pipeline is Elasticsearch. Logstash uses http protocol to connect to Elasticsearch. The above
example assumes that Logstash and Elasticsearch are running on the same instance. You can specify a
remote Elasticsearch instance by using the hosts configuration
cd /use/share/logstash
bin/logstash -e 'input { stdin { } } output { stdout{ } }'
The -e flag enables you to specify a configuration directly from the command line. The pipeline in the
example takes input from the standard input, stdin, and moves that input to the standard
output, stdout, in a structured format.
This will take the data from standard input and send it to both
standout output and Elasticsearch database with the index as
‘testing’ default type is logs and autogenerated document id.
output {
file {
path => "/tmp/output.log"
}
}'
bin/logstash –f first_logstash.conf
7.2. INSTALLATION
On Ubuntu
curl -L -O https://artifacts.elastic.co/downloads/kibana/kibana-5.5.1-amd64.deb
On Centos
curl -L -O https://artifacts.elastic.co/downloads/kibana/kibana-5.5.1-x86_64.rpm
7.3. CONFIGURATION
Open the kibana configuration file : /etc/kibana/kibana.yml
Change the server.host to your IP address
Change the elasticsearch.url to the IP address of elasticsearch
Once Kibana is started, you have to tell it about the Elasticsearch indices that you want to explore by
configuring one or more index patterns.
To create an index pattern to connect to Elasticsearch:
7.4. DISCOVER
Time Filter: This filters the data for a specific time range
Search Box: This is used to search and query the data
Toolbar: This contains options such as new search, save search, open saved search, and share
Index Name: This displays the name of the selected index
Fields List: This displays the name of all the fields that are present within the selected index
Number of Hits: This displays the total number of documents as per the time interval specified and
corresponding to the search query matching
Filter: You can filter the search results to display only those documents that contain a particular value in a
field. You can also create negative filters that exclude documents that contain the specified field value
Viewing the data stats: From the Fields list, you can see how many of the documents in the Documents
table contain a particular field, what the top 5 values are, and what percentage of documents contain
each value.
7.6. DASHBOARD
A Kibana dashboard displays a collection of saved visualizations.
In edit mode you can arrange and resize the visualizations as needed and save dashboards so they be
reloaded and shared.
Building the Dashboard:
You can Edit, delete, resize, move the visualizations within the dashboard.
Also you can share the Dashboard with your colleague or outside world using link.
8. BEATS
The Beats are open source data shippers that you install as agents on your servers to send different types
of operational data to Elasticsearch. Beats can send data directly to Elasticsearch or send it to
Elasticsearch via Logstash, which you can use to parse and transform the data.
8.1. METRICBEAT
Metricbeat helps you monitor your servers and the services they host by collecting metrics from the
operating system and services.
INSTALLATION
On Ubuntu:
curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-5.5.1-amd64.deb
On Centos
curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-5.5.1-x86_64.rpm
CONFIGURATION
To configure Metricbeat, you edit the configuration file. /etc/metricbeat/metricbeat.yml
Metricbeat uses modules to collect metrics. Following modules can be used, please uncomment /
comments the metrics you require for your monitoring
output.elasticsearch:
hosts: ["localhost:9200"]
Once the configuration is done, restart the metricbeat using following command.
cd /usr/share/metricbeat
./script/import_dashboards
INSTALLATION
On Ubuntu
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.5.1-amd64.deb
On Centos
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.5.1-x86_64.rpm
CONFIGURATION
To configure Filebeat, you edit the configuration file, /etc/filebeat/filebeat.yml
Define the paths of your log file
output.elasticsearch:
hosts: ["localhost:9200"]
If you want to use Logstash to perform additional processing on the data collected by Filebeat, you need
to configure Filebeat to use Logstash.
output.logstash:
hosts: ["127.0.0.1:5044"]
Once the configuration is done, restart the filebeat using following command.
9. REFERENCES
References
https://www.elastic.co/learn
Elastic Stack
https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html
Elasticsearch
https://www.elastic.co/guide/en/logstash/current/index.html
Logstash
https://www.elastic.co/guide/en/kibana/current/index.html
Kibana
https://www.elastic.co/guide/en/beats/libbeat/current/index.html
Beats
Contact me @vikshinde