docker-elk/README.md

440 lines
19 KiB
Markdown

# Elastic stack (ELK) on Docker
[![Elastic Stack version](https://img.shields.io/badge/Elastic%20Stack-8.0.0-00bfb3?style=flat&logo=elastic-stack)](https://www.elastic.co/blog/category/releases)
[![Build Status](https://github.com/deviantony/docker-elk/workflows/CI/badge.svg?branch=main)](https://github.com/deviantony/docker-elk/actions?query=workflow%3ACI+branch%3Amain)
[![Join the chat at https://gitter.im/deviantony/docker-elk](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/deviantony/docker-elk?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
Run the latest version of the [Elastic stack][elk-stack] with Docker and Docker Compose.
It gives you the ability to analyze any data set by using the searching/aggregation capabilities of Elasticsearch and
the visualization power of Kibana.
![Animated demo](https://user-images.githubusercontent.com/3299086/140641708-cea70d17-cc04-459f-89d9-3fcb5c58bc35.gif)
*:information_source: The Docker images backing this stack include [X-Pack][xpack] with [paid features][paid-features]
enabled by default (see [How to disable paid features](#how-to-disable-paid-features) to disable them). **The [trial
license][trial-license] is valid for 30 days**. After this license expires, you can continue using the free features
seamlessly, without losing any data.*
Based on the official Docker images from Elastic:
* [Elasticsearch](https://github.com/elastic/elasticsearch/tree/master/distribution/docker)
* [Logstash](https://github.com/elastic/logstash/tree/master/docker)
* [Kibana](https://github.com/elastic/kibana/tree/master/src/dev/build/tasks/os_packages/docker_generator)
Other available stack variants:
* [`tls`](https://github.com/deviantony/docker-elk/tree/tls): TLS encryption enabled in Elasticsearch
* [`searchguard`](https://github.com/deviantony/docker-elk/tree/searchguard): Search Guard support
---
## Philosophy
We aim at providing the simplest possible entry into the Elastic stack for anybody who feels like experimenting with
this powerful combo of technologies. This project's default configuration is purposely minimal and unopinionated. It
does not rely on any external dependency or custom automation to get things up and running.
Instead, we believe in good documentation so that you can use this repository as a template, tweak it, and make it _your
own_. [sherifabdlnaby/elastdocker][elastdocker] is one example among others of project that builds upon this idea.
---
## Contents
1. [Requirements](#requirements)
* [Host setup](#host-setup)
* [Docker Desktop](#docker-desktop)
* [Windows](#windows)
* [macOS](#macos)
1. [Usage](#usage)
* [Initial setup](#initial-setup)
* [Setting up user authentication](#setting-up-user-authentication)
* [Injecting data](#injecting-data)
* [Cleanup](#cleanup)
* [Version selection](#version-selection)
1. [Configuration](#configuration)
* [How to configure Elasticsearch](#how-to-configure-elasticsearch)
* [How to configure Kibana](#how-to-configure-kibana)
* [How to configure Logstash](#how-to-configure-logstash)
* [How to disable paid features](#how-to-disable-paid-features)
* [How to scale out the Elasticsearch cluster](#how-to-scale-out-the-elasticsearch-cluster)
* [How to reset a password programmatically](#how-to-reset-a-password-programmatically)
1. [Extensibility](#extensibility)
* [How to add plugins](#how-to-add-plugins)
* [How to enable the provided extensions](#how-to-enable-the-provided-extensions)
1. [JVM tuning](#jvm-tuning)
* [How to specify the amount of memory used by a service](#how-to-specify-the-amount-of-memory-used-by-a-service)
* [How to enable a remote JMX connection to a service](#how-to-enable-a-remote-jmx-connection-to-a-service)
1. [Going further](#going-further)
* [Plugins and integrations](#plugins-and-integrations)
* [Swarm mode](#swarm-mode)
## Requirements
### Host setup
* [Docker Engine](https://docs.docker.com/install/) version **17.05** or newer
* [Docker Compose](https://docs.docker.com/compose/install/) version **1.20.0** or newer
* 1.5 GB of RAM
*:information_source: Especially on Linux, make sure your user has the [required permissions][linux-postinstall] to
interact with the Docker daemon.*
By default, the stack exposes the following ports:
* 5044: Logstash Beats input
* 5000: Logstash TCP input
* 9600: Logstash monitoring API
* 9200: Elasticsearch HTTP
* 9300: Elasticsearch TCP transport
* 5601: Kibana
**:warning: Elasticsearch's [bootstrap checks][booststap-checks] were purposely disabled to facilitate the setup of the
Elastic stack in development environments. For production setups, we recommend users to set up their host according to
the instructions from the Elasticsearch documentation: [Important System Configuration][es-sys-config].**
### Docker Desktop
#### Windows
If you are using the legacy Hyper-V mode of _Docker Desktop for Windows_, ensure [File Sharing][win-filesharing] is
enabled for the `C:` drive.
#### macOS
The default configuration of _Docker Desktop for Mac_ allows mounting files from `/Users/`, `/Volume/`, `/private/`,
`/tmp` and `/var/folders` exclusively. Make sure the repository is cloned in one of those locations or follow the
instructions from the [documentation][mac-filesharing] to add more locations.
## Usage
**:warning: You must rebuild the stack images with `docker-compose build` whenever you switch branch or update the
[version](#version-selection) of an already existing stack.**
### Initial setup
Clone this repository onto the Docker host that will run the stack, then start the Elasticsearch service locally using
Docker Compose:
```console
$ docker-compose up -d elasticsearch
```
We will start the rest of the Elastic components _after_ completing the initial setup described in this section. These
steps only need to be performed _once_.
**:warning: Starting with Elastic v8.0.0, it is no longer possible to run Kibana using the bootstraped privileged
`elastic` user. If you are starting the stack for the very first time, you MUST initialize a password for the [built-in
`kibana_system` user][builtin-users] to be able to start and access Kibana. Please read the section below attentively.**
#### Setting up user authentication
*:information_source: Refer to [Security settings in Elasticsearch][es-security] to disable authentication.*
The stack is pre-configured with the following **privileged** bootstrap user:
* user: *elastic*
* password: *changeme*
For increased security, we will reset this bootstrap password, and generate a set of passwords to be used by
unprivileged [built-in users][builtin-users] within components of the Elastic stack.
1. Initialize passwords for built-in users
The commands below generate random passwords for the `elastic` and `kibana_system` users. Take note of them.
```console
$ docker-compose exec -T elasticsearch bin/elasticsearch-reset-password --batch --user elastic
```
```console
$ docker-compose exec -T elasticsearch bin/elasticsearch-reset-password --batch --user kibana_system
```
If the need for it arises (e.g. if you want to [collect monitoring information][ls-monitoring] through Beats and
other components), feel free to repeat this operation at any time for the rest of the [built-in
users][builtin-users].
1. Replace usernames and passwords in configuration files
Replace the password of the `kibana_system` user inside the Kibana configuration file (`kibana/config/kibana.yml`)
with the password generated in the previous step.
Replace the password of the `elastic` user inside the Logstash pipeline file (`logstash/pipeline/logstash.conf`)
with the password generated in the previous step.
*:information_source: Do not use the `logstash_system` user inside the Logstash **pipeline** file, it does not have
sufficient permissions to create indices. Follow the instructions at [Configuring Security in Logstash][ls-security]
to create a user with suitable roles.*
See also the [Configuration](#configuration) section below.
1. Unset the bootstrap password (_optional_)
Remove the `ELASTIC_PASSWORD` environment variable from the `elasticsearch` service inside the Compose file
(`docker-compose.yml`). It is only used to initialize the keystore during the initial startup of Elasticsearch, and
is ignored on subsequent runs.
1. Start Kibana and Logstash
```console
$ docker-compose up -d
```
The `-d` flag runs all services in the background (detached mode).
On subsequent runs of the Elastic stack, it is sufficient to execute the above command in order to start all
components.
*:information_source: Learn more about the security of the Elastic stack at [Secure the Elastic
Stack][sec-cluster].*
#### Injecting data
Give Kibana about a minute to initialize, then access the Kibana web UI by opening <http://localhost:5601> in a web
browser and use the following credentials to log in:
* user: *elastic*
* password: *\<your generated elastic password>*
Now that the stack is running, you can go ahead and inject some log entries. The shipped Logstash configuration allows
you to send content via TCP:
```console
# Using BSD netcat (Debian, Ubuntu, MacOS system, ...)
$ cat /path/to/logfile.log | nc -q0 localhost 5000
```
```console
# Using GNU netcat (CentOS, Fedora, MacOS Homebrew, ...)
$ cat /path/to/logfile.log | nc -c localhost 5000
```
You can also load the sample data provided by your Kibana installation.
### Cleanup
Elasticsearch data is persisted inside a volume by default.
In order to entirely shutdown the stack and remove all persisted data, use the following Docker Compose command:
```console
$ docker-compose down -v
```
### Version selection
This repository stays aligned with the latest version of the Elastic stack. The `main` branch tracks the current major
version (8.x).
To use a different version of the core Elastic components, simply change the version number inside the `.env` file. If
you are upgrading an existing stack, please carefully read the note in the next section.
**:warning: Always pay attention to the [official upgrade instructions][upgrade] for each individual component before
performing a stack upgrade.**
Older major versions are also supported on separate branches:
* [`release-7.x`](https://github.com/deviantony/docker-elk/tree/release-7.x): 7.x series
* [`release-6.x`](https://github.com/deviantony/docker-elk/tree/release-6.x): 6.x series (End-of-life)
* [`release-5.x`](https://github.com/deviantony/docker-elk/tree/release-5.x): 5.x series (End-of-life)
## Configuration
*:information_source: Configuration is not dynamically reloaded, you will need to restart individual components after
any configuration change.*
### How to configure Elasticsearch
The Elasticsearch configuration is stored in [`elasticsearch/config/elasticsearch.yml`][config-es].
You can also specify the options you want to override by setting environment variables inside the Compose file:
```yml
elasticsearch:
environment:
network.host: _non_loopback_
cluster.name: my-cluster
```
Please refer to the following documentation page for more details about how to configure Elasticsearch inside Docker
containers: [Install Elasticsearch with Docker][es-docker].
### How to configure Kibana
The Kibana default configuration is stored in [`kibana/config/kibana.yml`][config-kbn].
It is also possible to map the entire `config` directory instead of a single file.
Please refer to the following documentation page for more details about how to configure Kibana inside Docker
containers: [Install Kibana with Docker][kbn-docker].
### How to configure Logstash
The Logstash configuration is stored in [`logstash/config/logstash.yml`][config-ls].
It is also possible to map the entire `config` directory instead of a single file, however you must be aware that
Logstash will be expecting a [`log4j2.properties`][log4j-props] file for its own logging.
Please refer to the following documentation page for more details about how to configure Logstash inside Docker
containers: [Configuring Logstash for Docker][ls-docker].
### How to disable paid features
Switch the value of Elasticsearch's `xpack.license.self_generated.type` setting from `trial` to `basic` (see [License
settings][trial-license]).
You can also cancel an ongoing trial before its expiry date — and thus revert to a basic license — either from the
[License Management][license-mngmt] panel of Kibana, or using Elasticsearch's [Licensing APIs][license-apis].
### How to scale out the Elasticsearch cluster
Follow the instructions from the Wiki: [Scaling out Elasticsearch](https://github.com/deviantony/docker-elk/wiki/Elasticsearch-cluster)
### How to reset a password programmatically
If for any reason your are unable to use Kibana to change the password of your users (including [built-in
users][builtin-users]), you can use the Elasticsearch API instead and achieve the same result.
In the example below, we reset the password of the `elastic` user (notice "/user/elastic" in the URL):
```console
$ curl -XPOST -D- 'http://localhost:9200/_security/user/elastic/_password' \
-H 'Content-Type: application/json' \
-u elastic:<your current elastic password> \
-d '{"password" : "<your new password>"}'
```
## Extensibility
### How to add plugins
To add plugins to any ELK component you have to:
1. Add a `RUN` statement to the corresponding `Dockerfile` (eg. `RUN logstash-plugin install logstash-filter-json`)
1. Add the associated plugin code configuration to the service configuration (eg. Logstash input/output)
1. Rebuild the images using the `docker-compose build` command
### How to enable the provided extensions
A few extensions are available inside the [`extensions`](extensions) directory. These extensions provide features which
are not part of the standard Elastic stack, but can be used to enrich it with extra integrations.
The documentation for these extensions is provided inside each individual subdirectory, on a per-extension basis. Some
of them require manual changes to the default ELK configuration.
## JVM tuning
### How to specify the amount of memory used by a service
By default, both Elasticsearch and Logstash start with [1/4 of the total host
memory](https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/parallel.html#default_heap_size) allocated to
the JVM Heap Size.
The startup scripts for Elasticsearch and Logstash can append extra JVM options from the value of an environment
variable, allowing the user to adjust the amount of memory that can be used by each component:
| Service | Environment variable |
|---------------|----------------------|
| Elasticsearch | ES_JAVA_OPTS |
| Logstash | LS_JAVA_OPTS |
To accomodate environments where memory is scarce (Docker for Mac has only 2 GB available by default), the Heap Size
allocation is capped by default to 256MB per service in the `docker-compose.yml` file. If you want to override the
default JVM configuration, edit the matching environment variable(s) in the `docker-compose.yml` file.
For example, to increase the maximum JVM Heap Size for Logstash:
```yml
logstash:
environment:
LS_JAVA_OPTS: -Xmx1g -Xms1g
```
### How to enable a remote JMX connection to a service
As for the Java Heap memory (see above), you can specify JVM options to enable JMX and map the JMX port on the Docker
host.
Update the `{ES,LS}_JAVA_OPTS` environment variable with the following content (I've mapped the JMX service on the port
18080, you can change that). Do not forget to update the `-Djava.rmi.server.hostname` option with the IP address of your
Docker host (replace **DOCKER_HOST_IP**):
```yml
logstash:
environment:
LS_JAVA_OPTS: -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=18080 -Dcom.sun.management.jmxremote.rmi.port=18080 -Djava.rmi.server.hostname=DOCKER_HOST_IP -Dcom.sun.management.jmxremote.local.only=false
```
## Going further
### Plugins and integrations
See the following Wiki pages:
* [External applications](https://github.com/deviantony/docker-elk/wiki/External-applications)
* [Popular integrations](https://github.com/deviantony/docker-elk/wiki/Popular-integrations)
### Swarm mode
Experimental support for Docker [Swarm mode][swarm-mode] is provided in the form of a `docker-stack.yml` file, which can
be deployed in an existing Swarm cluster using the following command:
```console
$ docker stack deploy -c docker-stack.yml elk
```
If all components get deployed without any error, the following command will show 3 running services:
```console
$ docker stack services elk
```
*:information_source: To scale Elasticsearch in Swarm mode, configure seed hosts with the DNS name `tasks.elasticsearch`
instead of `elasticsearch`.*
[elk-stack]: https://www.elastic.co/what-is/elk-stack
[xpack]: https://www.elastic.co/what-is/open-x-pack
[paid-features]: https://www.elastic.co/subscriptions
[es-security]: https://www.elastic.co/guide/en/elasticsearch/reference/current/security-settings.html
[trial-license]: https://www.elastic.co/guide/en/elasticsearch/reference/current/license-settings.html
[license-mngmt]: https://www.elastic.co/guide/en/kibana/current/managing-licenses.html
[license-apis]: https://www.elastic.co/guide/en/elasticsearch/reference/current/licensing-apis.html
[elastdocker]: https://github.com/sherifabdlnaby/elastdocker
[linux-postinstall]: https://docs.docker.com/install/linux/linux-postinstall/
[booststap-checks]: https://www.elastic.co/guide/en/elasticsearch/reference/current/bootstrap-checks.html
[es-sys-config]: https://www.elastic.co/guide/en/elasticsearch/reference/current/system-config.html
[win-filesharing]: https://docs.docker.com/desktop/windows/#file-sharing
[mac-filesharing]: https://docs.docker.com/desktop/mac/#file-sharing
[builtin-users]: https://www.elastic.co/guide/en/elasticsearch/reference/current/built-in-users.html
[ls-security]: https://www.elastic.co/guide/en/logstash/current/ls-security.html
[ls-monitoring]: https://www.elastic.co/guide/en/logstash/current/monitoring-with-metricbeat.html
[sec-cluster]: https://www.elastic.co/guide/en/elasticsearch/reference/current/secure-cluster.html
[connect-kibana]: https://www.elastic.co/guide/en/kibana/current/connect-to-elasticsearch.html
[index-pattern]: https://www.elastic.co/guide/en/kibana/current/index-patterns.html
[config-es]: ./elasticsearch/config/elasticsearch.yml
[config-kbn]: ./kibana/config/kibana.yml
[config-ls]: ./logstash/config/logstash.yml
[es-docker]: https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html
[kbn-docker]: https://www.elastic.co/guide/en/kibana/current/docker.html
[ls-docker]: https://www.elastic.co/guide/en/logstash/current/docker-config.html
[log4j-props]: https://github.com/elastic/logstash/tree/7.6/docker/data/logstash/config
[esuser]: https://github.com/elastic/elasticsearch/blob/7.6/distribution/docker/src/docker/Dockerfile#L23-L24
[upgrade]: https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-upgrade.html
[swarm-mode]: https://docs.docker.com/engine/swarm/