Elasticsearch¶

Example cluster definition¶

Example ElasticsearchCluster resource:

apiVersion: navigator.jetstack.io/v1alpha1
kind: ElasticsearchCluster
metadata:
  name: demo
spec:
  ## Omitting the minimumMasters fields will cause navigator to automatically
  ## determine a quorum of masters to use.
  # minimumMasters: 2
  version: 5.6.2

  securityContext:
    runAsUser: 1000

  pilotImage:
    repository: quay.io/jetstack/navigator-pilot-elasticsearch
    tag: v0.1.0-alpha.1
    pullPolicy: Always

  nodePools:
  - name: master
    replicas: 3

    roles:
    - master

    resources:
      requests:
        cpu: "500m"
        memory: "2Gi"
      limits:
        cpu: "1"
        memory: "3Gi"

    persistence:
      enabled: true
      # size of the volume
      size: 10Gi
      # storageClass of the volume
      storageClass: standard

  - name: mixed
    replicas: 2

    roles:
    - data
    - ingest

    resources:
      requests:
        cpu: "500m"
        memory: "2Gi"
      limits:
        cpu: "1"
        memory: "3Gi"

    persistence:
      enabled: true
      # size of the volume
      size: 10Gi
      # storageClass of the volume
      storageClass: standard

Node Pools¶

The Elasticsearch nodes in a Navigator ElasticsearchCluster are configured and grouped by role and in Navigator, these groups of nodes are called nodepools.

Note

Other than the following whitelisted fields, updates to nodepool configuration are not allowed:

replicas
persistence

Configure Scheduler Type¶

If a custom scheduler type is required (for example if you are deploying with stork or another storage provider), this can be set on each nodepool:

spec:
  nodePools:
  - name: "ringnodes-1"
    schedulerName: "fancy-scheduler"
  - name: "ringnodes-2"
    schedulerName: "fancy-scheduler"

If the nodepool field is not specified, the default scheduler is used.

Managing Compute Resources for Clusters¶

Each nodepool has a resources attribute which defines the resource requirements and limits for each database node (pod) in that pool.

In the example above, each database node will request 0.5 CPU core and 2GiB of memory, and will be limited to 1 CPU core and 3GiB of memory.

The resources field follows exactly the same specification as the Kubernetes Pod API (pod.spec.containers[].resources).

See Managing Compute Resources for Containers for more information.

Pilots¶

Navigator creates one Pilot resource for every database node. Pilot resources have the same name and name space as the Pod for the corresponding database node. The Pilot.Spec is read by the pilot process running inside a Pod and contains its desired configuration. The Pilot.Status is updated by the pilot process and contains the discovered state of a single database node.

Other Supplementary Resources¶

Navigator will also create a number of supplementary resources for each cluster. For example it will create a serviceaccount, a role and a rolebinding so that pilot pods in a cluster have read-only access the API resources containing cluster configuration, and so that pilot pods can update the status of their corresponding Pilot resource and leader election configmap.

System Configuration for Elasticsearch Nodes¶

Elasticsearch requires important system configuration settings to be applied globally on the host operating system.

You must either ensure that Navigator is running in a Kubernetes cluster where all the nodes have been configured this way. Or you could use node labels and node selectors to ensure that the pods of an Elasticsearch cluster are only scheduled to nodes with the required configuration.

See Using Sysctls in a Kubernetes Cluster, and Taints and Tolerations for more information.

One way to apply these settings is to deploy a DaemonSet that runs the configuration commands from within a privileged container on each Kubernetes node. Here’s a simple example of such a DaemonSet:

$ kubectl apply -f docs/quick-start/sysctl-daemonset.yaml

# Apply sysctl configuration required by Elasticsearch
#
# This DaemonSet will re-run sysctl every 60s on all nodes.
#
# XXX See CronJob daemonset which will allow scheduling one-shot or repeated
# jobs across nodes:
# https://github.com/kubernetes/kubernetes/issues/36601

apiVersion: "extensions/v1beta1"
kind: "DaemonSet"
metadata:
  name: "navigator-elasticsearch-sysctl"
  namespace: "kube-system"
spec:
  template:
    metadata:
      labels:
        app: "navigator-elasticsearch-sysctl"
    spec:
      containers:
      - name: "apply-sysctl"
        image: "busybox:latest"
        resources:
          limits:
            cpu: "10m"
            memory: "8Mi"
          requests:
            cpu: "10m"
            memory: "8Mi"
        securityContext:
          privileged: true
        command:
        - "/bin/sh"
        - "-c"
        - |
          set -o errexit
          set -o xtrace
          while sysctl -w vm.max_map_count=262144
          do
            sleep 60s
          done

docs/quick-start/sysctl-daemonset.yaml