Cassandra

Example cluster definition

Example CassandraCluster resource:

apiVersion: "navigator.jetstack.io/v1alpha1"
kind: "CassandraCluster"
metadata:
  name: "demo"
spec:
  version: "3.11.1"
  nodePools:
  - name: "ringnodes"
    replicas: 3
    datacenter: "demo-datacenter"
    rack: "demo-rack"
    persistence:
      enabled: true
      size: "5Gi"
      storageClass: "default"
    nodeSelector: {}
    resources:
      requests:
        cpu: "500m"
        memory: "2Gi"
      limits:
        cpu: "1"
        memory: "3Gi"
  image:
    repository: "cassandra"
    tag: "3"
    pullPolicy: "IfNotPresent"
  pilotImage:
    repository: "quay.io/jetstack/navigator-pilot-cassandra"
    tag: "v0.1.0-alpha.1"

Node Pools

The C* nodes in a Navigator cassandracluster are configured and grouped by rack and data center and in Navigator, these groups of nodes are called nodepools.

All the C* nodes (pods) in a nodepool have the same configuration and the following sections describe the configuration options that are available.

Note

Other than the following whitelisted fields, updates to nodepool configuration are not allowed:

  • replicas
  • persistence

Configure Scheduler Type

If a custom scheduler type is required (for example if you are deploying with stork or another storage provider), this can be set on each nodepool:

spec:
  nodePools:
  - name: "ringnodes-1"
    schedulerName: "fancy-scheduler"
  - name: "ringnodes-2"
    schedulerName: "fancy-scheduler"

If the nodepool field is not specified, the default scheduler is used.

Cassandra Across Multiple Availability Zones

With rack awareness

Navigator supports running Cassandra with rack and datacenter-aware replication To deploy this, you must run a nodePool in each availability zone, and mark each as a separate Cassandra rack.

The nodeSelector field of a nodePool allows scheduling the nodePool to a set of nodes matching labels. This should be used with a node label such as failure-domain.beta.kubernetes.io/zone.

The datacenter and rack fields mark all Cassandra nodes in a nodepool as being located in that datacenter and rack. This information can then be used with the NetworkTopologyStrategy keyspace replica placement strategy. If these are not specified, Navigator will select an appropriate name for each: datacenter defaults to a static value, and rack defaults to the nodePool’s name.

As an example, the nodePool section of a CassandraCluster spec for deploying into GKE in europe-west1 with rack awareness enabled:

nodePools:
- name: "np-europe-west1-b"
  replicas: 3
  datacenter: "europe-west1"
  rack: "europe-west1-b"
  nodeSelector:
    failure-domain.beta.kubernetes.io/zone: "europe-west1-b"
  persistence:
    enabled: true
    size: "5Gi"
    storageClass: "default"
- name: "np-europe-west1-c"
  replicas: 3
  datacenter: "europe-west1"
  rack: "europe-west1-c"
  nodeSelector:
    failure-domain.beta.kubernetes.io/zone: "europe-west1-c"
  persistence:
    enabled: true
    size: "5Gi"
    storageClass: "default"
- name: "np-europe-west1-d"
  replicas: 3
  datacenter: "europe-west1"
  rack: "europe-west1-d"
  nodeSelector:
    failure-domain.beta.kubernetes.io/zone: "europe-west1-d"
  persistence:
    enabled: true
    size: "5Gi"
    storageClass: "default"

Without rack awareness

Since the default rack name is equal to the nodepool name, simply set the rack name to the same static value in each nodepool to disable rack awareness.

A simplified example:

nodePools:
- name: "np-europe-west1-b"
  replicas: 3
  datacenter: "europe-west1"
  rack: "default-rack"
  nodeSelector:
    failure-domain.beta.kubernetes.io/zone: "europe-west1-b"
- name: "np-europe-west1-c"
  replicas: 3
  datacenter: "europe-west1"
  rack: "default-rack"
  nodeSelector:
    failure-domain.beta.kubernetes.io/zone: "europe-west1-c"
- name: "np-europe-west1-d"
  replicas: 3
  datacenter: "europe-west1"
  rack: "default-rack"
  nodeSelector:
    failure-domain.beta.kubernetes.io/zone: "europe-west1-d"

Managing Compute Resources for Clusters

Each nodepool has a resources attribute which defines the resource requirements and limits for each database node (pod) in that pool.

In the example above, each database node will request 0.5 CPU core and 2GiB of memory, and will be limited to 1 CPU core and 3GiB of memory.

The resources field follows exactly the same specification as the Kubernetes Pod API (pod.spec.containers[].resources).

See Managing Compute Resources for Containers for more information.

Connecting to Cassandra

If you apply the YAML manifest from the example above, Navigator will create a Cassandra cluster with three C* nodes running in three pods. The IP addresses assigned to each C* node may change when pods are rescheduled or restarted, but there are stable DNS names which allow you to connect to the cluster.

Services and DNS Names

Navigator creates two headless services for every Cassandra cluster that it creates. Each service has a corresponding DNS domain name:

  1. The nodes service (e.g. cass-demo-nodes) has a DNS domain name which resolves to the IP addresses of all the C* nodes in cluster (nodes 0, 1, and 2 in this example).
  2. The seeds service (e.g. cass-demo-seeds) has a DNS domain name which resolves to the IP addresses of only the seed nodes (node 0 in this example).

These DNS names have multiple HOST (A) records, one for each healthy C* node IP address.

Note

The DNS server only includes healthy nodes when answering requests for these two services.

The DNS names can be resolved from any pod in the Kubernetes cluster:

  • If the pod is in the same namespace as the Cassandra cluster you need only use the left most label of the DNS name. E.g. cass-demo-nodes.
  • If the pod is in a different namespace you must use the fully qualified DNS name. E.g. cass-demo-nodes.my-namespace.svc.cluster.local.

Note

Read DNS for Services and Pods for more information about DNS in Kubernetes.

TCP Ports

The C* nodes all listen on the following TCP ports:

  1. 9042: For CQL client connections.
  2. 8080: For Prometheus client connections.

Connect using a CQL Client

Navigator configures all the nodes in a Cassandra cluster to listen on TCP port 9042 for CQL client connections. And there are CQL drivers for most popular programming languages. Most drivers have the ability to connect to a single node and then discover all the other cluster nodes.

For example, you could use the Datastax Python driver to connect to the Cassandra cluster as follows:

from cassandra.cluster import Cluster

cluster = Cluster(['cass-demo-nodes'], port=9042)
session = cluster.connect()
rows = session.execute('SELECT ... FROM ...')
for row in rows:
    print row

Note

The IP address to which the driver makes the initial connection depends on the DNS server and operating system configuration.

Pilots

Navigator creates one Pilot resource for every database node. Pilot resources have the same name and name space as the Pod for the corresponding database node. The Pilot.Spec is read by the pilot process running inside a Pod and contains its desired configuration. The Pilot.Status is updated by the pilot process and contains the discovered state of a single database node.

Other Supplementary Resources

Navigator will also create a number of supplementary resources for each cluster. For example it will create a serviceaccount, a role and a rolebinding so that pilot pods in a cluster have read-only access the API resources containing cluster configuration, and so that pilot pods can update the status of their corresponding Pilot resource and leader election configmap.

The Life Cycle of a Navigator Cassandra Cluster

Changes to the configuration of an established Cassandra cluster must be carefully sequenced in order to maintain the health of the cluster. So Navigator is conservative about the configuration changes that it supports.

Here are the configuration changes that are supported and the configuration changes which are not yet supported.

Supported Configuration Changes

Navigator supports the following changes to a Cassandra cluster:

  • Create Cluster: Add all initially configured node pools and nodes.
  • Scale Out: Increase CassandraCluster.Spec.NodePools[0].Replicas to add more C* nodes to a nodepool.

Navigator does not currently support any other changes to the Cassandra cluster configuration.

Unsupported Configuration Changes

The following configuration changes are not currently supported but will be supported in the near future:

  • Minor Upgrade: Trigger a rolling Cassandra upgrade by increasing the minor and / or patch components of CassandraCluster.Spec.Version.
  • Scale In: Decrease CassandraCluster.Spec.NodePools[0].Replicas to remove C* nodes from a nodepool.
The following configuration changes are not currently supported:
  • Add Rack: Add a nodepool for a new rack.
  • Remove Rack: Remove a nodepool.
  • Add Data Center: Add a nodepool for a new data center.
  • Remove Data Center: Remove all the nodepools in a data center.
  • Major Upgrade: Upgrade to a new major Cassandra version.

Create Cluster

When you first create a CassandraCluster resource, Navigator will add nodes, one at a time, in order of NodePool and according to the process described in Scale Out (below). The order of node creation is determined by the order of the entries in the CassandraCluster.Spec.NodePools list. You can look at CassandraCluster.Status.NodePools to see the current state.

Scale Out

When you first create a cluster or when you increment the CassandraCluster.Spec.NodePools[i].ReplicaCount, Navigator will add C* nodes, one at a time, until the desired number of nodes is reached. You can look at CassandraCluster.Status.NodePools[<nodepoolname>].ReadyReplicas to see the current number of healthy C* nodes in each nodepool.