This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Deployment

How to run Replicator on your infrastructure

1: Run Replicator in Docker

2: Kubernetes

2.1: Helm
2.2: Monitoring

Replicator is optimised to run in Kubernetes, where you can deploy it using a Helm chart we provide. However, you can also run it on a virtual machine using Docker Compose.

1 - Run Replicator in Docker

Running Replicator with Docker Compose

You can run Replicator using Docker Compose, on any machine, which has Docker installed.

We prepared a complete set of files for this scenario. You find those files in the Replicator repository.

The Compose file includes the following components:

Replicator itself
Prometheus, pre-configured to scrape Replicator metrics endpoint
Grafana, pre-configured to use Prometheus, with the Replicator dashboard included

Configuration

Before spinning up this setup, you need to change the replicator.yml file. Find out about Replicator settings on the Configuration page. We included a sample configuration file to the repository.

The sample configuration file includes a JavaScript transform configuration as an example. It is not suitable for production purposes, so make sure you remove it from your configuration.

The sample configuration enables verbose logging using the REPLICATOR_DEBUG environment variable. For production deployments, you should remove it from the configuration.

Monitoring

When you start all the component using docker-compose up, you’d be able to check the Replicator web UI by visiting http://localhost:5000, as well as Grafana at http://localhost:3000. Use admin/admin default credentials for Grafana. The Replicator dashboard is included in the deployment, so you can find it in the dashboards list.

Watch out for the replication gap and ensure that it decreases.

Grafana dashboard

2 - Kubernetes

Deploying Replicator to a Kubernetes cluster

You can run Replicator in a Kubernetes cluster in the same cloud as your managed EventStoreDB cloud cluster. The Kubernetes cluster workloads must be able to reach the managed EventStoreDB cluster. Usually, with a proper VPC (or VN) peering between your VPC and Event Store Cloud network, it works without issues.

We provide guidelines about connecting managed Kubernetes clusters:

2.1 - Helm

Deploy Replicator with our Helm chart

The easiest way to deploy Replicator to Kubernetes is by using a provided Helm chart. On this page, you find detailed instructions for using the Replicator Helm chart.

Add Helm repository

Ensure you have Helm 3 installed on your machine:

$ helm version
version.BuildInfo{Version:"v3.5.2", GitCommit:"167aac70832d3a384f65f9745335e9fb40169dc2", GitTreeState:"dirty", GoVersion:"go1.15.7"}

If you don’t have Helm, following their installation guide.

Add the Replicator repository:

$ helm repo add es-replicator https://eventstore.github.io/replicator
$ helm repo update

Provide configuration

Configure the Replicator options using a new values.yml file:

replicator:
  reader:
    connectionString: "GossipSeeds=node1.esdb.local:2113,node2.esdb.local:2113,node3.esdb.local:2113; HeartBeatTimeout=500; UseSslConnection=False;  DefaultUserCredentials=admin:changeit;"
  sink:
    connectionString: "esdb://admin:changeit@[cloudclusterid].mesdb.eventstore.cloud:2113"
    partitionCount: 6
  filters:
    - type: eventType
      include: "."
      exclude: "((Bad|Wrong)\w+Event)"
  transform:
    type: http
    config: "http://transform.somenamespace.svc:5000"
prometheus:
  metrics: true
  operator: true

Available options are:

Option	Description	Default
`replicator.reader.connectionString`	Connection string for the source cluster or instance	nil
`replicator.reader.protocol`	Reader protocol	`tcp`
`replicator.reader.pageSize`	Reader page size (only applicable for TCP protocol	`4096`
`replicator.sink.connectionString`	Connection string for the target cluster or instance	nil
`replicator.sink.protocol`	Writer protocol	`grpc`
`replicator.sink.partitionCount`	Number of partitioned concurrent writers	`1`
`replicator.sink.partitioner`	Custom JavaScript partitioner	`null`
`replicator.sink.bufferSize`	Size of the sink buffer, in events	`1000`
`replicator.scavenge`	Enable real-time scavenge	`true`
`replicator.runContinuously`	Set to `false` if you want Replicator to stop when it reaches the end of `$all` stream.	`true`
`replicator.filters`	Add one or more of provided filters	`[]`
`replicator.transform`	Configure the event transformation
`replicator.transform.bufferSize`	Size of the prepare buffer (filtering and transformations), in events	`1000`
`prometheus.metrics`	Enable annotations for Prometheus	`false`
`prometheus.operator`	Create `PodMonitor` custom resource for Prometheus Operator	`false`
`resources.requests.cpu`	CPU request	`250m`
`resources.requests.memory`	Memory request	`512Mi`
`resources.limits.cpu`	CPU limit	`1`
`resources.limits.memory`	Memory limit	`1Gi`
`pvc.storageClass`	Persistent volume storage class name	`null`
`terminationGracePeriodSeconds`	Timeout for the workload graceful shutdown, it must be long enough for the sink buffer to flush	`300`
`jsConfigMaps`	List of existing config maps to be used as JS code files (for JS transform, for example)	`{}`

Note:

As Replicator uses 20.10 TCP client, you have to specify UseSsl=false in the connection string when connecting to an insecure cluster or instance.
Only increase the partitions count if you don’t care about the $all stream order (regular streams will be in order anyway)

You should at least provide both connection strings and ensure that workloads in your Kubernetes cluster can reach both the source and the target EventStoreDB clusters or instances.

Read also about monitoring the replicator process in Kubernetes.

Configuring a JavaScript transform

Follow the documentation to configure a JavaScript transform in your values.yml file.

Then append the following option to your helm install command:

--set-file transformJs=./transform.js

Configuring a custom partitioner

Follow the documentation to configure a custom partitioner in your values.yml file.

Then append the following option to your helm install command:

--set-file partitionerJs=./partitioner.js

Complete the deployment

When you have the values.yml file complete, deploy the release using Helm. Remember to set the current kubectl context to the cluster where you are deploying to.

helm install es-replicator \
  es-replicator/es-replicator \
  --values values.yml \
  --namespace es-replicator

You can choose another namespace, the namespace must exist before doing a deployment.

The replication starts immediately after the deployment, assuming that all the connection strings are correct, and the Replicator workload has network access to both source and sink EventStoreDB instances.

The checkpoint is stored on a persistent volume, which is provisioned as part of the Helm release. If you delete the release, the volume will be deleted by the cloud provider, and the checkpoint will be gone. If you deploy the tool again, it will start from the beginning of the $all stream and will produce duplicate events.

2.2 - Monitoring

Observe Replicator in Kubernetes

When the deployment finishes, you should be able to connect to the Replicator service by using port forwarding:

$ kubectl port-forward -n es-replicator svc/es-replicator 5000

The Replicator web interface should be then accessible via http://localhost:5000. The UI will display the replication progress, source read and target write positions, number of events written, and the replication gap. Note that the write rate is shown for the single writer. When you use concurrent writers, the speed will be higher than shown.

Prometheus

If you have Prometheus in your Kubernetes cluster, we recommend enabling prometheus.metrics option. If the prometheus.operator option is set to false, the deployment will be annotated with prometheus.io/scrape.

If you have Prometheus managed by Prometheus Operator, the scrape annotation won’t work. You can set both prometheus.metrics and prometheus.operator options to true, so the Helm release will include the PodMonitor custom resource. Make sure that your Prometheus custom resource is properly configured with regard to podMonitorNamespaceSelector and podMonitorSelector, so it will not ignore the Replicator pod monitor.

Grafana

The best way to monitor the replication progress is using Prometheus and Grafana. If the pod is being properly scraped for metrics, you would be able to use the Grafana dashboard, which you can create by import it from JSON file.

Watch out for the replication gap and ensure that it decreases.

Grafana dashboard