Distribute ROS 2 across machines with MicroK8s

sidfaber

on 17 November 2020

Tags: cloud , k8s , micro cloud , multus , Robots , ROS

This article was last updated 3 years ago.

Introduction

Our simple ROS 2 talker and listener setup runs well on a single Kubernetes node, now let’s distribute it out across multiple computers. This article builds upon our simple ROS 2 talker / listener setup by running it on multiple K8s nodes.

At the completion of this setup expect to have a ROS2 Kubernetes cluster running MicroK8s on three different machines. Applying a single configuration file distributes the ROS 2 workload across the machines.

This is the third article in a series of four posts describing ROS 2 applications on Kubernetes with MicroK8s:

Part 1: ROS 2 and Kubernetes basics
Part 2: ROS 2 on Kubernetes: a simple talker and listener setup
Part 3 (this article): Distribute ROS 2 across machines with Kubernetes
Part 4: Exploring ROS 2 Kubernetes configurations

K8s design

Before starting installation and configuration, consider a few important Kubernetes principles that influence this setup:

As always, keep ROS and Kubernetes term collisions in mind: nodes, namespaces and services exist with both applications but have very different meanings.

All machines in the cluster should appear very similar to the Kubernetes infrastructure. When reviewing configurations, look for anything which describes the physical machine but might actually differ between Kubernetes nodes. For example, each machine in this prototype will be configured with the same network interface name since the Multus configuration depends on the machine’s master interface name.

For high availability to work properly, all Kubernetes resources (including pods, services and deployments) need to migrate smoothly between nodes. This includes system nodes in the kube-system namespace. Add the -A option to many commands (e.g., microk8s.kubectl get all -A) to access resources all the namespaces within the cluster.

Prerequisites: cluster hardware

Building a cluster with multiple machines requires a bit of infrastructure before beginning to install and configure software.

Use three or more machines to create a high availability MicroK8s cluster. Although this may seem like a lot of resources, don’t let this requirement hold you back. Ubuntu 20.04 and MicroK8s can easily run on older hardware. The master node described below runs on an HP Intel Core i3 laptop (circa 2015). The second node is a Dell Optiplex 755 desktop computer (circa 2008), and the third virtual node runs on a VMWare ESX server.

Each node needs a unique hostname, and each node must be able to resolve the name of all other nodes in DNS. The three nodes used for this article are micro1, micro2 and micro3.

Create the MicroK8s cluster

Begin with the same initial setup process for each K8s node. First configure networking, then install MicroK8s, and finally join the cluster.

Much of the installation for the first node may have been completed in part 2 of this series. If so, simply follow any steps below that have not yet been performed. If at any time system configuration changes appear to have created conflicts with MicroK8s, start over by simply removing the snap and related data to start over using the following command:

sudo snap remove microk8s --purge

Configure networking

Kubernetes–and Multus in particular–expects all nodes to use the same network interface name; however, Ubuntu Server by default uses predictable interface names. As a result, each node likely will have a different network interface name. Netplan provides a solution to reliably rename the interface. The configuration below consistently renames each host’s primary interface to eth0 based on the interface’s MAC address.

Begin by identifying the MAC address for the interface to be used with K8s traffic. List all the interfaces on the node with the command ip a. Your results should look similar to the following (this laptop has both a wired and a wireless interface):

1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
     inet 127.0.0.1/8 scope host lo
        valid_lft forever preferred_lft forever
     inet6 ::1/128 scope host 
        valid_lft forever preferred_lft forever
 2: enp7s0:  mtu 1500 qdisc fq_codel state UP group default qlen 1000
     link/ether 58:20:b1:7f:32:10 brd ff:ff:ff:ff:ff:ff
     inet 192.168.1.21/16 brd 192.168.255.255 scope global eth0
        valid_lft forever preferred_lft forever
     inet6 fe80::5a20:b1ff:fe7f:3210/64 scope link 
        valid_lft forever preferred_lft forever
 3: wlp13s0:  mtu 1500 qdisc noop state DOWN group default qlen 1000
     link/ether 40:b8:9a:1b:25:a3 brd ff:ff:ff:ff:ff:ff

Look for the interface labeled default, this is your primary network interface. Find the link/ether (i.e., MAC) address of this interface (58:20:b1:7f:32:10 in the example above).

Use this information to edit the current netplan configuration. The netplan configuration is normally the only file in the directory /etc/netplan; the default file for Ubuntu 20.04 server is /etc/netplan/00-installer-config.yaml. Modify your configuration similar to the following:

# This is the network config written by 'subiquity'
network:
  ethernets:
    lan:
      match:
        macaddress: 58:20:b1:7f:32:10
      set-name: eth0
      addresses:
        - 192.168.1.21/16
      gateway4: 192.168.1.1
      nameservers:
        addresses: [192.168.1.1]
  version: 2

This netplan configuration uses the match directive with the macaddress property to select the proper network adapter, then uses set-name to assign the interface the name eth0.

In addition to naming the network interface, this configuration also gives the interface a static IP address of 192.168.1.21, a default gateway of 192.168.1.1, and a DNS nameserver of 192.168.1.1.

After saving changes to the netplan configuration, a reboot for the interface name change to take effect.

DNS must also be set up properly for K8s nodes to locate each other. Although beyond the scope of this article, ensure that each node can resolve the IP address of other nodes in the cluster by host name.

Install MicroK8s on each node

With networking properly configured on your node, it’s time to install the MicroK8s snap. Also grant your user account permission to use microk8s commands:

sudo snap install microk8s --classic
sudo usermod -a -G microk8s $USER
sudo chown -f -R $USER ~/.kube
newgrp microk8s

Set up the primary node

Select one node in your cluster as a primary node. Normally this will be the first machine on which MicroK8s has been installed. Although all Kubernetes nodes in the cluster are essentially identical, one node serves as the master node which hosts the control plane. Other nodes join the cluster through this primary node.

On the primary node only, enable the DNS and Multus plugins used with the ROS 2 configuration:

microk8s enable multus dns

These plugins do not need to be explicitly enabled on other cluster nodes, they will automatically be enabled as they join the cluster. This type of control plane work will shift to a standby node should the primary become unavailable.

Join the cluster

In order to add a second node to the cluster, first run the microk8s.add-node command on the master node. Output should look similar to the following:

From the node you wish to join to this cluster, run the following:
   microk8s join 192.168.1.21:25000/f3338b610728cffbca327fe12c7e78a5

If the node you are adding is not reachable through the default interface
    you can use one of the following:
     microk8s join 192.168.1.21:25000/f3338b610728cffbca327fe12c7e78a5
     microk8s join 10.1.224.64:25000/f3338b610728cffbca327fe12c7e78a5

This URL includes key material needed by the new node to join the cluster. Issue the join command on the new node to add it to the cluster:

microk8s join 192.168.1.21:25000/f3338b610728cffbca327fe12c7e78a5
Contacting cluster at 192.168.1.21
Waiting for this node to finish joining the cluster. ..

The microk8s.add-node command must be used to generate new key material each time a new node joins the cluster.

Use the command microk8s status to monitor the new node as it joins the cluster and configures necessary system services:

microk8s is running
high-availability: no
  datastore master nodes: 192.168.1.21:19001
  datastore standby nodes: none
addons:
  enabled:
    dns                  # CoreDNS
    ha-cluster           # Configure high availability on the current node
    multus               # Multus CNI enables attaching multiple network interfaces to pods
  disabled:
    ambassador           # Ambassador API Gateway and Ingress
    cilium               # SDN, fast with full network policy
...

Notice that the DNS and Multus plugins have been enabled as part of the process for joining the cluster.

Repeat this step to add a third K8s nodes to the cluster. Once the cluster contains three or more nodes, microk8s status will show that high-availability has been automatically enabled.

Explore the cluster

If deployments were configured on the master node before adding additional nodes, these should still be running on the master. However, if the cluster does not have a deployment configured yet, apply the ROS talker / listener configuration as described in part 2 of this series. With this initial set of running pods and containers, take a look at which nodes are actually running the pods. Then experiment with scaling the number of running pods, and draining a node before taking it out of service.

List pods by node

Begin by checking the status of available nodes with the command microk8s.kubectl get nodes. This command can be executed on any of the K8s nodes.

NAME     STATUS   ROLES    AGE     VERSION
micro2   Ready    <none>   3m      v1.19.2-34+1b3fa60b402c1c
micro1   Ready    <none>   5h      v1.19.2-34+1b3fa60b402c1c
micro3   Ready    <none>   2m1s    v1.19.2-34+1b3fa60b402c1c

In order to identify which K8s node hosts different pods, use the command

microk8s.kubectl get pods -o wide

This returns the state of the pods, along with their primary IP address and the node hosting the pod:

NAME                                     READY   STATUS    RESTARTS   AGE  IP            NODE     NOMINATED NODE   READINESS GATES
ros-talker-deployment-6c447f496c-vf6nx   1/1     Running   0          4m   10.1.222.70   micro1               
ros-listener-deployment-575bfddd-czz9j   1/1     Running   0          4m   10.1.222.73   micro1               
ros-talker-deployment-6c447f496c-hmzpw   1/1     Running   0          4m   10.1.222.66   micro1

Add more pods

The results above show that the first node in the cluster, micro1, still hosts all the pods in the cluster. However, scaling up the number of talkers creates new pods on the other nodes:

microk8s.kubectl scale deployment ros-talker-deployment --replicas=10

Watch the output of microk8s.kubectl get all -o wide as these new pods start across different nodes.

Take a node out of service

Running pods should be removed from service before shutting down a node. This is known as draining the node; the following command drains all work off the micro3 node:

microk8s.kubectl drain micro3 --ignore-daemonsets

DaemonSet pods are ignored (if any exist) since they generally run on all nodes and cannot be migrated off a node.

Monitor the cluster as new pods are launched on micro1 and micro2while pods on micro3 are shut down:

node/micro3 already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-skfr5, kube-system/kube-multus-ds-amd64-g98cc
evicting pod default/ros-talker-deployment-6c447f496c-b9wzj
evicting pod default/ros-talker-deployment-6c447f496c-hmzpw
evicting pod default/ros-talker-deployment-6c447f496c-t55pl
evicting pod default/ros-talker-deployment-6c447f496c-bcdws
evicting pod default/ros-talker-deployment-6c447f496c-mmt2s
pod/ros-talker-deployment-6c447f496c-b9wzj evicted
pod/ros-talker-deployment-6c447f496c-mmt2s evicted
pod/ros-talker-deployment-6c447f496c-hmzpw evicted
pod/ros-talker-deployment-6c447f496c-t55pl evicted
pod/ros-talker-deployment-6c447f496c-bcdws evicted
node/micro3 evicted

Notice that each new container refreshes and the talker counter resets. Once the node completes successfully, the command microk8s.kubectl get nodes shows the status of micro3 as SchedulingDisabled:

NAME     STATUS                     ROLES    AGE   VERSION
micro1   Ready                      <none>   2h    v1.19.3-34+a56971609ff35a
micro2   Ready                      <none>   2h    v1.19.3-34+a56971609ff35a
micro3   Ready,SchedulingDisabled   <none>   2h    v1.19.3-34+a56971609ff35a

Finally, when work on micro3 is complete, issue the command microk8s.kubectl uncordon micro3 to return the node to service.

Conclusion

We have a ROS system running across three different machines, and we’re able to distribute Kubernetes pods across all the machines. In the final post of this series, we’ll take a look at a few alternate configurations for our talker/listener setup to better understand how to troubleshoot your setup.

Ubuntu cloud

Ubuntu offers all the training, software infrastructure, tools, services and support you need for your public and private clouds.