GO BPF libraries: cilium ebpf vs iovisor gobpf

Published 06-14-2020 16:06:47

Introduction

This post does not cover the basics of eBPF. There are a lot of resources online for that now (cilium documentation and other blog posts). Rather we will be looking at a comparison of these 2 GO libraries from the point of view of loading BPF programs from ELF files. These libraries have additional functionalities (bcc for iovisor and asm for cilium) with which we are not concerned at this time. For quick t-shooting bcc is fine but if you want to build a service based on BPF with long running processes then I think loading the programs from ELFs is the way to go so that you also cut down on the dependencies like kernel source or headers.

Loading a SOCKET_FILTER BPF program type

We will use a simple sample we found in the kernel source for testing:

#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/if_packet.h>
#include <linux/ip.h>
#include "include/bpf_helpers.h"
#include <linux/types.h>
#include "include/types.h"

#ifndef offsetof
#define offsetof(TYPE, MEMBER) ((size_t) & ((TYPE *)0)->MEMBER)
#endif

struct bpf_map_def SEC("maps") my_map = {
	.type = BPF_MAP_TYPE_ARRAY,
	.key_size = sizeof(__u32),
	.value_size = sizeof(long),
	.max_entries = 256,
};

SEC("socket1")
int filter(struct __sk_buff *skb)
{
  char fmt0[] = "Hello from filter !";
  bpf_trace_printk(fmt0, sizeof(fmt0));
	int index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol));
	long *value;

/*  if (skb->pkt_type != PACKET_OUTGOING)
		return 0;
*/
	value = bpf_map_lookup_elem(&my_map, &index);
	if (value)
		__sync_fetch_and_add(value, skb->len);

	return 0;
}
char _license[] SEC("license") = "GPL";

For compiling the restricted C code out of the kernel tree I use a folder structure like this:

folder struct

There are several header files taken out of the kernel source which I keep in the include folder. Maybe we won’t need all of them for our examples from this post but I found that depending on the program you write those files may be needed.

Compile the program with clang (you have to have kernel headers package installed):

cd bpf_prog; clang -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign -Wno-compare-distinct-pointer-types -Wunused -Wall -Werror -O2 -g -target bpf -c filter_test2.c -o filter.o

Iovisor gobpf/elf library

Iovisor gobpf library uses CGO quite a lot so in order to compile the GO program you will need the kernel headers. Iovisor uses it’s own map format and they encourage you to use that otherwise the loading of the maps will fail because the library checks that map sizes are 280 bytes which is the size of the custom map struct they use.

#define BUF_SIZE_MAP_NS 256

typedef struct bpf_map_def {
	unsigned int type;
	unsigned int key_size;
	unsigned int value_size;
	unsigned int max_entries;
	unsigned int map_flags;
	unsigned int pinning;
	char namespace[BUF_SIZE_MAP_NS];
} bpf_map_def;

enum bpf_pin_type {
	PIN_NONE = 0,
	PIN_OBJECT_NS,
	PIN_GLOBAL_NS,
	PIN_CUSTOM_NS,
};

It is here: https://github.com/iovisor/gobpf/blob/master/elf/include/bpf_map.h and you can add it to your include folder. You should also comment your struct bpf_map_def from the original bpf_helpers.h file. So the first part of our C BPF program now looks like this (included the iovisor bpf_map_def and modified the map as per new map def):

#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/if_packet.h>
#include <linux/ip.h>
#include "include/bpf_helpers.h"
#include "include/iovisor_gobpf_map_def.h"
#include <linux/types.h>
#include "include/types.h"

#ifndef offsetof
#define offsetof(TYPE, MEMBER) ((size_t) & ((TYPE *)0)->MEMBER)
#endif

struct bpf_map_def SEC("maps/my_map") my_map = {
	.type = BPF_MAP_TYPE_ARRAY,
	.key_size = sizeof(__u32),
	.value_size = sizeof(long),
	.max_entries = 256,
        .pinning = 0,
        .namespace = "",
};

Notice that the section name of the map has to be changed for iovisor as it looks for maps in sections of the format map/<map_name>.

With that out of the way we can now use the gobpf/elf library in GO to load the bpf socket program into the kernel VM and use it. The loading of a BPF socket filter type program goes like this:

mod := elf.NewModule("bpf_prog/filter.o")

if err := mod.Load(nil); err != nil {
  panic(err)
}

sf := mod.SocketFilter("socket1")

sock, err := openRawSock(index)
if err != nil {
  panic(err)
}
defer syscall.Close(sock)

if err := elf.AttachSocketFilter(sf, sock); err != nil {
  panic(err)
}

myMap := mod.Map("my_map")

var key uint32 = 6
var value int64

for {
  time.Sleep(1 * time.Second)

  mod.LookupElement(myMap, unsafe.Pointer(&key), unsafe.Pointer(&value))
  fmt.Println("The value is: ", value)
}


So what happens here is:

  • Each ELF is parsed into a module which is a concept of the developers and includes all BPF maps/functions that the ELF contains. The point is that the module can be later used to get what we need.
  • The Load() method loads the BPF programs into the kernel via the bpf() syscall and gets the maps out of the ELF.
  • Because we have a socket filter type program we use the SocketFilter() method to get a socket filter struct which contains (most importantly) the FD of the program loaded into the kernel. We will use this one later although the actual FD will be abstracted.
  • We open a raw socket and bind it to the interface on which we want to attach our filter. You can check out that function in the full code file.
  • With the AttachSocketFilter() method we attach the BPF program through it’s FD to the raw socket FD through a setsockopt() syscall.
  • We get the map.
  • We iterate on the map to get the values. The BPF program actually increments with the len field in the skb for every key. The key is the ip header protocol which shows the next layer (transport layer) protocol. Here we read just the key with value 6 (TCP) since it will be the majority of packets and we are just playing.

Full GO code is here

Cilium ebpf library

Cilium ebpf library has no special requirements so we can leave the original BPF program as it is and simply load it.

spec, err := ebpf.LoadCollectionSpecFromReader(bytes.NewReader(program))
if err != nil {
  panic(err)
}

coll, err := ebpf.NewCollection(spec)
if err != nil {
  panic(err)
}
defer coll.Close()

prog := coll.DetachProgram("filter")
if prog == nil {
  panic("no program named filter found")
}
defer prog.Close()

sock, err := openRawSock(index)
if err != nil {
  panic(err)
}
defer syscall.Close(sock)

if err := syscall.SetsockoptInt(sock, syscall.SOL_SOCKET, SO_ATTACH_BPF, prog.FD()); err != nil {
  panic(err)
}

fmt.Printf("Filtering on eth index: %d\n", index)
fmt.Println("Packet stats:")

myMap := coll.DetachMap("my_map")
if myMap == nil {
  panic(fmt.Errorf("no map named duration_end found"))
}
defer myMap.Close()

var key uint32 = 6
var value int64

for {

  time.Sleep(time.Second)

  if err := myMap.Lookup(key, &value); err != nil {
    if strings.Contains(err.Error(), "key does not exist") {
      log.Printf("Key does not exist yet !")
    } else {
      panic(err)
    }
  }

  fmt.Printf("Value: %d\n", value)

}

The basic steps are the same as for the iovisor/gobpf but the terminology is different. Here the programs and maps found in the ELF by the function LoadCollectionSpecFromReader() are called a collection spec.

So what happens is:

  • By running NewCollection() the programs are loaded in the kernel and we get a collection object which we further use for getting the program FD and maps.
  • By calling DetachProgram() we get the program object containing the FD.
  • We open the saw socket, bind it to the interface and get the FD.
  • We attach the BPF program FD to the raw socket FD with the setsockopt() syscall.
  • We get the map.
  • We iterate over the map and print the value.

Full GO code is here

Loading a KPROBE BPF program type

This simple kprobe program will get the filename of any file that has been opened in the system and print it in the trace buffer:

#include <uapi/linux/bpf.h>
#include <uapi/linux/ptrace.h>
#include <linux/version.h>
#include <bpf/bpf_helpers.h>

#define PT_REGS_PARM2(x) ((x)->si)

SEC("kprobe/do_sys_open")
int kprobe__do_sys_open(struct pt_regs *ctx)
{
		char file_name[256];

		bpf_probe_read(file_name, sizeof(file_name), PT_REGS_PARM2(ctx));

		char fmt[] = "file %s\n";
		bpf_trace_printk(fmt, sizeof(fmt), &file_name);

		return 0;
}

char _license[] SEC("license") = "GPL";
u32 _version SEC("version") = LINUX_VERSION_CODE;

For simplicity reasons we don’t use a map anymore. But you could use a map to store/get values just like in the example above for SOCKET_FILTER programs for both libraries. The same limitation applies for the iovisor library in which if you were to use a map you would have to use their own map format. After we load the program into the kernel we will be reading the trace ringbuffer like this:

cat  /sys/kernel/debug/tracing/trace_pipe

Iovisor gobpf/elf library

We can load it with the following go program:

package main

import (
	"fmt"
	"time"

	"github.com/iovisor/gobpf/elf"
)

func main() {
	mod := elf.NewModule("bpf_prog/kprobe_example.o")

	err := mod.Load(nil)
	if err != nil {
		panic(err)
	}

	err = mod.EnableKprobes(0)
	if err != nil {
		panic(err)
	}

	for {
		fmt.Println("Waiting...")
		time.Sleep(10 * time.Second)
	}
}

As you can see it is pretty straightforward. The iovisor library takes care of most of the operations needed and all we see is the magic.

Cilium ebpf library

We load with the following go program:

package main

import (
	"bytes"
	"fmt"
	"io/ioutil"
	"strconv"
	"strings"
	"time"

	"github.com/cilium/ebpf"
	"golang.org/x/sys/unix"
)

func getTracepointID(eventName string) (uint64, error) {
	data, err := ioutil.ReadFile("/sys/kernel/debug/tracing/events/kprobes/" + eventName + "/id")
	if err != nil {
		return 0, fmt.Errorf("failed to read tracepoint ID for 'sys_enter_open': %v", err)
	}
	tid := strings.TrimSuffix(string(data), "\n")
	return strconv.ParseUint(tid, 10, 64)
}
func createTracepoint(eventName string) error {

	var out = "p:kprobes/" + eventName + "_est123 " + eventName
	fmt.Println("Create event buff:", out)
	if err := ioutil.WriteFile("/sys/kernel/debug/tracing/kprobe_events", []byte(out), 0644); err != nil {
		return err
	}
	return nil
}

var bpfprogramFile string = "bpf_prog/kprobe_example.o"

func main() {

	program, err := ioutil.ReadFile(bpfprogramFile)
	if err != nil {
		panic("Error reading BPF program:" + err.Error())
	}

	spec, err := ebpf.LoadCollectionSpecFromReader(bytes.NewReader(program))
	if err != nil {
		panic(err)
	}

	coll, err := ebpf.NewCollection(spec)
	if err != nil {
		panic(err)
	}

	prog := coll.DetachProgram("kprobe__do_sys_open")
	if prog == nil {
		panic("no program named kprobe__do_sys_open found")
	}
	defer prog.Close()

	fmt.Println("Program file descriptor: ", prog.FD())

	if err := createTracepoint("do_sys_open"); err != nil {
		panic("Cannot create kprobe event: " + err.Error())
	}

	eid, errGetTr := getTracepointID("do_sys_open_est123")
	if errGetTr != nil {
		panic("Could not get TracepointID:" + err.Error())
	}

	attr := unix.PerfEventAttr{
		Type:        unix.PERF_TYPE_TRACEPOINT,
		Config:      eid,
		Sample_type: unix.PERF_SAMPLE_RAW,
		Sample:      1,
		Wakeup:      1,
	}
	efd, err := unix.PerfEventOpen(&attr, -1, 0, -1, unix.PERF_FLAG_FD_CLOEXEC)
	if err != nil {
		panic("Unable to open perf events:" + err.Error())
	}

	if _, _, err := unix.Syscall(unix.SYS_IOCTL, uintptr(efd), unix.PERF_EVENT_IOC_ENABLE, 0); err != 0 {
		panic("Unable to enable perf events:" + err.Error())
	}
	if _, _, err := unix.Syscall(unix.SYS_IOCTL, uintptr(efd), unix.PERF_EVENT_IOC_SET_BPF, uintptr(prog.FD())); err != 0 {
		panic("Unable to attach bpf program to perf events:" + err.Error())
	}
	for {
		fmt.Println("Waiting...")
		time.Sleep(10 * time.Second)
	}
}

As you can see the magic fades away when using the cilium library as we have to go through more steps in order to load the program and activate it. So what happens here is the following:

  • The program gets loaded with the NewCollection() call
  • Register the event in /sys/kernel/debug/tracing/kprobe_events with the createTracepoint() function.
  • Get the event id with getTracepointID() . We need this so that we get the event file descriptor.
  • Do the perf_event_open syscall which gets the event file descriptor
  • Use the ioctl syscall to enable the event
  • Use the ioctl syscall to attach the BPF program to the event using the event FD and the program FD

In both cases we need a infinite loop at the end of the program that keeps the userspace program running. If it is not running the BPF programs get unloaded from the kernel.

iovisor supported BPF programs types

https://github.com/iovisor/gobpf/blob/e6b321d3210387d6a09bde4feba22a09e8c6f4ae/elf/elf.go#L561

cilium supported BPF program types

https://github.com/cilium/ebpf/blob/7acf5cc039f43cc55e927f1c4b2fd161535aad26/elf_reader.go#L577

Conclusions

Cilium library supports loading more BPF program types than its iovisor counterpart. Although the iovisor library provides more abstraction it does this by relying on CGO which incurs cost in terms of performance as a call to CGO is much slower then a call to native GO. The iovisor library has an interesting and useful feature. It changes the kernel version from your BPF program version section to the kernel version it is running on when it loads the ELF.That means that if you specify the “magic” version of 0xFFFFFFFE in your version section like: u32 _version SEC("version") = 0xFFFFFFFE when the library will read the ELF and encounter that version it will replace it with the version it finds on the system it is running on. This helps avoid compiling a BPF program for different kernel versions or recompiling on every minor kernel version change. Of course the BPF developers added this limitation with good reason as, for example, for a kprobe program, kernel function names may change at any time and so your program will not work anymore so use it at your own risk. But if we take the chance and assume that these won’t actually change so often then we have a more portable BPF program. This is a cool feature that could be also added to the cilium library.