Skip to main content
  1. All Posts/

karapace

Tools HTML

Karapace

karapace. Your Apache Kafka® essentials in one tool.
An open-source implementation
of Kafka REST and
Schema Registry.

Overview

Karapace supports the storing of schemas in a central repository, which clients can access to
serialize and deserialize messages. The schemas also maintain their own version histories and can be
checked for compatibility between their different respective versions.
Karapace rest provides a RESTful interface to your Apache Kafka cluster, allowing you to perform tasks such
as producing and consuming messages and perform administrative cluster work, all the while using the
language of the WEB.

Features

  • Drop in replacement both on pre-existing Schema Registry / Kafka Rest Proxy client and
    server-sides
  • Moderate memory consumption
  • Asynchronous architecture based on aiohttp
  • Supports Avro, JSON Schema, and Protobuf
  • Leader/Replica architecture for HA and load balancing

Compatibility details

Karapace is compatible with Schema Registry 6.1.1 on API level. When a new version of SR is released, the goal is
to support it in a reasonable time. Karapace supports all operations in the API.
There are some caveats regarding the schema normalization, and the error messages being the same as in Schema Registry, which
cannot be always fully guaranteed.

Setup

Using Docker

To get you up and running with the latest build of Karapace, a docker image is available:

# Fetch the latest build from main branch
docker pull ghcr.io/aiven/karapace:develop

# Fetch the latest release
docker pull ghcr.io/aiven/karapace:latest

An example setup including configuration and Kafka connection is available as docker-compose example:

docker-compose -f ./container/docker-compose.yml up -d

Then you should be able to reach two sets of endpoints:

Configuration

Each configuration key can be overridden with an environment variable prefixed with KARAPACE_,
exception being configuration keys that actually start with the karapace string. For example, to
override the bootstrap_uri config value, one would use the environment variable
KARAPACE_BOOTSTRAP_URI. Here you can find an example configuration file to give you an idea
what you need to change.

Source install

Alternatively you can do a source install using:

python setup.py install

Quickstart

To register the first version of a schema under the subject “test” using Avro schema:

$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" 
  --data '{"schema": "{"type": "record", "name": "Obj", "fields":[{"name": "age", "type": "int"}]}"}' 
  http://localhost:8081/subjects/test-key/versions
{"id":1}

To register a version of a schema using JSON Schema, one needs to use schemaType property:

$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" 
  --data '{"schemaType": "JSON", "schema": "{"type": "object","properties":{"age":{"type": "number"}},"additionalProperties":true}"}' 
  http://localhost:8081/subjects/test-key-json-schema/versions
{"id":2}

To list all subjects (including the one created just above):

$ curl -X GET http://localhost:8081/subjects
["test-key"]

To list all the versions of a given schema (including the one just created above):

$ curl -X GET http://localhost:8081/subjects/test-key/versions
[1]

To fetch back the schema whose global id is 1 (i.e. the one registered above):

$ curl -X GET http://localhost:8081/schemas/ids/1
{"schema":"{"fields":[{"name":"age","type":"int"}],"name":"Obj","type":"record"}"}

To get the specific version 1 of the schema just registered run:

$ curl -X GET http://localhost:8081/subjects/test-key/versions/1
{"subject":"test-key","version":1,"id":1,"schema":"{"fields":[{"name":"age","type":"int"}],"name":"Obj","type":"record"}"}

To get the latest version of the schema under subject test-key run:

$ curl -X GET http://localhost:8081/subjects/test-key/versions/latest
{"subject":"test-key","version":1,"id":1,"schema":"{"fields":[{"name":"age","type":"int"}],"name":"Obj","type":"record"}"}

In order to delete version 10 of the schema registered under subject “test-key” (if it exists):

$ curl -X DELETE http://localhost:8081/subjects/test-key/versions/10
 10

To Delete all versions of the schema registered under subject “test-key”:

$ curl -X DELETE http://localhost:8081/subjects/test-key
[1]

Test the compatibility of a schema with the latest schema under subject “test-key”:

$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" 
  --data '{"schema": "{"type": "int"}"}' 
  http://localhost:8081/compatibility/subjects/test-key/versions/latest
{"is_compatible":true}

Get current global backwards compatibility setting value:

$ curl -X GET http://localhost:8081/config
{"compatibilityLevel":"BACKWARD"}

Change compatibility requirements for all subjects where it’s not
specifically defined otherwise:

$ curl -X PUT -H "Content-Type: application/vnd.schemaregistry.v1+json" 
  --data '{"compatibility": "NONE"}' http://localhost:8081/config
{"compatibility":"NONE"}

Change compatibility requirement to FULL for the test-key subject:

$ curl -X PUT -H "Content-Type: application/vnd.schemaregistry.v1+json" 
  --data '{"compatibility": "FULL"}' http://localhost:8081/config/test-key
{"compatibility":"FULL"}

List topics:

$ curl "http://localhost:8082/topics"

Get info for one particular topic:

$ curl "http://localhost:8082/topics/my_topic"

Produce a message backed up by schema registry:

$ curl -H "Content-Type: application/vnd.kafka.avro.v2+json" -X POST -d 
  '{"value_schema": "{"namespace": "example.avro", "type": "record", "name": "simple", "fields": 
  [{"name": "name", "type": "string"}]}", "records": [{"value": {"name": "name0"}}]}' http://localhost:8082/topics/my_topic

Create a consumer:

$ curl -X POST -H "Content-Type: application/vnd.kafka.v2+json" -H "Accept: application/vnd.kafka.v2+json" 
  --data '{"name": "my_consumer", "format": "avro", "auto.offset.reset": "earliest"}' 
  http://localhost:8082/consumers/avro_consumers

Subscribe to the topic we previously published to:

$ curl -X POST -H "Content-Type: application/vnd.kafka.v2+json" --data '{"topics":["my_topic"]}' 
  http://localhost:8082/consumers/avro_consumers/instances/my_consumer/subscription

Consume previously published message:

$ curl -X GET -H "Accept: application/vnd.kafka.avro.v2+json" 
  http://localhost:8082/consumers/avro_consumers/instances/my_consumer/records?timeout=1000

Commit offsets for a particular topic partition:

$ curl -X POST -H "Content-Type: application/vnd.kafka.v2+json" --data '{}' 
  http://localhost:8082/consumers/avro_consumers/instances/my_consumer/offsets

Delete consumer:

$ curl -X DELETE -H "Accept: application/vnd.kafka.v2+json" 
  http://localhost:8082/consumers/avro_consumers/instances/my_consumer

Backing up your Karapace

Karapace natively stores its data in a Kafka topic the name of which you can
configure freely but which by default is called _schemas.
Karapace includes a tool to backing up and restoring data. To back up, run:

karapace_schema_backup get --config karapace.config.json --location schemas.log

You can also back up the data by using Kafka’s Java console
consumer:

./kafka-console-consumer.sh --bootstrap-server brokerhostname:9092 --topic _schemas --from-beginning --property print.key=true --timeout-ms 1000 1> schemas.log

Restoring Karapace from backup

Your backup can be restored with Karapace by running:

karapace_schema_backup restore --config karapace.config.json --location schemas.log

Or Kafka’s Java console producer can be used to restore the data
to a new Kafka cluster.
You can restore the data from the previous step by running:

./kafka-console-producer.sh --broker-list brokerhostname:9092 --topic _schemas --property parse.key=true < schemas.log

Performance comparison…