/plushcap/analysis/airbyte/extending-the-behavior-of-third-party-docker-images-on-kubernetes

Extending the behavior of third-party Docker images on Kubernetes

What's this blog post about?

Airbyte is an open-source project that orchestrates Docker containers to sync data from various sources to data warehouses and lakes. The process involves two types of containers - source containers for reading data, and destination containers for writing data. These containers implement the Airbyte Protocol which specifies a command line interface and the structure of stdout messages passed between them. When initially orchestrating third-party containers in Kubernetes, Airbyte encountered challenges with extending container entrypoints to perform syncs. Since Kubernetes does not allow inspecting Docker entrypoints, they needed a strategy to identify this entrypoint when launching the container. They considered three options: retrieving the entrypoint from the Docker registry APIs, requiring a fixed entrypoint, and setting the entrypoint in an environment variable. Airbyte ultimately chose Option 3 - requiring developers implementing their connectors to add an environment variable containing their entrypoint as a string that could be executed with eval. This allows them to wrap Docker entrypoints on the fly to inject complex logic for reading in stdin from a network connection, relaying stdout/stderr over the network, and performing other operations.

Company
Airbyte

Date published
Dec. 7, 2021

Author(s)
Jared Rhizor

Word count
1266

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.