Extending the behavior of third-party Docker images on Kubernetes
Airbyte is an open-source project that orchestrates Docker containers to sync data from various sources to data warehouses and lakes. The process involves two types of containers - source containers for reading data, and destination containers for writing data. These containers implement the Airbyte Protocol which specifies a command line interface and the structure of stdout messages passed between them. When initially orchestrating third-party containers in Kubernetes, Airbyte encountered challenges with extending container entrypoints to perform syncs. Since Kubernetes does not allow inspecting Docker entrypoints, they needed a strategy to identify this entrypoint when launching the container. They considered three options: retrieving the entrypoint from the Docker registry APIs, requiring a fixed entrypoint, and setting the entrypoint in an environment variable. Airbyte ultimately chose Option 3 - requiring developers implementing their connectors to add an environment variable containing their entrypoint as a string that could be executed with eval. This allows them to wrap Docker entrypoints on the fly to inject complex logic for reading in stdin from a network connection, relaying stdout/stderr over the network, and performing other operations.
Company
Airbyte
Date published
Dec. 7, 2021
Author(s)
Jared Rhizor
Word count
1266
Language
English
Hacker News points
None found.