open:production-docker-image-for-apache-airflow

Production Docker Image for Apache Airflow

  • You want to deploy Airflow using container images
  • You want to contribute to Airflow in Devops area
  • You want to learn about best practices of using Airflow Containers
  • You are a curious person that want to learn something new
  • Standard unit of software
  • Packages code and its dependencies
  • Lightweight execution package of software
  • Container images - binary packages

FROM ubuntu:18.04
COPY . /app
RUN make /app && make install
WORKDIR /bin/project
ENTRYPOINT ["/bin/project"]
CMD ["--help"]

  • Specify base image
  • Run commands
  • Copy files
  • Set working directory
  • Define entrypoint
  • Define default command
  • Predictable, consistent development & test environment
  • Predictable, consistent execution environment
  • Lightweight but isolated: sandboxed view of the OS isolated from others
  • Build once: run anywhere
  • Kubernetes runs containers natively
  • Bridge: “Development → Operations”
  • Builds optimised image
  • Highly customizable (ARGs)
  • Multi segmented (build + main)
    ## Usage

docker build . -t yourcompany/airflow:1.10.11-BUILD_Id

FROM apache/airflow:1.10.11

# change to root user temporarily
USER root

RUN apt-get update \
    && apt-get install -y --no-install-recommends \
    emacs \
    && apt-get autoremove -yqq --purge \
    && apt-get clean \
    && rm -rf '/var/lib/apt/lists/*'
    
# Change back to the airflow user
USER airflow

# Add extra dependencies
RUN pip install --user numpy

# Embed DAGs (Optionally) - DAGs can be baked in but also
# they can be git-synced or mounted from shared volume
COPY --chown=airflow:root dags-folder $(AIRFLOW_HOME)/dags/

Pros

  • Use releases images
  • Simple build command
  • Own Dockerfile
  • No need for Airflow sources

Cons

  • Potentially bigger size
  • Predefined extras only
  • Installs limited set of python dependecies

git clone [email protected]:apache/airflow.git

cd airflow

git checkout v1-10-stable

docker build .

  • Installs from PyPi == 1.10.11
  • Additional airflow extras, dev, runtime deps …
  • Does not use local sources (can be run from master including entrypoint)

7nxqhmy.jpg

  • Breeze - development and test environment
  • Supports building production image
  • Auto-complete of options
  • New Breeze video showing building production images:
  • ./breeze build-image --help

See BREEZE.rst in the Airflow repo

f7dnoey.jpg

  • Docker and Docker-Compose - not recommended for production
  • Managed Container Services
    • Managed: Amazon ECS, Google Container on VMs, Azure Container Instances
  • Kubernetes on-Prem:
    • Airflow Operator (not recommended yet)
  • Managed Kubernetes: Amazon EKS, Google GKe, Azure AKS
  • OpenShift (also Kubernetes)

  • open/production-docker-image-for-apache-airflow.txt
  • 마지막으로 수정됨: 2021/01/28 00:07
  • 저자 127.0.0.1