Understanding Airflow XComs: Passing Data Between Tasks
We all know Airflow is such a goat-ed framework in the data industry! Despite being a love-hate tool, it is the most successful orchestrator and used by companies like Salesforce, Apple, Reddit, etc. Understanding how it works conceptually and technically is extremely important as a Data Engineer to build data pipelines and XComs are a crucial part of it.
What are XComs?
XComs AKA Cross-Communication are a way to pass data between tasks in an Airflow DAG. Now don’t get me wrong when you hear the word “data” as it doesn’t mean passing the data that you extract from source systems. It goes without saying but let me repeat; Airflow is not an ETL tool, but a scheduler to schedule your pipelines. Yes, CRON can do it but it will struggle when you want to schedule and maintain 100s of pipelines. Coming back to XComs - data is pushed to XCom Store, a separate metadata database maintained by Airflow and retrieved by another task by pulling it.
Real World Example
Let’s go over a simple pipeline; imagine you are extracting data from Postgres to S3, transforming it, and storing it in Snowflake for downstream analytics. You are using Airflow and different tasks to set up the workflow. There are 3 tasks in total:
Task 1 (ingest_data_from_postgres): this extracts the data and pushes it to S3. Also does an XCom push of S3’s file path.
Task 2 (transform_data): your transformation starts but since this is a different task, you have to first do an XCom pull to get the file’s path, read it and start the transformation/applying business logic.
Task 3 (load_data_to_snowflake): once the transformation is done, you can load it to Snowflake.
How to use XComs?
Here’s how you can do XCom Push and Pull:
XComs store data in key-value pairs where you pull the value associated with a key. We are pushing the path of file in the above code snippet. Let’s look at how we can pull it:
Common Use-Cases
Passing metadata between tasks
Triggering tasks dynamically
Sharing information across different operators
A powerful mechanism in Airflow but must be used wisely to pass small pieces of data, coordinate workflows, and enable dynamic behavior.



