Access data in your data warehouse with Data Pipeline

Sync your Stripe account with Snowflake, Amazon Redshift, Google Cloud Storage, and other data storage services.

Data Pipeline is a no-code product that sends all your Stripe data to a variety of data storage destinations. This allows you to centralize your Stripe data with other business data to help close your books and get more detailed business insights.

With Data Pipeline, you can:

Automatically export your complete Stripe data in a fast and reliable manner.
Stop relying on third-party extract, transform, and load (ETL) pipelines or home-built API integrations.
Combine data from all your Stripe accounts into one data warehouse.
Integrate Stripe data with your other business data for more complete business insights.

Caution

Because of data localization requirements, Stripe doesn’t offer Data Pipeline services to customers, businesses, or users in India.

Note

If you have questions regarding support for your data destination, let us know at [email protected].

Destination Support

Stripe Data Pipeline supports two variations of destinations.

Data warehouses (Snowflake, Amazon Redshift)
- For data warehouse destinations, Stripe sends a data share to your data warehouse.
- After you accept the data share, you can access your core Stripe data in Snowflake or Amazon Redshift within 12 hours.
- After the initial load, your Stripe data refreshes regularly, delivering an incremental or full load of data every 3 hours.
Cloud storage (Google Cloud Storage, Azure Blob Storage)
- For our cloud storage destinations, Stripe sends Parquet files directly to a cloud storage location you own.
- After the initial load, your Stripe data refreshes regularly, delivering a new full load of your data every 6 hours.

Database schemas

Your warehouse data is split into two database schemas based on the API mode used to create the data.

Schema name	Description
`STRIPE`	Data populated from live mode
`STRIPE_TESTMODE`	Data populated from test mode

Multiple Stripe accounts with the same data warehouse

If you share data from multiple Stripe accounts with the same data warehouse, you can identify these separately. Every table has a merchant_id column, which allows you to filter the data by account.

Example use case

In some cases, you might want to combine information from your proprietary data with Stripe data. The following schema shows an orders table that lists data about an order for a company:

date	order_no	stripe_txn_no	customer_name	price	items
11/2/2024	1	bt_xcVXgHcBfi83m94	John Smith	5	1 book

The table above doesn’t contain data regarding transaction fees or payouts because that data exists solely within Stripe. In Stripe, the balance_transactions table contains the following information, but lacks proprietary data regarding customer names and items purchased:

id	amount	available_on	fee	net	automatic_transfer_id
bt_xcVXgHcBfi83m94	500	11/2/2024	50	450	po_rC4ocAkjGy8zl3j

To access your proprietary data alongside your Stripe data, combine the orders table with Stripe’s balance_transactions table:

select
  orders.date,
  orders.order_no,
  orders.stripe_txn_no,
  bts.amount,
  bts.fee,
  bts.automatic_transfer_id
from mycompany.orders join stripe.balance_transactions bts
on orders.stripe_txn_no = bts.id;

After it completes, the following information is available:

date	order_no	Stripe_txn_no	amount	fee	automatic_transfer_id
11/2/2024	1	bt_xcVXgHcBfi83m94	500	50	po_rC4ocAkjGy8zl3j

Datasets

You can see a list of available datasets under Datasets in the schema documentation page in the Dashboard. Available datasets might vary by region, subject to local product availability and regulations. Data Pipeline separately shares each dataset, which contains one or more warehouse tables, as data becomes available. Data Pipeline updates some tables on different schedules based on the availability of new data. See data freshness for more information on available datasets and refresh schedules.

Email notifications

You can also subscribe to email notifications for critical updates in the Dashboard.

Turn off Data Pipeline

You can turn off Data Pipeline in the Dashboard settings page by clicking Turn off. After you disconnect, you lose access to your data share immediately.