Download your Stripe account to a SQL database
Table Dog (tdog) is a CLI tool that downloads your Stripe account into a SQL database (MySQL, Postgres or SQLite). On the first run it will download all objects. After that, events are applied to bring your database up to date.
The CLI can be run once to get all data up to the present moment, or it can run as a daemon in the background to poll Stripe’s /events URL and keep your database continuously up to date (less than 1 second behind).
Self-contained, portable, performant
- Uses less than 20MB of RAM.
- Compiled to machine code using memory safe Rust.
- Runs natively on Linux, Mac and Windows.
- Docker image provided.
- No runtime dependencies.
Comparison grid
Table Dog | Sigma | Misc Cloud Sync Service | |
Avoids unencrypted data passing through third party services. |
Runs on-premises No external access by third parties |
Data remains inside of Stripes network | Plain text data passes through possibly-insecure black box server, shared with other customers data |
Updated in real time 100% of the dataset is in RAM next to your code for fast custom queries |
Daily only Must export the whole dataset everyday |
Schedule used (E.g. "hourly") | |
Can handle 2M+ writes per month without delay | Efficient native machine code runs next to your database server | Batch process once a day |
No dedicated resources (shared multi-tenant server). High network latency for each write between sync server and your remote DB |
Exact process/schema can be archived/vendored into version control | tdog is a single binary file | Cannot pin older version | Cannot pin older version |
Supports the database engine you currently use Allows leveraging your existing SQL tools (libraries, ORM's, GUI's, admin processes) and developer skill |
Cannot reuse SQL tools (custom SQL dialect) | ||
Snapshots |
SQL dump Incremental disk based snapshots |
Latest dataset only, no export |
SQL dump Incremental disk based snapshots |
No DB server needed |
SQLite can handle billions of rows. Only needs a local file system |
Managed (custom SQL dialect) | Must provide internet accessible IP/user/pass of your database server (insecure) |
Fast first download Avoid the global Stripe account rate limit lock which could cause down time for your current systems Note: A full download is only required on the first run |
Set a custom concurrent requests per second Optionally abort the entire download when a HTTP 429 is encountered |
Wait a day |
Default low concurrency setting No abort on 429 Could take days for large accounts |
Web UI, Shareable query URL's |
Datasette | Just data transfer to your SQL server |
Datasets, not request/responses.
Having your complete Stripe account in a SQL database allows you to think in datasets instead of requests and responses.
- Join with your existing business data.
- Avoid the N+1 problem with efficient SQL joins, or by using SQLite to avoid network latency.
- Use your language’s SQL API library and tools instead of the Stripe API (writes will still be done via the Stripe API directly).
- Use the decades of optimizations of your SQL engines query planner to efficiently get a result set to your code.
How tdog can save you time
The tdog CLI can help you with the following tasks:
- Reporting
- SQL was designed for business reports. Having your dataset in one place allows you to iteratively design SQL queries in a fast feedback loop resulting in less developer time spent.
- You can snapshot your Stripe account at a certain point in time with database snapshots; the HTTP API does not support read transactions so you can never get the same result when you run the same report again.
- Many analytics tools can import data from a SQL database.
- JSON API
- For desktop, mobile and web clients that show a customer billing summary, you can use multiple SQL queries to create a single JSON response.
- Many ORM-like tools can generate code models and API's from an existing SQL schema.
- Reactive functions (via polling)
- Reactive functions are functions that are run in response to an event or a specific state. E.g. "When X happens run function Y". SQL can be used to determine X and run Y reactively in response.
- Polling a SQL query allows you to select a very specific result set that can be some combination of multiple Stripe objects and your own business data. DB transactions can be used to ensure events are processed by a single worker process.
- It is easier to move your Stripe data to another system from a static SQL dataset.
- You can browse the tables and understand the data from the list of values, as opposed to studying the Stripe docs.
- The tdog SQL dataset could be an intermediate cache before pushing data to Redis, Kafka or other cloud systems.
Secure
- Your unencrypted data is never routed through third party servers (like most existing cloud-based solutions).
- No need to depend on third party web services being reliable, secure and operational in the future.
- The CLI can run next to your database server which results in higher throughput due to the minimal network latency between the CLI and the DB server.
- The CLI process can be sandboxed, and only needs read-only API access to Stripe, and LAN access to the database (or a disk to write to for SQLite).
Avoid having a public server for Stripe webhooks
To keep your database up to date Stripe gives you two options: a public HTTP endpoint for webhooks, or poll the /events URL.
Tdog polls the /events URL and applies the events to your database, so you can just query for the state you need without having either a publicly accessible web server, or a long running process to poll the /events URL.
No database needed
You do not need a database server to leverage SQL – tdog can write directly to an SQLite file. Use your languages existing SQLite API to run SQL queries against the database file.
Alternatives
- Stripe Sigma
- Sigma does not include the last 24 hours, which removes the option for reactive functions or hourly reports.
- It has its own SQL language dialect, which means you cannot use your existing knowledge or tooling for your preferred SQL engine.
- It is remote so your queries will have added network latency – there is no way to efficiently move the dataset into RAM for custom code queries.
- There are no snapshots, so you are unable snapshot the inputs (the entire database) to your queries to reproduce the outputs.
- It is expensive for high throughput systems.
- Many cloud-based solutions.
- Your data is always unencrypted as it passes through cloud-based systems. Your database must be reachable from the public internet, and your credentials shared with the cloud service. If their system is hacked, goes down, or their business closes your business will be negatively affected too. For optimal security database servers should not be reachable from the public internet.
- Network latency between their server and your database may be high. Writes may only be applied hourly which does not give you the option for executing time sensitive reactive functions.
- DIY.
- There is a developer time cost to developing and maintaining your own code, but if you only need very basic reports this could be preferable.
Contact
Table Dog is designed and developed by Enzo from England. Say Hello: