We’re evaluating stitch for our BI infrastructure. We’ve currently set up only one integration - from Mixpanel to Postgres and we’re wondering why we’re seeing different counts of rows exported from the stitch dashboard v.s. the row count of the resulting tables in Postgres. I’m assuming one stitch row is equivalent to a DB table row, right? Could it be that rows were replicated multiple times due to e.g. historical runs? I’d appreciate any clarification on this
Also, I’m seeing the following errors in the destination table
table name mixpanel_funnels__steps__selector_params__property_filter_params_list too long (69) table name mixpanel_funnels__steps__selector_params__property_filter_params_list__filter__operand too long (86)
There are two things happening here - let’s break it down:
Source rows ≠ Stitch rows ≠ PostgreSQL destination rows
To be blunt, the row totals you see in Stitch will almost never equal what’s in your source or in your destination. There are a few reasons for this, the main one being that row totals in Stitch are cumulative across all replication jobs. Other factors such as an integration’s replication schedule, replication methods used by tables, the structure of the data, and the destination you’re using can all impact row usage.
For your destination (PostgreSQL), the replicated row total will always be higher than the source due to de-nesting nested data, of which Mixpanel has a lot.
Additionally, your question is very timely - I’m the documentation manager for Stitch and I’ve been working on some updates to better explain this exact issue. You can take a look at a preview of those changes here. I hope the visual example in that section helps explain things a bit more.
Unfortunately, the reason these errors are in _sdc_rejected is known to us. PostgreSQL limits table names to 63 characters - these tables have names with 69 and 86 characters, respectively. As the table names exceed PostgreSQL’s limits, Stitch is unable to load the tables.