Does anyone we use Snowpipe to batch up data loaded by Stitch in S3 before loading it into Snowflake using a scheduled job? How easy is it and how useful do you find it?
We currently load data from Stitch directly into Snowflake using a dedicated virtual warehouse, but we do use Snowpipe to load other data streaming from Kinesis Firehose into S3. Not quite what you were asking there, but maybe an adjacent use case is also instructive.
Setting up that integration is not trivial in Snowflake, since you have to create & configure multiple Snowflake objects to make it work (a storage integration, a stage, a destination table and a pipe), and configure corresponding security resources in AWS & Snowflake to allow Snowflake access. I did write a script to automate some of that workflow, but it’s a multi-stage affair that requires some manual labor in our case, as we manage our AWS resources in Terraform. It’s not pretty.
However, it does work beautifully, with low load latency and dramatically lower loading costs. We would be thrilled if Stitch could use Snowpipe to load data, instead of a dedicated virtual warehouse.