Replies: 1 comment
-
I should also note that this would require the capability to be present in the sdk, the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Feature scope
Taps (catalog, state, stream maps, tests, etc.)
Description
Add backfill support
what I mean by a backfill job: Its a bulk data integration would act on incremental streams. the purpose of the backfill job would be to integrate records that would otherwise not get fetched via the incremental replication unless they were updated. the backfill would not update the state
what scenarios are the backfill useful in.
Backfills are useful when adding new fields to a stream's definition and avoiding having to do a full refresh.
normally, supporting this request would require running a full-refresh. however a full-refresh could be expensive time wise, and may require api calls on the sales database, where the api limit is shared between different integrations. additionally, a full refresh may impose load on the sales database and require special scheduling.
Backfills are also useful in case something goes wrong in data loading process and the state gets updated but the data has failed to sync.
How can this be implemented.
a backfill operation, if a stream support it, should allow the operator to specify a backfill filter, the filter would be passed to the method making the request, at command time. something like this
this would also require taps to support the backfill operation, which is why i feel like the sdk is the best place to implement the hooks that will enable this functionality down the road.
Beta Was this translation helpful? Give feedback.
All reactions