Backfill
Once your scheduling and orchestration are set up, you might encounter the following scenarios:
| Use Case | Description | Solution |
|---|---|---|
| Initial Backfill | You have just setup hubble and would like to ingest historical data | - Option 1. Data Import Pros: Cheap and fast - Option 2: Re-trigger DAGs for past dates Cons: Slow and expensive |
| Bug Fix | You resolved a bug and need to re-ingest a specific data column/s or back fix a data column | - Option 1. JS UDF Pros: Cheap and fast Cons: May need optimized query writing and running in batches - Option 2: Re-trigger DAGs for past dates Cons: Slow and expensive |
| New data column extraction | You added a new data column/s as part of a feature request and need to backfill data for the newly added column/s | - Option 1. JS UDF Pros: Cheap and fast Cons: May need optimized query writing and running in batches - Option 2: Re-trigger DAGs for past dates Cons: Slow and expensive |
Backfill using JS UDF
This document outlines methods to extract required fields from the XDR of raw data.
Data Import
This document outlines methods to perform inital backfill when setting up hubble.