- 
                Notifications
    You must be signed in to change notification settings 
- Fork 2.3k
[Pull-based Ingestion] Support message mappers to support different input formats and raw payloads #19765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Pull-based Ingestion] Support message mappers to support different input formats and raw payloads #19765
Conversation
3fab3ae    to
    72773f3      
    Compare
  
    | ❌ Gradle check result for 72773f3: TIMEOUT Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? | 
| ❌ Gradle check result for 72773f3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? | 
| ❌ Gradle check result for 72773f3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? | 
| ❌ Gradle check result for 72773f3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? | 
72773f3    to
    85dae2e      
    Compare
  
    Signed-off-by: Varun Bharadwaj <[email protected]>
85dae2e    to
    e9e9f3f      
    Compare
  
    | ❌ Gradle check result for e9e9f3f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? | 
| Codecov Report❌ Patch coverage is  Additional details and impacted files@@             Coverage Diff              @@
##               main   #19765      +/-   ##
============================================
- Coverage     73.10%   73.10%   -0.01%     
+ Complexity    70959    70932      -27     
============================================
  Files          5737     5740       +3     
  Lines        324766   324816      +50     
  Branches      46981    46986       +5     
============================================
+ Hits         237425   237460      +35     
- Misses        68226    68245      +19     
+ Partials      19115    19111       -4     ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
 | 
| @timojohlo Can you confirm that this approach would solve your use case? This will index the JSON document output from the OTel collector as-is with no transformation. | 
| So, in case of  | 
| 
 @timojohlo The important point here is that there would be no transformation of the source JSON document. It would be indexed as-is, and your field mappings for example would have to correspond to that source document structure. | 
Description
This PR refactors the pull-based indexing flow to support message mappers. A default message mapper is created to retain current behavior. Alternatively, a raw payload mapper is added to support ingesting from any given streaming source.
In the raw payload mode, the Kafka offset / Kinesis sequence number will be used as the document ID. This will ensure duplicate documents are not created on rewind/replay. Document versioning will not be supported, and only an eventually consistent view of documents can be expected on message replays (as older message can potentially overwrite newer one on replay, until the lag is caught up). This will be an append-only indexing mode.
This model should allow the flexibility to support other formats in the future, when needed.
Related Issues
Resolves #19548
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.