[WIP] Notebook-friendly connectors as importable classes #2685
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Generate
elastic-connectorspackage with notebook-friendly connector classesA lot of changes huh? Dw 95% is autogenerated code :) you can skip
package/generatedandpackage/docs, instead provide feedback on the actual template inscripts/package/codegen/templatesSee package in action: colab notebook gist
Changes
I added a flow to auto-generate package code to turn the connectors framework into standalone importable connector classes that can be used independently of the framework application (connectors protocol dependent on special connectors indices).
The generated wrapper classes live under
package/generate, they addDataSourceconfigfields as constructor arguments, and uselabeltogether withtooltipto build docstrings.The code gen scripts along with
jinja2templates lives underscripts/package.The
package.connector_base.ConnectorBaseclass is a class from which the generated classes inherit from. It provides some utils such as:async_get_docsthat would both callget_docsand use local Apache Tika lib for content extractionloggerallows to pass custom logger to attach to the dataprovider logicdownload_contentflag, can be disabled (so Apache Tika is not fetched) when syncing with e.g. sql database where we don't do content extraction (since no files to download)New requirements are specified under:
requirements/package.txtused by package logicrequirements/package-dev.txtused to build packageAutomated code generation and packaging logic
This happens under the hood when you call
make build_connector_packageSteps are as follows:
scripts/package/codegenscripts and templates to update defs inpackage/generated/connectorsas well as/package/*and put it in temp folder/package/elastic_connectorselastic_connectorsnamespacelazydocspackage/setup.pytwine(for now to testpypi)Constructor + Docstrings = Hints from python language server
Potential improvements
elastic_connectors[google_drive]orelastic_connectors[sharepoint_online]- so that you only install stuff that you need/connectorscode is accessible in the package (we don't document but folks could e.g. import ConcurrentTask)Pre-Review Checklist
config.yml.example)v7.13.2,v7.14.0,v8.0.0)