-
-
Notifications
You must be signed in to change notification settings - Fork 168
DSL revamp in Python #698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DSL revamp in Python #698
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 5 comments
examples/bookstore.py
Outdated
| db = helix.Db() | ||
|
|
||
| class Chapter(db.Node): | ||
| @index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: @index decorator not imported - will cause NameError at runtime
| @index | |
| index: helix.I64 |
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/bookstore.py
Line: 62:62
Comment:
**syntax:** `@index` decorator not imported - will cause NameError at runtime
```suggestion
index: helix.I64
```
How can I resolve this? If you propose a fix, please make it concise.| def searchdocs_rag(query: helix.Vector, k: helix.I32) -> helix.Iterator[dict[str, helix.Value]]: | ||
| # TODO | ||
| vecs = db.search_vector(query, k) | ||
| chapters = vecs.incoming_nodes[Contains] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: wrong node type retrieved - based on schema, vectors are embedded in SubChapter nodes, not Chapter nodes. Old DSL shows subchapters <- vecs::In<EmbeddingOf>
| chapters = vecs.incoming_nodes[Contains] | |
| subchapters = vecs.incoming_nodes[SubChapter] |
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/bookstore.py
Line: 106:106
Comment:
**logic:** wrong node type retrieved - based on schema, vectors are embedded in `SubChapter` nodes, not `Chapter` nodes. Old DSL shows `subchapters <- vecs::In<EmbeddingOf>`
```suggestion
subchapters = vecs.incoming_nodes[SubChapter]
```
How can I resolve this? If you propose a fix, please make it concise.| class ArgSubchapter(helix.Struct): | ||
| title: helix.String | ||
| content: helix.String | ||
| chunk: helix.Vector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: helix.Vector lacks dimension specification - should match EmbeddingVector (1536 dimensions) for consistency
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/bookstore.py
Line: 84:84
Comment:
**style:** `helix.Vector` lacks dimension specification - should match `EmbeddingVector` (1536 dimensions) for consistency
How can I resolve this? If you propose a fix, please make it concise.| @db.query | ||
| def loaddocs_rag(chapters: helix.List[ArgChapter]) -> str: | ||
| for c in chapters: | ||
| c_node = db.add_node(Chapter(index=c.id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: property name mismatch - old DSL uses chapter_index (line 31), new uses index
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/bookstore.py
Line: 89:89
Comment:
**style:** property name mismatch - old DSL uses `chapter_index` (line 31), new uses `index`
How can I resolve this? If you propose a fix, please make it concise.| # TODO | ||
| vecs = db.search_vector(query, k) | ||
| chapters = vecs.incoming_nodes[Contains] | ||
| return chapters.map(lambda c: {"index": c.index}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: variable name misleading - these are subchapters, not chapters
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/bookstore.py
Line: 107:107
Comment:
**style:** variable name misleading - these are subchapters, not chapters
How can I resolve this? If you propose a fix, please make it concise.|
dumb question: Can users call third-party, access the std library etc? If not, maybe we should use another file suffix to avoid confusion |
AFAIC we're just stealing Python's syntax right? |
They can, but I don't think everything will work (we will try to error when it doesn't). But for example, a function such as: def foo(x): return x ** 2 // 9Will work perfectly fine, as you can give it an input that records all the operations and creates a recording of the AST basically.
We aren't just stealing Python's syntax, this is literally Python code that will get run by a Python runtime, and by abusing how dynamic Python is we will construct a DB schema and a query AST. |
|
how are you thinking about enforcing the DSL boundary? I thought about validating the module via an AST pass and only allowing constructs that are statically analyzable. however, would arbitrary python code raise an error or just be ignored for the sake of simplicity? |
This PR marks the start of the new Helix DSL, in Python.
The reason we chose Python instead of creating a new language is because:
Greptile Overview
Greptile Summary
Introduces new Python DSL for HelixDB as a compile-time schema definition language that compiles to static Rust code. The example demonstrates a bookstore RAG system with chapters, subchapters, and vector embeddings.
Critical Issues:
@indexdecorator on line 62 will cause NameError (not imported)searchdocs_ragretrieves wrong node type - should getSubChapternodes from vectors, notChapternodesStyle Inconsistencies:
indexvschapter_index)ArgSubchapter.chunksearchdocs_rag(chaptersshould besubchapters)The Python DSL approach is sound for compile-time schema generation, but this example needs fixes before it can run successfully.
Important Files Changed
File Analysis
@indexdecorator) and logic error in vector search query that retrieves wrong node typeSequence Diagram
sequenceDiagram participant User participant PythonDSL as Python DSL (bookstore.py) participant HelixCompiler as Helix Compiler participant Database as HelixDB User->>PythonDSL: Define schema (Chapter, SubChapter, EmbeddingVector) PythonDSL->>HelixCompiler: Compile to AST HelixCompiler->>Database: Generate static Rust code User->>Database: Call loaddocs_rag(chapters) loop For each chapter Database->>Database: Create Chapter node loop For each subchapter Database->>Database: Create SubChapter node with embedding Database->>Database: Create Contains edge (Chapter→SubChapter) end end Database-->>User: Return "Success" User->>Database: Call searchdocs_rag(query, k) Database->>Database: Search vectors by similarity Database->>Database: Traverse to SubChapter nodes Database-->>User: Return subchapter data