Skip to content

promote self-hosted C tokenizer and parser to a general-purpose C tokenizer and parser in the standard library #4045

@andrewrk

Description

@andrewrk

Currently Zig has the file src-self-hosted/c_tokenizer.zig which is used to tokenize macros for translate-c purposes. It also has some functions in src-self-hosted/translate_c.zig which do AST parsing of C tokens:

  • parseCPrimaryExpr
  • parseCExpr
  • parseCSuffixOpExpr
  • etc

Zig has set a (in my opinion successful) precedent in exposing its own tokenizer, parser, and AST in the standard library, in these files:

  • std/lib/zig/ast.zig
  • std/lib/zig/parse.zig
  • std/lib/zig/render.zig
  • std/lib/zig/tokenizer.zig

Now it is time to do the same for C:

  • Decouple C tokenization and AST building from translate-c
  • Expose C tokenization and AST building (and rendering?) in std/lib/c/* in the same way as zig.
  • Complete the implementation of C tokenization and AST building. Make it compatible with the specification as well as GCC extensions, and make it robust. Someone should be able to write a C compiler in zig comfortably by using the standard library C tokenizer and parser.
  • translate-c should use this generic and robust C tokenizer and parser API, rather than its current, only-what-we-need-to-make-it-work implementation.

cc @Vexu

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementSolving this issue will likely involve adding new logic or components to the codebase.frontendTokenization, parsing, AstGen, Sema, and Liveness.standard libraryThis issue involves writing Zig code for the standard library.translate-cC to Zig source translation feature (@cImport)

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions