Skip to content

Conversation

DrXiao
Copy link
Collaborator

@DrXiao DrXiao commented Aug 13, 2025

The proposed changes enhance shecc to generate dynamically linked executables. When the --dynlink flag is specified, shecc produces sections such as .plt and .got for the compiled programs, allowing the executables to leverage the ELF interpreter and the GNU C library to run.

This pull request is still a work in progress due to the following incomplete tasks:

  • Fix the potential issues. (The bootstrapping process still fails for dynamically linked shecc.)
  • Improve code quality and commit messages.
  • Implement for RISC-V output
  • Make the dynamically linked shecc run the test suites.
  • Improve README.md to describe dynamic linking.
  • Enhance GitHub workflows to verify the dynamically linked shecc.
  • Validate the proposed changes on an Arm machine such as a BeagleBone Black or Raspberry Pi.
  • Refine c.c and c.h to avoid duplications

Updated usage:

# Perform bootstrapping process for the dynamically linked shecc.
$ make DYNLINK=1

# Add '--dynlink' to generate dynamically linked executable.
$ shecc [-o output] [+m] [--dump-ir] [--no-libc] [--dynlink] <input.c>

# Run the generated executable by given the elf interpreter prefix.
$ qemu-arm -L /usr/arm-linux-gnueabi/ <executable>

@DrXiao
Copy link
Collaborator Author

DrXiao commented Aug 13, 2025

Currently, only the stage 0 compiler and stage 1 compiler can be generated, and the stage 1 compiler will encounter a Segmentation fault when running.

However, the stage 0 compiler can still compile a simple program and run the executable via QEMU:

/* test.c */
int main(void)
{
    printf("%x %x %x\n", 1, 2, 3);
    printf("%x %x %x %x\n", 1, 2, 3, 4);
    printf("%x %x %x %x %x\n", 1, 2, 3, 4, 5);
    return 0;
}
$ out/shecc --dynlink -o test test.c

Then, we can use arm-linux-gnueabi-readelf or arm-linux-gnueabi-objdump to check the executable. For example, check the relocation information:

$ arm-linux-gnueabi-readelf --relocs test

Relocation section '.rel.plt' at offset 0x260 contains 2 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
000102a8  00000116 R_ARM_JUMP_SLOT   00000000   __libc_start_main
000102ac  00000216 R_ARM_JUMP_SLOT   00000000   printf

However, I ran the test via qemu-arm and found that while the program can execute the main function, the result of certain printf() calls are incorrect.

$ qemu-arm -L /usr/arm-linux-gnueabi/ test
1 2 3
1 2 3 40830000
1 2 3 40830000 10224

Notice that the second and third printf() calls have more than four arguments, certain arguments will be pushed to the stack due to the Arm calling convention.

I think this is a potential issue that shecc pushes wrong values to the stack to make (glibc's) printf() calls produce incorrect results.

@DrXiao
Copy link
Collaborator Author

DrXiao commented Aug 13, 2025

FWIW, I disassemble the test executable:

test.asm
$ arm-linux-gnueabi-objdump -d test

test:     file format elf32-littlearm


Disassembly of section .text:

000100b4 <.text>:
   100b4:	e3a0b000 	mov	fp, #0
   100b8:	e3a0e000 	mov	lr, #0
   100bc:	e49d1004 	pop	{r1}		@ (ldr r1, [sp], #4)
   100c0:	e1a0200d 	mov	r2, sp
   100c4:	e52d2004 	push	{r2}		@ (str r2, [sp, #-4]!)
   100c8:	e52d0004 	push	{r0}		@ (str r0, [sp, #-4]!)
   100cc:	e3a0c000 	mov	ip, #0
   100d0:	e52dc004 	push	{ip}		@ (str ip, [sp, #-4]!)
   100d4:	e30000ec 	movw	r0, #236	@ 0xec
   100d8:	e3400001 	movt	r0, #1
   100dc:	e3a03000 	mov	r3, #0
   100e0:	eb000067 	bl	0x10284
   100e4:	e3a0007f 	mov	r0, #127	@ 0x7f
   100e8:	eb000005 	bl	0x10104
   100ec:	e1a09000 	mov	r9, r0
   100f0:	e1a0a001 	mov	sl, r1
   100f4:	e3008004 	movw	r8, #4
   100f8:	e3408000 	movt	r8, #0
   100fc:	e04dd008 	sub	sp, sp, r8
   10100:	e1a0c00d 	mov	ip, sp
   10104:	eb000005 	bl	0x10120
   10108:	e3008004 	movw	r8, #4
   1010c:	e3408000 	movt	r8, #0
   10110:	e08dd008 	add	sp, sp, r8
   10114:	e1a00000 	nop			@ (mov r0, r0)
   10118:	e3a07001 	mov	r7, #1
   1011c:	ef000000 	svc	0x00000000
   10120:	e1a00009 	mov	r0, r9
   10124:	e1a0100a 	mov	r1, sl
   10128:	eaffffff 	b	0x1012c
   1012c:	e50de004 	str	lr, [sp, #-4]
   10130:	e3008044 	movw	r8, #68	@ 0x44
   10134:	e3408000 	movt	r8, #0
   10138:	e04dd008 	sub	sp, sp, r8
   1013c:	e3000224 	movw	r0, #548	@ 0x224
   10140:	e3400001 	movt	r0, #1
   10144:	e3a01001 	mov	r1, #1
   10148:	e3a02002 	mov	r2, #2
   1014c:	e3a03003 	mov	r3, #3
   10150:	e58d0004 	str	r0, [sp, #4]
   10154:	e58d1008 	str	r1, [sp, #8]
   10158:	e58d200c 	str	r2, [sp, #12]
   1015c:	e58d3010 	str	r3, [sp, #16]
   10160:	e59d0004 	ldr	r0, [sp, #4]
   10164:	e59d1008 	ldr	r1, [sp, #8]
   10168:	e59d200c 	ldr	r2, [sp, #12]
   1016c:	e59d3010 	ldr	r3, [sp, #16]
+  10170:	eb000046 	bl	0x10290
   10174:	e300022e 	movw	r0, #558	@ 0x22e
   10178:	e3400001 	movt	r0, #1
   1017c:	e3a01001 	mov	r1, #1
   10180:	e3a02002 	mov	r2, #2
   10184:	e3a03003 	mov	r3, #3
   10188:	e3a04004 	mov	r4, #4
   1018c:	e58d0014 	str	r0, [sp, #20]
   10190:	e58d1018 	str	r1, [sp, #24]
   10194:	e58d201c 	str	r2, [sp, #28]
   10198:	e58d3020 	str	r3, [sp, #32]
   1019c:	e58d4024 	str	r4, [sp, #36]	@ 0x24
   101a0:	e59d0014 	ldr	r0, [sp, #20]
   101a4:	e59d1018 	ldr	r1, [sp, #24]
   101a8:	e59d201c 	ldr	r2, [sp, #28]
   101ac:	e59d3020 	ldr	r3, [sp, #32]
   101b0:	e59d4024 	ldr	r4, [sp, #36]	@ 0x24
+  101b4:	eb000035 	bl	0x10290
   101b8:	e300023b 	movw	r0, #571	@ 0x23b
   101bc:	e3400001 	movt	r0, #1
   101c0:	e3a01001 	mov	r1, #1
   101c4:	e3a02002 	mov	r2, #2
   101c8:	e3a03003 	mov	r3, #3
   101cc:	e3a04004 	mov	r4, #4
   101d0:	e3a05005 	mov	r5, #5
   101d4:	e58d0028 	str	r0, [sp, #40]	@ 0x28
   101d8:	e58d102c 	str	r1, [sp, #44]	@ 0x2c
   101dc:	e58d2030 	str	r2, [sp, #48]	@ 0x30
   101e0:	e58d3034 	str	r3, [sp, #52]	@ 0x34
   101e4:	e58d4038 	str	r4, [sp, #56]	@ 0x38
   101e8:	e58d503c 	str	r5, [sp, #60]	@ 0x3c
   101ec:	e59d0028 	ldr	r0, [sp, #40]	@ 0x28
   101f0:	e59d102c 	ldr	r1, [sp, #44]	@ 0x2c
   101f4:	e59d2030 	ldr	r2, [sp, #48]	@ 0x30
   101f8:	e59d3034 	ldr	r3, [sp, #52]	@ 0x34
   101fc:	e59d4038 	ldr	r4, [sp, #56]	@ 0x38
   10200:	e59d503c 	ldr	r5, [sp, #60]	@ 0x3c
+  10204:	eb000021 	bl	0x10290
   10208:	e3a00000 	mov	r0, #0
   1020c:	e1a00000 	nop			@ (mov r0, r0)
   10210:	e3008044 	movw	r8, #68	@ 0x44
   10214:	e3408000 	movt	r8, #0
   10218:	e08dd008 	add	sp, sp, r8
   1021c:	e51de004 	ldr	lr, [sp, #-4]
   10220:	e12fff3e 	blx	lr

Disassembly of section .plt:

00010270 <.plt>:
   10270:	e52de004 	push	{lr}		@ (str lr, [sp, #-4]!)
   10274:	e300a2a4 	movw	sl, #676	@ 0x2a4
   10278:	e340a001 	movt	sl, #1
   1027c:	e1a0e00a 	mov	lr, sl
   10280:	e59ef000 	ldr	pc, [lr]
   10284:	e300c2a8 	movw	ip, #680	@ 0x2a8
   10288:	e340c001 	movt	ip, #1
   1028c:	e59cf000 	ldr	pc, [ip]
+  10290:	e300c2ac 	movw	ip, #684	@ 0x2ac
   10294:	e340c001 	movt	ip, #1
   10298:	e59cf000 	ldr	pc, [ip]

0x10290 is the starting address of printf@plt. However, in the text section, there are three places using bl instruction to call printf(), and each of these places has several str and ldr instructions to manipulate the stack beforehand.

@jserv jserv requested review from ChAoSUnItY and fennecJ August 14, 2025 03:11
@sysprog21 sysprog21 deleted a comment from bito-code-review bot Aug 14, 2025
@DrXiao DrXiao force-pushed the feat/support-dynamic-linking branch from b698f38 to 13dfa35 Compare August 16, 2025 12:59
@sysprog21 sysprog21 deleted a comment from bito-code-review bot Aug 17, 2025
@jserv jserv requested review from nosba0957 and vacantron August 19, 2025 08:22
@jserv
Copy link
Collaborator

jserv commented Aug 19, 2025

Consider the minimal change below:

--- a/src/main.c
+++ b/src/main.c
@@ -85,7 +85,7 @@ int main(int argc, char *argv[])
     global_init();

     /* include libc */
-    if (libc)
+    if (libc && !dynlink)
         libc_generate();

     /* load and parse source code into IR */

It disables the built-in libc when --dynlink is enabled, since dynamic linking should use the system libc.

@jserv
Copy link
Collaborator

jserv commented Aug 19, 2025

Notice that the second and third printf() calls have more than four arguments, certain arguments will be pushed to the stack due to the Arm calling convention.

OP_assign just does a register-to-register move (__mov_r(__AL, rd, rn)). The key issue seems to be in how the register mapping works. The problem is with ARM calling convention when passing more than 4 arguments to variadic functions like printf(). In ARM AAPCS (Arm Architecture Procedure Call Standard):

  • First 4 arguments (r0-r3) are passed in registers
  • Arguments beyond 4 must be pushed to the stack
  • The stack must be properly aligned and arguments placed correctly

This suggests that parameter passing is handled differently. Currently, the virtual registers (0-7) are mapped to ARM physical registers (r0-r7). Looking at the code, rd = ph2_ir->dest directly uses the virtual register number as the ARM register number. This means:

  • Virtual register 0 → ARM r0
  • Virtual register 1 → ARM r1
  • Virtual register 2 → ARM r2
  • Virtual register 3 → ARM r3
  • Virtual registers 4-7 → ARM r4-r7

In ARM calling convention, arguments beyond r3 should go to the stack, not to r4-r7. This is definitely a bug. When ir->dest = args++ assigns argument 4 to virtual register 4 (r4), argument 5 to virtual register 5 (r5), etc., but ARM calling convention requires arguments 5+ to be placed on the stack, not in r4-r7.

@jserv
Copy link
Collaborator

jserv commented Aug 19, 2025

The original code had a bug where function calls with more than 4 arguments violated the AAPCS:

  • Arguments 0-3 should go in registers r0-r3
  • Arguments 4+ should be placed on the stack

However, shecc was incorrectly placing all arguments (0-7) in registers r0-r7, causing stack-based arguments to be passed incorrectly.

Consider the changes below:

diff --git a/src/reg-alloc.c b/src/reg-alloc.c
index c66a061..51cd2ea 100644
--- a/src/reg-alloc.c
+++ b/src/reg-alloc.c
@@ -520,12 +520,42 @@ void reg_alloc(void)
                         is_pushing_args = 1;
                     }
 
-                    src0 = prepare_operand(bb, insn->rs1, -1);
-                    ir = bb_add_ph2_ir(bb, OP_assign);
-                    ir->src0 = src0;
-                    ir->dest = args++;
-                    REGS[ir->dest].var = insn->rs1;
-                    REGS[ir->dest].polluted = 0;
+                    /* Check if next call is to external function (for ARM
+                     * calling convention)
+                     */
+                    insn_t *next_insn = insn->next;
+                    func_t *target_func = NULL;
+                    bool is_external_call = false;
+
+                    /* Look ahead for the OP_call to determine if it's external
+                     */
+                    while (next_insn && next_insn->opcode == OP_push)
+                        next_insn = next_insn->next;
+                    if (next_insn && next_insn->opcode == OP_call) {
+                        target_func = find_func(next_insn->str);
+                        is_external_call = target_func && !target_func->bbs;
+                    }
+
+                    /* ARM calling convention for external functions: first 4
+                     * args in r0-r3, rest on stack
+                     */
+                    if (is_external_call && args >= 4) {
+                        /* Arguments 4+: keep on stack, don't load into
+                         * registers. The variable is already on stack from
+                         * earlier spill_alive().
+                         */
+                    } else {
+                        /* Normal behavior for internal functions or first 4
+                         * args
+                         */
+                        src0 = prepare_operand(bb, insn->rs1, -1);
+                        ir = bb_add_ph2_ir(bb, OP_assign);
+                        ir->src0 = src0;
+                        ir->dest = args;
+                        REGS[ir->dest].var = insn->rs1;
+                        REGS[ir->dest].polluted = 0;
+                    }
+                    args++;
                     break;
                 case OP_call:
                     callee_func = find_func(insn->str);
@@ -535,8 +565,8 @@ void reg_alloc(void)
                     ir = bb_add_ph2_ir(bb, OP_call);
                     strcpy(ir->func_name, insn->str);
                     if (dynlink) {
-                        func_t *target_func = find_func(ir->func_name);
-                        target_func->is_used = true;
+                        func_t *target_fn = find_func(ir->func_name);
+                        target_fn->is_used = true;
                     }
 
                     is_pushing_args = 0;

Before the fix: All arguments were always loaded into sequential registers (r0, r1, r2, r3, r4, r5, r6, r7).
After the fix:

  • Arguments 0-3: Still loaded into registers r0-r3 (normal behavior)
  • Arguments 4+ for external calls: Skip register assignment entirely, keeping them on the stack where spill_alive() already placed them

Before Fix (Incorrect)
For a call like printf("Format %d %d %d %d %d", 1, 2, 3, 4, 5):

  load %x0, stack  # arg 0 → r0 ✓
  load %x1, stack  # arg 1 → r1 ✓
  load %x2, stack  # arg 2 → r2 ✓
  load %x3, stack  # arg 3 → r3 ✓
  load %x4, stack  # arg 4 → r4 ❌ (violates ARM calling convention)
  load %x5, stack  # arg 5 → r5 ❌ (violates ARM calling convention)
  call @printf

After Fix (Correct)
For the same call:

  load %x0, stack  # arg 0 → r0 ✓
  load %x1, stack  # arg 1 → r1 ✓
  load %x2, stack  # arg 2 → r2 ✓
  load %x3, stack  # arg 3 → r3 ✓
                   # args 4,5 stay on stack ✓ (ARM compliant)
  call @printf

@jserv

This comment was marked as resolved.

@jserv
Copy link
Collaborator

jserv commented Aug 24, 2025

I would like to ask @lecopzer for reviewing.

@DrXiao
Copy link
Collaborator Author

DrXiao commented Aug 25, 2025

I tried to fix the Arm calling convention issue, but I found that the main problem seems not be register allocation. The actual problem is stack manipulation.

Consider a code like printf("%x %x %x %x %x\n", 1, 2, 3, 4, 5), if using arm-linux-gnueabi-gcc to compile, it produces the following machine code:

   1042c:	e3a03005 	mov	r3, #5
+  10430:	e58d3004 	str	r3, [sp, #4]
   10434:	e3a03004 	mov	r3, #4
+  10438:	e58d3000 	str	r3, [sp]
   1043c:	e3a03003 	mov	r3, #3
   10440:	e3a02002 	mov	r2, #2
   10444:	e3a01001 	mov	r1, #1
   10448:	e59f0010 	ldr	r0, [pc, #16]	@ 10460 <main+0x40>
   1044c:	ebffffac 	bl	10304 <printf@plt>

We can notice that only 4 and 5 are pushed to stack. The first four arguments are stored in r0-r3.

However, if using shecc to compile, it produces as follows:

   10148:	e30001b4 	movw	r0, #436	@ 0x1b4
   1014c:	e3400001 	movt	r0, #1
   10150:	e3a01001 	mov	r1, #1
   10154:	e3a02002 	mov	r2, #2
   10158:	e3a03003 	mov	r3, #3
   1015c:	e3a04004 	mov	r4, #4
   10160:	e3a05005 	mov	r5, #5
+  10164:	e58d0004 	str	r0, [sp, #4]
+  10168:	e58d1008 	str	r1, [sp, #8]
+  1016c:	e58d200c 	str	r2, [sp, #12]
+  10170:	e58d3010 	str	r3, [sp, #16]
+  10174:	e58d4014 	str	r4, [sp, #20]
+  10178:	e58d5018 	str	r5, [sp, #24]
   1017c:	e59d0004 	ldr	r0, [sp, #4]
   10180:	e59d1008 	ldr	r1, [sp, #8]
   10184:	e59d200c 	ldr	r2, [sp, #12]
   10188:	e59d3010 	ldr	r3, [sp, #16]
   1018c:	e59d4014 	ldr	r4, [sp, #20]
   10190:	e59d5018 	ldr	r5, [sp, #24]
   10194:	eb00001b 	bl	0x10208

The machine code uses r0-r5 to store the arguments, pushes all of them onto stack and load the values back from stack. This causes glibc's printf to receives incorrect values for the fourth and fifth arguments.

I think shecc generates machine code that pushes all arguments onto stack because spill_alive() may eventually call spill_var() to generate the OP_global_store / OP_store opcodes in the phase 2 IR.

shecc/src/reg-alloc.c

Lines 581 to 584 in 6a97bd7

if (!is_pushing_args) {
spill_alive(bb, insn);
is_pushing_args = 1;
}

I'm not sure why shecc behaves as described above, but I will try to review and fix it for AAPCS compliance.

@jserv

This comment was marked as resolved.

@DrXiao DrXiao force-pushed the feat/support-dynamic-linking branch from 13dfa35 to 1aceea3 Compare August 26, 2025 14:55
@DrXiao
Copy link
Collaborator Author

DrXiao commented Aug 26, 2025

I have rebased onto the master branch and updated the commits, so that we can review the updated implementation.

With the changes below, I can proceed with stage 1 compilation via make DYNLINK=1:

I have also temporarily created a new commit to apply part of the changes, and I will review everything to resolve any potential issues.

@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Sep 4, 2025
cubic-dev-ai[bot]

This comment was marked as outdated.

@DrXiao DrXiao force-pushed the feat/support-dynamic-linking branch 2 times, most recently from f81adb8 to a96619d Compare September 7, 2025 07:59
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Sep 7, 2025
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 issues found across 15 files

React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.


/* string-related functions */
int strlen(char *str);
int strcmp(char *s1, char *s2);
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add const to strcmp parameters for correctness and compatibility with libc.

Prompt for AI agents
Address the following comment on lib/c.h at line 27:

<comment>Add const to strcmp parameters for correctness and compatibility with libc.</comment>

<file context>
@@ -0,0 +1,47 @@
+
+/* string-related functions */
+int strlen(char *str);
+int strcmp(char *s1, char *s2);
+int strncmp(char *s1, char *s2, int len);
+char *strcpy(char *dest, char *src);
</file context>
Suggested change
int strcmp(char *s1, char *s2);
int strcmp(const char *s1, const char *s2);
Fix with Cubic

phdr.p_memsz = elf_interp->size + elf_relplt->size + elf_plt->size +
elf_got->size + elf_dynstr->size + elf_dynsym->size +
elf_dynamic->size; /* size in memory */
phdr.p_flags = 7; /* flags */
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second PT_LOAD marks interpreter/dynamic data segment executable; drop execute bit to avoid RWX mapping.

Prompt for AI agents
Address the following comment on src/elf.c at line 220:

<comment>Second PT_LOAD marks interpreter/dynamic data segment executable; drop execute bit to avoid RWX mapping.</comment>

<file context>
@@ -175,50 +193,68 @@ void elf_generate_header(void)
+        phdr.p_memsz = elf_interp-&gt;size + elf_relplt-&gt;size + elf_plt-&gt;size +
+                       elf_got-&gt;size + elf_dynstr-&gt;size + elf_dynsym-&gt;size +
+                       elf_dynamic-&gt;size; /* size in memory */
+        phdr.p_flags = 7;                 /* flags */
+        phdr.p_align = 4;                 /* alignment */
+        elf_write_blk(elf_program_header, &amp;phdr, sizeof(elf32_phdr_t));
</file context>
Fix with Cubic

@@ -7,4 +7,12 @@ ARCH_DEFS = \
\#define ARCH_PREDEFINED \"__riscv\" /* Older versions of the GCC toolchain defined __riscv__ */\n$\
\#define ELF_MACHINE 0xf3\n$\
\#define ELF_FLAGS 0\n$\
\#define DYN_LINKER \"/lib/ld-linux.so.3\"\n$\
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DYN_LINKER points to ARM’s /lib/ld-linux.so.3; set this to the correct RISC-V dynamic loader path to avoid invalid interpreter in produced executables.

Prompt for AI agents
Address the following comment on mk/riscv.mk at line 10:

<comment>DYN_LINKER points to ARM’s /lib/ld-linux.so.3; set this to the correct RISC-V dynamic loader path to avoid invalid interpreter in produced executables.</comment>

<file context>
@@ -7,4 +7,12 @@ ARCH_DEFS = \
     \#define ARCH_PREDEFINED \&quot;__riscv\&quot; /* Older versions of the GCC toolchain defined __riscv__ */\n$\
     \#define ELF_MACHINE 0xf3\n$\
     \#define ELF_FLAGS 0\n$\
+    \#define DYN_LINKER \&quot;/lib/ld-linux.so.3\&quot;\n$\
+    \#define LIBC_SO \&quot;libc.so.6\&quot;\n$\
+    \#define PLT_FIXUP_SIZE 20\n$\
</file context>
Fix with Cubic

@@ -282,15 +296,23 @@ void emit_ph2_ir(ph2_ir_t *ph2_ir)
return;
case OP_call:
func = find_func(ph2_ir->func_name);
emit(__bl(__AL, func->bbs->elf_offset - elf_code->size));
if (func->bbs)
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential null dereference of func; check func for NULL before accessing bbs to handle unresolved/external symbols safely.

Prompt for AI agents
Address the following comment on src/arm-codegen.c at line 299:

<comment>Potential null dereference of func; check func for NULL before accessing bbs to handle unresolved/external symbols safely.</comment>

<file context>
@@ -282,15 +296,23 @@ void emit_ph2_ir(ph2_ir_t *ph2_ir)
     case OP_call:
         func = find_func(ph2_ir-&gt;func_name);
-        emit(__bl(__AL, func-&gt;bbs-&gt;elf_offset - elf_code-&gt;size));
+        if (func-&gt;bbs)
+            ofs = func-&gt;bbs-&gt;elf_offset - elf_code-&gt;size;
+        else
</file context>
Fix with Cubic

@@ -58,6 +58,8 @@ int main(int argc, char *argv[])
hard_mul_div = true;
else if (!strcmp(argv[i], "--no-libc"))
libc = false;
else if (!strcmp(argv[i], "--dynlink"))
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combining --dynlink with --no-libc is not validated; the build may produce unusable output. Consider rejecting conflicting flags or reconciling libc inclusion with dynlink mode.

Prompt for AI agents
Address the following comment on src/main.c at line 61:

<comment>Combining --dynlink with --no-libc is not validated; the build may produce unusable output. Consider rejecting conflicting flags or reconciling libc inclusion with dynlink mode.</comment>

<file context>
@@ -58,6 +58,8 @@ int main(int argc, char *argv[])
             hard_mul_div = true;
         else if (!strcmp(argv[i], &quot;--no-libc&quot;))
             libc = false;
+        else if (!strcmp(argv[i], &quot;--dynlink&quot;))
+            dynlink = true;
         else if (!strcmp(argv[i], &quot;-o&quot;)) {
</file context>
Fix with Cubic

@DrXiao DrXiao force-pushed the feat/support-dynamic-linking branch 2 times, most recently from 2a23bfc to e76c3af Compare September 8, 2025 14:47
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Sep 8, 2025
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Sep 8, 2025
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 issues found across 15 files

React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.

(elf_code_start + elf_code->size)));
/* Goto the 'exit' code snippet if __libc_start_main returns */
emit(__mov_i(__AL, __r0, 127));
emit(__bl(__AL, 28));
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard-coded branch to an 'exit' snippet is emitted in dynlink path, but the exit snippet is only generated for static linking, leading to an invalid/incorrect branch target.

Prompt for AI agents
Address the following comment on src/arm-codegen.c at line 500:

<comment>Hard-coded branch to an &#39;exit&#39; snippet is emitted in dynlink path, but the exit snippet is only generated for static linking, leading to an invalid/incorrect branch target.</comment>

<file context>
@@ -456,13 +474,42 @@ void emit_ph2_ir(ph2_ir_t *ph2_ir)
+                            (elf_code_start + elf_code-&gt;size)));
+        /* Goto the &#39;exit&#39; code snippet if __libc_start_main returns */
+        emit(__mov_i(__AL, __r0, 127));
+        emit(__bl(__AL, 28));
 
-    /* start */
</file context>
Fix with Cubic

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These instructions are wrong. I will fix them.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 issues found across 15 files

React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.

@@ -7,4 +7,12 @@ ARCH_DEFS = \
\#define ARCH_PREDEFINED \"__riscv\" /* Older versions of the GCC toolchain defined __riscv__ */\n$\
\#define ELF_MACHINE 0xf3\n$\
\#define ELF_FLAGS 0\n$\
\#define DYN_LINKER \"/lib/ld-linux.so.3\"\n$\
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dynamic linker path is for ARM; RISC-V uses an arch-specific ld-linux-riscv32-*.so.1. Use the correct RISC-V interpreter path or make it configurable.

Prompt for AI agents
Address the following comment on mk/riscv.mk at line 10:

<comment>Dynamic linker path is for ARM; RISC-V uses an arch-specific ld-linux-riscv32-*.so.1. Use the correct RISC-V interpreter path or make it configurable.</comment>

<file context>
@@ -7,4 +7,12 @@ ARCH_DEFS = \
     \#define ARCH_PREDEFINED \&quot;__riscv\&quot; /* Older versions of the GCC toolchain defined __riscv__ */\n$\
     \#define ELF_MACHINE 0xf3\n$\
     \#define ELF_FLAGS 0\n$\
+    \#define DYN_LINKER \&quot;/lib/ld-linux.so.3\&quot;\n$\
+    \#define LIBC_SO \&quot;libc.so.6\&quot;\n$\
+    \#define PLT_FIXUP_SIZE 20\n$\
</file context>
Fix with Cubic

if (func->bbs)
ofs = elf_code_start + func->bbs->elf_offset;
else
ofs = elf_plt_start + func->plt_offset;
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking address of external function unconditionally uses PLT when func body is missing; guard with dynlink to prevent invalid addresses in static builds.

Prompt for AI agents
Address the following comment on src/arm-codegen.c at line 320:

<comment>Taking address of external function unconditionally uses PLT when func body is missing; guard with dynlink to prevent invalid addresses in static builds.</comment>

<file context>
@@ -299,7 +314,10 @@ void emit_ph2_ir(ph2_ir_t *ph2_ir)
+        if (func-&gt;bbs)
+            ofs = elf_code_start + func-&gt;bbs-&gt;elf_offset;
+        else
+            ofs = elf_plt_start + func-&gt;plt_offset;
         emit(__movw(__AL, __r8, ofs));
         emit(__movt(__AL, __r8, ofs));
</file context>
Fix with Cubic

if (func->bbs)
ofs = func->bbs->elf_offset - elf_code->size;
else
ofs = (elf_plt_start + func->plt_offset) -
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calls to external functions unconditionally use PLT when func body is missing; this should be gated by dynlink to avoid referencing non-existent PLT in static builds.

Prompt for AI agents
Address the following comment on src/arm-codegen.c at line 303:

<comment>Calls to external functions unconditionally use PLT when func body is missing; this should be gated by dynlink to avoid referencing non-existent PLT in static builds.</comment>

<file context>
@@ -287,7 +297,12 @@ void emit_ph2_ir(ph2_ir_t *ph2_ir)
+        if (func-&gt;bbs)
+            ofs = func-&gt;bbs-&gt;elf_offset - elf_code-&gt;size;
+        else
+            ofs = (elf_plt_start + func-&gt;plt_offset) -
+                  (elf_code_start + elf_code-&gt;size);
+        emit(__bl(__AL, ofs));
</file context>
Fix with Cubic

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO:

if (func->bbs)
    ofs = func->bbs->elf_offset - elf_code->size;
else if (dynlink)
    ofs = (elf_plt_start + func->plt_offset) -
          (elf_code_start + elf_code->size);
else
    fatal("The function is not implemented");

emit(__mov_r(__AL, __r0, __r0));
emit(__mov_i(__AL, __r7, 1));
emit(__svc());
if (!dynlink) {
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unconditional branch still skips 56 bytes even when the exit/syscall block is omitted for dynlink, likely causing control-flow to jump into the wrong place. Make the branch size conditional (or omit it) when dynlink is enabled.

Prompt for AI agents
Address the following comment on src/arm-codegen.c at line 521:

<comment>Unconditional branch still skips 56 bytes even when the exit/syscall block is omitted for dynlink, likely causing control-flow to jump into the wrong place. Make the branch size conditional (or omit it) when dynlink is enabled.</comment>

<file context>
@@ -471,24 +518,26 @@ void code_generate(void)
-    emit(__mov_r(__AL, __r0, __r0));
-    emit(__mov_i(__AL, __r7, 1));
-    emit(__svc());
+    if (!dynlink) {
+        /* exit - only for static linking */
+        emit(__movw(__AL, __r8, GLOBAL_FUNC-&gt;stack_size));
</file context>

int strncmp(char *s1, char *s2, int len);
char *strcpy(char *dest, char *src);
char *strncpy(char *dest, char *src, int len);
char *memcpy(char *dest, char *src, int count);
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memcpy should use void* for pointers, const on src, and size_t for count to match libc and avoid size truncation.

Prompt for AI agents
Address the following comment on lib/c.h at line 31:

<comment>memcpy should use void* for pointers, const on src, and size_t for count to match libc and avoid size truncation.</comment>

<file context>
@@ -0,0 +1,47 @@
+int strncmp(char *s1, char *s2, int len);
+char *strcpy(char *dest, char *src);
+char *strncpy(char *dest, char *src, int len);
+char *memcpy(char *dest, char *src, int count);
+int memcmp(void *s1, void *s2, int n);
+void *memset(void *s, int c, int n);
</file context>
Fix with Cubic

func->bbs = arena_calloc(BB_ARENA, 1, sizeof(basic_block_t));
/* In dynamic mode, __syscall won't be implemented but needs to exist
* for parsing the built-in libc. It will be treated as external. */
func->bbs = NULL;
Copy link

@cubic-dev-ai cubic-dev-ai bot Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assigning NULL to func->bbs causes a null dereference in RISC-V cfg_flatten where func->bbs is used unconditionally.

Prompt for AI agents
Address the following comment on src/parser.c at line 5237:

<comment>Assigning NULL to func-&gt;bbs causes a null dereference in RISC-V cfg_flatten where func-&gt;bbs is used unconditionally.</comment>

<file context>
@@ -5232,7 +5232,13 @@ void parse_internal(void)
-    func-&gt;bbs = arena_calloc(BB_ARENA, 1, sizeof(basic_block_t));
+    /* In dynamic mode, __syscall won&#39;t be implemented but needs to exist
+     * for parsing the built-in libc. It will be treated as external. */
+    func-&gt;bbs = NULL;
+    if (!dynlink) {
+        /* Otherwise, allocate a basic block to implement in static mode. */
</file context>
Fix with Cubic

@jserv
Copy link
Collaborator

jserv commented Sep 8, 2025

@DrXiao, You can click "Resolve conversation" button once you are confident to resolve the review messages or suggestions by @cubic-dev-ai .

@DrXiao DrXiao force-pushed the feat/support-dynamic-linking branch from e76c3af to 8a22ab9 Compare September 9, 2025 14:37
@DrXiao
Copy link
Collaborator Author

DrXiao commented Sep 9, 2025

8a22ab9 is a temporary solution to fix Arm calling convention so that the stage 0 compiler can handle the following test code and make executable run as expected:

int main(int argc, char **argv)
{
    int ret[8], c = 0;
    ret[c++] = printf("%d %p %c %c\n", argc, argv, 'A', 'B');
    ret[c++] = printf("%x %x %x %x %x %x\n", 1, 2, 3, 4, 5, 6);
    ret[c++] = printf("%x %x %x %x\n", 1, 2, 3, 4);
    ret[c++] = printf("%x %x %x %x %x %x %x\n", 1, 2, 3, 4, 5, 6, 7);
    ret[c++] = printf("%x %x %x %x %x\n", 1, 2, 3, 4, 5);
    ret[c++] = printf("%x %x %x\n", 1, 2, 3);
    printf("ret[] =");
    for (int i = 0; i < c; i++)
        printf(" %d", ret[i]);
    printf("\n");
    return 0;
}
$ qemu-arm test
1 0x407fff64 A B
1 2 3 4 5 6
1 2 3 4
1 2 3 4 5 6 7
1 2 3 4 5
1 2 3
ret[] = 17 12 8 14 10 6

But, the bootstrapping process still fails, and I will continue to find the potential issues.


edit1:
Use alias to simplify the use of qemu-arm.

$ alias qemu-arm="qemu-arm -L /usr/arm-linux-gnueabi/"

@jserv
Copy link
Collaborator

jserv commented Sep 10, 2025

edit1: Use alias to simplify the use of qemu-arm.

$ alias qemu-arm="qemu-arm -L /usr/arm-linux-gnueabi/"

FYI: AMaCC has a handy detection for dynamic linker: https://github.com/jserv/amacc/blob/master/mk/arm.mk

@DrXiao
Copy link
Collaborator Author

DrXiao commented Sep 10, 2025

Consider the following code:

#include <stdbool.h>

bool dynlink;
int main(void)
{
    dynlink = false;
    printf("dynlink = %p\n", &dynlink);
    printf("dynlink = %p\n", &dynlink);
    return 0;
}

After compiling the code by the stage 0 compiler, it produces incorrect result:

$ out/shecc --dynlink -o test test.c
$ qemu-arm -L /usr/arm-linux-gnueabi/ test
dynlink = 0x407ffdec
dynlink = 0x407ffd08

After disassembling the generate executable using arm-linux-gnueabi-objdump, I believe the cause is that register r12, which stores a global pointer, is modified after the first printf() call. Since the executable relies on r12 to access the "global stack", it retrieves the wrong value from r12 after calling printf(), which results in an incorrect address for dynlink.

FWIW, applying the following changes allows the updated stage 0 compiler to handle the above case correctly, but the bootstrapping process still fails.

diff --git a/src/arm-codegen.c b/src/arm-codegen.c
index ede6e8d..aab2904 100644
--- a/src/arm-codegen.c
+++ b/src/arm-codegen.c
@@ -71,6 +71,8 @@ void update_elf_offset(ph2_ir_t *ph2_ir)
     case OP_write:
     case OP_push:
     case OP_pop:
+    case OP_push_gp:
+    case OP_pop_gp:
     case OP_jump:
     case OP_call:
     case OP_load_func:
@@ -292,6 +294,12 @@ void emit_ph2_ir(ph2_ir_t *ph2_ir)
     case OP_pop:
         emit(__add_i(__AL, __sp, __sp, rn * 4));
         return;
+    case OP_push_gp:
+        emit(__push_reg(__AL, __r12));
+        break;
+    case OP_pop_gp:
+        emit(__pop_word(__AL, __r12));
+        break;
     case OP_branch:
         emit(__teq(rn));
         if (ph2_ir->is_branch_detached) {
diff --git a/src/defs.h b/src/defs.h
index 4244c45..7af8693 100644
--- a/src/defs.h
+++ b/src/defs.h
@@ -269,6 +269,8 @@ typedef enum {
     OP_indirect, /* indirect call with function pointer */
     OP_return,   /* explicit return */
     OP_pop,      /* eliminate arguments */
+    OP_push_gp,  /* preserve global pointer */
+    OP_pop_gp,   /* restore global pointer */
 
     OP_allocat, /* allocate space on stack */
     OP_assign,
diff --git a/src/reg-alloc.c b/src/reg-alloc.c
index 2b2de3c..407964a 100644
--- a/src/reg-alloc.c
+++ b/src/reg-alloc.c
@@ -770,6 +770,7 @@ void reg_alloc(void)
 
                     if (dynlink) {
                         callee_func->is_used = true;
+                        ir = bb_add_ph2_ir(bb, OP_push_gp);
                         /* Push args to stack for Arm output */
                         if (!callee_func->bbs && args > 4) {
                             int regs = 0;
@@ -789,6 +790,7 @@ void reg_alloc(void)
                             ir = bb_add_ph2_ir(bb, OP_pop);
                             ir->src0 = args - 4;
                         }
+                        ir = bb_add_ph2_ir(bb, OP_pop_gp);
                     }
 
                     is_pushing_args = false;
$ out/shecc --dynlink -o test test.c
$ qemu-arm -L /usr/arm-linux-gnueabi/ test
dynlink = 0x407ffdec
dynlink = 0x407ffdec

# After applying the above changes, the stage 1 compiler can execute
# global_init() and libc_generate(), but fails at parse().
$ qemu-arm -L /usr/arm-linux-gnueabi/ out/shecc-stage1.elf -o shecc src/main.c
[Error]: Unexpected token
Occurs at source location 0.
/*
^ Error occurs here
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted (core dumped)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants