Skip to content

Conversation

@wmitsuda
Copy link
Member

fixes #10285

however, looking at the code, a few lines above, I'm wondering why tx actually is nil here, can tx, err = e.db.BeginRwNosync(ctx) return tx == nil when there is no error?

not clear to me what value tx is capturing inside the closure, I'm afraid this fix may hide another bug, I'm a little confused.

@Giulio2002
Copy link
Contributor

what bug are we talking about?

@Giulio2002
Copy link
Contributor

oh wait when the fuck did this get here

@wmitsuda
Copy link
Member Author

see the crash in the original issue

@Giulio2002
Copy link
Contributor

Let's wait for Alex to check this. @somnathb1 fixed a DB inflation bug here. while @AskAlexSharov introduced it so they have better understanding of why that was there

@wmitsuda
Copy link
Member Author

actually, now I see in the code that tx can be reassigned to nil later and there is a goto, so I think it is capturing that reassigment.

not sure if that was the original intention of the code because of the weird closure capture behaviors in golang, so may you take a deeper look at it @Giulio2002 pls?

@wmitsuda
Copy link
Member Author

Let's wait for Alex to check this. @somnathb1 fixed a DB inflation bug here. while @AskAlexSharov introduced it so they have better understanding of why that was there

yeah, it is confusing, better have more eyes on it because I'm not an expert in this flow.

@Giulio2002
Copy link
Contributor

Regardless, this fix makes sense. so it should be still approved. I think it was an oversight. I am looking at the code right now and I am 99.9999% confident it is a genuine fix. However, I want to see what others think too. I can make mistakes :)

@somnathb1
Copy link
Contributor

The code overall around this is quite confusing and needs quite the focus to understand.
Setting tx to nil somewhere and then checking later for a rollback() call - pretty twisted to begin with (Not this PR)

@wmitsuda can you please check tx.Rollback() isn't being called twice, courtesy L137 and L335?

@AskAlexSharov
Copy link
Collaborator

  1. All destructors must be double-call-safe. tx.Rollback() is. We also can make it nil-call-safe.

  2. Why it's here: because E3 require support of --sync.loop.block.limit (enabled by default) in both --internalcl and --externalcl cases

  3. @somnathb1 did revert this feature in E2 Revert "StageSenders: --sync.loop.block.limit support" #10060 - but I was not able to merge it in E3 - because E3 also need fill rawdbv3.TxNums index which is non-trivial in updateForkChoice. But there was no follow-up Ticket/PR to simplify updateForkChoice after revert.

  4. updateForkChoice needs to be refactored to 2 functions: loop and loopIteration. (Similar to what we have in StageLoop + StageLoopIteration). Then loopIteration func will able to begin/commit/rollback and run all stages for 1 --sync.loop.block.limit (without new Begin and goto).

  5. I'm scare to refactor updateForkChoice because it's not like usual golang code - it does send errors to channel and in many cases it send special status-code to channel (i scare to loose some special status-code).

  6. Implementation of txs here:


TooBigJumpStep:
   tx = begin()
    if tooBigJump {
       runStagesFor1KBlocks(initialCycle=true)
   } else {
       runStagesForAllBlocks(initialCycle=false)
   }
   tx.Commit()
   tx = nil 
   if tooBigJump {
       goto TooBigJumpStep
   }
  1. Why closure:
type A struct {
	v int
}

func (a *A) Print() {
	fmt.Printf("Print: %d\n", a.v)
}
func TestName(t *testing.T) {
	a := &A{v: 1}
	defer func() {
		fmt.Printf("Alex: %d\n", a.v)
	}()
	defer a.Print()
	a = nil
	a = &A{v: 2}
}

Output:

Print: 1
Alex: 2

defer a.Print evaluate expression in non-lazy way (right now) and only pointer to func it exec at the end (and this func belongs to first object).
while defer func(int) {}(5+5) also execute expression 5+5 (right now) and execute pointer to func at the end (this func will go to global variable a and use it's value. current value of a at end of test is 2).


So - we need split updateForkChoice func to 2 - then we will not need worry about closures/rollbacks_if_nil/etc...
or even better: we need take a look at feature --sync.loop.block.limit and make sure it supported everywhere in consistent way (maybe add tests).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[e3] crash on sepolia

5 participants