Skip to content

Conversation

@pau-hedgehog
Copy link
Contributor

No description provided.

@pau-hedgehog pau-hedgehog self-assigned this Sep 1, 2025
@pau-hedgehog pau-hedgehog added ci:+release Enable VLAB release tests ci:+hlab Enable hybrid VLAB tests labels Sep 1, 2025
@github-actions
Copy link

github-actions bot commented Sep 1, 2025

Test Results

  8 files  32 suites   2h 4m 48s ⏱️
 20 tests  6 ✅  14 💤 0 ❌
160 runs  48 ✅ 112 💤 0 ❌

Results for commit 95db236.

♻️ This comment has been updated with latest results.

@pau-hedgehog pau-hedgehog marked this pull request as ready for review September 2, 2025 07:17
@pau-hedgehog pau-hedgehog requested a review from a team as a code owner September 2, 2025 07:17
Copy link
Contributor

@edipascale edipascale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just a couple of small things

@pau-hedgehog pau-hedgehog force-pushed the pau/dhcp_renew_rel_test branch 2 times, most recently from d446dcf to 3f07c55 Compare September 3, 2025 16:22
@pau-hedgehog pau-hedgehog removed the ci:+hlab Enable hybrid VLAB tests label Sep 3, 2025
@edipascale
Copy link
Contributor

So I'm a little conflicted about this test the way it currently is, because I do not believe it's testing DHCP renewals anymore. Instead we are reconfiguring the interfaces and getting brand new leases. It's not a bad thing to have additional coverage on this, but it's not what we're saying we are doing, right? For instance, l3vni actual renewals should not have the short->long lease cycle, or at least that's not what I would expect. I think to test renewals you need to set the lease to a reasonably short time via the DHCPOptions and wait long enough for it to expire, then check what you get afterwards. It might be possible to force a renew earlier via networkctl, although this question appears to imply hat it doesn't really work that way, and your experience seems to confirm it.

It's also easy to break this if we ever change the lease time values we use in the DHCP server. This is inevitable to a certain extent, although we could at least set the lease time as a DHCPOption for the "large lease" value instead of using the default and risking it changing.

None of the above means that we should reject the PR, but I think it's worth thinking about what we're doing.

The one thing I think we should change regardless is the code to parse the DHCP lease - instead of duplicating it let's use the existing function or adapt it if it doesn't fit your needs.

@pau-hedgehog
Copy link
Contributor Author

So I'm a little conflicted about this test the way it currently is, because I do not believe it's testing DHCP renewals anymore. Instead we are reconfiguring the interfaces and getting brand new leases. It's not a bad thing to have additional coverage on this, but it's not what we're saying we are doing, right? For instance, l3vni actual renewals should not have the short->long lease cycle, or at least that's not what I would expect. I think to test renewals you need to set the lease to a reasonably short time via the DHCPOptions and wait long enough for it to expire, then check what you get afterwards. It might be possible to force a renew earlier via networkctl, although this question appears to imply hat it doesn't really work that way, and your experience seems to confirm it.

It's also easy to break this if we ever change the lease time values we use in the DHCP server. This is inevitable to a certain extent, although we could at least set the lease time as a DHCPOption for the "large lease" value instead of using the default and risking it changing.

None of the above means that we should reject the PR, but I think it's worth thinking about what we're doing.

The one thing I think we should change regardless is the code to parse the DHCP lease - instead of duplicating it let's use the existing function or adapt it if it doesn't fit your needs.

All the points you raise are 100% valid. Let me iterate once more to address all the concerns

@pau-hedgehog pau-hedgehog marked this pull request as draft September 4, 2025 08:19
@pau-hedgehog pau-hedgehog force-pushed the pau/dhcp_renew_rel_test branch 2 times, most recently from 46e14bf to d23ea05 Compare September 25, 2025 17:48
@Frostman Frostman requested a review from Copilot September 30, 2025 20:54
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new DHCP renewal test to verify that VPC-attached interfaces properly handle DHCP lease renewals with shorter lease times. The test configures VPC DHCP options with a reduced lease time and monitors the automatic renewal process across servers.

  • Adds comprehensive DHCP renewal testing with concurrent execution support
  • Implements VPC DHCP configuration updates with proper reversion handling
  • Adds lease parsing and validation utilities for monitoring renewal behavior

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@pau-hedgehog pau-hedgehog changed the title chore: add DHCP renew release test Add DHCP renew release test Oct 7, 2025
@pau-hedgehog pau-hedgehog force-pushed the pau/dhcp_renew_rel_test branch 2 times, most recently from 1b5ae33 to f796072 Compare November 3, 2025 16:40
@pau-hedgehog
Copy link
Contributor Author

So I'm a little conflicted about this test the way it currently is, because I do not believe it's testing DHCP renewals anymore. Instead we are reconfiguring the interfaces and getting brand new leases. It's not a bad thing to have additional coverage on this, but it's not what we're saying we are doing, right? For instance, l3vni actual renewals should not have the short->long lease cycle, or at least that's not what I would expect. I think to test renewals you need to set the lease to a reasonably short time via the DHCPOptions and wait long enough for it to expire, then check what you get afterwards. It might be possible to force a renew earlier via networkctl, although this question appears to imply hat it doesn't really work that way, and your experience seems to confirm it.
It's also easy to break this if we ever change the lease time values we use in the DHCP server. This is inevitable to a certain extent, although we could at least set the lease time as a DHCPOption for the "large lease" value instead of using the default and risking it changing.
None of the above means that we should reject the PR, but I think it's worth thinking about what we're doing.
The one thing I think we should change regardless is the code to parse the DHCP lease - instead of duplicating it let's use the existing function or adapt it if it doesn't fit your needs.

All the points you raise are 100% valid. Let me iterate once more to address all the concerns

Test has been updated to test the renew as suggested

@pau-hedgehog pau-hedgehog requested a review from Copilot November 3, 2025 17:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link

github-actions bot commented Nov 3, 2025

Release Tests

 12 files  48 suites   3h 36m 30s ⏱️
 25 tests  7 ✅  16 💤 2 ❌
300 runs  79 ✅ 219 💤 2 ❌

For more details on these failures, see this check.

Results for commit 2b280e9.

♻️ This comment has been updated with latest results.

@pau-hedgehog pau-hedgehog force-pushed the pau/dhcp_renew_rel_test branch from f796072 to 7376c67 Compare November 4, 2025 00:01
@pau-hedgehog pau-hedgehog force-pushed the pau/dhcp_renew_rel_test branch from 7376c67 to 2b280e9 Compare November 4, 2025 08:04
@pau-hedgehog pau-hedgehog requested a review from Copilot November 4, 2025 13:09
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}

// Look for short lease (L3VNI 2-step process)
if !shortLeaseFound && info.ValidLifetime >= 5 && info.ValidLifetime <= 15 {
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic numbers 5 and 15 for the short lease detection range should be defined as named constants to clarify what this range represents in the L3VNI 2-step process.

Copilot uses AI. Check for mistakes.
}

// Wait for full lease
if info.ValidLifetime >= int(shortLeaseTime)-10 {
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tolerance value of 10 seconds should be defined as a named constant to improve code clarity and maintainability.

Copilot uses AI. Check for mistakes.
@pau-hedgehog pau-hedgehog marked this pull request as ready for review November 4, 2025 13:14
Copy link
Contributor

@edipascale edipascale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pau-hedgehog, this is now a proper DHCP renewal test. We could merge it as is, but I believe the code can be made much simpler

// Sets VPC DHCPOptions to a shorter lease and reconfigures servers via networkctl
// Waits for DHCP renewal and checks lease time
func (testCtx *VPCPeeringTestCtx) dhcpRenewalTest(ctx context.Context) (bool, []RevertFunc, error) {
servers, err := testCtx.getServersWithVPCAttachments(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what we're doing here - you've created a helper function where you are already fetching a bunch of things, but you are only returning the servers - and then you are fetching some of that stuff again in the caller. Can't we fetch everything that we need in the helper directly?

More in general, if the aim is to find any VPC that has some severs attached to it, there's code already to do that - see lines 1416-1460 in the static external test

slog.Info("Testing DHCP renewal", "servers", len(testServers), "vpc", testVPC.Name)

// Save original DHCP options for ALL subnets in the VPC
originalSubnetOptions := make(map[string]*vpcapi.VPCDHCPOptions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should be able to use the autogenerated deep copy functions, but also why can't we just update the lease time without overwriting the entire options? then we would just need to store the old lease time and replace it in the same way

continue
}

_, err = GetServerNetconfCmd(conn, subnet.VLAN, testCtx.setupOpts.HashPolicy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the purpose of this if we are not storing the command?

return fmt.Errorf("getting ssh config for server %s: %w", serverName, err)
}

_, _, err = ssh.Run(ctx, fmt.Sprintf("sudo networkctl reconfigure %s", ifName))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't know if this produces any stderr output in case of failure, but if so we could add it to the error message to ease troubleshooting. when in doubt I generally add it

return info, nil
}

func checkDHCPLeaseInRange(leaseInfo *DHCPLeaseInfo, expectedLease, tolerance int) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's already a checkDHCPLease function in the release test. Let's unify them if at all possible, this file is already huge as it is

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:+release Enable VLAB release tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants