Skip to content

Conversation

@opswill
Copy link
Contributor

@opswill opswill commented Oct 29, 2025

Change summary

pppoe interfaces may lose QoS config on reconnect. This PR reapplies the QoS policy when the interface comes up.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes)
  • Migration from an old Vyatta component to vyos-1x, please link to related PR inside obsoleted component
  • Other (please describe):

Related Task(s)

https://vyos.dev/T7965

Related PR(s)

How to test / Smoketest result

After the server-side disconnection,and when the pppoe0 link is up:

vyos@vyos:/etc/ppp/ip-up.d$ show qos cake interface pppoe0
qdisc cake 1: root refcnt 2 bandwidth 60Mbit diffserv3 flows nat nowash no-ack-filter split-gso rtt 100ms raw overhead 0
 Sent 666822251 bytes 3329477 pkt (dropped 16, overlimits 396425 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 474624b of 4Mb
 capacity estimate: 60Mbit
 min/max network layer size:           29 /    1480
 min/max overhead-adjusted size:       29 /    1480
 average network hdr offset:            0

                   Bulk  Best Effort        Voice
  thresh       3750Kbit       60Mbit       15Mbit
  target            5ms          5ms          5ms
  interval        100ms        100ms        100ms
  pk_delay          0us        162us         10us
  av_delay          0us         15us          3us
  sp_delay          0us          1us          2us
  backlog            0b           0b           0b
  pkts                0      3326153         3340
  bytes               0    666668856       173760
  way_inds            0       247740            0
  way_miss            0       179041            7
  way_cols            0            0            0
  drops               0           16            0
  marks               0            0            0
  ack_drop            0            0            0
  sp_flows            0            3            0
  bk_flows            0            1            0
  un_flows            0            0            0
  max_len             0        21600           60
  quantum           300         1514          457

Checklist:

  • I have read the CONTRIBUTING document
  • I have linked this PR to one or more Phabricator Task(s)
  • I have run the components SMOKETESTS if applicable
  • My commit headlines contain a valid Task id
  • My change requires a change to the documentation
  • I have updated the documentation accordingly

@github-actions
Copy link

github-actions bot commented Oct 29, 2025


PR title does not match the required format

@github-actions
Copy link

github-actions bot commented Oct 29, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@opswill
Copy link
Contributor Author

opswill commented Oct 29, 2025

I have read the CLA Document and I hereby sign the CLA

vyosbot added a commit to vyos/vyos-cla-signatures that referenced this pull request Oct 29, 2025
Copy link
Member

@c-po c-po left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File is named 99-vyos-ppp-qos and thus executed prior to 99-vyos-pppoe-callback - please rename it to 99-vyos-pppoe-qos

@opswill
Copy link
Contributor Author

opswill commented Oct 29, 2025

@c-po I have renamed the file name

Copy link
Member

@c-po c-po left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the file permissions to be executable.

Call QoS CLI helper script when PPPoE interface reconnected and QoS was configured on interface.

@c-po c-po added bp/sagitta Create automatic backport for sagitta LTS version bp/circinus Create automatic backport for circinus labels Oct 29, 2025
Copy link
Member

@sever-sever sever-sever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-apply QoS config for PPP interface
I have not tested it

# Check if QoS is configured for the interface
qos_path = ['qos', 'interface', interface]
if conf.exists(qos_path):
call('/usr/libexec/vyos/conf_mode/qos.py')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looping in @jestabro - Do you think this is the best way to call the conf script from here?

@sever-sever sever-sever self-requested a review November 4, 2025 15:43
Copy link
Contributor

@jestabro jestabro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not considered for merge as is, for performance reasons: (1) at the very least, it should not be a python script (2) revisit other approaches.

@sever-sever sever-sever marked this pull request as draft November 6, 2025 16:43
Copy link
Member

@c-po c-po left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @opswill,

after an internal review with all sub-teams, I need to withdraw my initial approval of this PR. While your solution works nicely for client connections, it poses serious scalability issues when VyOS operates as a BRAS.

The main concern is that this helper is implemented in Python. Each time a PPP interface is created by the kernel - which includes both PPPoE client interfaces and PPP subscriber interfaces when VyOS acts as a BRAS - the system incurs the Python interpreter startup overhead.

To put this in perspective: assuming a Python interpreter startup time of around one second, a scenario with 1000 subscribers connecting over 60 seconds would add roughly 1000 seconds of cumulative overhead. That translates to the control plane being overloaded just handling connection events, severely impacting performance and subscriber experience.

When @jestabro mentioned “other approaches,” he was referring to a possible short-term mitigation: implementing this helper as a Bash script instead. The script could quickly exit if the interface name does not start with pppoe, so PPP subscriber interfaces would only incur the lightweight Bash startup penalty, avoiding the heavy Python startup cost.

In the longer term, @jestabro and I have discussed the idea of developing a vyos-netlinkd daemon. Such a background daemon could listen for NETLINK messages from the kernel and handle interface creation events efficiently. Since the Python interpreter would already be running, the overhead per event would be negligible. The issue you’re addressing here would be a strong use case for that daemon approach.

I hope this clarifies the reasoning behind our change of stance. It would be great if you could adapt this helper into a shell script that conditionally calls the Python helper only when the interface name begins with pppoe. That would provide an immediate fix and keep the design future-proof for when vyos-netlinkd becomes available.

@github-actions github-actions bot added the rebase label Nov 7, 2025
@github-actions
Copy link

github-actions bot commented Nov 7, 2025

CI integration 👍 passed!

Details

CI logs

  • CLI Smoketests (no interfaces) 👍 passed
  • CLI Smoketests VPP ❌ failed
  • CLI Smoketests (interfaces only) 👍 passed
  • Config tests 👍 passed
  • Config tests VPP 👍 passed
  • RAID1 tests 👍 passed
  • TPM tests 👍 passed

@opswill
Copy link
Contributor Author

opswill commented Nov 7, 2025

Hi @c-po
Thank you for the detailed explanation and for clarifying the reasoning behind withdrawing the initial approval. I appreciate you providing the global context regarding the severe BRAS scalability issues.

My primary intent was simply to fix the specific problem of lost QoS configuration, and I agree that the hook script approach may not be the optimal long-term solution.

I also noticed that there are already several other Python scripts present in /etc/ppp/ip-up.d/. Based on my understanding, all scripts in this directory will execute upon PPP dial/reconnect, meaning they likely also contribute to the existing performance overhead.

To implement the short-term mitigation you suggested, I have submitted another PR #4837 to use a Bash script to conditionally call the Python helper, which should significantly improve performance in the interim.

I look forward to your team's final solution with the vyos-netlinkd daemon.

@opswill opswill closed this Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bp/circinus Create automatic backport for circinus bp/sagitta Create automatic backport for sagitta LTS version current rebase

Development

Successfully merging this pull request may close these issues.

5 participants