-
Notifications
You must be signed in to change notification settings - Fork 6
Fix concurrency of fibtable using thread-local cache of read handles #917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix concurrency of fibtable using thread-local cache of read handles #917
Conversation
c42743c to
202bdad
Compare
c772a0d to
79683ab
Compare
202bdad to
021f73f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes concurrency issues in the FIB table by implementing a thread-local cache for read handles, replacing the direct lookup mechanism. The primary change is transitioning from the previous concurrent access pattern to using left-right-tlcache for efficient thread-local caching of FIB readers.
Key changes:
- Replace
FibIdwithFibKeythroughout the codebase for better type semantics - Implement thread-local caching for FIB read handles using
left-right-tlcache - Simplify FIB table API by removing direct Arc returns and using factories instead
Reviewed Changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| routing/src/rib/vrftable.rs | Updated VRF table to use FibKey instead of FibId, simplified FIB access patterns |
| routing/src/fib/fibtype.rs | Renamed FibId to FibKey, added Hash trait and Identity implementation for thread-local caching |
| routing/src/fib/fibtable.rs | Major refactoring to use thread-local cache with FibReaderFactory pattern |
| routing/src/interfaces/*.rs | Updated interface attachment logic to use FibKey |
| left-right-tlcache/src/lib.rs | Extended cache with refresh and iterator functionality |
| dataplane/src/packet_processor/ipforward.rs | Simplified packet forwarding by using cached FIB readers |
| routing/src/errors.rs | Added new error types for FIB table access failures |
| if !matches!(id, FibKey::Id(_)) { | ||
| panic!("Attempting to set invalid Id of {} to fib", id); | ||
| } |
Copilot
AI
Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The panic message contains a formatting placeholder {} but the id parameter is not passed to the format string. This will cause a runtime panic. Should be: panic!(\"Attempting to set invalid Id of {id} to fib\");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move id to be inline here. Co-pilot is wrong about this causing a panic, but there is no reason not to inline this here.
routing/src/rib/vrftable.rs
Outdated
| assert_eq!( | ||
| fibtr.enter().unwrap().len(), | ||
| 1, | ||
| "only defaylt fib should remain" |
Copilot
AI
Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'defaylt' to 'default'
| "only defaylt fib should remain" | |
| "only default fib should remain" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address this comment too.
routing/src/rib/vrftable.rs
Outdated
| debug!("Vrf with Id {vrfid} no longer has a VNI associated"); | ||
| debug!("Vrf with Id {vrfid} no longer has a vni {vni} associated"); | ||
| } else { | ||
| debug!("Vrf {vrfid} has no vni configured") |
Copilot
AI
Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing semicolon at the end of the debug statement. Should be: debug!(\"Vrf {vrfid} has no vni configured\");
| debug!("Vrf {vrfid} has no vni configured") | |
| debug!("Vrf {vrfid} has no vni configured"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix this co-pilot comment.
79683ab to
eda6dab
Compare
A FibReaderFactory object allows creating FibReaders. Signed-off-by: Fredi Raspall <[email protected]>
021f73f to
39dcde0
Compare
mvachhar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the unsafe coerced pointer casts in favor of a checked conversion.
| Ok(()) | ||
| } | ||
| } | ||
| impl Display for FibTable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if the fields are owned, aren't they properly protected by left-right? And if so, can't this stay as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can, but we're not using it, so I removed that Display impl.
routing/src/rib/vrftable.rs
Outdated
| debug!("Vrf with Id {vrfid} no longer has a VNI associated"); | ||
| debug!("Vrf with Id {vrfid} no longer has a vni {vni} associated"); | ||
| } else { | ||
| debug!("Vrf {vrfid} has no vni configured") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix this co-pilot comment.
routing/src/rib/vrftable.rs
Outdated
| assert_eq!( | ||
| fibtr.enter().unwrap().len(), | ||
| 1, | ||
| "only defaylt fib should remain" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address this comment too.
This commit contains 3 changes:
1) Rename FibId -> FibKey and clarify usage. The old FibId is
used in two ways: one, to identify a Fib; another, to look up
fibs in the Fibtable. Rename it to FibKey since, while still
used as an Id, the main purpose is to act as a key.
Also, change some internal APIs to make sure that Fibs are
identified by a single FibKey (based on the vrf id).
2) Restructure FibTable: FibReaders (which wrap ReadHandles<Fib>)
cannot be shared among threads. Instead of storing a FibReader
for each Fib, store a FibReaderFactory from which readers
can be obtained on a per-thread basis. In this patch, every
time a Fib needs to be accessed, a new FibReader is created,
which is very inefficient due to the need for locking
when creating ReadHandles. Subsequent commits will fix this.
We also keep the Id of a Fib along with its factory. This is
needed to be able to efficiently implement the traits needed
for the readhandle cache. On this, note that:
- the fibtable could keep fibreaders as it did, since new
can be obtained via cloning. However, keeping instead
factories protects us from errorneously handing off a reference
to a FibReader to more than a thread.
- if we would store fibreaders (and clone them on demand),
we would not need to store the Id (since it could be obtained
entering the fib via the readhandle), but the number of
readers would be increased.
3) Elements in the new FibTable are stored within an Arc. This is
to ensure that the compiler automatically determines that
FibTable is Sync.
4) Avoid needing to derive/implement Clone() for FibTable by adding
a dummy sync_from() Absorb implementation, by publishing on
FibTable creation.
Signed-off-by: Fredi Raspall <[email protected]>
Signed-off-by: Fredi Raspall <[email protected]>
The thread-local cache of prior commit stores ReadHandle<Fib>. However, the rest of the code uses transparent wrapper FibReader, with additional methods. Implement AsRef so that we can have FibReaders with zero cost. Signed-off-by: Fredi Raspall <[email protected]>
Add left-right-tlcache as dependency to implement the fibtable cache. Implement trait Identity for Fib so that the thread-local cache of ReadHandle<Fib>s can determine if a read handle points to the right object or the entry needs to be invalidated. Signed-off-by: Fredi Raspall <[email protected]>
Versioning the fibtable will allow us to determine the validity of cached readhandles without needing to ask the provider, which would add an extra lookup (e.g. hash or similar), on every cache lookup. This patch adds the version for the FibTable and the logic to monotonically increment its value anytime it is updated. Note that version is u64 instead of AtomicU64 since we don't need to synchronize multiple threads on its value. Signed-off-by: Fredi Raspall <[email protected]>
Implement AsRef<ReadHandleFactory<Fib>> for FibReaderFactory so that it can be treated as a ReadHandleFactory<Fib>. Signed-off-by: Fredi Raspall <[email protected]>
Declare thread-local ReadHandleCache `FIBTABLE_CACHE`. Implement trait ReadHandleProvider for FibTable so that it can be polled by the cache, per thread. We could implement ReadHandleProvider for FibTableReader instead. However, implementing it for the FibTable is more general in that it would also work if the FibTable itself was not wrapped in left- right. Signed-off-by: Fredi Raspall <[email protected]>
... from Rc<ReadHandle<Fib>>. The cache stores ReadHandle<T>'s for some T. In the case of the FibTable, T=Fib and we want to hand off FibReaders (a thin wrapper of ReadHandle<Fib>) to the threads using the cache. Add a method to perform such a conversion. Signed-off-by: Fredi Raspall <[email protected]>
Add the main method to lookup FibReaders from a FibKey from the thread-local cache and a FibTableReader. The method will check the local cache to return FibReaders owned by the calling thread. On failure, it will pull data from provider (fibtable) accessible from the same FibTableReader. Signed-off-by: Fredi Raspall <[email protected]>
Let the ipforward NF look up FibReader from thread-local cache. Signed-off-by: Fredi Raspall <[email protected]>
Signed-off-by: Fredi Raspall <[email protected]>
Extend the thread-local cache with full refresh logic and iterator. This requires adding a new method to trait ReadHandleProvider. Signed-off-by: Fredi Raspall <[email protected]>
impl ReadHandleProvider get_iter() for FibTable. This is needed to allow the thread handling the cli to safely iterate over the fibs. Signed-off-by: Fredi Raspall <[email protected]>
The existing iterator implementation is subobtimal in that each call to get_iter() needs to request the provider a full iterator for the whole set of read handle factories, unless refresh arg is set to false, and then refresh the local cache with read handles built from each factory. This is inefficient in case the set of objects in the provider rarely changes (which may be the case in most setups). Avoid pulling an iterator from the provider if the version did not change, as this means that no objects have been added or removed. If anything, we remove unusable readers. Signed-off-by: Fredi Raspall <[email protected]>
Fix the code so that a cache keeps the minimal number of rhandles after a refresh. The previous code would keep an independent rhandle for aliases, meaning that if there are N objects each with one allias, 2 x N rhandles would exist after a refresh. Solve this by letting aliases use the same rhandle as their primary. Signed-off-by: Fredi Raspall <[email protected]>
Signed-off-by: Fredi Raspall <[email protected]>
Signed-off-by: Fredi Raspall <[email protected]>
39dcde0 to
9a9c275
Compare
c0accdb
into
pr/fredi/tl_left_right_readhandles
This rides on top of #903
Fixes #909
Note: this PR includes a couple of commits that extend the left-right readhandle thread-local cache with a refresh method and iterator that I thought would be needed to inspect the FIBs (e.g. from the CLI). In the end, those methods are not needed (currently), but I've decided to keep them even if not currently used. These extension require implementing a very simple additional trait method for the Fibtable.