Now that Mutex is implemented in std without boxing when futex is available, it might make sense to use them instead of parking_lot. There are some benchmarks hinting about better std performances.
It might also make testing using e.g. loom easier.
Is it worth a trial and a benchmark?