This repository was archived by the owner on Oct 16, 2020. It is now read-only.
-
Couldn't load subscription status.
- Fork 26
This repository was archived by the owner on Oct 16, 2020. It is now read-only.
etcd v2 failing after update to 1298.5.0 stable #1838
Copy link
Copy link
Open
Description
Bug
Container Linux Version
1298.5.0 stable
Expected Behavior
The etcd cluster works after an update.
Actual Behavior
The etcd cluster reports a failure immediately after the update to 1298.5.0 stable.
etcd cluster is unavilable or misconfigured - error from locksmith
Other Information
Logs from IRC, where this was reported:
hi all. has anyone seen any regressions in etcd with the latest stable release?
5:38 PM we've been having massive problems across our cluster these past 24 hours or so
5:39 PM various things that depend on etcd are exhibiting odd behaviour
5:39 PM fleet has inexplicably shut down all managed units when it timed out talking to etcd proxy
5:39 PM locksmith appeared to fail and allowed two nodes in a group to reboot
5:39 PM at the same time
5:40 PM "etcd cluster is unavilable or misconfigured" - error from locksmith
5:40 PM we don't know if this is a regression or if we've reached some kind of scale limit with our 5-node etcdv2 cluster
5:41 PM certainly nothing has changed config-wise with etcd. cluster has been amazingly stable until now
5:42 PM etcd may be a red herring here because etcd2 has been at the same version for so long
5:42 PM it could be a kernel bug affecting networking
5:43 PM one thing we noted was that some nodes in the cluster updated to the latest -stable in just the last few hours
so we were running a mix of releases for a while
5:44 PM I think etcd *could* be a red herring (etcd2 has been on the same version for so long now)
5:45 PM it could be a kernel bug...just spitballing here tho
5:45 PM we had the previous stable release on some nodes up until just now