-
Notifications
You must be signed in to change notification settings - Fork 212
Switch log line to metric for detecting hash collisions #786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 2 commits
4c97534
9352cfe
c5c21ae
2241869
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -16,6 +16,9 @@ | |
|
|
||
| package io.reactivex.mantis.network.push; | ||
|
|
||
| import io.mantisrx.common.metrics.Counter; | ||
| import io.mantisrx.common.metrics.Metrics; | ||
| import io.mantisrx.common.metrics.spectator.MetricGroupId; | ||
| import java.util.ArrayList; | ||
| import java.util.HashMap; | ||
| import java.util.List; | ||
|
|
@@ -37,12 +40,21 @@ public class ConsistentHashingRouter<K, V> extends Router<KeyValuePair<K, V>> { | |
| private static long validCacheAgeMSec = 5000; | ||
| private HashFunction hashFunction; | ||
| private AtomicReference<SnapshotCache<SortedMap<Long, AsyncConnection<KeyValuePair<K, V>>>>> cachedRingRef = new AtomicReference<>(); | ||
| private final MetricGroupId metricGroup = new MetricGroupId("ConsistentHashingRouter"); | ||
| private final Metrics metrics; | ||
| private final Counter collisionsCounter; | ||
|
|
||
| public ConsistentHashingRouter(String name, | ||
| Func1<KeyValuePair<K, V>, byte[]> dataEncoder, | ||
| HashFunction hashFunction) { | ||
| super("ConsistentHashingRouter_" + name, dataEncoder); | ||
| this.hashFunction = hashFunction; | ||
| this.metrics = new Metrics.Builder() | ||
| .id(metricGroup) | ||
| .addCounter("numHashCollisions") | ||
| .build(); | ||
|
|
||
| this.collisionsCounter = this.metrics.getCounter("numHashCollisions"); | ||
| } | ||
|
|
||
| @Override | ||
|
|
@@ -108,7 +120,7 @@ private void computeRing(Set<AsyncConnection<KeyValuePair<K, V>>> connections) { | |
| byte[] connectionBytes = (connectionId + "-" + i).getBytes(); | ||
| long hash = hashFunction.computeHash(connectionBytes); | ||
| if (ring.containsKey(hash)) { | ||
| logger.error("Hash collision when computing ring. {} hashed to a value already in the ring.", connectionId + "-" + i); | ||
| this.collisionsCounter.increment(); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it might still be nice to have logging for which connections are duplicates. Could we maybe track at the connection level (instead of the partition level) and emit one log line with all of the duplicate connections (not virtual nodes in the hash ring)
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a debug log that should come out in a structure like: I chose a debug log that we'll have to explicitly opt into because I expect the structure to be huge and potentially cause this pausing like behavior we've been seeing today.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh and @Override
public String toString() {
return "AsyncConnection [host=" + host + ", port=" + port
+ ", groupId=" + groupId + ", slotId=" + slotId + ", id=" + id
+ "]";
}
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, I think toString will be called by the logger lib so I don't need to call it myself |
||
| } | ||
| ring.put(hash, connection); | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don' t think this is ever going to get registered. The underlying
Routerimplementation has agetMetricsmethod that is used to register the metrics. You could override that here but then you would need to re-instantiatenumEventsRoutedandnumEventsProcessedto make sure those metrics are still sent (and probably use a metricsGroup like the parent class to avoid a breaking change).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you need to register with something like "MetricsRegistry.getInstance().registerAndGet"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gracias y'all, used the singleton to register the metrics!