-
Notifications
You must be signed in to change notification settings - Fork 17
Improved NetFlowV9 support #21
Conversation
much of this is still wip
currently fixing tests
aggregating the data flowsets does not work, because all records are based on the packet's timestamp to simplify parsing in the codec, the aggregator now collects all templates/option templates into a protobuf and adds received and buffered data flows to be parsed. the netflow packets are preserved completely and can also contain templates. the codec will not use them, though, only the aggregator does
| } | ||
| } | ||
|
|
||
| private void queueBufferedPackets(Set<TemplateKey> templates, Set<ChannelBuffer> packetsToSend, TemplateKey templateKey) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually requires checking all templates used by a packet, so this code will change soon and is not the final version.
the codec-aggregator is null in super class, so that the put call added it after the raw-message handler thus the code never ran and screwed up parsing
the previous implementation only checked for a single template id to be present for each packet, which in general is wrong if not all templates arrive at the same time (which might happen for large numbers of active templates) the new implementation manually checks each packet's template requirements agains the ids of received templates, for the current remoteaddress/source id combination
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am seeing a field nf_nf_field_153 when ingesting v9 via pmacctd. Is this a dynamically generated field? Also, the nf prefix is duplicated.
|
Ah shit. Yeah the prefix is broken now.
I'll fix it tomorrow
…On Aug 24, 2017 7:59 PM, "Bernd Ahlers" ***@***.***> wrote:
***@***.**** requested changes on this pull request.
I am seeing a field nf_nf_field_153 when ingesting v9 via pmacctd. Is
this a dynamically generated field? Also, the nf prefix is duplicated.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#21 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADLLnGSYWdEbHckJWhd79yvEP6DUAC7ks5sbboMgaJpZM4O9ibc>
.
|
|
I also saw this. Can this still happen? I thought we are waiting until we get a template before we pass on the packet. |
|
That's weird. Do you have a config that causes this? |
|
The field 153 is because our default field definition list is missing the type. It is "flow end milliseconds" of https://www.iana.org/assignments/ipfix/ipfix.xhtml |
I would strongly advise against this, as IPFIX ("NetFlow version 10") has some incompatible fields with NetFlow version 9. For example, id 1 is "octetDeltaCount" in IPFIX (8 bytes), while it's the number of incoming bytes in NetFlow 9 (4 bytes) (see http://netflow.caligare.com/netflow_v9.htm). |
fixes double prefixing of fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
* move template caching and flow buffering in custom message aggregator much of this is still wip * remove template cache * wip for codec aggregation and parsing custom format of v9 currently fixing tests * wip * migrate tests * fix v9 parsing by preserving the complete packets during buffering aggregating the data flowsets does not work, because all records are based on the packet's timestamp to simplify parsing in the codec, the aggregator now collects all templates/option templates into a protobuf and adds received and buffered data flows to be parsed. the netflow packets are preserved completely and can also contain templates. the codec will not use them, though, only the aggregator does * remove duplicated license header * update protobuf comment * update comment * tweak license header to avoid diff * fix guice setup for transport * fix handler setup the codec-aggregator is null in super class, so that the put call added it after the raw-message handler thus the code never ran and screwed up parsing * change how the packet cache works the previous implementation only checked for a single template id to be present for each packet, which in general is wrong if not all templates arrive at the same time (which might happen for large numbers of active templates) the new implementation manually checks each packet's template requirements agains the ids of received templates, for the current remoteaddress/source id combination * prefix netflow v9 fields with nf_ * remove unused optional template atomic ref * don't prefix unknown fields names, that is done centrally now fixes double prefixing of fields * don't forget to fix test * add flow timestamp fields from ipfix * fix test after field definition list update (cherry picked from commit 4e9d68f)
|
I can confirm that #20 looks solved now: I am running rc5 and it's dealing with our full netflow configuration without any apparent problem since some hours. |
|
Hi everybody, i'm having troubles with an Invalid FlowVersion Exception (Invalid NetFlow version 0) when trying to log the flow of a netgear switch. Graylog v2.3.1+9f2c6ef And this is the server.log result: Thanks in advance !!!! |
|
@moadiv This plugin currently doesn't support SFlow, see #3. Please post questions about this plugin to our discussion forum or join the #graylog channel on freenode IRC. Thank you! |
This PR changes the way templates are handled.
Since for V9 template data flows are not sent with every packet, the implementation must buffer packets until it receives the necessary templates to know how to parse them. The same is true for the option template.
This implementation moves the buffering and template aggregation into a custom codec aggregator, so that the codec itself, which runs after journalling the message, can assume that it has all the templates it needs to successfully parse a packet. This is even more important when processing a journal after a restart.
The RFC requires not to write templates to disk (or otherwise store them independent of data flows), because they might change at any time. We therefore colocate them with the data itself, taking the performance hit of writing more bytes to disk, but with the benefit of a safer implementation.
Thus this implementation does not lose data in the case it doesn't have templates yet. Those are resent regularly by the exporter, for each observation domain.
To be compatible with Graylog 2.3, this change comes with a custom codec aggregator, for 3.0 we can migrate the code back into the server.
fixes #18
fixes #19
fixes #20