netflow

Author	SHA1	Message	Date
Dominik Pataky	7ea24a900c	Small addition to grafolean fix (comments, endianness hint)	2022-07-02 11:50:18 +02:00
Anze	92b221aa10	Fix: f-strings might not be supported	2022-05-08 22:01:34 +02:00
Anze	1bffe3a2a3	Performance improvement: rearrange netflow v9 packet parsing (use struct.unpack to extract all of the values at once)	2022-05-08 18:31:06 +02:00
Anze	c12507343b	Performance improvement: no need to copy a part of the buffer when using struct.unpack_from()	2022-05-08 18:30:11 +02:00
Anze	77da7b16b6	Performance improvement: use struct.unpack instead of manually constructing bytes when possible	2022-05-08 17:54:05 +02:00
Anze	b10dc5faef	Performance improvement: rearrange code so that instead of converting IP addresses to integers first, we construct them from bytes directly	2022-05-08 17:52:51 +02:00
Anze	ef99464fc5	Performance improvement: when checking if a field contains an IP address, compare the keys (which are integers) instead of values (strings)	2022-05-08 17:51:56 +02:00
Dominik Pataky	8b5675913d	Small changes to PR #37 preventing infinite loops; bump version Closes #37	2022-04-25 20:26:04 +02:00
Vitali Sepetnitsky	b8e911a40a	avoid infinite loop in V9ExportPacket's constructor	2022-02-16 18:39:15 +02:00
Dominik Pataky	ab32ce93b5	Fix counters in options templates Counters in 4-packs used '/ 4' instead of '// 4', passing a float into range(), instead of int. Refs #30	2021-05-02 15:48:20 +02:00
Dominik Pataky	5adde00aec	Implement options templates/records handling for V9 Previously, option templates and their data records were not correctly recognized. This is fixed now. Collectors can now use the V9ExportPacket.options field to get a list of V9OptionsDataRecord, with scopes and data fields. Templates are mixed in the templates dict. They will have both data templates and option templates. Let's hope exporters do not mix them (re-use the same IDs for both template types). During development, the search for the correct template was refactored. The templates are not pased into the V9DataFlowSet any more. Only the one single matching template is passed into V9DataFlowSet and V9OptionsDataFlowset, as should be. Refs #30	2021-04-05 13:07:32 +02:00
Dominik Pataky	e43980fe4a	Add stub implementation to store V9 options templates This is a hacky workaround to handle V9 options templates, without implementing the full corresponding spec. This solves missing templates which raise a V9TemplateNotRecognized exception, even though an exporter might do everything correctly. Refs #29 Refs #30	2021-04-04 20:42:49 +02:00
Dominik Pataky	54e19af8c2	Adapt new V9OptionsTemplateFlowSet stub Resolves #29	2021-04-04 10:35:08 +02:00
Jonas Licht	5b823052f1	Stub parsing of option templates to can ignore option datasets	2021-03-26 16:46:27 +01:00
Dominik Pataky	5cdb514ffc	Ensure compatibility with Python 3.5.3 This commit replaces multiple occurences of new features which were not yet implemented with Python 3.5.3, which is the reference backwards compatibility version for this package. The version is based on the current Python version in Debian Stretch (oldstable). According to pkgs.org, all other distros use 3.6+, so 3.5.3 is the lower boundary. Changes: * Add maxsize argument to functools.lru_cache decorator * Replace f"" with .format() * Replace variable type hints "var: type = val" with "# type:" comments * Replace pstats.SortKey enum with strings in performance tests Additionally, various styling fixes were applied. The version compatibility was tested with tox, pyenv and Python 3.5.3, but there is no tox.ini yet which automates this test. Bump patch version number to 0.10.3 Update author's email address. Resolves #27	2020-04-24 16:52:25 +02:00
Dominik Pataky	143986c38d	Fix multi-exception catch in collector; make templates @property in v9 The collector should catch both v9 and IPFIX template errors - syntax error corrected. The v9 ExportPacket.templates attribute is now @property and read-only.	2020-04-01 14:12:27 +02:00
Dominik Pataky	c3da0b2096	Adapt utils, collector, analyzer to IPFIX At differnt points in the tool set, NetFlow (v9) is set as the default case. Now that IPFIX is on its way to be supported as well, adapt all occurences where a differentiation must be done.	2020-03-31 22:47:23 +02:00
Dominik Pataky	0358c3416c	Fix logger in collector; fix header dates	2020-03-31 16:28:33 +02:00
Dominik Pataky	cd07885d28	Improve handling of mixed template/data exports; add test The collector is able to parse templates in an export and then use these templates to parse dataflows inside the same export packet. But the test implementation was based on the assumption, that the templates always arrive first in the packet. Now, a mixed order is also processed successfully. Test included.	2020-03-30 16:42:48 +02:00
Dominik Pataky	7ae179cb33	Reformat data flow attributes and unpacking; adapt tests The V1DataFlow and V5DataFlow classes used a verbose way of unpacking the hex byte stream to the specific fields until now. With this commit, both use a list of field names, one struct.unpack call and then a mapping for-loop for each field. Additionally the upper boundary of the passed data slice was added. With the self.__dict__.update() call all fields are now also accessible as direct attributes of the corresponding instance, e.g. flow.PROTO to access flow.data["PROTO"]. This works for flows of all three versions. The tests were adapted to reflect this new implementation.	2020-03-30 12:29:50 +02:00
Dominik Pataky	8b70fb1058	Fix to_dict() in headers; formatting The collector uses the .to_dict() function to persist the header in its gzipped output file. Now all headers implement this function.	2020-03-29 23:17:05 +02:00
Dominik Pataky	abce1f57dd	Move collector and analyzer into the package, callable via CLI Beginning with this commit, the reference implementations of the collector and analyzer are now included in the package. They are callable by running `python3 -m netflow.collector` or `.analyzer`, with the same flags as before. Use `-h` to list them. Additional fixes are contained in this commit as well, e.g. adding more version prefixes and moving parts of code from __init__ to utils, to fix circular imports.	2020-03-29 22:14:45 +02:00
Dominik Pataky	e8073013c1	Rename classes in v1, v5 and v9 according to version Until now, every NetFlow version file used similar names for their classes, e.g. "Header". These are now prefixed with their respective version, e.g. "V1Header", to avoid confusion in imports etc.	2020-03-29 19:49:57 +02:00
Dominik Pataky	5fd4e9bd24	Update README/setup.py; add .json property to v9 header for export The README and setup.py were adapted to the current state, preparing for PyPI upload and package info. In v9, the header received an additional .json property, which exports the header as a dict to allow JSON serialization in the export file. This export is used in main.py	2020-03-29 18:06:52 +02:00
Dominik Pataky	61439ec6ef	Improve analyzer (handling of pairs, dropping noise) Previously, the analyzer assumed that two consecutive flows would be a pair. This proved unreliable, therefore a new comparison algorithm is ussed. It utilizes the IP addresses and the 'first_switched' parameter to identify two flows of the same connection. More improvements can be done, especially filtering and in the identification of the initiating peer. Tests still fail, have to be adapted to the new dicts and gzip.	2019-11-03 15:58:40 +01:00
Dominik Pataky	1646a52f17	Store IP addresses (v4 + v6) as strings rather than ints As mentioned by @pR0Ps in `6b9d20c8a6/analyze_json.py (L83)` IP addresses, especially in IPv6, should better be stored as parsed strings instead of their raw integer values. Implemented.	2019-11-03 13:35:32 +01:00
Carey Metcalfe	96817f1f8d	Add support for v1 and v5 NetFlow packets Thanks to @alieissa for the initial v1 and v5 code	2019-10-16 23:46:32 -04:00
Carey Metcalfe	ef151f8d28	Improve collector script and restructure code - Moved the netflow library out of the src directory - The UDP listener was restructured so that multiple threads can receive packets and push them into a queue. The main thread then pulls the packets off the queue one at a time and processes them. This means that the collector will never drop a packet because it was blocked on processing the previous one. - Adds a property to the ExportPacket class to expose if any new templates are contained in it. - The collector will now only retry parsing past packets when a new template is found. Also refactored the retry logic a bit to remove duplicate code (retrying just pushes the packets back into the main queue to be processed again like all the other packets). - The collector no longer continually reads and writes to/from the disk. It just caches the data in memory until it exits instead.	2019-10-16 23:31:39 -04:00

28 commits