netflow

Author	SHA1	Message	Date
Dominik Pataky	098acc1ae9	Fix tests by removing type hints	2023-08-19 19:22:43 +02:00
Dominik Pataky	942df30bf6	Release v0.12.1	2023-08-19 19:16:02 +02:00
Dominik Pataky	a829d428bf	Add test for IPFIX padding handling	2023-08-19 14:33:12 +02:00
Dominik Pataky	97c99f51b3	Fix type comparison from '==' to 'is'	2023-08-19 10:57:01 +02:00
Dominik Pataky	bb0ab89615	Squashed commit of branch feature/ipfix-padding: commit 63abf52ec640a019f8c45c1208f0dfb585641781 Padding: add offset!=length check to reduce safety check calls Adds another check when parsing a set. The check "offset != self.header.length" allows to skip the padding checks if the offset is the same as the length, not calling rest_is_padding_zeroes and wasting CPU time. commit 8d1cf9cac12c45c0af70591b646d898ba5c923fc Finish IPFIX padding handling Tested implementation of IPFIX set padding handling. Uses TK-Khaw's proposed no_padding_last_offset calculation, extended as modulo calculation to match multiple data set records. Tests were conducted by capturing live traffic on a test machine with tcpdump, then this capture file was read in by softflowd 1.1.0, with the collector.py as the export target. The exported IPFIX (v10) packets were then using both no padding and padding, so that tests could be validated. Closes #34 Signed-off-by: Dominik Pataky <software+pynetflow@dpataky.eu> commit 51ce4eaa268e4bda5be89e1d430477d12fc8a72c Fix and optimize padding calculation for IPFIX sets. Refs #34 commit 9d3c4135385ca9714b7631a0c5af46feb891a9fb Author: Khaw Teng Kang <tk.khaw@attrelogix.com> Date: Tue Jul 5 16:29:12 2022 +0800 Reverted changes to template_record, data_length is now computed using field length in template. Signed-off-by: Khaw Teng Kang <tk.khaw@attrelogix.com> commit 3c4f8e62892876d4a2d42288843890b97244df55 IPFIX: handle padding (zero bytes) in sets Adds a check to each IPFIX set ID branch, checking if the rest of the bytes in this set is padding/zeroes. Refs #34 Signed-off-by: Dominik Pataky <software+pynetflow@dpataky.eu>	2023-08-19 09:26:08 +02:00
Dominik Pataky	fe1d3df296	IPFIX: improve bitwise operation on enterprise flag bit	2022-12-03 09:33:24 +01:00
Dominik Pataky	073a212290	IPFIX: extend string field conversion to fallback to str() Closes #42	2022-12-03 09:11:34 +01:00
Paul Glaß	88f864036b	Fixed string conversion from bytes	2022-12-03 09:11:34 +01:00
GitOldGrumpy	affef1a972	Fix for clearing enteprise flag bit	2022-09-02 16:01:11 +01:00
Dominik Pataky	a94ad57f3e	IPFIX: fix usage of field data type (instead of name) Refs #40	2022-07-02 12:09:29 +02:00
Dominik Pataky	7ea24a900c	Small addition to grafolean fix (comments, endianness hint)	2022-07-02 11:50:18 +02:00
Anze	92b221aa10	Fix: f-strings might not be supported	2022-05-08 22:01:34 +02:00
Anze	1bffe3a2a3	Performance improvement: rearrange netflow v9 packet parsing (use struct.unpack to extract all of the values at once)	2022-05-08 18:31:06 +02:00
Anze	c12507343b	Performance improvement: no need to copy a part of the buffer when using struct.unpack_from()	2022-05-08 18:30:11 +02:00
Anze	77da7b16b6	Performance improvement: use struct.unpack instead of manually constructing bytes when possible	2022-05-08 17:54:05 +02:00
Anze	b10dc5faef	Performance improvement: rearrange code so that instead of converting IP addresses to integers first, we construct them from bytes directly	2022-05-08 17:52:51 +02:00
Anze	ef99464fc5	Performance improvement: when checking if a field contains an IP address, compare the keys (which are integers) instead of values (strings)	2022-05-08 17:51:56 +02:00
Dominik Pataky	8b5675913d	Small changes to PR #37 preventing infinite loops; bump version Closes #37	2022-04-25 20:26:04 +02:00
Vitali Sepetnitsky	b8e911a40a	avoid infinite loop in V9ExportPacket's constructor	2022-02-16 18:39:15 +02:00
Dominik Pataky	ab32ce93b5	Fix counters in options templates Counters in 4-packs used '/ 4' instead of '// 4', passing a float into range(), instead of int. Refs #30	2021-05-02 15:48:20 +02:00
Dominik Pataky	5adde00aec	Implement options templates/records handling for V9 Previously, option templates and their data records were not correctly recognized. This is fixed now. Collectors can now use the V9ExportPacket.options field to get a list of V9OptionsDataRecord, with scopes and data fields. Templates are mixed in the templates dict. They will have both data templates and option templates. Let's hope exporters do not mix them (re-use the same IDs for both template types). During development, the search for the correct template was refactored. The templates are not pased into the V9DataFlowSet any more. Only the one single matching template is passed into V9DataFlowSet and V9OptionsDataFlowset, as should be. Refs #30	2021-04-05 13:07:32 +02:00
Dominik Pataky	e43980fe4a	Add stub implementation to store V9 options templates This is a hacky workaround to handle V9 options templates, without implementing the full corresponding spec. This solves missing templates which raise a V9TemplateNotRecognized exception, even though an exporter might do everything correctly. Refs #29 Refs #30	2021-04-04 20:42:49 +02:00
Dominik Pataky	54e19af8c2	Adapt new V9OptionsTemplateFlowSet stub Resolves #29	2021-04-04 10:35:08 +02:00
Jonas Licht	5b823052f1	Stub parsing of option templates to can ignore option datasets	2021-03-26 16:46:27 +01:00
Dominik Pataky	06d7c0c5d0	Improve parse_packet documentation and error handling (exception) The parse_packet function is one of the main functions for usage of this library in other scripts. It works, but was under-documented until now. Especially the 'templates' parameter might lead to confusions for new users who have not yet worked with templates. This commit should make things clearer. Refs #28	2020-08-01 12:33:40 +02:00
Dominik Pataky	81d57f3c4c	Handle SIGINT and SIGTERM in yielding listener Signals INT and TERM were not correctly handled in the 'while True' loop of the yielding listener function. Now, the loop breaks as expected, terminating the listener thread and the application.	2020-08-01 10:46:35 +02:00
Dominik Pataky	5cdb514ffc	Ensure compatibility with Python 3.5.3 This commit replaces multiple occurences of new features which were not yet implemented with Python 3.5.3, which is the reference backwards compatibility version for this package. The version is based on the current Python version in Debian Stretch (oldstable). According to pkgs.org, all other distros use 3.6+, so 3.5.3 is the lower boundary. Changes: * Add maxsize argument to functools.lru_cache decorator * Replace f"" with .format() * Replace variable type hints "var: type = val" with "# type:" comments * Replace pstats.SortKey enum with strings in performance tests Additionally, various styling fixes were applied. The version compatibility was tested with tox, pyenv and Python 3.5.3, but there is no tox.ini yet which automates this test. Bump patch version number to 0.10.3 Update author's email address. Resolves #27	2020-04-24 16:52:25 +02:00
Dominik Pataky	5d1c5b8710	IPFIX: add template withdrawal handling; bump version to v0.10.2 Templates may be withdrawn as per RFC7011. Receiving a template with an existing template_id and a field_count of 0 now triggers deletion of this template.	2020-04-06 17:27:26 +02:00
Dominik Pataky	742f5a0a48	IPFIX: enhance (data\|field) types and parsing; extend tests Parts of the IPFIXFieldTypes class were extracted into the new IPFIXDataTypes class, to increase readability and stability. The IPFIXDataRecord class and its field parser is now more in tune with the specifications, handling signed and unsigned, as well as float, boolean and UTF8 strings etc. Corresponding tests were extended with softflowd packets (level "ethernet") and value checks (e.g. MAC address). Resolves #25	2020-04-06 17:02:52 +02:00
Dominik Pataky	405f9c6a67	IPFIX: replace IPFIX_FIELD_TYPES with class; handle signed In IPFIX, template fields can be signed or unsigned, or even be pure bytes or unicode string. This differentiation was extended in this commit. Additionally, the IPFIX_FIELD_TYPES dict mapping from int->str was replaced by a more verbose version, which also includes the standardized IANA data types. The class' methods provides access to the fixed data set. This is then used in the IPFIXDataRecord parser. Refs #25	2020-04-04 15:21:53 +02:00
Dominik Pataky	959f8d3c2c	Tests: add parameter store_packets to send_recv_packets The function send_recv_packets in tests stored all processed ExportPackets by default in a list. Memory usage tests were therefore based on this high amount of stored objects, since no instance of any ExportPacket was deleted until exit. With the new parameter store_packets the caller can define how many packets should be stored during receiving, as to test multiple scenarios. Three such scenarios are implemented: don't store any packet, store maximum of 500 at a time and store all packets. This comes much closer to the real world scenario of the collector, which uses a "for export in listener.get" loop, dumping any new ExportPacket to file immediatelly and then deleting the object. Yet, the case where all packets are stored must still be covered as well, because the collector might not be the only implementation which uses listener.get, so finding memory leaks should be covered.	2020-04-03 17:28:16 +02:00
Dominik Pataky	55272e8a0a	Fix analyzer test; IPFIX: change handling of 16 bytes fields Analyzer test was missing imports. IPFIX templates with 16 bytes fields were processed extra, since struct does not natively support conversion to int. The new implementation still handles it extra, but uses struct's "s" unpack format descriptor now.	2020-04-03 10:29:38 +02:00
Dominik Pataky	dfe0ffdcc7	IPFIX: adapt templates attribute handling to IPFIX as well	2020-04-01 14:14:47 +02:00
Dominik Pataky	143986c38d	Fix multi-exception catch in collector; make templates @property in v9 The collector should catch both v9 and IPFIX template errors - syntax error corrected. The v9 ExportPacket.templates attribute is now @property and read-only.	2020-04-01 14:12:27 +02:00
Dominik Pataky	56d443aa2a	Refactor tests, moved into tests/ The tests are now located in tests/. They are also split into multiple files, beginning with test_netflow and test_analyzer. The tests for IPFIX will be added to test_ipfix.	2020-04-01 11:55:45 +02:00
Dominik Pataky	4b8cbf92bc	IPFIX: implement field types of 16 bytes in parser Python struct does not natively support 16 byte fields. But since IPFIX uses fields of length 16 bytes for at least IPv6 addresses, they must be processed in the IPFIX parser. This commit adds support for 16 byte fields by handling them as special struct.unpack cases.	2020-04-01 11:34:34 +02:00
Dominik Pataky	d2e1bc8c83	IPFIX: reformat IANA field types dict (adding the data type)	2020-04-01 09:46:32 +02:00
Dominik Pataky	c3da0b2096	Adapt utils, collector, analyzer to IPFIX At differnt points in the tool set, NetFlow (v9) is set as the default case. Now that IPFIX is on its way to be supported as well, adapt all occurences where a differentiation must be done.	2020-03-31 22:47:23 +02:00
Dominik Pataky	937e640198	IPFIX: implement data records and template handling; add IANA types Second half of the IPFIX implementation now adds the support for data records. The templates are also extracted, allowing the collector to use them across exports. The field types were extracted from the IANA assignment list at https://www.iana.org/assignments/ipfix/ipfix-information-elements.csv Please note that the IPFIX implementation was made from scratch and differs from the NetFlow v9 implementation, as there was little copy/paste.	2020-03-31 22:45:58 +02:00
Dominik Pataky	524e411850	Add first approach of IPFIX implementation Adds a new module, IPFIX. The collector already recognizes version 10 in the header, meaning IPFIX. The parser is able to dissect the export package and all sets with their headers. Missing is the handling of the templates in the data sets - a feature needed for the whole parsing process to complete.	2020-03-31 20:58:15 +02:00
Dominik Pataky	0358c3416c	Fix logger in collector; fix header dates	2020-03-31 16:28:33 +02:00
Dominik Pataky	cd07885d28	Improve handling of mixed template/data exports; add test The collector is able to parse templates in an export and then use these templates to parse dataflows inside the same export packet. But the test implementation was based on the assumption, that the templates always arrive first in the packet. Now, a mixed order is also processed successfully. Test included.	2020-03-30 16:42:48 +02:00
Dominik Pataky	d4d6d59713	Provide parse_packet as API; fix parse_packet input handling; README To get closer to a stable package, netflow now offers the parse_packet function in its top-level __init__ file. This function was also enhanced to handle multiple input formats (str, bytes, hex bytes). Updated README accordingly.	2020-03-30 13:04:25 +02:00
Dominik Pataky	7ae179cb33	Reformat data flow attributes and unpacking; adapt tests The V1DataFlow and V5DataFlow classes used a verbose way of unpacking the hex byte stream to the specific fields until now. With this commit, both use a list of field names, one struct.unpack call and then a mapping for-loop for each field. Additionally the upper boundary of the passed data slice was added. With the self.__dict__.update() call all fields are now also accessible as direct attributes of the corresponding instance, e.g. flow.PROTO to access flow.data["PROTO"]. This works for flows of all three versions. The tests were adapted to reflect this new implementation.	2020-03-30 12:29:50 +02:00
Dominik Pataky	8b70fb1058	Fix to_dict() in headers; formatting The collector uses the .to_dict() function to persist the header in its gzipped output file. Now all headers implement this function.	2020-03-29 23:17:05 +02:00
Dominik Pataky	4a90e0ce34	Update README, bump minor version to v0.9.0	2020-03-29 22:34:30 +02:00
Dominik Pataky	abce1f57dd	Move collector and analyzer into the package, callable via CLI Beginning with this commit, the reference implementations of the collector and analyzer are now included in the package. They are callable by running `python3 -m netflow.collector` or `.analyzer`, with the same flags as before. Use `-h` to list them. Additional fixes are contained in this commit as well, e.g. adding more version prefixes and moving parts of code from __init__ to utils, to fix circular imports.	2020-03-29 22:14:45 +02:00
Dominik Pataky	e8073013c1	Rename classes in v1, v5 and v9 according to version Until now, every NetFlow version file used similar names for their classes, e.g. "Header". These are now prefixed with their respective version, e.g. "V1Header", to avoid confusion in imports etc.	2020-03-29 19:49:57 +02:00
Dominik Pataky	5fd4e9bd24	Update README/setup.py; add .json property to v9 header for export The README and setup.py were adapted to the current state, preparing for PyPI upload and package info. In v9, the header received an additional .json property, which exports the header as a dict to allow JSON serialization in the export file. This export is used in main.py	2020-03-29 18:06:52 +02:00
Dominik Pataky	61439ec6ef	Improve analyzer (handling of pairs, dropping noise) Previously, the analyzer assumed that two consecutive flows would be a pair. This proved unreliable, therefore a new comparison algorithm is ussed. It utilizes the IP addresses and the 'first_switched' parameter to identify two flows of the same connection. More improvements can be done, especially filtering and in the identification of the initiating peer. Tests still fail, have to be adapted to the new dicts and gzip.	2019-11-03 15:58:40 +01:00

1 2

54 commits