Commit graph

99 commits

Author SHA1 Message Date
Dominik Pataky 7ea24a900c Small addition to grafolean fix (comments, endianness hint) 2022-07-02 11:50:18 +02:00
Anze 92b221aa10 Fix: f-strings might not be supported 2022-05-08 22:01:34 +02:00
Anze 1bffe3a2a3 Performance improvement: rearrange netflow v9 packet parsing (use struct.unpack to extract all of the values at once) 2022-05-08 18:31:06 +02:00
Anze c12507343b Performance improvement: no need to copy a part of the buffer when using struct.unpack_from() 2022-05-08 18:30:11 +02:00
Anze 77da7b16b6 Performance improvement: use struct.unpack instead of manually constructing bytes when possible 2022-05-08 17:54:05 +02:00
Anze b10dc5faef Performance improvement: rearrange code so that instead of converting IP addresses to integers first, we construct them from bytes directly 2022-05-08 17:52:51 +02:00
Anze ef99464fc5 Performance improvement: when checking if a field contains an IP address, compare the keys (which are integers) instead of values (strings) 2022-05-08 17:51:56 +02:00
Dominik Pataky 0e24ad9e64 Merge branch 'medigateio-fix/avoid-infinite-loop-in-V9ExportPacket-constructor' 2022-04-25 20:29:59 +02:00
Dominik Pataky 8b5675913d Small changes to PR #37 preventing infinite loops; bump version
Closes #37
2022-04-25 20:26:04 +02:00
Vitali Sepetnitsky b8e911a40a avoid infinite loop in V9ExportPacket's constructor 2022-02-16 18:39:15 +02:00
Dominik Pataky 87c1bfb892 Release v0.11.0; adds Netflow v9 option headers 2021-11-14 17:53:27 +01:00
cookie a86fe7c731
Merge pull request #35 from bitkeks/add_v9_options
Add v9 options
2021-11-14 17:45:11 +01:00
Dominik Pataky 3b207c3568 Update README 2021-05-02 16:15:38 +02:00
Dominik Pataky ab32ce93b5 Fix counters in options templates
Counters in 4-packs used '/ 4' instead of '// 4', passing a float into
range(), instead of int.

Refs #30
2021-05-02 15:48:20 +02:00
Dominik Pataky 5adde00aec Implement options templates/records handling for V9
Previously, option templates and their data records were not correctly
recognized. This is fixed now. Collectors can now use the
V9ExportPacket.options field to get a list of V9OptionsDataRecord, with
scopes and data fields.

Templates are mixed in the templates dict. They will have both data
templates and option templates. Let's hope exporters do not mix them
(re-use the same IDs for both template types).

During development, the search for the correct template was refactored.
The templates are not pased into the V9DataFlowSet any more. Only the
one single matching template is passed into V9DataFlowSet and
V9OptionsDataFlowset, as should be.

Refs #30
2021-04-05 13:07:32 +02:00
Dominik Pataky e43980fe4a Add stub implementation to store V9 options templates
This is a hacky workaround to handle V9 options templates, without
implementing the full corresponding spec. This solves missing templates
which raise a V9TemplateNotRecognized exception, even though an exporter
might do everything correctly.

Refs #29
Refs #30
2021-04-04 20:42:49 +02:00
Dominik Pataky 3f62e4a163 Merge branch 'j-licht-master' 2021-04-04 10:54:02 +02:00
Dominik Pataky 54e19af8c2 Adapt new V9OptionsTemplateFlowSet stub
Resolves #29
2021-04-04 10:35:08 +02:00
cookie fcddb49a6a Update run_tests.yml
Add pyenv action to support Python 3.5.3
2021-04-04 10:15:40 +02:00
cookie 3981721900 Create run_tests.yml 2021-04-04 10:15:40 +02:00
cookie 699ec116a4
Update run_tests.yml
Add pyenv action to support Python 3.5.3
2021-04-04 07:23:18 +00:00
cookie 536277eac6
Create run_tests.yml 2021-04-04 07:09:25 +00:00
Jonas Licht 5b823052f1 Stub parsing of option templates to can ignore option datasets 2021-03-26 16:46:27 +01:00
Dominik Pataky 06d7c0c5d0 Improve parse_packet documentation and error handling (exception)
The parse_packet function is one of the main functions for usage of this
library in other scripts. It works, but was under-documented until now.
Especially the 'templates' parameter might lead to confusions for new
users who have not yet worked with templates. This commit should make
things clearer.

Refs #28
2020-08-01 12:33:40 +02:00
Dominik Pataky 81d57f3c4c Handle SIGINT and SIGTERM in yielding listener
Signals INT and TERM were not correctly handled in the 'while True' loop
of the yielding listener function. Now, the loop breaks as expected,
terminating the listener thread and the application.
2020-08-01 10:46:35 +02:00
Dominik Pataky 5cdb514ffc Ensure compatibility with Python 3.5.3
This commit replaces multiple occurences of new features which were not
yet implemented with Python 3.5.3, which is the reference backwards
compatibility version for this package. The version is based on the
current Python version in Debian Stretch (oldstable). According to
pkgs.org, all other distros use 3.6+, so 3.5.3 is the lower boundary.

Changes:
  * Add maxsize argument to functools.lru_cache decorator
  * Replace f"" with .format()
  * Replace variable type hints "var: type = val" with "# type:" comments
  * Replace pstats.SortKey enum with strings in performance tests

Additionally, various styling fixes were applied.
The version compatibility was tested with tox, pyenv and Python 3.5.3,
but there is no tox.ini yet which automates this test.

Bump patch version number to 0.10.3
Update author's email address.

Resolves #27
2020-04-24 16:52:25 +02:00
Dominik Pataky 5d1c5b8710 IPFIX: add template withdrawal handling; bump version to v0.10.2
Templates may be withdrawn as per RFC7011. Receiving a template with an
existing template_id and a field_count of 0 now triggers deletion of
this template.
2020-04-06 17:27:26 +02:00
Dominik Pataky 742f5a0a48 IPFIX: enhance (data|field) types and parsing; extend tests
Parts of the IPFIXFieldTypes class were extracted into the new
IPFIXDataTypes class, to increase readability and stability.

The IPFIXDataRecord class and its field parser is now more in tune with
the specifications, handling signed and unsigned, as well as float,
boolean and UTF8 strings etc.

Corresponding tests were extended with softflowd packets (level
"ethernet") and value checks (e.g. MAC address).

Resolves #25
2020-04-06 17:02:52 +02:00
Dominik Pataky 405f9c6a67 IPFIX: replace IPFIX_FIELD_TYPES with class; handle signed
In IPFIX, template fields can be signed or unsigned, or even be pure
bytes or unicode string. This differentiation was extended in this
commit.

Additionally, the IPFIX_FIELD_TYPES dict mapping from int->str was
replaced by a more verbose version, which also includes the standardized
IANA data types. The class' methods provides access to the fixed data
set. This is then used in the IPFIXDataRecord parser.

Refs #25
2020-04-04 15:21:53 +02:00
Dominik Pataky f7a44852c3 Tests: add memory performance test for v1 and v5; bump version to 0.10.1 2020-04-04 10:58:06 +02:00
Dominik Pataky 959f8d3c2c Tests: add parameter store_packets to send_recv_packets
The function send_recv_packets in tests stored all processed
ExportPackets by default in a list. Memory usage tests were therefore
based on this high amount of stored objects, since no instance of any
ExportPacket was deleted until exit.
With the new parameter store_packets the caller can define how many
packets should be stored during receiving, as to test multiple
scenarios.

Three such scenarios are implemented: don't store any packet, store
maximum of 500 at a time and store all packets. This comes much closer
to the real world scenario of the collector, which uses a "for export in
listener.get" loop, dumping any new ExportPacket to file immediatelly
and then deleting the object.

Yet, the case where all packets are stored must still be covered as
well, because the collector might not be the only implementation which
uses listener.get, so finding memory leaks should be covered.
2020-04-03 17:28:16 +02:00
Dominik Pataky 53f8ca764e Tests: add memory performance tests
A new test file is added which contains memory and CPU tests. For now,
only the memory usage tests work (threading!). They print out tables of
memory usage based on file path and on function. Additionally, they check
some basic measurements: if all packets were processed and if a
collection of version 9/10 called any functions in 10/9.

Refs #24
2020-04-03 15:36:09 +02:00
Dominik Pataky 258b7c1e0b Tests: move packets into lib again, add packet generator
The static packets in the tests are back in lib.py to avoid circular
imports. A new packet generator function was added.
2020-04-03 15:20:41 +02:00
Dominik Pataky 55272e8a0a Fix analyzer test; IPFIX: change handling of 16 bytes fields
Analyzer test was missing imports.

IPFIX templates with 16 bytes fields were processed extra, since struct
does not natively support conversion to int. The new implementation
still handles it extra, but uses struct's "s" unpack format descriptor
now.
2020-04-03 10:29:38 +02:00
Dominik Pataky 27525887bd Update README to reflect IPFIX implementation; bump version to v0.10.0
Resolves #20
2020-04-01 14:40:21 +02:00
Dominik Pataky 547792c5c2 Tests: move packets into each version test file; add tests for IPFIX
The previously introduced tests/lib.py contained the NetFlow v9 packets
and then the IPFIX packets, those were split and put into their
respective test files again. The lib now contains shared objects only.

For IPFIX tests were added. Two new packets were added, one with
templates and one without (again, real exports from softflowd).
Different cases are checked: no template, template and later template.
Fields of flows are also checked, especially IPv6 addresses.

Note: exports made with softflowd were created by softflowd 1.0.0,
compiled from https://github.com/irino/softflowd
2020-04-01 14:15:53 +02:00
Dominik Pataky dfe0ffdcc7 IPFIX: adapt templates attribute handling to IPFIX as well 2020-04-01 14:14:47 +02:00
Dominik Pataky 143986c38d Fix multi-exception catch in collector; make templates @property in v9
The collector should catch both v9 and IPFIX template errors - syntax
error corrected. The v9 ExportPacket.templates attribute is now
@property and read-only.
2020-04-01 14:12:27 +02:00
Dominik Pataky 56d443aa2a Refactor tests, moved into tests/
The tests are now located in tests/. They are also split into multiple
files, beginning with test_netflow and test_analyzer. The tests for
IPFIX will be added to test_ipfix.
2020-04-01 11:55:45 +02:00
Dominik Pataky 4b8cbf92bc IPFIX: implement field types of 16 bytes in parser
Python struct does not natively support 16 byte fields. But since IPFIX
uses fields of length 16 bytes for at least IPv6 addresses, they must be
processed in the IPFIX parser. This commit adds support for 16 byte
fields by handling them as special struct.unpack cases.
2020-04-01 11:34:34 +02:00
Dominik Pataky d2e1bc8c83 IPFIX: reformat IANA field types dict (adding the data type) 2020-04-01 09:46:32 +02:00
Dominik Pataky c3da0b2096 Adapt utils, collector, analyzer to IPFIX
At differnt points in the tool set, NetFlow (v9) is set as the default
case. Now that IPFIX is on its way to be supported as well, adapt all
occurences where a differentiation must be done.
2020-03-31 22:47:23 +02:00
Dominik Pataky 937e640198 IPFIX: implement data records and template handling; add IANA types
Second half of the IPFIX implementation now adds the support for data
records. The templates are also extracted, allowing the collector to use
them across exports.

The field types were extracted from the IANA assignment list at
https://www.iana.org/assignments/ipfix/ipfix-information-elements.csv

Please note that the IPFIX implementation was made from scratch and
differs from the NetFlow v9 implementation, as there was little
copy/paste.
2020-03-31 22:45:58 +02:00
Dominik Pataky 524e411850 Add first approach of IPFIX implementation
Adds a new module, IPFIX. The collector already recognizes version 10 in
the header, meaning IPFIX. The parser is able to dissect the export
package and all sets with their headers.

Missing is the handling of the templates in the data sets - a feature
needed for the whole parsing process to complete.
2020-03-31 20:58:15 +02:00
Dominik Pataky 0358c3416c Fix logger in collector; fix header dates 2020-03-31 16:28:33 +02:00
Dominik Pataky cd07885d28 Improve handling of mixed template/data exports; add test
The collector is able to parse templates in an export and then use these
templates to parse dataflows inside the same export packet. But the test
implementation was based on the assumption, that the templates always
arrive first in the packet. Now, a mixed order is also processed
successfully. Test included.
2020-03-30 16:42:48 +02:00
Dominik Pataky d4d6d59713 Provide parse_packet as API; fix parse_packet input handling; README
To get closer to a stable package, netflow now offers the parse_packet
function in its top-level __init__ file. This function was also enhanced
to handle multiple input formats (str, bytes, hex bytes).

Updated README accordingly.
2020-03-30 13:04:25 +02:00
Dominik Pataky 7ae179cb33 Reformat data flow attributes and unpacking; adapt tests
The V1DataFlow and V5DataFlow classes used a verbose way of unpacking
the hex byte stream to the specific fields until now. With this commit,
both use a list of field names, one struct.unpack call and then a
mapping for-loop for each field.

Additionally the upper boundary of the passed data slice was added.

With the self.__dict__.update() call all fields are now also accessible
as direct attributes of the corresponding instance, e.g. flow.PROTO to
access flow.data["PROTO"]. This works for flows of all three versions.

The tests were adapted to reflect this new implementation.
2020-03-30 12:29:50 +02:00
Dominik Pataky 8b70fb1058 Fix to_dict() in headers; formatting
The collector uses the .to_dict() function to persist the header in its
gzipped output file. Now all headers implement this function.
2020-03-29 23:17:05 +02:00
Dominik Pataky 4a90e0ce34 Update README, bump minor version to v0.9.0 2020-03-29 22:34:30 +02:00