netflow

Author	SHA1	Message	Date
kudakwashe siziva	59652f7d2f	Update README.md Changed file extension from json to gz	2020-01-17 10:43:21 +02:00
Dominik Pataky	61439ec6ef	Improve analyzer (handling of pairs, dropping noise) Previously, the analyzer assumed that two consecutive flows would be a pair. This proved unreliable, therefore a new comparison algorithm is ussed. It utilizes the IP addresses and the 'first_switched' parameter to identify two flows of the same connection. More improvements can be done, especially filtering and in the identification of the initiating peer. Tests still fail, have to be adapted to the new dicts and gzip.	2019-11-03 15:58:40 +01:00
Dominik Pataky	eff99fc6e3	Add client info to stored data Until now, packets arriving at the collector's interface were stored by timestamp, with the exported flows in the payload. This format is now extended to also store the client's IP address and port, allowing multiple clients to export flows to the same collector instance.	2019-11-03 13:57:06 +01:00
Dominik Pataky	1646a52f17	Store IP addresses (v4 + v6) as strings rather than ints As mentioned by @pR0Ps in `6b9d20c8a6/analyze_json.py (L83)` IP addresses, especially in IPv6, should better be stored as parsed strings instead of their raw integer values. Implemented.	2019-11-03 13:35:32 +01:00
Dominik Pataky	6b9d20c8a6	Refactor storing data and writing to disk - using gzip and lines In previous versions, collected flows (parsed data) were stored in memory by the collector. In regular intervals, or at shutdown, this one single dict was dumped as JSON onto disk. With this commit, the behaviour is changed to line-based JSON dumps for each flow, gzipped onto disk for storage efficiency. The analyze_json is updated as well to handle the new gzipped files in the new format. See the comments in main.py for more details. Fixes #10	2019-11-03 12:02:05 +01:00
Dominik Pataky	3dee135a22	Merge branch 'props-master' Merging pull request #9 by @pR0Ps https://github.com/bitkeks/python-netflow-v9-softflowd/pull/9 Thanks for the contribution! Resolves #9	2019-10-31 18:02:06 +01:00
Dominik Pataky	9f16d246a5	Add v1, v5 to README; change fallback; add timeout parameter Updated the README to reference NetFlow v1 and v5 as well. The fallback(key, dict) method used an exception-based testing of the keys existence. Switched to 'if x in'. The NetFlowListener is based on threading.Thread, which uses the 'timeout' parameter in .join(). Added.	2019-10-31 17:55:48 +01:00
Dominik Pataky	bfec3953e6	Bump version, fix small errors, decrease packet num in tests	2019-10-31 17:35:15 +01:00
Carey Metcalfe	345a5b08ff	Fix setup.py file	2019-10-16 23:46:32 -04:00
Carey Metcalfe	bf92f24669	Add test for invalid packets	2019-10-16 23:46:32 -04:00
Carey Metcalfe	96817f1f8d	Add support for v1 and v5 NetFlow packets Thanks to @alieissa for the initial v1 and v5 code	2019-10-16 23:46:32 -04:00
Carey Metcalfe	186b648c4d	Fix tests Uses the analyzer's new stdin-reading capabilities to test the analysis without having to write temporary files. Also removes most of the delays because the listener can keep up now.	2019-10-16 23:44:28 -04:00
Carey Metcalfe	8e6d0c54e8	Allow analyze_json.py to accept input via stdin This will make testing much cleaner in the future (no temp files needed) Also increase performance by memoizing the hostname lookup	2019-10-16 23:44:19 -04:00
Carey Metcalfe	11dc92269c	Refactor code to make programatic access to flows easier This commit splits the packet collecting and processing out into a thread that provides a queue-like `get(block=True, timeout=None)` function for getting processed `ExportPackets`. This makes it much easier to use rather than starting a generator and sending a value to it when you want to stop. The `get_export_packets` generator is an example of using it - it just starts the thread and yields values from it.	2019-10-16 23:33:22 -04:00
Carey Metcalfe	ef151f8d28	Improve collector script and restructure code - Moved the netflow library out of the src directory - The UDP listener was restructured so that multiple threads can receive packets and push them into a queue. The main thread then pulls the packets off the queue one at a time and processes them. This means that the collector will never drop a packet because it was blocked on processing the previous one. - Adds a property to the ExportPacket class to expose if any new templates are contained in it. - The collector will now only retry parsing past packets when a new template is found. Also refactored the retry logic a bit to remove duplicate code (retrying just pushes the packets back into the main queue to be processed again like all the other packets). - The collector no longer continually reads and writes to/from the disk. It just caches the data in memory until it exits instead.	2019-10-16 23:31:39 -04:00
Dominik Pataky	ce2be709d6	Update README + LICENSE	2019-03-31 21:37:13 +02:00
Dominik Pataky	8de110980c	Add tests for the collector (main.py).	2019-03-31 21:23:24 +02:00
Dominik Pataky	85e6af4bd2	Add buffering of exports with unknown template Until now, exports which were received, but their template was not known, resulted in KeyError exceptions due to a missing key in the template dict. With this release, these exports are buffered until a template export updates this dict, and all buffered exports are again examined. Release v0.7.0 Fixes #4 Fixes #5	2019-03-31 20:51:34 +02:00
Dominik Pataky	5c7ec0aef8	Add additional field types (ASA, PANOS) and set fallback type to 0 refs #4 @ Github	2018-06-15 13:48:17 +02:00
Dominik Pataky	9395aafa71	Fix missing IP_PROTOCOL_VERSION field in analyzer Checks for the key first and handles non-existence. Update to Copyright notices. Fixes #3	2018-02-20 12:09:54 +01:00
Dominik Pataky	691a3480fd	Add duration to Connection	2017-10-29 19:38:33 +01:00
Dominik Pataky	6c267c8c77	Bump to 0.6; expand analyzer	2017-10-29 11:53:32 +01:00
Dominik Pataky	898d220a91	Add JSON export and analyzing example script	2017-10-28 19:00:18 +02:00
Dominik Pataky	92d8e724bf	Fix merge for Python3	2017-10-28 17:34:55 +02:00
cookie	9df5bd426e	Merge pull request #2 from deeso/master Created an installable Python Package	2017-10-28 17:19:29 +02:00
Adam Pridgen	23bc00a316	typo in logging message	2017-09-16 14:15:34 -05:00
Adam Pridgen	e11105e950	added setup main file	2017-09-16 14:11:44 -05:00
Doм	7b24ae51e0	Merge pull request #1 from randerzander/master Thanks for contributing @randerzander !	2016-12-12 18:46:06 +01:00
Randy Gelhausen	bd22551669	converted hardcoded host/port to arg driven, switched int.from_bytes to Python2 friendly routine	2016-11-29 22:50:09 -05:00
Dominik Pataky	8fa999b877	Remove namedtuples import (old version)	2016-08-10 23:10:11 +02:00
Dominik Pataky	aa2a8d8458	Add LICENSE and README.md	2016-08-10 22:47:35 +02:00
Dominik Pataky	546f96122f	Fix datarecord saving bug; cleanup; license	2016-08-10 22:33:57 +02:00
Dominik Pataky	2d7c905d41	Parsing finished, bug in datarecord lists	2016-08-10 20:38:07 +02:00
Dominik Pataky	1be7552e06	Add classes	2016-08-10 18:55:38 +02:00
Dominik Pataky	6cf8356456	Basic implementation of udp socket listener and FlowRecord	2016-08-10 16:28:29 +02:00

35 commits