mentat.stats.idea module¶
Library for calculating various statistics from given list of IDEA messages.
-
mentat.stats.idea.
LIST_AGGREGATIONS
= (['ips', ('Source.IP4', 'Source.IP6'), '__unknown__'], ['analyzers', ('Node[#].SW',), '__unknown__'], ['categories', ('Category',), '__unknown__'], ['detectors', ('Node[#].Name',), '__unknown__'], ['abuses', ('_Mentat.ResolvedAbuses', '_CESNET.ResolvedAbuses'), '__unknown__'], ['asns', ('_Mentat.SourceResolvedASN', '_CESNET.SourceResolvedASN'), '__unknown__'], ['countries', ('_Mentat.SourceResolvedCountry', '_CESNET.SourceResolvedASN'), '__unknown__'], ['classes', ('_Mentat.EventClass', '_CESNET.SourceResolvedASN'), '__unknown__'], ['severities', ('_Mentat.EventSeverity', '_CESNET.SourceResolvedASN'), '__unknown__'])¶ List of statistical aggregations.
-
mentat.stats.idea.
LIST_CALCSTAT_KEYS
= ('ips', 'analyzers', 'categories', 'detectors', 'abuses', 'asns', 'countries', 'classes', 'severities', 'category_sets', 'detectorsws', 'detector_types', 'source_types', 'target_types', 'source_ports', 'target_ports', 'protocols')¶ List of subkey names of all calculated statistics.
-
mentat.stats.idea.
LIST_OPTIMAL_STEPS
= (1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, 60, 120, 180, 240, 300, 360, 600, 720, 900, 1200, 1800, 3600, 7200, 10800, 14400, 21600, 28800, 43200, 86400, 172800, 259200, 345600, 432000, 518400, 604800, 864000, 1209600)¶ List of optimal timeline steps. This list is populated with values, that round nicelly in time calculations.
-
mentat.stats.idea.
LIST_STAT_GROUPS
= ('stats_internal', 'stats_external', 'stats_overall')¶ List of statistical groups. The statistics will be calculated separatelly for these.
-
mentat.stats.idea.
TRUNCATION_THRESHOLD
= 100¶ Threshold for truncated statistics.
-
mentat.stats.idea.
TRUNCATION_WHITELIST
= {'abuses': True, 'analyzers': True, 'categories': True, 'category_sets': True, 'classes': True, 'countries': True, 'detector_types': True, 'detectors': True, 'detectorsws': True, 'protocols': True, 'severities': True, 'source_types': True, 'target_types': True}¶ Whitelist for truncating statistics.
-
mentat.stats.idea.
TRUNCATION_WHITELIST_THRESHOLD
= 1000¶ Threshold for whitelisted truncated statistics.
-
mentat.stats.idea.
aggregate_dbstats_events
(aggr_type, aggr_name, aggr_data, default_val, timeline_cfg, result=None)[source]¶
-
mentat.stats.idea.
aggregate_stat_groups
(stats_list, result=None)[source]¶ Aggregate multiple full statistical records produced by the
mentat.stats.idea.evaluate_event_groups()
function into single statistical record.- Parameters
stats_list (list) – List of full statistical records to be aggregated.
- Returns
Single statistical record structure.
- Return type
dict
-
mentat.stats.idea.
aggregate_stats_reports
(report_list, dt_from, dt_to, result=None)[source]¶ Aggregate multiple reporting statistical records.
- Parameters
report_list (list) – List of report objects as retrieved from database.
dt_from (datetime.datetime) – Lower timeline boundary.
dt_to (datetime.datetime) – Upper timeline boundary.
result (dict) – Optional data structure for storing the result.
- Returns
Single aggregated statistical record.
- Return type
dict
-
mentat.stats.idea.
aggregate_timeline_groups
(stats_list, dt_from, dt_to, max_count, min_step=None, result=None)[source]¶ Aggregate multiple full statistical records produced by the
mentat.stats.idea.evaluate_event_groups()
function and later retrieved from database asmentat.datatype.sqldb.EventStatisticsModel
into single statistical record. Given requested timeline time interval boundaries will be adjusted as necessary to provide best result.- Parameters
stats_list (list) – List of full statistical records to be aggregated.
dt_from (datetime.datetime) – Lower requested timeline time interval boundary.
dt_to (datetime.datetime) – Upper requested timeline time interval boundary.
max_count (int) – Maximal number of steps in timeline.
min_step (int) – Force minimal step size in timeline.
result (dict) – Optional dictionary structure to contain the result.
- Returns
Single statistical record structure.
- Return type
dict
-
mentat.stats.idea.
calculate_timeline_config
(dt_from, dt_to, max_count, min_step=None)[source]¶ Calculate optimal configurations for timeline chart dataset.
-
mentat.stats.idea.
calculate_timeline_config_daily
(dt_from, dt_to)[source]¶ Calculate optimal configurations for timeline chart dataset with steps forced to one day.
-
mentat.stats.idea.
evaluate_event_groups
(events, stats=None)[source]¶ Evaluate full statistics for given list of IDEA events. Events will be grouped using
group_events()
first and the statistics will be evaluated separatelly for each of message groupsstats_overall
,stats_internal
andexternal
.- Parameters
events (list) – List of IDEA events to be evaluated.
stats (dict) – Optional dictionary structure to populate with statistics.
- Returns
Structure containing evaluated event statistics.
- Return type
dict
-
mentat.stats.idea.
evaluate_events
(events, stats=None)[source]¶ Evaluate statistics for given list of IDEA events.
- Parameters
events (list) – List of IDEA events to be evaluated.
stats (dict) – Optional data structure to which to append the calculated statistics.
- Returns
Structure containing calculated event statistics.
- Return type
dict
-
mentat.stats.idea.
evaluate_singlehost_events
(host, events, dt_from, dt_to, max_count, stats=None)[source]¶ Evaluate statistics for given list of IDEA events and produce statistical record for single host visualisations.
- Parameters
source (str) – Event host.
events (list) – List of IDEA events to be evaluated.
dt_from (datetime.datetime) – Lower timeline boundary.
dt_to (datetime.datetime) – Upper timeline boundary.
max_count (int) – Maximal number of items for generating toplists.
stats (dict) – Data structure to which to append calculated statistics.
- Returns
Structure containing evaluated event timeline statistics.
- Return type
dict
-
mentat.stats.idea.
evaluate_timeline_events
(events, dt_from, dt_to, max_count, stats=None)[source]¶ Evaluate statistics for given list of IDEA events and produce statistical record for timeline visualisations.
- Parameters
events (list) – List of IDEA events to be evaluated.
dt_from (datetime.datetime) – Lower timeline boundary.
dt_to (datetime.datetime) – Upper timeline boundary.
max_count (int) – Maximal number of items for generating toplists.
stats (dict) – Data structure to which to append calculated statistics.
- Returns
Structure containing evaluated event timeline statistics.
- Return type
dict
-
mentat.stats.idea.
group_events
(events)[source]¶ Group events according to the presence of the
_Mentat.ResolvedAbuses
(or_CESNET.ResolvedAbuses
) key. Each event will be added to groupoverall
and then to eitherinternal
, orexternal
based on the presence of the key mentioned above.- Parameters
events (list) – List of IDEA events to be grouped.
- Returns
Structure containing event groups
stats_overall
,stats_internal
andstats_external
.- Return type
dict
-
mentat.stats.idea.
truncate_evaluations
(stats, top_threshold=100, force=False)[source]¶ Make all statistical groups more brief with
truncate_stats()
.- Parameters
stats (dict) – Structure containing statistics for all groups.
top_threshold (int) – Toplist threshold size.
force (bool) – Force the toplist threshold even to whitelisted keys.
- Returns
Updated structure containing statistics.
- Return type
dict
-
mentat.stats.idea.
truncate_stats
(stats, top_threshold=100, force=False)[source]¶ Make statistics more brief. For each of the statistical aggregation subkeys generate toplist containing given number of items at most.
- Parameters
stats (dict) – Structure containing statistics.
top_threshold (int) – Toplist threshold size.
force (bool) – Force the toplist threshold even to whitelisted keys.
- Returns
Updated structure containing statistics.
- Return type
dict
-
mentat.stats.idea.
truncate_stats_with_mask
(stats, mask, top_threshold=100, force=False)[source]¶ Make statistics more brief. For each of the statistical aggregation subkeys generate toplist containing at most given number of items, but in this case use given precalculated mask to decide which items should be hidden. The use case for this method is during calculation of timeline statistics. In that case the global toplists must be given to mask out the items in every time interval, otherwise every time interval might have different item toplist and it would not be possible to draw such a chart.
- Parameters
stats (dict) – Structure containing single statistic category.
mask (dict) – Global truncated statistics to serve as a mask.
top_threshold (int) – Toplist threshold size.
force (bool) – Force the toplist threshold even to whitelisted keys.
- Returns
Updated structure containing statistics.
- Return type
dict