mentat.stats.idea module

Library for calculating various statistics from given list of IDEA messages.

mentat.stats.idea.LIST_AGGREGATIONS = (['ips', 'Source.IP4', '__unknown__'], ['analyzers', 'Node[#].SW', '__unknown__'], ['categories', 'Category', '__unknown__'], ['detectors', 'Node[#].Name', '__unknown__'], ['abuses', '_CESNET.ResolvedAbuses', '__unknown__'], ['asns', '_CESNET.SourceResolvedASN', '__unknown__'], ['countries', '_CESNET.SourceResolvedCountry', '__unknown__'], ['classes', '_CESNET.EventClass', '__unknown__'], ['severities', '_CESNET.EventSeverity', '__unknown__'])

List of statistical aggregations.

mentat.stats.idea.LIST_CALCSTAT_KEYS = ('ips', 'analyzers', 'categories', 'detectors', 'abuses', 'asns', 'countries', 'classes', 'severities', 'category_sets', 'detectorsws')

List of subkey names of all calculated statistics.

mentat.stats.idea.LIST_OPTIMAL_STEPS = (1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, 60, 120, 180, 240, 300, 360, 600, 720, 900, 1200, 1800, 3600, 7200, 10800, 14400, 21600, 28800, 43200, 86400, 172800, 259200, 345600)

List of optimal timeline steps. This list is populated with values, that round nicelly in time calculations.

mentat.stats.idea.LIST_STAT_GROUPS = ('stats_internal', 'stats_external', 'stats_overall')

List of statistical groups. The statistics will be calculated separatelly for these.

mentat.stats.idea.aggregate_stat_groups(stats_list, result=None)[source]

Aggregate multiple full statistical records produced by the mentat.stats.idea.evaluate_event_groups() function into single statistical record.

Parameters

stats_list (list) – List of full statistical records to be aggregated.

Returns

Single statistical record structure.

Return type

dict

mentat.stats.idea.aggregate_stats_reports(report_list, dt_from, dt_to, result=None)[source]

Aggregate multiple reporting statistical records.

Parameters
  • report_list (list) – List of report objects as retrieved from database.

  • dt_from (datetime.datetime) – Lower timeline boundary.

  • dt_to (datetime.datetime) – Upper timeline boundary.

  • result (dict) – Optional data structure for storing the result.

Returns

Single aggregated statistical record.

Return type

dict

mentat.stats.idea.aggregate_timeline_groups(stats_list, dt_from, dt_to, max_count, min_step=None, result=None)[source]

Aggregate multiple full statistical records produced by the mentat.stats.idea.evaluate_event_groups() function and later retrieved from database as mentat.datatype.sqldb.EventStatisticsModel into single statistical record. Given requested timeline time interval boundaries will be adjusted as necessary to provide best result.

Parameters
  • stats_list (list) – List of full statistical records to be aggregated.

  • dt_from (datetime.datetime) – Lower requested timeline time interval boundary.

  • dt_to (datetime.datetime) – Upper requested timeline time interval boundary.

  • max_count (int) – Maximal number of steps in timeline.

  • min_step (int) – Force minimal step size in timeline.

  • result (dict) – Optional dictionary structure to contain the result.

Returns

Single statistical record structure.

Return type

dict

mentat.stats.idea.evaluate_event_groups(events, stats=None)[source]

Evaluate full statistics for given list of IDEA events. Events will be grouped using group_events() first and the statistics will be evaluated separatelly for each of message groups stats_overall, stats_internal and external.

Parameters
  • events (list) – List of IDEA events to be evaluated.

  • stats (dict) – Optional dictionary structure to populate with statistics.

Returns

Structure containing evaluated event statistics.

Return type

dict

mentat.stats.idea.evaluate_events(events, stats=None)[source]

Evaluate statistics for given list of IDEA events.

Parameters
  • events (list) – List of IDEA events to be evaluated.

  • stats (dict) – Optional data structure to which to append the calculated statistics.

Returns

Structure containing calculated event statistics.

Return type

dict

mentat.stats.idea.evaluate_timeline_events(events, dt_from, dt_to, max_count, stats=None)[source]

Evaluate statistics for given list of IDEA events and produce statistical record for timeline visualisations.

Parameters
  • events (list) – List of IDEA events to be evaluated.

  • dt_from (datetime.datetime) – Lower timeline boundary.

  • dt_to (datetime.datetime) – Upper timeline boundary.

  • max_count (int) – Maximal number of items for generating toplists.

  • stats (dict) – Data structure to which to append calculated statistics.

Returns

Structure containing evaluated event timeline statistics.

Return type

dict

mentat.stats.idea.group_events(events)[source]

Group events according to the presence of the _CESNET.ResolvedAbuses key. Each event will be added to group overall and then to either internal, or external based on the presence of the key mentioned above.

Parameters

events (list) – List of IDEA events to be grouped.

Returns

Structure containing event groups stats_overall, stats_internal and stats_external.

Return type

dict

mentat.stats.idea.truncate_evaluations(stats, top_threshold=20)[source]

Make all statistical groups more brief with truncate_stats().

Parameters
  • stats (dict) – Structure containing statistics for all groups.

  • top_threshold (int) – Toplist threshold size.

Returns

Updated structure containing statistics.

Return type

dict

mentat.stats.idea.truncate_stats(stats, top_threshold=20)[source]

Make statistics more brief. For each of the statistical aggregation subkeys generate toplist containing given number of items at most.

Parameters
  • stats (dict) – Structure containing statistics.

  • top_threshold (int) – Toplist threshold size.

Returns

Updated structure containing statistics.

Return type

dict

mentat.stats.idea.truncate_stats_with_mask(stats, mask, top_threshold=20)[source]

Make statistics more brief. For each of the statistical aggregation subkeys generate toplist containing at most given number of items, but in this case use given precalculated mask to decide which items should be hidden. The use case for this method is during calculation of timeline statistics. In that case the global toplists must be given to mask out the items in every time interval, otherwise every time interval might have different item toplist and it would not be possible to draw such a chart.

Parameters
  • stats (dict) – Structure containing single statistic category.

  • mash (dict) – Global truncated statistics to serve as a mask.

  • top_threshold (int) – Toplist threshold size.

Returns

Updated structure containing statistics.

Return type

dict