We recently just released a new version of MSTICPy with a feature called Pivot functions.
Pivot functions have three main goals:
- Making it easy to discover and invoke MSTICPy functionality.
- Creating a standardized way to call pivotable functions.
- Letting you assemble multiple functions into re-usable pipelines.
The pivot functionality exposes operations relevant to a particular entity as methods (or functions) of that entity. These operations include: data queries, threat intelligence lookups, other data lookups (such as geo-location, whois and domain resolution) as well as other local functions.
Here are a couple of examples showing calling different kinds of enrichment functions from an entity (in this case the IpAddress entity):
What is “pivoting”?
This comes from the common practice in CyberSec investigations of navigating from one suspect entity to another. E.g., you might start with an alert identifying a potentially malicious IP Address, from there you ‘pivot’ to see which hosts or accounts were communicating with that address. From there you might pivot again to look at processes running on the host or Office activity for the account.
Pivot functions let you do this operation more easily by grouping the data query, processing and enrichment functions together with the entity (Host, IP, URL) that you are focused on.
What was life like before pivot functions?
Previously, to use this functionality you would have to:
- know which modules contained the functions you wanted,
- import the functions,
- maybe do some initialization — such as creating a class,
- probably look up the help strings to check on the arguments.
Once you’d done, that you’d have four types of output, each with its own distinct format that you might have to do some wrangling with to combine into a nicely-presented format.
Pivot functions to the rescue
Pivot functions aim to side-step a lot of that by:
- Being accessible via the entity class relevant to the job in hand (e.g. all URL-related functions are exposed as members of the Url entity class).
- Normalizing input formats— every pivot function can accepts input in the form of a string, a list (or other Python “iterable”), or a pandas DataFrame.
- Regularizing parameter signatures — you can usually just pass a single positional parameter to the functions. If you need to use a parameter name, you can use a generic term such as “value” (in the case of DataFrames you use “data” and “column” to specify the DataFrame and column name respectively.
- Normalizing output — all pivot functions return results as DataFrames. In addition, you can join input to the output using inner, left, right or outer joins.
There is a small amount of up-front cost to using pivot functions — you have to import and initialize the Pivot class but very little else. When the Pivot class is created it will trawl the MSTICPy codebase and convert disparate functions in to member functions on the relevant entity. Any queries that you have loaded with a data provider such as Azure Sentinel will also be attached to entities that correspond to query input parameters (e.g. the list_host_logons query becomes a function attached to the Host entity class)
Getting started
Use the MSTICPy “init_notebook()” function to initialize the notebook environment and load MSTICPy and other common modules (such as pandas). Then load the query provider you use to access your data (in Azure Sentinel, Splunk or other data source). Finally, import and create an instance of the Pivot class.
from msticpy.nbtools.nbinit import init_notebook
init_notebook(namespace=globals());az_provider = QueryProvider("AzureSentinel")from msticpy.datamodel.pivot import Pivot
pivot = Pivot(namespace=globals())
Having done that you can view the pivot functions available on each entity.
>>> IpAddress.get_pivot_list()
['AzureSentinel.SecurityAlert_list_alerts_for_ip',
'AzureSentinel.SigninLogs_list_aad_signins_for_ip',
'AzureSentinel.AzureActivity_list_azure_activity_for_ip',
'AzureSentinel.AzureNetworkAnalytics_CL_list_azure_network_flows_by_ip',
...
'ti.lookup_ip',
'ti.lookup_ipv4',
'ti.lookup_ipv4_OTX',
...
'ti.lookup_ipv6_OTX',
'util.whois',
'util.ip_type',
'util.ip_rev_resolve',
'util.geoloc_mm',
'util.geoloc_ips']
Note: you can get a list of the entities available by typing dir(entities)
You can also use Jupyter notebook tab completion to help you navigate to the desired function.
Query and Processing Pipelines
Because all pivot functions can use data input from a DataFrame and also output results as DataFrames, it means that you can assemble multiple functions into a pipeline.
We’ve added a pandas extension mp_pivot to make this a bit easier. The example below shows an input DataFrame — suspicious_ips_df — being passed into a pipeline. For each step in the pipeline, we call an mp_pivot function. The mp_pivot.run() function lets you invoke pivot actions on the output from the previous step (the input DataFrame or the DataFrame output from the previous step). You can optionally join the input to the output to preserve results as the data proceeds through the pipeline, using the join parameter.
There are also some utility pipeline functions. mp_pivot.display() can be used to display partial results at any point. mp_pivot.tee() will fork off a version of the DataFrame at that point in the pipeline and save it to a named variable. At the end of this pipeline we are using another MSTICPy pandas “accessor” method — mp_timeline.plot()— to plot the events in a timeline.
Read More
The full pivot functions documentation is available on ReadtheDocs.
There are also a couple of Jupyter notebooks that walk through this in more depth:
- PivotFunctions — Introduction
- PivotFunctions (more in-depth, follows the ReadtheDocs page)
Feedback
Please send any feedback or suggestions for improvements to msticpy@microsoft.com or create an issue on our GitHub repo.