MSTICPy 0.8.8 Release

6 min readOct 29, 2020

We’re pleased to announce the release of MSTICPy 0.8.8 (which should have been 0.8.5 but a few hiccups caused us to do some hotfixes before announcing)

This release has a few new cool features (plus the usual share of fixes):

VirusTotal API V3 support — with notebook support for viewing and navigating the indicator relationships and the VT Graph.
Mordor data provider and browser — to view and import attack data sets directly into your notebook
Streamlined Azure authentication — you should only need to authenticate once!
Azure Sentinel APIs for retrieving data such as Incidents, Alert rules, bookmarks, etc. from your Azure Sentinel workspace.
Partitioned queries — experimental feature allowing you to split very large/complex queries into sequential chunks

We describe each of these in a bit more detail below and you can also read the official release notes on GitHub.

VirusTotal V3 API Support

Andres Ramirez and Juan Infantes at VirusTotal (VT) contributed a module and a demonstration notebook into MSTICPy. This is notable, not just because it brings support for their awesome new V3 API, they are the first people outside our small team to contribute code into our humble package. Thanks Juan and Andres!

One of the key features of the API is the ability to query relationships between indicators such as:

Parent processes of malware files.
Domains related to the malware.

Here’s an example of parent data retrieved for a VT malware ID

Query from the VT API bringing back parent sample IDs for a piece of malware.

Being able to query the data is great but the package also includes the ability to navigate these relationships on an interact graph.

VT Graph showing relationships of files and domains for a piece of malware

You can view the notebook on our GitHub repo: VTLookupV3.

The module makes use of two Python packages published by VirusTotal — vt-py and vt-graph-api. Follow the links to read more about these. Also checkout the full V3 API documentation pages at VirusTotal.

Mordor data provider and browser

Mordor is a project to capture host and network log data that illustrates adversarial attack patterns and is part of the Open Threat Research Forge created by Roberto Rodriquez and Jose Rodriguez.

The Mordor project provides one of the most comprehensive libraries of full attack logs — the captured logs contain not just the events directly related to the attack but also the set of benign events happening at the time of the attack. Each data set is mapped to Mitre ATT&CK techniques and tactics and includes simulation scripts to allow you to produce the same data in your environment. This makes Mordor very useful for testing detection logic — whether simple rules or in more complex machine learning scenarios requiring labelled data.

The MSTICPy Mordor module allows you to browse and search through Mordor data sets and query individual data sets in a similar way to other MSTICPy data providers. Like the other providers, the Mordor provider returns results as a pandas DataFrame, allowing it to be used easily in Jupyter notebooks and other Python code.

You can use the Mordor provider in your code or from an interactive Jupyter or IPython environment. The query names follow the structure of the Mordor data repository so are a little long but tab-completion of the names is fully supported.

You can also search for the data sets you want using simple search syntax to query any of the Mordor metadata.

Querying Mordor data sets from the command line.

Finally, you can use our interactive browser to search and filter on the different data sets. From here you can download the data sets and navigate to any related Jupyter notebooks on the Threat Hunter Playbook site.

Read more about this in our documentation or check out the notebook.

Note: You’ll need to install html5lib for this to work correctly.

pip install html5lib

Streamlined Azure authentication

We’ve re-vamped the disparate authentication mechanisms that we previously used for different components in MSTICPy. Now, MSTICPy components that access Azure services can share the same chained credential set, meaning that you should need to authenticate less often and typically only once in a notebook.

The credential mechanism can also take advantage of an existing Azure CLI authentication. If you log in to Azure prior to starting you notebook session with Azure CLI, msticpy will automatically try to reuse the Azure CLI credentials to obtain tokens for various services. This is especially helpful for interactive work since you can do a single az login at the beginning of your session and it will be inherited by all notebooks that your run.

We now also support Managed Service Identity (MSI) authentication. You can assign a managed service identity to a cloud VM and grant access for that identity to Azure Sentinel. MSTICPy will automatically use an MSI credential, if available.

Unfortunately, this all happens behind the scenes, so no interesting screen shots to show you for this one!

Azure Sentinel APIs

Azure Sentinel’s log data is ,of course, exposed through the Log Analytics API — this is already used by the MSTICPy data providers to allow you to query data in Azure Sentinel. Some data though is only exposed via its management APIs. For example:

Hunting queries
Security Incidents
Alert rules

You can now access this data from the azure_sentinel module in MSTICPy.

Retrieving hunting queries from the Azure Sentinel API

In the current release we have support for querying many of these APIs (update/creation via these APIs is not yet there but we plan on providing this soon.)

Read more about this feature in our documentation page and accompanying notebook.

Partitioned Queries

This is an experimental (meaning not fully supported) feature that we would like people to try out.

Some queries can fail, either because they are complex and have a long execution time, or because they try to retrieve more data than allowed over the Azure Sentinel data API. Partitioned queries lets you break up a query into several chunks. These are executed sequentially (so it’s definitely not quicker than a single query, but not hugely slower) and the results stitched back together into a single DataFrame.

This uses a simple mechanism of dividing the time period (between the start and end values of your original query) into time slices, then executes the query for each time slice. This works well for simple queries but there are some limitations:

if your queries have joins — each subquery will join only on the subset of data within each time-bounded sub-query.
if you do explicit manipulation of time boundaries within your query, this may produce unexpected results.
you must be using a pre-defined query (i.e. one of the built-in queries or a query that you’ve created), it does not work for ad hoc queries. (An ad hoc query means sending a query as a string to the query_provider.exec_query() method).

To use the partitioned query feature, execute the query function as normal but add the parameter split_queries_by. The value of this parameter must be a timespan expressed as {N}{TimeUnit} — e.g. “1d”, “2h”, etc. Anything that is acceptable to a pandas Timedelta object should work (but don’t try to split your query into nano-second chunks!). The time span for your original query will be split into chunks according to this split_queries_by value.

qry_prov.WindowsSecurity.list_host_logons(
   start="2020-09-01T00:00:00",
   end="2020-09-30T00:00:00",
   host_name="my_host",
   split_queries_by="1d"
)

You can read the documentation here.

Try it out

Install the new version:

pip install msticpy

or upgrade:

pip install --upgrade msticpy

Report any issues on our GitHub repo. Reach out to us on twitter: @ianhellen, @ashwinpatil, @MSSPete or email msticpy@microsoft.com.

Happy hunting!