(We skipped a 1.2.0 with a few last-minute fixes)
This release contains some interesting new features, two of which were contributed by community members Ryan Cobb and Julien Touche. These features are: two new data providers for Azure Resource Graph and Sumologic, and a DataViewer control for pandas dataframes.
Azure Resource Graph provider
The Azure Resource graph provider lets you query Azure resources using KQL queries. This works much like our other query providers offering both pre-defined queries in YAML and ad-hoc querying.
You can use this data connector to flexibly and quickly get details on deployed Azure resources within a subscription. It allows for bulk queries on various aspects of resources and returns data in a very structured format (the ubiquitous pandas DataFrame). This makes it much more effective and efficient than getting resource specific details via the resource API.
Although mostly behaving like other data providers, the authentication process is a little different to some — you can use Azure CLI, MSI, environment variable or interactive device logon.
Running a resource graph query
To run pre-defined query, execute with the query name, e.g.
QUERY_PROVIDER.ResoruceGraph.QUERY_NAME(). You can pass parameters to these queries to customize them, however they will also run with default parameters if you do not provide any. The query browser will provide details as to what parameters are available with each query.
As with other data providers data is returned to you in a Pandas DataFrame.
You can also run ad hoc queries:
query = """
Resources | where type =~'Microsoft.Compute/virtualMachines'
Many thanks to Ryan Cobb @rcobb-scwx for creating and contributing this.
Sumo Logic Inc. , is a cloud-based machine data analytics service focusing on security, operations and BI use cases. This provider allows you to connect to and query your data from MSTICPy via their Search API.
You can define your own sumologic query and run it via sumologic provider via
For more information, check documentation Running an Ad-hoc Query
sumologic_query = '''
| formatDate(_messageTime,"yyyy/dd/MM HH:mm:ss") as date
| first(date), last(date) by _sourceCategory
| count _sourceCategory,_first,_last
| sort -_count
df = sumologic_prov.exec_query(sumologic_query, days=0.0005, verbosity=3)
Explore more in the Sumologic Data Connector notebook
Many thanks to Julien (@juju4) for building and contributing this.
The data viewer uses the Bokeh DataTable control to display and browse through data in a pandas DataFrame.
It lets you sort by column, choose which columns to display and filter by multiple columns.
It also keeps a synchronized copy of the DataFrame reflecting the column choice and filtering applied. So that you can always access the data as it appears in the control.
You can also access the filter as a pandas “criteria” object that you apply to the original dataframe.
Two other minor but useful updates in this release are changes to the init_notebook function and the query_providers.
init_notebook() function is run at the start of most of our notebooks and does some setup work such as importing commonly-used modules. In our Azure ML notebooks environment we supplemented this with a local script
nb_check.py — this did a number of things specific to the Azure ML notebooks environment such as checking versions and configuration. However, this was difficult to update once deployed. We’ve merged most of the functionality of this into the
init_notebook function so, when in an Azure ML environment, most of the goodness of nb_check is still available.
One significant change to the behavior is that
init_notebook will now create a basic msticpyconfig.yaml file if it cannot find one. This will contain configuration settings for the Azure Sentinel workspace that the notebook was deployed from.
We’ve also added a QueryTime control to our data providers. Previously, to supply dates to a query you would have to instantiate a QueryTime control and supply it as a parameter to the query. For example:
timespan = nbwidgets.QueryTime(units="day)
And then in a separate cell run:
Now you can just pull up the query provider time control and set the time range for all subsequent queries.
The time range set will be used by default on all queries on that data/query provider. You can, of course, bring up the query_time control at any point and change this time range. If you supply explicit “start” and “end” parameters or pass a QueryTime object as a parameter to a query, these will override the default query_time settings.
You can read more details on our documentation pages using the links provided in this article.
You can also read the release notes for details of all of the other fixes and minor changes in this release.
Please let us know about any issues or feature requests on our GitHub repo.