MSTIC Notebooklets release 0.2

6 min readJun 27, 2021

This release includes 3 new notebooklets: AccountSummary, IPAddressSummary and LogonSessionRarity.

It’s been a while since we updated any notebooklets but then three come along at once! For some background read the original announcement of notebooklets.

This release also integrates the notebooklets with MSTICPy’s pivot functions, so that you can call a notebook from a single-line pivot function. These functions are available in the Host, Account and IpAddress entities so far.

If you don’t have the notebooklets package (msticnb) installed, install it with pip.

pip install msticnb

MSTICnb depends on MSTICPy and will install this if you don’t have it installed.

The first two of these notebooks are specific to Azure Sentinel data and queries. The LogonSessionRarity notebooklet should be usable with Windows SecurityEvent data from any source.

Account Summary Notebooklet

This is the most feature-laden notebooklet. The functionality is based on the Account Explorer notebook that we created for Azure Sentinel. Although this had a lot content, it was difficult to navigate — depending on whether the account was an Azure Active Directory (AAD), a Windows or a Linux account you would have to skip to the relevant part of the notebook to do further analysis and data queries. This made the notebook long and convoluted.

The AccountSummary notebooklet abstracts a lot of that logic — it has the following simple pattern common to most notebooklets:

you supply an account name — matches are searched for in Windows, Linux and AAD data.
if there are multiple matches you get to choose which account to examine — as you click on each account some relevant summary information (such as last activity time) is shown.
initial data (DataFrames, visualizations and browsers) are returned as a notebooklet Result object.
for the selected account you can run the get_additional_data function which pulls back more data (depending on the account type) from Azure Signin and Activity logs, Office Activity, Windows events or Linux syslog.

Importing notebooklets (msticnb) and initializing the MSTICPy Pivot library

Note: use of the pivot library and pivot functions is optional. You can instantiate a notebooklet and execute it’s run function as explained in the original article. Using pivot functions streamlines the process a little though.

import msticnb as nbpivot = Pivot(globals())
nb.init(query_provider=qry_prov)Account = entities.Account
acc_result = Account.nblt.account_summary("UserName")

Account summary notebooklet searching multiple logs

If a single account matches, the notebooklet shows initial results of account details, most recent activity and any related alerts.

Initial account details showing the Account entity (with details of associated IP addresses) and the last sign-in activity record — Initial account details

If multiple accounts match, you will be prompted to choose an account.

List box with list of accounts to choose — Account chooser

As you select each account, the logon described above are retrieved and shown for the selected account.

Viewing and retrieving additional data with result methods

From the returned Result object you can browse any alerts retrieved. These additional methods are actually member functions of the notebooklet class but they are accessible via the result object.

List of alerts involving the account. As you select an alert, details of the alert are shown below. — Browser for alerts related to the account.

Bug and workaround:

There is a bug in this release. If only one account matched you cannot run subsequent methods like get_additional_data(). We will be issuing a fix shortly but you can work around this for the moment by running the following code (after the initial run() call) in a cell.

acc_result.account_selector;

This causes the single matched account to be selected (you’ll get some redundant data displayed if you don’t add the trailing “;”).

Get Additional Data

You can also choose to retrieve further details with the get_additional_data() method. The data sets queried will differ depending on whether this is an Azure, Windows or Linux account. The images below show some of the visualized data from an Azure account. Details are retrieved from the Azure and the Office Activity logs.

Shows getting additional details about an Azure account. Some of these are shown on an event timeline, grouped by the type of activity (e.g. sign-In, storage access, Azure portal activity). — Azure account activity timeline

Azure activity summary and related details of IP address used.

The result object returned from the initial call to the notebooklet function contains data sets (as pandas DataFrames) and visualizations collected by the notebooklet. Invoking additional data retrieval functions (like “get_additional_data”) will update this result object. You can just run this object in a cell to see the data contained. If you re-run the notebooklet with a new account (the code snippet at the start of the article) it will make any previous result inaccessible (since the result variables have the same name). If you want to keep a result object, rename it first.

Saving a notebooklet result

Athough you cannot current serialize the notebooklet result directly (e.g. to a pickle file) you can save individual dataframes within the result. You can create a simple loop to do this:

for attr_name in dir(nb_result):
    attrib = getattr(nb_result, attr_name)
    if isinstance(attrib, pd.DataFrame):
        attrib.to_pickle(f"nb_result_{attrib}.pkl")

Making notebooks cleaner and simpler

We will shortly be replacing the existing Account Explorer notebook in Azure Sentinel with a version that uses this notebooklet.

The use of notebooklets dramatically reduces the amount of code and complexity in the notebook. In this case the use of the AccountSummary notebooklet reduced the number of code cells in the notebook from 31 to 15 and the total number of lines of code from 634 to 43.

This is 7% of the original code with no loss of functionality!

IP Address Summary Notebooklet

Like the previous notebooklet the IpAddressSummary notebooklet aims to automate multiple common queries and lookups in one or two function calls.

ip_result = IpAddress.nblt.ip_address_summary(
    value="144.91.119.160"
)

IP summary results including records of activity in workspace logs, WhoIs information detailing the ownership of the IP address and geolocation data for the IP. — The initial details include searches for activity in common logs, WhoIs and geo-location results

You can browse details of any Threat Intelligence results returned (as shown below) as well as related alerts (using the browse_alerts method).

Threat intelligence browser showing threat intel results from the notebooklet in a selectable list. As each item is selected, details are shown beneath. — Threat intelligence results

Logon Session Rarity Notebooklet

This notebooklet has a more specialized use case than the previous two. It is designed to take a set of Windows processes and identify the user sessions within that data that ran relatively unusual processes or processes with unusual command lines. It will work on many tens of thousands of process events and reduce the set of interesting/suspicious events that you have to browse through to a more manageable subset.

It does this by clustering common, repetitive system processes. Often each instance of these command may vary a little (e.g. a UUID or timestamp in the path name or command line) and so may be difficult to isolate using standard data grouping.

The input to the notebooklet is a DataFrame of process create events (Windows Event ID 4688 — this notebooklet currently only works with Windows data). You can see that even though we start with approximately 23,000 events, we’ve been able to cluster these to a few hundred distinct groups.

Launching the notebooklet with the input DataFrame. It shows 23,000 events being grouped into 222 distinct clusters plus 95 unique events. — Running the logon session rarity notebooklet

We use the clustering to assign a rarity score to each event (events in very large clusters will have a very low rarity score and those in small clusters or just unique events have a high rarity score). Now we group the events by the account/logon session that created the processes and show the average rarity score of the processes in each session.