socstories.blog

Detection methods, KQL & more

Demistifying Sentinel SigninLogs

TL;DR to analyze the Sentinel SigninLogs table, keep the following in mind:

  • Group by OriginalRequestId or Id and take the entry with the last TimeGenerated value
  • Error code 50074 indicates Strong authentication required, but says nothing about authentication being successful or not
  • Parse unresolved user identities to their correct names, based on resolved entries (group on UserId)
  • Use the DeviceDetail field to determine if the source device is managed by your organization (or which browser was being used)

Summary

The SigninLogs table in Sentinel provides valuable data that is being used by multiple detection and hunting queries.
However, misinterpretation of this data leads to these queries triggering excessive False Positive results. This becomes a large problem in larger organizations, so I decided to dig in and create a summary of how to interpret this data. As always, for any comments or improvements feel free to contact me via email or LinkedIn

Contents of the table

The official documentation for SigninLogs has a table explaining all of the different fields. While the table is too large to mention all of them, below you can find the most interesting fields that are being used (or can be used) in most of the detection use cases or during the investigation of these detections:

The official documentation for SigninLogs has a table explaining all of the different fields, which I recommend looking into.

ColumnTypeDescription
AppDisplayNamestringThe application name displayed in the Azure Portal
AuthenticationDetailsstringThe result of the authentication attempt and additional details on the authentication method
AuthenticationRequirementstringThis holds the highest level of authentication needed through all the sign-in steps, for sign-in to succeed.
ClientAppUsedstringThe legacy client used for sign-in activity. For example: Browser, Exchange ActiveSync, Modern clients, IMAP, MAPI, SMTP, or POP.
DeviceDetaildynamicThe device information from where the sign-in occurred. Includes information such as deviceId, OS, and browser.
IdstringThe identifier representing the sign-in activity.
IdentitystringThe display name of the actor identified in the signin.
IPAddressstringThe IP address of the client from where the sign-in occurred.
IsInteractiveboolIndicates whether a user sign in is interactive. In interactive sign in, the user provides an authentication factor to Azure AD. These factors include passwords, responses to MFA challenges, biometric factors, or QR codes that a user provides to Azure AD or an associated app. In non-interactive sign in, the user doesn’t provide an authentication factor. Instead, the client app uses a token or code to authenticate or access a resource on behalf of a user. Non-interactive sign ins are commonly used for a client to sign in on a user’s behalf in a process transparent to the user.
LocationDetailsdynamicProvides the city, state, country/region and latitude and longitude from where the sign-in happened.
MfaDetaildynamicThis property is deprecated. (but still useful)
NetworkLocationDetailsstringThe network location details including the type of network used and its names.
OriginalRequestIdstringThe request identifier of the first request in the authentication sequence.
ResultDescriptionstringProvides the error message or the reason for failure for the corresponding sign-in activity.
ResultTypestringProvides the 5-6 digit error code that’s generated during a sign-in event. 0 indicates success; other values are failures. You can find more information using the Azure AD Error Codes documentation or https://login.microsoftonline.com/error.
StatusstringThe sign-in status. Includes the error code and description of the error (in case of a sign-in failure).
TimeGenerateddatetime
UserAgentstringThe user agent information related to sign-in.
UserDisplayNamestringThe display name of the user.
UserIdstringThe identifier of the user.
UserPrincipalNamestringThe UPN of the user.
Table 1: extract of most useful fields from the SigninLogs table

Now, those of us that use Sentinel and implemented detection rules from the official Github might have encountered rules such as SigninBruteForce-AzurePortal.yaml that generate quite a lot of false positives. While it is true that certain thresholds need to be configured when implementing such rules, I noticed that there was a recurring issue with these detection rules showing more failed logins than actually true.

This lead me to dig deeper into the topic, analyze these detection rules and see where the issue might lie. For doing so, I manually triggered some successful & failed logins, as well as some failed MFA attempts, and compared them with the logs available.

50074 – Strong authentication required

The first issue that I encountered was that of the error code 50074. Depending on which official documentation you consult, there is a small but important distinction in the meaning of this code.
According to the documentation of Microsoft Entra authentication and authorization error codes, this error code indicates UserStrongAuthClientAuthNRequiredInterrupt – Strong authentication is required and the user did not pass the MFA challenge.

Screenshot showing the meaning of error code 50074 according to MS documentation
Explanation of error code 50074 according to Microsoft documentation


However, If we consult the interactive error lookup page from Microsoft, we see that this error code indicates Strong Authentication is required.

Screenshot showing the meaning of error code 50074 according to MS error lookup
Explanation of error code 50074 according to Microsoft’s interactive error lookup page


So while both sources show that Strong authentication (MFA) is required, one says that MFA failed (which is an important indicator of a potentially malicious login attempt) while the other simply says that MFA is required.

We manage to verify this manually, and the correct interpretation is Strong authentication is required but it does not indicate that this authentication failed. This is important to keep in mind, because this means that we can filter out this error code when looking for failed logins. This is something that most detection rules currently don’t do.

Group logins by ID

Another important factor of excessive False Positive alerts, is the fact that detection rules do not group logins by ID.
The SigninLogs table in Sentinel contains a record for multiple authentication steps within the authentication process. This means that 1 login might trigger multiple records in the table. However, you cannot say that it consistently increases each authentication attempt with 4 more records as it depends on each authentication. In smaller environments this might be less of an issue, but in a large environment this will result in a lot of False Positive alerts.

The solution for this is relatively easy: you can either group all the records from one authentication process by OriginalRequestId or by Id.
The difference is that Id is a value that Microsoft assigned to different records belonging to one authentication process, and OriginalRequestId just refers to the first record of the authentication process. We confirmed manually that both ways give the same result, with an occasional outlier or one record that might be grouped wrong. Seeing how we tested this in a live environment with roughly 100k users, having one or two records wrong is an acceptable margin.

Simply said: before running any analysis query, you should first group all logs belonging to one authentication process using

| summarize arg_max(TimeGenerated, *) by Id

or

| summarize arg_max(TimeGenerated, *) by OriginalRequestId

Resolving unresolved users

Sentinel sometimes has the issue where a record of the SigninLogs table does not properly resolve a user their identity. Some of the detection rules take this into account, but not all of them due. Below logic is included in the detection rule ‘Password spray attack against Azure AD application‘ but is worth repeating here.
You can quickly identify these records, as they have the user GUID Identity. In KQL terms, you can identify them with

 | where Identity matches regex "[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12}"

To resolve these GUIDs to actual identities, which is necessary to get a complete and accurate view of a user’s login history, you would have to cross reference the GUIDs to logins with resolved Identities, and join them based on Id.
While this sounds complicated, the logic is quite straightforward (the example given here is for sign-in activity to Azure Portal):

  1. create a variable ‘azPortalSignins’ with the logins for the application that you want to monitor (for instance: SigninLogs | where AppDisplayName == “Azure Portal”), and create an additional column ‘Unresolved’ based on the regex mentioned above.
  2. create a second variable ‘identityLookup’ containing each entry where Unresolved == false, and only keep the distinct user fields (UserDisplayName, UserPrincipalName, Identity and UserId)
  3. create a third variable ‘resolved_azPortalSignins’, where you filter on Unresolved == true and join with ‘identityLookup’ on UserId. Overwrite the UserDisplayName, UserPrincipalName and Identity fields with the correct values from ‘identityLookup’
  4. filter azPortalSignins on Unresolved == false and perform a union operator on the resolved_azPortalSignins variable, and continue with whatever detection logic you are using

This logic gives the below KQL code

let identityLookback = 7d;
let ruleLookback = 1d;
let isGUID= "[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12}";
let azPortalSignins = materialize (
SigninLogs
| where TimeGenerated >= startofday(ago(identityLookback))
| where AppDisplayName == "Azure Portal"
| summarize arg_max(TimeGenerated, *) by OriginalRequestId
| extend Unresolved = iff(Identity matches regex isGUID, true, false)
);
let identityLookup =
azPortalSignins
| where Unresolved == false
| distinct UserId, lu_UserDisplayName = UserDisplayName, lu_UserPrincipalName = UserPrincipalName, lu_Identity = Identity;
let resolved_azPortalSignins = 
azPortalSignins
| where Unresolved == true
| join kind = leftouter (identityLookup) on UserId
| extend UserDisplayName = lu_UserDisplayName, UserPrincipalName = lu_UserPrincipalName, Identity = lu_Identity
| project-away lu_UserDisplayName, lu_UserPrincipalName, lu_Identity;
azPortalSignins
| where Unresolved == false
| union resolved_azPortalSignins
| where TimeGenerated >= startofday(ago(ruleLookback))

Identifying managed devices

Another indicator that might be interesting to take into account, is if the sign-in attempt originated from a managed device. Depending on the defined use case, you might consider login attempts that originate from a device managed by your organization as ‘safe’. This specifically is the case if your use case is trying to detect a brute force from an external actor trying to gain access to your application. This is completely up to the risk appetite and scope of your use case, but I believe it is still worth mentioning here. IF you are not excluding this from detection, it is important to know this when investigating these incidents.

Identifying managed devices is done by parsing information in the DeviceDetail field. Keep in mind that while in the SigninLogs table this is a dynamic field, the AADNonInteractiveUserSignInLogs defines this as a string field. Because … logic? The same is true for LocationDetails, MfaDetail and Status
This is important, as most of the current detection rules for sign-in activity query both tables, by defining a function and pass the table name as a string, where the table is then invoked with the table() command. You can parse this to dynamic with the todynamic() function.

When doing so, it is important that you parse these fields to dynamic early on in the query. And while you’re doing so, you give them all the same naming convention and either make them all plural or singular names. By default it is mixed, with DeviceDetail being singular but LocationDetails being plural.

The DeviceDetail field contains multiple interesting records, such as isManaged, displayName and browser. If a device is not managed, this will be indicated by isManaged being false, and deviceId being empty. While the KQL is pretty straightforward, below is an example snippet (continuing on the previous example of azPortalSignins).

azPortalSignins
| where Unresolved == false
| union resolved_azPortalSignins
| where TimeGenerated >= startofday(ago(ruleLookback))
| extend DeviceDetails = todynamic(DeviceDetail), Status = todynamic(Status), LocationDetails = todynamic(LocationDetails), MfaDetails = todynamic(MfaDetail)
| project-away DeviceDetail, MfaDetail
| extend Browser = tostring(DeviceDetails.browser)
| extend ManagedDevice = iff(DeviceDetails.isManaged == false,"Unmanaged device","Managed device")
| extend DeviceName = DeviceDetails.displayName

Depending on how you want to customize your query, you might either define exclusions for managed devices or keep this into account for your investigation.

Conclusion

Many detection rules use the SigninLogs (and AADNonInteractiveUserSignInLogs) table. You should keep in mind that it is possible to either improve these detection rules with the below elements, or keep these elements in mind when performing investigations of these alerts

  • Multiple records of the same authentication process can be grouped on either Id or OriginalRequestId
  • Unresolved users where the Identity matches the GUID, should be resolved based on resolved entries (join on UserId)
  • Error code 50074 indicates that strong authentication was triggered, but says nothing if the authentication passed or failed
  • It is possible to identify if the source device is managed by your organization by looking at the information available in the DeviceDetail field.

Author: