January 27th, 2014

The SSSD Active Directory provider, part 2

The previous post gave a high level overview of what new features are present in the AD provider starting with 1.9 and especially in the latest versions. This post will illustrate these AD specific features in more detail. Because the AD backend is still undergoing rapid development, most features are accompanied with the version it appeared in.

Faster logins

One of very common complaints about using the LDAP provider with SSSD was that logins are too slow. Typically this was the case with very large and nested group memberships the user was a member of, as the SSSD previously crawled the LDAP directory, looking up the groups.

The AD provider is able to take advantage of a special attribute present in Active Directory called tokenGroups to read all the groups is a member of in a single call. This performance enhancement can reduce the number of LDAP calls needed to find the group memberships to a single one, drastically improving the login time.

The tokenGroups attribute is only leveraged if the SSSD maps the ID values from SIDs, not when POSIX attributes are used in the older versions, up to 1.11.3. With 1.11.3 or later, the tokenGroups attribute is leveraged even when POSIX attributes are used instead of automatic mapping.

Dynamic DNS updates

Clients enrolled to an Active Directory domain may be allowed to update their DNS records stored in AD dynamically. At the same time, Active Directory servers support DNS aging and scavenging, which means that stale DNS records might be removed from AD after a period of inactivity.

The AD provider supports both scenarios described above by default. The AD provider will attempt to update the DNS record every time the AD provider goes online (typically after startup) and periodically every 24 hours to keep the records from being scavenged.

Dynamic NetBIOS name discovery

When referring to a user from an Active Directory domain, typically the domain is part of the identifier. In contrast to UNIX users that are normally just referred to with the username:
id root

The AD users can be referred to either by fully qualifiying the name with the AD domain:
id Administrator@ad.example.com

Or just using the NetBIOS name of the AD domain:
id AD\\Administrator

In most deployments, the NetBIOS name is just the first part of the full domain name, but not always. The NetBIOS name can be customized on the AD side by the administrator.The AD provider of SSSD is able to recognize both formats and autodetect the right NetBIOS name as well.

DNS site discovery

Large Active Directory environments often span across multiple locations in multiple geographies. In Active Directory, these physical locations are represented as "sites". Each client belongs to a site based on the network subnet it resides in. For best performance, it is important that the clients are able to find the closest site and use the other domain controllers only as a fallback.

The support for discovering the closes site was added in SSSD 1.10 and is enabled by default.

Support for trusted domains in the same forest

Starting with version 1.10, the SSSD is able to dynamically find the trusted domains in the same forest and provide both authentication and identity information for users coming from the trusted domains. The SSSD retrieves identity information from the Global Catalog, so it's important that the users and all needed attributes are replicated to the Global Catalog. This includes even POSIX attributes such as home directory, login shell and most importantly UIDs and GIDs if not using ID mapping.

Prior to 1.10, it was somewhat possible to configure the SSSD to fetch identity data from trusted domain, but the administrator had to represent each domain with a separate [domain] stanza in the config file. Each domain stanza had to be fully configured as a separate identity source including search bases and host names. Moreover, groups could only contain members from the same domain. The native support also requires no configuration at all, the trusted domains are discovered on the fly.

Support for enterprise logins

Some users in AD might use a different Kerberos Principal suffix than the default one. This feature is on by default in the AD provider and was introduced in SSSD 1.10. This feature is also required to support logins of users from trusted domains.

A simplified LDAP access control mechanism

Starting with upstream version 1.11.2, there is a simplified way to express the access control using an LDAP filter with the AD backend. The administrator can now only specify access_provider=ad and then use the
access filter with an option aptly called ad_access_filter.

The following example illustrated restricting access to users whose name begins with 'jo' and have a valid home directory attribute:
access_provider = ad
ad_access_filter = (&(sAMAccountName=jo*)(unixHomeDirectory=*))

In addition to checking whether the user matches the filter, the AD access filter also checks the account validity. Expired accounts are not permitted even if they matched the filter. Previously, the administrator had to configure the LDAP access filter and specify all the options manually.

The example used above would the look much more involved:
access_provider = ldap
ldap_access_order = filter, expire
ldap_account_expire_policy = ad
ldap_access_filter = (&(sAMAccountName=jo*)(unixHomeDirectory=*))
ldap_sasl_mech = GSSAPI
ldap_schema = ad

Still, for most deployments, the simple access provider is the best choice for the ease of configuration, especially when it comes to group membership.

Why is id so slow with SSSD?

Every once in a while, when debugging an SSSD performance problem for a user or a customer, I see that even experienced users tend to measure login performance by running id(1). That's not really the best thing to do, for one reason - running the plain id command does much more than what happens during login, and of course, at a cost.

Typically, the login program such as ssh or gdm, needs to perform a couple of tasks aside verifying the user's credentials. These include finding out if the user exists, what shell and home directory does he have and what groups the user is a member of, so that the user can access the files he should be allowed to (and vice versa). These tasks boil down to two corresponding glibc calls - getpwnam to find the details about the user, and getgrouplist to retrieve the list of groups he is a member of.

Because these two library functions are used so often, we take special care in the SSSD to make sure that we use any optimization that is available, such as the transitive memberOf attribute when IPA is used or the tokenGroup attribute when the AD provider is configured. In order to measure what the performance of Name Service Switch calls the login program does is like, you can call "id -G $username" from the command line.

Notice the extra "-G" switch. That really makes a bit of a difference, because the getgrouplist operation returns a list of numerical IDs the user is a member of. That's usually good enough for the login programs to set the groups for the user who logs in but not really friendly for the admin inspecting the output of the id command.

In contrast to "id -G $username", "id $username" does one operation that sounds trivial but can be extremely expensive - resolves the group GIDs to group names. While that sounds like a really easy operation, it involves calling getgrgid for each of the GIDs returned by getgroupslist. And there comes the slowdown, the getgrgid operation is kind of an all-or-nothing call. It retrieves not only the information about the group itself, such as its name, but also all information about the members if the group, including all the users. This can get quite expensive, consider a university setup where each student was a member of group "students" that consisted of all students on the university.

A legitimate question might be whether its possible to restrict the amount of information the get-by-GID call retrieves. And the answer is both yes and no - the POSIX interface doesn't allow any such query directly, but the SSSD offers several means to speed up the overall processing. One quite recent addition is the ignore_group_members configuration option that was contributed by Paul Henson. Setting this option to True causes all groups to effectivelly appear empty, avoiding the need to download the members. Keep in mind that with many server implementations, the members might also include other nested groups which causes the whole operation to recurse.