March 30


Data Cleanup Part 2: Other UserIDs

By: Ioana Bazavan Justus

Did last month’s exercise of mapping primary userIDs kill you?

Is it still killing you?

Unless a number of full-time resources were allocated on a project basis, the cleanup for a large organization can easily take months to complete so if you’re still working on it, don’t worry – you’re not alone!

That said, we need to move on so join us when you’re ready.

The Purpose of Secondary userIDs

Once the primary userIDs are mapped, it is time to continue on with all of the other userIDs in the organization – the ones for the systems that were identified as “secondary” (Priority 2) in January’s exercise.

Secondary systems are systems that need to be integrated to some degree with identity management, but they were deemed “secondary” because the integration might be complex, the system is important but doesn’t have that many users, or the system may be too old to integrate.

There is also another type of secondary account – one most often associated with mainframe or administrative accounts: additional IDs belonging to the same person on a single system.

There are a variety of reasons for this: in some cases, a user of a system may also be an administrator, and there is a security requirement to keep the permissions separate. In mainframe environments, multiple IDs may be needed either because a user has too many permissions to “fit” on a single ID (there are ways to fix this, but that’s outside of the scope of this discussion), or because users need access to the same data for different regions, and switching “views” within one ID is too cumbersome.

There could be other reasons for having multiple IDs on a single system, but the end result is the same: if any user has more than one ID on any key system, that ID needs to identified and linked to the user’s primary account. Otherwise, there will be gaps in the integrity of the identity data.

The task at hand

Cleaning up and mapping secondary userIDs is similar to cleaning up and mapping primary userIDs. The only difference is that the target systems are different. As a result, this effort may be easier…  or harder than the previous one.

Here’s why:

Smaller systems might be easier to map

Systems with fewer users are generally easier to keep clean, and they’re maintained by fewer administrators. There is also the possibility that the administrators know the users personally. If the Priority 2 systems on the list fall into this category, expect this effort to go a lot faster than the one for primary userIDs.

More obscure systems may not be as well-maintained

When cleaning up and mapping primary accounts, the email system is generally the best place to start because it tends to be one of the best-maintained, and for good reason(s):

  1. People use their email all the time, if it’s not working correctly and their name isn’t right, they’re very vocal about it. So users’ email data tends to be very clean
  2. Mailboxes take up precious disk space and disk space costs money. Email administrators tend to notice and act on inactive accounts in the interest of saving the company some money

The more obscure systems don’t have these luxuries. They tend to be more loosely maintained. Administrators may not be as rigorous about following up on inactive accounts or configuring the system to auto-disable/auto-delete unused IDs. They may also not follow the company’s naming standard when creating userIDs. The worst part is they likely don’t populate much – or any! – personally identifiable information with the userID.

If the Priority 2 systems on the list fall into this category, expect this task to be as painful as the one for primary userIDs – or worse.

The UNIX environment is a can of worms

(For ease of expression, I’ll use the term UNIX here, but this applies to Linux and really any *NIX environment)

The UNIX environment can be one of the most difficult to clean up – especially at large companies with many UNIX servers – because of the tendency for UNIX environments to lack central user administration facilities. Unlike in an Active Directory or mainframe environment, users are typically added to each UNIX server (or cluster) to which they need access. This causes a user administration nightmare – trying to figure out which users are on which systems – especially when access needs to be identified or terminated. This problem is compounded if there is little or no identifying information with the ID, or if the IDs were created on a first-come, first-served basis.

Here’s a true story to illustrate the point:

I helped a client clean up their UNIX IDs on one of my first identity management projects. At the company, there were (among others) three UNIX developers named Trong Nguyen, Trung Nguyen, and Tran Nguyen. Their IDs were tnguyen, tnguyen1, and tnguyen2. They requested access to different UNIX servers at different times, depending on their project needs. The UNIX administrators were in the habit of assigning the next available userID on each server to users as they requested access. As a result, my mapping matrix looked something like this:

Server 1 Server 2 Server 3
Trong Nguyen tnguyen tnguyen1 tnguyen2
Trung Nguyen tnguyen1 tnguyen2 tnguyen
Tran Nguyen tnguyen2 tnguyen tnguyen1

In reality, each developer had access to over 25 servers, and they themselves didn’t know which ID they were assigned on which system. To make things worse, their names were not registered with the userIDs, so the only way to figure it out was by trial and error.

UserID correlation is just one problem in the UNIX environment – identifying unused accounts is another. Many n-tiered applications that run on a UNIX infrastructure require the user to have a UNIX account on the underlying servers for the application access to work, but the user only ever logs into the application – not into the server. As a result of this, the UNIX account is never used, nor is the password ever changed. This necessitates changes to the password expiration configurations on those servers, and it precludes auto-disabling/auto-deleting inactive accounts. As a result, it is much easier to accumulate old accounts, and much harder to identify truly inactive IDs.

UNIX also seems to be an environment where developers use their own ID to run batch jobs (instead of requesting a system account for that purpose). The developer leaves the company, but the batch job persists. Disable the ID, break a business function. So then there’s the added work of identifying the job and what permissions it needs to run, creating an appropriate system account, changing the script to reference the new ID, and then finally doing what really needed to be done – cleaning up the userID.

In all fairness, this happens in all environments, not just UNIX, but this is where this information fit in the grand scheme of things. 🙂

Did I mention UNIX UIDs?

In addition to an administrator-assigned userID, UNIX systems also automatically generate a numeric UID for each user. What many companies realize too late is that if UIDs aren’t expressly managed, each user will be assigned the next available UID on each server, much like the tnguyen situation I describe above. Having different UIDs on different systems significantly complicates the integration between identity management and the UNIX environment. This situation must be rectified before the integration can occur.

The solution is fairly simple to design but tedious to implement – just like everything else in this process. Basically, you pick a high-enough UID that there is space between any existing UIDs and it, and use that as the starting point. Then you assign a new UID to each user and ensure that that UID “sticks” across all servers. You also design a process to ensure that once a user gets assigned a UID, each UID becomes reserved for the assigned user across all servers.

The details of this process need to be discussed with a good UNIX engineer and the implementation – although it will take time and planning – should be transparent to the end users.

Another note on UNIX integration

Although it’s entirely possible to integrate identity manager directly with the UNIX farm, it’s not the most efficient or cost-effective way to go about it as it would require a separate integration with each server. There are products out there (the one I’m familiar with is Likewise) that will LDAP- or AD-enable UNIX user management so that the existing integration between LDAP or AD and identity manager can be used. There are also products that allow similar functionality between UNIX and mainframe tools such as RACF.

If UNIX is a large component of your environment, start looking into products that will facilitate the integration with identity manager now.


The approach for cleaning up secondary userIDs is the same as what was outlined last month for primary userIDs. Remember to communicate frequently and clearly with the impacted users and their management, and don’t be afraid to disable IDs (in an organized way, of course) if all other avenues of research have failed.

Parking Lot

There’s a good chance that this second round of cleanups will uncover more interesting issues – as I advised last month, take the time to do something about it.

Updating the requirements list

If a UNIX-identity manager integration is in scope, start planning now. Research integration products and determine if they are appropriate to implement. If not, be sure to update the requirements list to ensure that UNIX integration requirements are captured.

Action Recap

This month’s actions are very similar to last month’s, just on different systems:

  1. Identify the secondary IDs, and determine who owns each ID
  2. Identify and retire obsolete IDs
  3. Connect secondary IDs to the primary IDs
  4. Develop (and use!) a process for keeping the IDs clean until identity management can take over

How can I help?

Do you need some clarification or additional assistance? Do you have an experience to share with others? Leave a comment below so we can all improve together.


You may also like

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

Tired of feeling defeated on Friday?

Where the stack of work to get done is bigger than what got finished. You dread next week before the weekend even begins.

It doesn’t have to be this way.