June 03, 2011

NewProcs: A High-level System Activity Indicator

Operating systems are fairly multi-tasking and processes are their major, most visible components. In line with Heisenberg's Uncertainty Principle, which I see everywhere these days, observing processes can be more accurate than observing any smaller components. In my job I mostly observe process-level behaviors; at least I begin there. Recently I started staring from one more level zoomed out -- the lifetime of a process -- and insensitively asking, "Should this process have been created?"

It's a question that is sometimes easy to decide, but not always. It is like an Age of Empires player asking herself, "Should I create a new villager?". It depends on the tasks pending, the resources available, the world and the neighborhood into which he might be born, the greater purpose his long-term life could serve, and even the nature of the creator.

One small step towards deciding this is the knowledge of process lifetimes, their births and deaths, and so a measurement that I now routinely take is "NewProcs". On Windows, Process Monitor through filters "Operation is Process Create" and "Operation is Process Exit" does a beautiful job of this. On Solaris the following DTrace one-liner will do: dtrace -qn 'proc:::exec-success, proc:::exit {printf("%d\t%d\t%Y\t%s\n", pid, ppid, walltimestamp, curpsinfo->pr_psargs)}'. On OSs with SystemTap the following one-liner will do: stap forktracker.stp (the file should be located in the SystemTap examples directory). On Unix-based OSs, I have a feeling GNU Acct can serve as a workaround.

One major area that can take advantage of this is "scripts" - be they Batch scripts or Shell scripts or Perl scripts. Especially in enterprise software, perepheral operations like installations and configurations use a lot of scripts. I will try to elaborate on that in the near future.

The number of processes being created and killed itself isn't necessarily a good indicator, but the detailed list can occasionally throw a lot of light on the high-level system activity during various operations. Below are a few examples (naïve layman's curiosities):

1. On Windows 7 and Windows Server 2008 (probably Windows Vista as well), Explorer can navigate the calendar, connect to various networks, switch battery power plans, browse and search for various programs all without creating any new processes. However the Speakers/Headphones icon in the system tray needs to create a new process, SndVol, for any basic volume change operation.

2. Windows Media Player Network Sharing Service Configuration Application (wmpnscfg) is one inexplicable troll. It keeps appearing during scenarios where its existence isn't all that clear. e.g. Whenever one connects to some networks, several short-lived wmpnscfg processes get created. A disconnect is usually associated with another instance.

3. When one looks for the version of Google Chrome (at least with 11.0.696.71), the "About Google Chrome" dialog creates a GoogleUpdateOnDemand process which goes on to create two more processes: GoogleUpdate /ondemand, GoogleUpdate -Embedding. The browser also has a nice feature of Gmail notifications. Unlike notifications from Microsoft Office Outlook, each Gmail notification comes through a separate procress (chrome -type=renderer, the tab process). Google Chrome is known to be a "multi-processed" application for reasons of stability and security, but.

Cygwin (and MinGW) earnestly takes up its share of process creation as explained here. I'm sure everyone has their reasons, but high birth rates and infant mortality rates always concern me. Should these processes have been created? What do you think?

2 comments:

  1. The disadvantage with single process model (at least in case of web browsers) is that:
    a. if one tab crashes (due to bad javascript or some other reason), it can bring down all the other well-behaving tabs. This doesn't sound fair.
    b. if one tab is able to overcome browser security protections, it can access datastructures (DOM values which can have important private information) of other tabs like Bank or Personal Finance related websites. This doesn't sound good at all.
    c. if one tab leaks memory, ditto ditto.

    The above three occur more often than not, due to the nature of web browsers loading code from practically all over web (where the above things are caused innocently and maliciously).

    Hence, Chrome came up with this process of one tab per process. Essentially operating system offers protection (as a basic tenet) and hence solves all the above three problems. There is certainly greater overhead, but it has its advantages (I love that I can start/stop applets of EmergeDesktop anytime I choose to, but it creates multiple processes).

    I know that you are talking about those processes whose birth/death can be avoided, but given that we have both pluses and minuses on both sides, IMO I think it would boil down to a trade-off.

    ReplyDelete
  2. I'm not debating the multi-proc browsers. In fact, I'm liking it more and more as I see some of its advantages (most commonly that an unresponsive tab is not freezing the browser itself).

    In Chrome case, I'm still questioning whether new mail notifications need a new process each, and whether the About needs to spawn three new processes, and there are a few other things.

    ReplyDelete