Wednesday, January 3, 2007

Memory Overcommit and the OOM Killer

Linux has a feature called memory overcommit. Put simply, it means kernel allocates memory even if it doesn't have enough. This happens when a new process is created using fork(). This effectively copies the parent's address space, and so requires twice the parent process' memory once the new process (child) is created. The memory overcommit feature means that fork() always returns a success. Even if there is not enough memory to create a new child process!
The idea behind a memory overcommit feature of Linux is that the child process rarely uses all the memory allocated to it. fork() is followed by exec() which overlays the child address space with some exectutable. Once the exec() is done, the child process exits and the parent process (which goes into wait() after creation of child) resumes.
Failing to allocate enough memory when it is needed by the child results in another process being invoked. This process is called Out Of Memory (OOM) killer. The job of this process is to select a process to kill so that the memory requirements after fork() can be satisfied. Not a very desirable feature, but it is necessary to keep memory overcommit feature of Linux. This made OOM killer infamous. How to select a process to kill is tricky. It might happen that some important processes (e.g. a database) gets killed by OOM killer. Analogies like this show how serious the situation is when killer is invoked.
It seems that during 2.4, OOM killer's favourite process to kill was the Netscape browser. The browser would crash all of a sudden and you'd have no idea why.
The memory overcommit along with OOM is not an example of a good design feature, but has even made its way into AIX. With 2.6 the memory overcommit feature can be suppressed using some variables, but by default the feature is present.
Fortunately, it doesn't exist in Solaris. Solaris never used memory overcommit. First it was vfork() instead of fork() to prevent the failure of process creation. In Solaris 10, posix_spawn() is used instead of vfork() since vfork() is not MT-safe.

Sunday, December 24, 2006

Search from Firefox address bar

I've been using Google toolbar for Firefox for a long time. There is a Firefox feature which can make the use of the toolbar reduntant by doing Google search from Firefox address bar. For a single word search, Firefox takes us to the homepage of the first search result found. It is like the "I'm feeling lucky" feature of Google search. Just like the infamous "failure" or "miserable failure" and "I'm feeling lucky" combination does. But if we enter more than one word string in the Firefox address bar, it is like normal Google search, and makes installing Google toolbar unnecessary.
A cool feature of Firefox which many are not aware of.

Friday, December 22, 2006

The toilet

1- Say, there is a common toilet with many doors.
2- A person can enter from only one of these doors at any time i.e. the door to enter is fixed for him.
3- A door can be used for entry into the toilet by many persons i.e. it is a common entrance for some persons.
4- Only one person can use the toilet at a time.
5- There is a toilet supervisor whose job is to see that persons behave properly.
6- The supervisor gives each of the persons who use that toilet a number so he can recognize them when they come to use the toilet.
7-The higher the number the higher the chance of the person getting to use the toilet.
8- If a person is using the toilet and another person comes up whose number is bigger, the supervisor kicks the first person out and lets the second person to use the toilet. The kicked one get to use the toilet once the bigger number person is finished. The same fate awaits the second person if another person with even bigger number comes. In this case however, the first kicked person has to wait even longer. If the small number person is one unlucky guy he will find all persons coming with numbers bigger than his and he never gets to use the toilet or may have to wait for too long.
9- Seeing the plight of persons with small numbers the supervisor decides to help them out. He now puts a lock on each door. A person with small number now has option to lock as many doors from inside as he wants.
10- A door is open unless someone is using the toilet and has locked it from inside.
11- Any person can enter if his door is not locked and no higher number person is using the toilet. If a smaller number person is using the toilet when higher number comes and the door is not locked, the supervisor kicks the smaller person out and the higher number person uses it. The smaller person can reenter once the higher number person is done.
Now the supervisor is happy. The person who got small numbers can happily use the toilet if they lock the doors. But the person using the toilet often don't lock all the doors, so they still get kicked out by higher number person whose door they didn't lock.
This also cause another problem. Now sometimes the higher number person can't use the toilet because his door is locked by smaller number person who got kicked up by another person with a number somewhat in the middle of the other two. The highest number person should be using the toilet but the door is locked so he can't. Instead the middle person is using it. The supervisor didn't like it but he can't do anything with present rules so he does something else to help out the highest number person in such cases.
12- Now when a smaller number person goes in the toilet and locks a door, and a higher number person comes to that door, the supervisor exchanges the numbers of the two persons for the time while the small number person is using the toilet.
13- Once the person using the toilet is done, the two get their numbers back. The smaller person opens the door and comes out. And the higher number person enters the toilet.
14- If a person with a number somewhere in the middle comes before the (original) small number person finishes, he sees that a bigger number person is using the toilet since smaller and higher guys have exchanged their numbers. After the guys is done, the real higher number guy gets his number back and enters the toilet. So the middle guy has to wait for both the other two persons and gets a chance to use the toilet after them.

Now substitute the toilet with processor, supervisor with scheduler, persons with processes, and numbers with priorities. 1- to 8- describe the priority based scheduling with preemption. 8- is known as process starving. 9, 10, and 11- constitute what is known as priority inversion.
12, 13, and 14- are examples of priority inheritance.

The analogy isn't perfect though, and there might be some loopholes.

Thursday, December 21, 2006

Outstanding OpenSolaris questions by James McGovern

James left a comment to one of my earlier posts and suggested I answer some of the outstanding questions he posted on his blog sometimes back. Though I am not an authority on this, I will try to answer some of them as per my understanding. I was thinking of replying in comments section but it became too long, so here is a reply to James comment.

Hi James,
Unfortunately I have little information related to SPARC chips. It's an open architecture and anyone can see the specification and is free to implement.
I don't think Sun produces SPARC chips for appliances as they are a server-focussed company.
Maybe Fujitsu does it. I have heard of SPARC chips in some cameras, but you'd have to google search to find out more.
Regarding OpenSolaris, I believe OpenSolaris.org community is much more capable to answer those queries. e.g. I searched Xen community list there and it seems they have some working Xen code for OpenSolaris. Of course, Xen itself is not yet complete, so Xen for OpenSolaris would take time. Looking at the activity there, it seems Xen is the future of OpenSolaris virtualization.
Headless/Diskless clients under Solaris have been supported for quite sometime.
About legal implications of running OpenSolaris, I know none that exist. You are free to distribute your product with an OpenSolaris distribution as long as the existing files you have used from the community and modified are open sourced under CDDL. If you've added any new files, you are free to choose whatever license you want for your files if that license permits it. CDDL is less viral in this regard.
That's my understanding. I'd suggest to throw these questions to OpenSolaris list. They'd sure give you detailed and authentic reply.

Wednesday, December 20, 2006

Virtualization is where the action is !

Virtualization is the buzzword in the world of Operating Systems these days. Recently KVM - Kernel based Virtual Machine capability was introduced into Linux. When completed, it would make it possible to run Windows ( maybe other OS too) as a guest OS on top of Linux in the newer Intel and AMD processors that have support for virtualization.
KVM is virtualization specific to Linux. Other virtualization technologies also exist some of which like VMware are very advanced and allow many more OS's as hosts and guests.
Another virtualization technology under development is Xen which will be a real competitor of VMware as it will have support for many OS's just like VMware and can be used with older processors as well. Xen is an open source project unlike VMware which is proprietary.
Then there's hardware virtualization which allows one set of hardware to run many OS's. UltraSparc T1 aka Niagara is supposed to get Logical Domain ( LDom ) support in the near future which will allow one Niagara processor to run many different versions of Solaris OS simulaneously.
IBM and Sun have had hardware virtualization in their big iron for a long time but now even smaller machines can have it. Solaris for example allows a form of virtualization with Zones where a machine with Solaris 10 or some OpenSolaris distro can run dozens of virtualized instances of the the OS. Each Zone is a secure virtual OS instance on which applications can run which can be compromised without compromising other zones in the same system.
With all these different virtualization techniques in Unix, Linux, Mac OS X and even in Windows, user today is the king! What was unthinkable a few years back is now possible thanks to all the advances in technology, be it open or proprietary.

Tuesday, December 19, 2006

Sun to release iPod like player !

Heard a rumour that Sun has finished working on a killer mp3 player. It'll be on offer for a free 60 day trial once the tussle between engineers and marketing is resolved. Engineers are opposing marketing people's move to name it Sun Java Secure Media Pocket Player, but they're willing to accept if the name is shortened to SJSMPP as long as no one knows what it stands for after its release.

Well, that was a joke by my insider-friend. Wondering if such a player if ever released would be able to run on minimized version of Solaris.

Sunday, December 17, 2006

GNU/Solaris ?

Sometimes back, The Register had an article titled "Is 'GNU/Solaris' emerging from Microsoft-Novell deal?"
But GNU/Solaris is already there, even with OpenSolaris under CDDL which is not GPL but another open source license. Maybe the reporter didn't do the homework right! Or perhaps he meant something else when he said GNU/Solaris.

ZFS in Mac OS X ?

Seems it wasn't just a rumour. ZFS is going to be in the upcoming MacOS X! Very cool to know ZFS is being ported to other OSs. It is already being ported to FreeBSD, along with DTrace. Porting ZFS to other OS is good for them as well as Solaris and Operating systems in general. It gives more visibility to such great technologies and innovation that they rightly deserve. It also gets other OS users to experience and use such powerful stuff. That would definitely attract more users to Solaris also, mainly those who still have to know how different Solaris 10 and OpenSolaris are to prior releases, and their capabilities.
The story and screenshot of ZFS in OS X was first broken here:
http://mac4ever.com/news/27485/zettabyte_sur_leopard/

Some blogs discussing it are at:
http://loop.worldofapple.com/archives/2006/12/17/zfs-file-system-makes-it-to-mac-os-x-leopard/
http://rom.feria.name/blog/2006/12/17/zfs-on-mac-os-x-105/
http://colindw.blogspot.com/2006/12/w00t-zfs-on-leopard.html
http://www.c0t0d0s0.eu/archives/2406-Its-official-ZFS-in-Leopard.html

Thursday, December 14, 2006

Linus Torvalds on GPL kernel modules

It's no news that Linus is a very good software programmer. There are other aspects of his character that are admirable. On the Linux mailing list today he stressed on why it is not good on the part of developers or open source zealots to force people to use software only the way developers want.
Responding to a suggestion that a time limit be set ( 12 months was suggested) after which kernel won't be allowed to load non- GPL tagged module, he said users should be allowed to use software the way they want. He tries to make the difference between use and distribution clear.
Software developers can only force people to distribute software the way developers want. How they use it should be left to the individuals.
Linus says, "There's a big difference between "copy" and "use". It's exatcly the same issue whether it's music or code. You can't re-distribute other peoples music (becuase it's _their_ copyright), but they shouldn't put limits on how you personally _use_ it (because it's _your_ life)."
He further makes it clear that he hates the idea of forcing on people the GPL way,
"In other words, you guys know my stance. I'll not fight the combined opinion of other kernel developers, but I sure as hell won't be the first to merge this, and I sure as hell won't have _my_ tree be the one that causes this to happen.
So go get it merged in the Ubuntu, (Open)SuSE and RHEL and Fedora trees first. This is not something where we use my tree as a way to get it to other trees. This is something where the push had better come from the other direction.
Because I think it's stupid. So use somebody else than me to push your political agendas, please."
Well said!
List archived at:
http://lkml.org/lkml/2006/12/13/370

Live Upgrade

Solaris OS has a pretty cool technology if we want to upgrade our computer to some later release of the OS. It is called Live Upgrade. It basically works like this:
When you first install Solaris on your computer, you leave some disk space free for the future. It is not a problem since disks are cheap now. Only thing is to remember to set aside some space during first installation. When at some later time a new release of the operating system comes up and you want to install it without having to shut down your system, you can use Live Upgrade. It basically is really Live Upgrade. No downtime while upgrading. Now how many OSs have such cool stuff!
Ok so you are ready to upgrade. You just make a copy of your existing operating system boot image. It's just a command away and the empty disk space has the copy of existing Solaris. Another command and the copy gets upgraded to whatever newer release you have. Once the upgrade is over, simply set the newly upgraded space as the boot option and just one reboot after this you are running the latest bits of the OS. See? The downtime is just one reboot. All the time the system was upgrading you were using the system while the upgrade was going in the background. It just made your system a bit slower, that's all!
Though an individual can afford to waste a couple of hours in upgrading the system by shutting it down, data centers don't have such luxury. That's why they use Live Upgrade. The downtime when they want to use the latest OS is just one reboot time. It has an additional advantage. If for some reason the upgrade fails and you can't reboot into the newly upgraded partition, just revert back to the old working disk partition as your boot OS and it will work fine!

Steps to install PyTorch on VMware workstation (Ubuntu guest)

  The following is the list of steps to install pytorch 2.0 in VMware workstation (Ubuntu guest): $ mkdir ~/pytorch $ mkdir ~/pytorch/as...