Showing posts with label Unix. Show all posts
Showing posts with label Unix. Show all posts

Thursday, July 23, 2009

Redirection and 2>&1

In Unix, one often finds a input and output redirection "<", ">", ">>". These are simple to understand.

Users can also do redirect one stream into another stream. This is done by using standard file descriptors. So, if someone wants to redirect all standard errors to whereever standard output is going, add 2>&1 in the end of the command. 2>&1 means send the standard errors to where standard output is going. 2 is the default file descriptor for stderr and 1 for stdout. If certain command is redirecting stdout to, say some file, stderr still goes to default screen. If you want stderr also to be sent to the file where stdout is going, use 2>&1.
Similarly, you can also redirect stdout to whereever stderr is going by 1>&2.

Also, remember that "some unix command > some_file 2>&1" is different from "some unix command 2>&1 > some_file". The first one redirects stderr to some_file, whereas the second won't.

Saturday, July 11, 2009

A tutorial on setuid and sticky bits in Unix

Unix users are sometimes confused with setuid and sticky bit permissions on files and directories. Below is a small tutorial on the most common use of setuid and sticky bits.

There are certain files or devices that are writable only by root. Nevertheless, ordinary users often have to use root-owned programs that try to write to those files. Since these files are writable only by root, how would a non-root user run the program that writes into that file. The solution in Unix is setuid bit. When this bit is set on a root-owned program, the program gets the effective privileges of root even when run by non-root user. This happens only for setuid programs i.e. for programs that have setuid bit set by owner/root. Example is passwd program that modifies the password files that are writable only by root. Since passwd program is setuid id, any non-root user can run this program and modify his/her password. Another example is ping program that is also setuid since ordinary users also run the ping program that accesses network devices.

In the file permission listing found by command “ls –l” setuid programs have the s in place of x. This means the program is both executable by owner and setuid. Another possibility is S in place of x which means the program is setuid but NOT executable. The program permissions could look like

rwsr-xr-x ( executable by root and setuid; what matters for setuid bit is the third letter)

OR

rwSr-xr-x ( hmmm…does it make sense !?)

One can make a program setuid by

chmod u+s prog_name

OR

chmod 4755 prog_name

Although the explanation above was specific to root-owned files, it could apply to any owner. So any user can make a program owned by himself/herself setuid and let others modify some of the owner’s files by this program.

Another related special permission setting for a program is setgid bit which is similar to above, but is useful only to the owner’s group. The setuid and setgid bits have different meanings when applied to directories. setgid bit on directory "d" means that any file or directory created under it would get the group id of "d". Remember that normally the group id of any newly created directory is the group id of the user who created it.

Yet another special permission is the sticky bit. Now-a-days, it is mainly used for directories. Let us first understand what directory permissions mean. Some directories are writable by all users. That means all users can create files as well as delete files inside such directories. Execute permission for a directory means search permission into that directory. So, execute permission is necessary to descend into that directory.

An example of directory with sticky bit set is /tmp. This directory stores temporary data that is created by user programs. Since these directories are writable by all, any user can delete any file! Even those files that are owned by others! To fix this state, such directories have sticky bit set. Now only the owners can delete any file in such directories.

Directories with sticky bit set have t letter in the execute permission set for others, as seen by the output of “ls –l”.

rwxrwxrwt ( sticky bit is set and execute permission for others)

OR

rwxrwxrwT ( sticky bit set but no execute permission for others)

Capital T means that others don’t have execute permission for that directory, so they can’t search into it.

update: Added setgid directory explanation.


Tuesday, May 19, 2009

How Unix shell executes commands

Unix shell runs all the commands in a new forked child process. It means, the shell in which the command is invoked becomes the parent that creates a child process ( a shell process). The child in turn exec's the command by overlaying itself with the command's image. Why would the new process be created for executing the command? Can't the shell run the command in the same process. The answer is:
If the shell overlays itself with the command's process image, it would have nowhere to return when the command finishes executing. This would close the parent shell. Running "exec command_name" effectively does this i.e. runs the command that evetually closes the shell when the command execution is over. To avoid closing the invoking shell itself, all commands are run in a new child process by the shell.

We can use "strace" utility to track down the child process creation and execution system calls. Note that doing strace on the command will only trace the command after fork i.e. the output will show execve(...) followed by other syscalls ending with some exit call. This is so because we are tracing only the command which really is in the child process. To trace how the shell has created a new child process by fork() (clone() family in Linux), do strace on the current shell in a separate terminal and then run a command in the current shell.

PS: edited off the 'subshell' part as that is a different beast altogether...

Sunday, June 10, 2007

Links and symlinks - Unix and Windows

Hard links in Unix are files that have different names and can possibly different directories, but they have same inode i.e. the file is stored at any one place in the hard disk. All the hard links to any file point to that location. One can delete a hard link but it won't delete the file if there is any other link to it.

Symbolic links or symlinks on the other hand are small file that contain the pointer to another file. They are different from the actual file they are pointing to. So deleting a symbolic link won't delete the actual file. The implementation of symbolic links in Unix is transparent to the user. If a user opens and edits a symbolic link, he actually is editing the file the symbolic link points to. The symbolic link remains just a pointer to the actual file.

While Windows have shortcuts that are nearest thing to symbolic links, if someone edits a shortcut file, it actually gets changed and so it is not as transparent to the user as Unix. I read somewhere that Windows Vista has introduced transparent symbolic links similar to Unix.

Tuesday, June 5, 2007

Creating a dynamic library - example

We all use library functions in the programs we write. An example of library that is always used in Solaris and Unix like Operating systems is libc.so. But how to create a library? It is not hard. A dynamic library can be easily created as shown in the following example.

Let's say we want to create a library called libgeek.so. It will contain an example function called my_library_func() that we will use in our program. We will create a simple program called geek.c that has the function we wanted. We will compile this as a library and call it libgeek.so (library names begin with lib) :

$ cat geek.c

my_library_func()
{
printf("Inside my library function");
}

The above is a library function we wanted to create. We then compile it into a dynamic library by giving a -G option to compiler :

$ cc -o libgeek.so -G geek.c

Now, we can use the generated library libgeek.so in our programs like:

$ cat hellolibrary.c

int main()
{
my_library_func();
return 0;
}

Now, we can compile our program and tell the linker to link to the library we created for my_library_func() :

$ cc hellolibrary.c -L/home/osgeek -R/home/osgeek -lgeek

L and R tell linker the path to look up during link-time and run-time to find library libgeek.so. The library libgeek.so is used with "lib" part removed and "l" prefixed as "lgeek".

When we run this program, the output would look like:

$ a.out
Inside my library function

That's it. We created a library and used it in a program.

Sunday, May 27, 2007

No archives (*.a ) in Solaris anymore

While discussing static libraries in one of my previous posts, I commented that libm is provided as both dynamic library ( libm.so ) as well as static archive ( libm.a ).

Well, that is not true for Solaris anymore. Solaris 10 doesn't ship with a single static library.
Doing
ls -la |grep *.a
in /usr/lib where libraries are usually present returned no results. Tried in some more directories with same result.
I don't know when static libraries were dropped from Solaris. My guess is that it was Solaris 10, but any pointers to information would be welcome.

Saturday, May 19, 2007

Static linking : library options in command line

In my last post I asked why it's advised that library options be the last in the command line in case of static linking.
Here is the explanation:
The symbols on the command line are resolved from left to right.
Stating linking looks through the static library for "undefined" symbols when it is processed.
Now in case of

cc -lfoo hello.c

there are no undefined symbols when libfoo.a gets processed and so nothing gets extracted from it. When the object file is processed, it doesn't find any symbol and it gives an error "Undefined symbol"
If hello.c is put before -lfoo as in

cc hello.c -lfoo

there are undefined symbols when libfoo gets processed and so they get extracted. This works fine.

Dynamic linking doesn't have this issue as all symbols are available through the virtual address space of the output file.
Static libraries have other issues like bigger executable size, and lack of ABI ( application needs to be relinked with each new version of the library).
One advantage of having static libraries is that the executables linked to them are somewhat faster at runtime because all the linking occurs before loadtime. This helps in benchmarking. Math library libm is provided as a shared object (libm.so) as well as static library (archive libm.a) since benchmarking makes a heavy use of this library.

Friday, May 18, 2007

quirk of static linking

A question related to linking today.
Why is it advised to put the library options at the end of command line for compilation?

Hint: If we have a static library, say libfoo.a which we want to link to our program hello.c

cc hello.c -lfoo
rather than
cc -lfoo hello.c

-l option tells the compiler to link to library [lib]foo. Note that "lib" from libfoo is dropped and only "foo" part is given with -l.

Wednesday, January 3, 2007

Memory Overcommit and the OOM Killer

Linux has a feature called memory overcommit. Put simply, it means kernel allocates memory even if it doesn't have enough. This happens when a new process is created using fork(). This effectively copies the parent's address space, and so requires twice the parent process' memory once the new process (child) is created. The memory overcommit feature means that fork() always returns a success. Even if there is not enough memory to create a new child process!
The idea behind a memory overcommit feature of Linux is that the child process rarely uses all the memory allocated to it. fork() is followed by exec() which overlays the child address space with some exectutable. Once the exec() is done, the child process exits and the parent process (which goes into wait() after creation of child) resumes.
Failing to allocate enough memory when it is needed by the child results in another process being invoked. This process is called Out Of Memory (OOM) killer. The job of this process is to select a process to kill so that the memory requirements after fork() can be satisfied. Not a very desirable feature, but it is necessary to keep memory overcommit feature of Linux. This made OOM killer infamous. How to select a process to kill is tricky. It might happen that some important processes (e.g. a database) gets killed by OOM killer. Analogies like this show how serious the situation is when killer is invoked.
It seems that during 2.4, OOM killer's favourite process to kill was the Netscape browser. The browser would crash all of a sudden and you'd have no idea why.
The memory overcommit along with OOM is not an example of a good design feature, but has even made its way into AIX. With 2.6 the memory overcommit feature can be suppressed using some variables, but by default the feature is present.
Fortunately, it doesn't exist in Solaris. Solaris never used memory overcommit. First it was vfork() instead of fork() to prevent the failure of process creation. In Solaris 10, posix_spawn() is used instead of vfork() since vfork() is not MT-safe.

Monday, December 11, 2006

Microsoft Unix

It sounds funny now, but Microsoft once actually had the most widely installed Unix base. Its version of Unix was called Xenix and it was distributed in the 80's by many vendors. What happened to it since then? Well, Microsoft sold it to SCO and moved on to develop OS/2 with IBM and then Windows NT.

Wikipedia has some interesting tidbits of information for Xenix here.
How it looked like back in the 80's? Here is a screenshot from wikipedia.

System V release 4, the standard for Unix today was formed by merging SunOS, BSD, Xenix, and System V.

Of course, the legacy of Microsoft Xenix is still around. But where to look to see the history of Unix ? All the flavours of Unix are closed source, or are they? Thanks to open sourcing of Solaris, we can now take a look into all the real Unix code and find some gems of Copyrights that silently narrate the history of Unix development.

For example, to see how the development of Unix has passed on from the University of California at Berkeley to AT&T and Microsoft to Sun Microsystems, have a look at this tar code.

Such is the beauty of Unix. Decades older than any other present day OSs and still holding on its own in the modern world. Not only that, it manages to beat others often in their own game and still come out at other times with such innovations that are the envy of the youngsters. Even spawning dozens of clones which are cool in their own way. Ubuntu, anyone?

Me? I'm happy with my good ol' Unix. Solaris, that is. For me.

Steps to install PyTorch on VMware workstation (Ubuntu guest)

  The following is the list of steps to install pytorch 2.0 in VMware workstation (Ubuntu guest): $ mkdir ~/pytorch $ mkdir ~/pytorch/as...