Tuesday, March 31, 2009

linux: semget across the processes

Recently I've written an issue on whether the semaphore handle obtained with sem_open is valid in children processes.
To make the picture complete I'd like to describe is it possible to pass semaphore handle to children processes obtained by IXS's semget.

When you call semget with some key and the call was successful you get semaphore id that is valid not only within process tree but system wide.
So actually there shouldn't be any problems with the children processes that are using semaphore handle obtained in the parent process. Let's glance at the code below.

#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <errno.h>
#include <string.h>

union semun
{
 int              val;    /* Value for SETVAL */
 struct semid_ds *buf;    /* Buffer for IPC_STAT, IPC_SET */
 unsigned short  *array;  /* Array for GETALL, SETALL */
};

int main()
{
 struct sembuf op[1];
 union semun ctrl;
 int sem = semget(0xF00B00, 1, IPC_CREAT|0600);
 if (sem == -1)
 {
  perror("semget");

  return 1;
 }

 op[0].sem_num = 0;
 op[0].sem_flg = 0;

 ctrl.val = 1;
 if (semctl(sem, 0, SETVAL, ctrl) == -1)
 {
  perror("semctl");

  return 1;
 }

 switch(fork())
 {
  case -1:
   perror("fork()");

   return 1;

  case 0:
  {
   printf("Child %u waiting for semaphore(%d)...\n",getpid(), sem);
   op[0].sem_op = 0;
   semop(sem, op, 1);
   printf("Child: Done\n");

   return 0;
  }
 }

 sleep(1);

 printf("Parent %u setting semaphore(%d)...\n",getpid(),sem);
 op[0].sem_op = -1;
 semop(sem, op, 1);
 printf("Parent: Set\n");

 return 0;
}
Here a semaphore was created and its initial value was 1. Next child process was created which waits on the semaphore until its value becomes zero. It finishes as soon as parent process decrements the value of the semaphore.The output of this test application would be
$ ./test 
Child 3353 waiting for semaphore(0)...
Parent 3352 setting semaphore(0)...
Child: Done
Parent: Set
With ipcs we can look at the system semaphores:
$ ipcs -s

------ Semaphore Arrays --------
key        semid      owner      perms      nsems     
0x00f00b00 0          niam      600        1 
You can see that semaphore with key 0x00f00b00 and id 0 is present in the system.
Since semaphore is system wide and could be accessed by its id the call to sem get may be omitted is we know that semaphore is already available and the id is known. The following code snippet is the same as previous one except there's no call to semget but value of sem is hardcoded with value 0(the id of that semaphore in our case):
int main()
{
 struct sembuf op[1];
 union semun ctrl;
 int sem = 0;

 op[0].sem_num = 0;
 op[0].sem_flg = 0;

 ctrl.val = 1;
 if (semctl(sem, 0, SETVAL, ctrl) == -1)
 {
  perror("semctl");

  return 1;
 }

 switch(fork())
 {
  case -1:
   perror("fork()");

   return 1;

  case 0:
  {
   printf("Child %u waiting for semaphore(%d)...\n",getpid(), sem);
   op[0].sem_op = 0;
   semop(sem, op, 1);
   printf("Child: Done\n");

   return 0;
  }
 }

 sleep(1);

 printf("Parent %u setting semaphore(%d)...\n",getpid(),sem);
 op[0].sem_op = -1;
 semop(sem, op, 1);
 printf("Parent: Set\n");

 return 0;
}
The output is
$ ./test 
Child 3366 waiting for semaphore(0)...
Parent 3365 setting semaphore(0)...
Child: Done
Parent: Set
Here as before the child process waits for the semaphore until the parent process decrements its value to become 0.

If you'll look into the code of semget you'll find that it does semget syscall. In ipc/util.c in the linux kernel tree you should be able to find ipcget function which calls ipcget_public routine that makes key checks and in under certain circumstances creates new semaphore with new id and adds to semaphore set. The value of id is system wide, so after the semaphore had been created you will be able to get access to it if you have appropriate permissions.

Friday, March 27, 2009

linux: automatic login

The time of laptop-per-human is almost came. Laptop is something personal like toothbrush - you are the only owner.
I do not really worry who is using laptop - because only I use it. So logging in each time I plug power on seemed strange for me. So I've forced gdm(gnome desktop manager) to autologin mode. The time passed and I realized that the only purpose of gdm is to automatically login to my session. "What the hell?" I said to myself. Why do I have to spend my laptop's resource on something that I don't really need(yep, I prefer to spend memory and cpu on emacs and gcc)?

So I've found nice approach to autologin to X session w/o any desktop managers.
That's really simple.
You should edit /etc/inittab configuration file and replace line which loads desktop manager with command that will launch X session for your user. For me it is

x:5:once:/bin/su <user> -l -c "/bin/bash --login -c 'ck-launch-session startx' &>/dev/null"

For distributions that use another approach to run DM(like gentoo - they load DM as a startup script) you can replace the line which loads DM with
/bin/su <user> -l -c "/bin/bash --login -c 'ck-launch-session startx' &>/dev/null"
or disable loading the startup script and modify inittab like I showed above.

Also it's possible to make automatic login in virtual consoles:
c1:2345:respawn:/sbin/mingetty -n -l <user> vc/1 linux
or
c1:2345:respawn:/sbin/agetty -n -l <external program> 38400 vc/1 linux
Whe external program will perform login for you. The program might be quite simple - just exec 'login' for your user.

Tuesday, March 24, 2009

*nix: rw-lock with semaphores

RW-lock is widely used synchronization technique. It allows multiply readers to go through the barrier if there is no writers and only one writer at one time.
This incredibly could speed up application if you expect more readers than writers and amount of readers is high(here should be some heuristic because if you have few readers and time of data lock in reader is small using rw-locks might be overhead).

You likely will find rw-lock mechanism in POSIX threads(pthread_rwlock_init, pthread_rwlock_rdlock, pthread_rwlock_wrlock, pthread_rwlock_unlock). As pthreads is a POSIX interface you will likely find it on any POSIX-compliant system.

When we are talking about process synchronization we'll more likely think about semaphores.
There is no such thing as rw-lock semaphore(some system have indeed).
Of course you still may set PTHREAD_PROCESS_SHARED on pthread rw-lock with pthread_rwlockattr_setpshared. But if you still want to use semaphores ...

Keeping in mind that rw-lock is operated by two types of tasks two semaphores are needed. One for writer. It will be binary semaphore. And one for readers. It will be counting semaphore.

When the reader tries to get the lock it checks if there is no writers. If nobody is writing, the reader increments counting semaphore and proceeds. On 'unlock' the reader just decrement the counting semaphore and that's all.
When the writer tries to 'lock' it takes(decrements) binary semaphore and wait until all readers finish their work. When there's no readers the writer proceeds. On 'unlock' the writer releases(increments) binary semaphore.

Look at the implementation below:

#include <semaphore.h>
#include <errno.h>

struct rwlock {
    sem_t rlock;
    sem_t wlock;
};

void init_rwlock(struct rwlock *rw);
void destroy_rwlock(struct rwlock *rw);
void rlock(struct rwlock *rw);
void runlock(struct rwlock *rw);
void wlock(struct rwlock *rw);
void wunlock(struct rwlock *rw);

void init_rwlock(struct rwlock *rw)
{
    sem_init(&rw->rlock, 1, 0);
    sem_init(&rw->wlock, 1, 1);
}

void destroy_rwlock(struct rwlock *rw)
{
    wlock(rw);

    sem_destroy(&rw->rlock);
    sem_destroy(&rw->wlock);
}

void rlock(struct rwlock *rw)
{
    sem_wait(&rw->wlock);

    sem_post(&rw->rlock);

    sem_post(&rw->wlock);
}
void wlock(struct rwlock *rw)
{
    int readers = 0;

    sem_wait(&rw->wlock);

    do {
        if (sem_getvalue(&rw->rlock, &readers) != -1) {
            if (readers == 0) {
  break;
            }
 }

 usleep(1);
    } while (1);
}

void runlock(struct rwlock *rw)
{
    do {
        if (sem_trywait(&rw->rlock) == -1) {
            if (errno != EAGAIN) {
                continue;
            }
        }
        break;
    } while (1);
}

void wunlock(struct rwlock *rw)
{
    sem_post(&rw->wlock);
}
This interface could be used by calling pairs rlock/runlock and wlock/wunlock. Check out next piece of code:
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <sys/mman.h>
int *counter;
struct rwlock *rw;

void reader(void)
{
    do {
        rlock(rw);
        printf(">>>>reader: counter %d\n", *counter), fflush(NULL);
        if (*counter == 16777216) {
            runlock(rw);
            break;
        }
        usleep(1);
        runlock(rw);
    } while(1);
}

void writer(void)
{
    do {
        wlock(rw);
        ++*counter;
        printf(">>>>writer: new counter %d\n", *counter), fflush(NULL);
 if (*counter == 16777216) {
            wunlock(rw);
            break;
        }
        wunlock(rw);
    } while (1);
}
int main(int argc, char **argv)
{
    pid_t children[100];
    int i;

    counter = mmap(NULL, sizeof(int), PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, 0, 0);
    rw = mmap(NULL, sizeof(struct rwlock), PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, 0, 0);

    *counter = 0;

    init_rwlock(rw);

    for (i=0;i<100;++i) {
        children[i] = fork();
        if (children[i] == -1) {
            printf("Problem with creating %d child: ''", i, strerror(errno)), fflush(NULL);
        }else if (children[i] == 0) {
            if (i%3 == 0) {
                writer();
            }else {
                reader();
            }
        }
    }

    wait(NULL);

    return 0;
}
Here 100 processes have been started where 1/3 are writers. You likely see that here permitted any number of readers. This could be a bottleneck if writer has to wait too long. pthreads rw-lock suffers from the same disease.
To make limitation on readers rlock function should be modified to check amount of readers that are currently holding the lock. With reader limitation rw-lock on semaphores will likely look like:
struct rwlock {
    sem_t rlock;
    sem_t wlock;
    int readers;
};

void init_rwlock(struct rwlock *rw, int readers);
void destroy_rwlock(struct rwlock *rw);
void rlock(struct rwlock *rw);
void runlock(struct rwlock *rw);
void wlock(struct rwlock *rw);
void wunlock(struct rwlock *rw);

void init_rwlock(struct rwlock *rw, int readers)
{
    sem_init(&rw->rlock, 1, 0);
    sem_init(&rw->wlock, 1, 1);
    rw->readers = readers;
}

void destroy_rwlock(struct rwlock *rw)
{
    wlock(rw);

    sem_destroy(&rw->rlock);
    sem_destroy(&rw->wlock);
}

void rlock(struct rwlock *rw)
{
    int readers;
    do {
        sem_wait(&rw->wlock);
 if (sem_getvalue(&rw->rlock, &readers) != -1) {
            if (readers < rw->readers) {

                sem_post(&rw->rlock);

                sem_post(&rw->wlock);

                break;
            }
 }
        sem_post(&rw->wlock);

        usleep(1);
    } while (1);
}

void wlock(struct rwlock *rw)
{
    int readers = 0;

    sem_wait(&rw->wlock);

    do {
        if (sem_getvalue(&rw->rlock, &readers) != -1) {
            if (readers == 0) {
                break;
            }
        }

        usleep(1);
    } while (1);
}

void runlock(struct rwlock *rw)
{
    do {
        if (sem_trywait(&rw->rlock) == -1) {
            if (errno != EAGAIN) {
                continue;
            }
        }
        break;
    } while (1);
}

void wunlock(struct rwlock *rw)
{
    sem_post(&rw->wlock);
}
The semaphores is really powerful tool for building different synchronization schemes.

Tuesday, March 17, 2009

c++: initialzation of the virtual base class

Recently I've touched problem that I haven't ever seen before with virtual base classes.
The problem didn't appeared for me because I haven't used non-default constructor in VBC(virtual base class).
According to standard the most-derived class is responsible for constructing VBC.
You might think that this is not a problem but let's look at the code snippet above:

#include <iostream>

using namespace std;

class VBC
{
  public:
    VBC(const string &i) : i(i) {}
  protected:
    string i;
};

class C : virtual public VBC
{
  public:
    C() : VBC("C") {}
};

class D : public C
{
  public:
    D() : VBC("D") {}
    void print() {cout << i << endl;}
};

int main(int argc, char **argv)
{
    D d;
    d.print();

    return 0;
}
First of all you might notice that class D must have non-trivial constructor to call constructor of VBC class as it is responsible for constructing instance of VBC. That's not a big deal but this depends on what you expect from the virtual base class and what the class hierarchy is. Let's assume that VBC's objection is just to hold the name of the current class instance. Class D stands here just for printing the value of VBC's member i. When d.print() is called you will see 'D' on the output. That's because you called VBC's constructor with argument "D".
Here the problem comes out. Looks like there is no solution how to make class D print 'C'. VBC won't be constructed in class C.
The compiler's behavior becomes more clear when you look at the multiply inheritance model:
class VBC
{
  public:
    VBC(const string &i) : i(i) {}
  protected:
    string i;
};

class C1 : virtual public VBC
{
  public:
    C1() : VBC("C1") {}
};

class C2 : virtual public VBC
{
  public:
    C2() : VBC("C2") {}
};

class D : public C1, public C2
{
  public:
    D() : VBC("D") {}
    void print() {cout << i << endl;}
};
Indeed here it's more clear that for compiler it's an undefined behavior which instance(C1's or C2's) of VBC to choose. Actually for class D neither VBC instance of C1 or C2 is constructed due to the nature of virtual base classes.
For the end user of the class it could be confusing to search for the virtual base class declaration through the hierarchy of inheritance to understand how constructor of virtual base class should be called.
In the Internet I've found a 'solution' that might resolve this issue: create default non-trivial constructor of VBC like this:
class VBC
{
  public:
    VBC(const string &i = "VBC") : i(i) {}
  protected:
    string i;
};

class C1 : virtual public VBC
{
  public:
    C1() : VBC("C1") {cout << i << endl;}
};

class C2 : virtual public VBC
{
  public:
    C2() : VBC("C2") {cout << i << endl;}
};

class D : public C1, public C2
{
  public:
    void print() {cout << i << endl;}
};
The biggest mistake of this solution is that member i of class D inherited from VBC will be initialized with value of "VBC" what is you might expect less among other variants.
Of course virtual inheritance makes sense in multiply inheritance model and such problem is more explicitly recognized there.
Probably the best solution is not to define non-default(even with predefined values of arguments) constructor in virtual base class but this is not that easy in the current world though.

Tuesday, March 10, 2009

linux: sem_open across the processes

Is it safe to use the address of the semaphore in child process obtained in the parent?
According to the man page it's not allowed:

References to copies of the semaphore produce undefined results.

That is true in general, the code below shouldn't work correctly.
int main()
{
 sem_t *sem = sem_open("key",O_CREAT,S_IRUSR|S_IWUSR,0);
 switch(fork())
 {
 case -1:
  perror("fork()");

  return 1;

 case 0:
 {
  printf("Child %u waiting for semaphore(%p)...\n",getpid(),sem);
  sem_wait(sem);
  printf("Child: Done\n");
  
  return 0;
 }
 }

 sleep(1);

 printf("Parent %u setting semaphore(%p)...\n",getpid(),sem);
 sem_post(sem);
 printf("Parent: Set\n");

 return 0;

}
But it works. At least in current implementation of linux and glibc.

sem_t is a pointer. You can found out that it's defined as a union but it's used only as a pointer. The union is used for alignment.

Looking into the sem_open.c in glibc sources you may find out that the return value of sem_open is actually the address returned from mmap call.
(sem_t *) mmap(NULL, sizeof (sem_t), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
mmap maps the file in /dev/shm/(or where shmfs is mounted), with name "sem."+name from the sem_open first argument. In case if the semaphore had been already created sem_open searches for its address in the tree of opened semaphores. For the search it uses tuple of name of the file, inode and device.

The tree of opened semaphores is declared in sem_open.c. It's local for the process.

The call to sem_open in the child process will refer to the same tree of the opened semaphores(as the address space is being copied in case of fork) and will return the address from the previous call of sem_open in parent process.

Interesting that the address returned from sem_open will be the same if it was called in the child and in the parent process independently:
#include <stdio.h>
#include <error.h>
#include <errno.h>
#include <semaphore.h>
#include <sys/stat.h>
#include <fcntl.h>

int main()
{
 switch(fork())
 {
 case -1:
  perror("fork()");

  return 1;

 case 0:
 {
  sem_t *sem = sem_open("key",O_CREAT,S_IRUSR|S_IWUSR,0);

  printf("Child %u waiting for semaphore(%p)...\n",getpid(),sem);
  sem_wait(sem);
  printf("Child: Done\n");
  
  return 0;
 }
 }

 sleep(1);

 sem_t *sem = sem_open("key",O_CREAT,S_IRUSR|S_IWUSR,0);

 printf("Parent %u setting semaphore(%p)...\n",getpid(),sem);
 sem_post(sem);
 printf("Parent: Set\n");

 return 0;

}
$gcc sem-test.c -lrt -o test
$./test 
Child 16116 waiting for semaphore(0xb8062000)...
Parent 16115 setting semaphore(0xb8062000)...
Parent: Set
Child: Done
That is because memory mapped by mmap is preserved across fork, with the same set of attributes.

Tuesday, March 3, 2009

c++: lifetime of temporaries

In c++ you can encounter few types of temporary objects.

Implicitly temporary object is created by compiler when complex expression is being evaluated:

5 + a/4;
Here the result of expression of (a/4) has to be stored somewhere for further calculations.

An anonymous temporary object could be created by programmer when no name is given to an object. Such expression as
4;
is an anonymous temporary object of type int.
When function is intend to return some value it would be also stored into the temporary object.

Of course smart compilers will make some optimizations here to avoid the creation of temporary objects.
The expression
int b = 5 + a/4
could be optimized as storing the result of a/4 to b then adding 5 to b.
The unused return value of the function won't be stored anywhere, moreover compiler may force function not to perform any 'return' actions.

The lifetime of temporary object is ended at the last step in evaluating the full-expression(or block of code) that contains the point where they were created. The temporary object of a/4 in expression 5 + a/4; is being destroyed just after the semicolon.

There is an exception in the standard when you can extend the lifetime of temporary object to the lifetime of const reference that is binded to it:
string get()
{
    return "string";
}
const string &obj = get();
The lifetime of string object created by function get on return is extended to the lifetime of const reference obj.

The temporaries are r-value objects, which means that you can't change their values. So the usage of const reference is mandatory. In pre-standard it's was allowed to use non-const references for temporaries but in ISO-C++ it's a compilation error. Changing value of const object could lead to unexpected results.

Using const references to catch the return value of the function could be used as an optimization of avoiding creating temporary object on function return. Consider this code:
string get()
{
    return "string";
}
string obj = get();
The temporary is created and its value is stored into the string object obj. Binding the return value to const reference could optimize memory usage and avoid extra calls to constructor and destructor in case if you are interested in the result of the function.

However modern compilers use RVO(Return Value Optimization) or NRVO(Named RVO) to avoid creation of temporary objects so in most cases you don't have to care about such optimizations.