Patch Name:  PHKL_9744

Patch Description: s700 10.01 SystemV semaphores, semop(2) cumulative patch

Creation Date: 97/01/09

Post Date:  97/01/15

Warning: 97/03/27 - This Non-Critical Warning has been issued by HP.

	This patch introduces a problem with counting semafores.
	Applications could hang when using semop() with non
	binary semaphores.
	The replacement patch PHKL_10454 is immediately
	available.
	Sites having installed PHKL_9744 should update to
	PHKL_10454 immediately.

Hardware Platforms - OS Releases:
        s700: 10.01

Products: N/A

Filesets:
        OS-Core.CORE-KRN ProgSupport.C-INC

Automatic Reboot?: Yes

Status: General Superseded With Warnings

Critical: No

Path Name:  /hp-ux_patches/s700/10.X/PHKL_9744

Symptoms:
        PHKL_9744:
        Applications using SystemV semaphores were experiencing
        poor performance.  This problem was reported by several
        customers using Oracle or SAP.

        PHKL_9114:
        Two cases of application hangs on semaphore; 1) using sema
        sets and 2) wrong sema counts cuased by a signal to a
        thread.

        PHKL_7053:
        semop() with counting semaphores could leave processes
        hanging. By "counting" semaphores we mean semops that
        use absolute values > 1 in the sem_op field.

        Here is an example situation which causes a hang:

        semval is 0
        Child A is asleep trying to get 2 from the sema.
        Child B is asleep trying to get 1 from the sema.
        Parent adds 1 to semaphore.

        Child A "wakes up" but finds there is'nt enough
        and goes back to sleep. Child B, who could
        be satisfied with 1, never wakes up thus
        leading to a hang.

        The customer visible symptom is process hangs
        in code that uses system V semaphores.

        This fix is in addition to an earlier fix
        related to process hangs in semop code. It covers
        some bugs which were not fixed by the earlier
        fix.

        PHKL_5869:
        A process that uses semop(2) may hang indefinitely.
        The bug only shows up if some other process also
        uses semop(2), specifically requesting two or more
        semaphore operations in a single call.

Defect Description:
        PHKL_9744:
        The previous patch (PHKL_9115) fixed a hang situation
        caused by a lost wakeup, but it also introduced a
        performance problem by waking up too many processes.
        The semaphore sleep/wakeup strategy was redesigned
        to minimize the number of wakeups.

        PHKL_9114:
        Two residual 'holes' were found:

        Case 1:
        Thread A holds Sem0 and Sem1 (of a set)
        Thread B attempts Down(0) and sleeps
        Thread C attempts Down(0 and 1) and sleeps
        A does Up(0), causing wakeup_one()
        C awakens, gets Sem0 but sleeps on Sem1
        B remains asleep; C did not call wakeup()

        The solution is to pass the wakeup on to anyone else
        sleeping on the sema that triggered the original wakeup.

        Case 2:
        Thread A initializes Sem0 to '0'
        Thread B does a Down(0) and blocks
        Thread C does a Down(0) and blocks
        A does an Up(0) to awaken B or C
        ...However, a kill signal hits B
        B awakens with the signal and completes
        A does an Up(0) to wake C
        C does not get the wakeup

        The bug was that the sleeper count was decremented
        when thread A releases the sema.  When B awakens
        with a signal there was an additional decrement of
        the sleeper count. This will cause the loss of a
        future wakeup on the next Up sema.

        PHKL_7053:
        The defect results in process hangs in semaphore code.

        Here is some sample code which reproduces a hang:

        {
                ....stuff deleted...
                if (!(pid1 = fork())) {
                        P(2);
                        exit(0);
                }
                else if (!(pid2 = fork())) {
                        sleep(5);
                        P(1);
                        V(2);
                        exit(0);
                }
                else {
                        sleep(10);/* wait for children to block */
                        V(1);
                }

                /* wait for children */
        }

        P(val)
        int val;
        {
                p_op.sem_op = -val;
                semop(semid, &p_op, 1);
        }

        V(val)
        int val;
        {
                v_op.sem_op = val;
                semop(semid, &v_op, 1);
        }

                semid is the semaphore id which is obtained
                through semget().

                p_op and v_op are semop data structures initialized
                in program.

        PHKL_5869:
        The symptom here is that a process may sleep forever,
        waiting on a semaphore that is available.

        The cause is that semop supports operations on multiple
        semaphores.  If the first semaphore is available, but the
        second one is not, semop must back out by releasing the
        first one before going to sleep on the second one.  The
        back out code calls semundo which does release the
        semaphore; however, it fails to do a wakeup which is
        necessary if some other process is waiting for that
        same semaphore.

        The fix was to modify the back out code so that it
        performs the necessary wakeup.

        This bug shows up in 10.01 because the semop code was
        redesigned to be more efficient by performing the absolute
        minimum number of wakeups necessary for correct
        functionality.  Obviously, we were too aggressive and
        forgot one of them.

SR:
        1653195545 5003276675 5003306571 5003339747

Patch Files:
        /usr/conf/h/sem.h
        /usr/conf/lib/libhp-ux.a(sysV_sem.o)
        /usr/include/sys/sem.h

what(1) Output:
        /usr/conf/h/sem.h:
                sem.h  $Date: 96/11/15 15:25:14 $ $Revision: 1.20.71
                        .4 $ PATCH_10.01 (PHKL_9114)
        /usr/conf/lib/libhp-ux.a(sysV_sem.o):
                sysV_sem.c  $Date: 97/01/09 13:40:03 $ $Revision: 1.
                        27.71.19 $ PATCH_10.01 (PHKL_9744)
        /usr/include/sys/sem.h:
                sem.h  $Date: 96/11/15 15:25:14 $ $Revision: 1.20.71
                        .4 $ PATCH_10.01 (PHKL_9114)

cksum(1) Output:
        2023641962 5532 /usr/conf/h/sem.h
        231382391 15556 /usr/conf/lib/libhp-ux.a(sysV_sem.o)
        2023641962 5532 /usr/include/sys/sem.h

Patch Conflicts: None

Patch Dependencies:  None

Hardware Dependencies:  None

Other Dependencies:  None

Supersedes:
        PHKL_5869 PHKL_7053 PHKL_9114

Equivalent Patches:
        PHKL_9745:
        s800: 10.01

        PHKL_9746:
        s700: 10.10

        PHKL_9747:
        s800: 10.10

        PHKL_9748:
        s700: 10.20

        PHKL_9749:
        s800: 10.20

Patch Package Size:  90 Kbytes

Installation Instructions:
        Please review all instructions and the Hewlett-Packard
        SupportLine User Guide or your Hewlett-Packard support terms
        and conditions for precautions, scope of license,
        restrictions, and, limitation of liability and warranties,
        before installing this patch.
        ------------------------------------------------------------
        1. Back up your system before installing a patch.

        2. Login as root.

        3. Copy the patch to the /tmp directory.

        4. Move to the /tmp directory and unshar the patch:

                cd /tmp
                sh PHKL_9744

        5a. For a standalone system, run swinstall to install the
            patch:

                swinstall -x autoreboot=true -x match_target=true \
                        -s /tmp/PHKL_9744.depot

        5b. For a homogeneous NFS Diskless cluster run swcluster on the
            server to install the patch on the server and the clients:

                swcluster -i -b

            This will invoke swcluster in the interactive mode and
            force all clients to be shut down.

            WARNING: All cluster clients must be shut down prior to the
                     patch installation.  Installing the patch while the
                     clients are booted is unsupported and can lead to
                     serious problems.

            The swcluster command will invoke an swinstall session in which
            you must specify:

                alternate root path  -  default is /export/shared_root/OS_700
                source depot path    -  /tmp/PHKL_9744.depot

            To complete the installation, select the patch by choosing
            "Actions -> Match What Target Has" and then "Actions -> Install"
            from the Menubar.

        5c. For a heterogeneous NFS Diskless cluster:

                - run swinstall on the server as in step 5a to install
                  the patch on the cluster server.

                - run swcluster on the server as in step 5b to install
                  the patch on the cluster clients.

        By default swinstall will archive the original software in
        /var/adm/sw/patch/PHKL_9744.  If you do not wish to retain a
        copy of the original software, you can create an empty file
        named /var/adm/sw/patch/PATCH_NOSAVE.

        Warning: If this file exists when a patch is installed, the
                 patch cannot be deinstalled.  Please be careful
                 when using this feature.

        It is recommended that you move the PHKL_9744.text file to
        /var/adm/sw/patch for future reference.

        To put this patch on a magnetic tape and install from the
        tape drive, use the command:

                dd if=/tmp/PHKL_9744.depot of=/dev/rmt/0m bs=2k

Special Installation Instructions:  None