Thursday, September 13, 2007

MSOSUG Live upgrade slides

Great turnout last night for the inaugural MSOSUG meeting.

Here's the slides I used for my talk on Live Upgrade.

Tuesday, September 11, 2007

Melbourne Solaris and OpenSolaris Users Group tomorrow night

Please join us for our first meeting!

Where?

Level 7, 476 St Kilda Road, Melbourne,

When?

September 12th, 2007 at 18:00 for general discussion.
Speakers start at 18:30, and we expect that each speaker will take up to one hour.

Speakers:

  • Nathan Kroenert: Niagara 2 CPU and Architecture exploration
  • Boyd Adamson: Solaris Live Upgrade: Using and abusing

Coffee, Tea and soft drinks provided.

A note on access: The Lifts in the building and entrance to Level 7 automatically lock early in the evening, so access will be provided by Sun folks on-site. A sign will also be placed close to the ground floor lifts with a contact number, should you arrive after the start of the meeting.
The Melbourne Solaris and OpenSolaris Users Group exists to explore Solaris, OpenSolaris and closely related technologies.

For those looking for more information on what this group will do in the future, come along to the first meeting, as we'll be spending some time talking about what it is that everyone is most interested in exploring. It is hoped that over time, everyone participating will have something to share with the group, be it 'their' implementation of something, or someone exploring the deeper aspects of some particular Solaris feature, right through to folks who are interested in directly contributing to the OpenSolaris movement. We'll also discuss the preferred frequency of regular meetings.

If you are interested in Solaris, doing interesting things with, or on Solaris, we want YOU!
See you there!

Tuesday, August 21, 2007

zsh DTrace Provider

Since there’s an effort underway to instrument various shells with dtrace probes, I thought I’d work on my shell of choice, zsh. Also, Brendan twisted my arm.

We’re trying to keep some uniformity between the shells, so I’m implementing the probes that Alan Hargreaves has documented.

So far, I’ve mostly finished the following probes:

builtin-entry
builtin-return
function-entry
function-return
script-done
script-start
And in simple use they look like this:
$ cat $tests/test_func_simple.zsh
#!./zsh

func()
{
echo "Hello from function"
return 1
}

func

$ dtrace -n 'zsh$target:::' -c $tests/test_func_simple.zsh
dtrace: description 'zsh$target:::' matched 8 probes
Hello from function
CPU ID FUNCTION:NAME
0 51903 zsh_main:script-start
0 51898 runshfunc:function-entry
0 51896 execbuiltin:builtin-entry
0 51897 execbuiltin:builtin-return
0 51896 execbuiltin:builtin-entry
0 51897 execbuiltin:builtin-return
0 51899 runshfunc:function-return
0 51902 zexit:script-done
dtrace: pid 16530 exited with status 1

And for something more elaborate:

$ cat $tests/basic_args.d

#pragma D option quiet

zsh$target:::builtin-entry, zsh$target:::function-entry
{
printf("%15s: %8s line: %d\n", probename, copyinstr(arg1), arg2);
}

zsh$target:::builtin-return, zsh$target:::function-return
{
printf("%15s: %8s ret: %d\n", probename, copyinstr(arg1), arg2);
}

zsh$target:::script-start
{
printf("Script %s starts\n", copyinstr(arg0));
}

zsh$target:::script-done
{
printf("Script %s done, return: %d\n", copyinstr(arg0), arg1);
}

$ dtrace -s $tests/basic_args.d -c $tests/test_func_simple.zsh
Hello from function
Script ../../tests/test_func_simple.zsh starts
function-entry: func line: 0
builtin-entry: echo line: 0
builtin-return: echo ret: 0
builtin-entry: return line: 0
builtin-return: return ret: 1
function-return: func ret: 1
Script ../../tests/test_func_simple.zsh done, return: 1


Note that the line numbers are all zero at the moment since I haven’t provided them yet.
There’s a few more probes to come and some cleanup to do, but I’m hoping to have a patch available soon.

Thursday, July 12, 2007

docs.sun.com now usable!

Great news from Michelle Olsen in this thread at the opensolaris discussions:

We also had a hardware upgrade last week for docs.sun.com that has made a huge difference in performance, give it a whirl.


I've tried it and it actually feels snappy! The site had gotten so bad lately that I'd taken to downloading PDFs, rather than crawl from page to page.

Hooray!

Wednesday, December 06, 2006

Concatenating DVD iso parts for Solaris downloads

Is seems that every time a new Solaris Express release comes out, someone complains about having to unzip, then concatenate the iso image file parts for the DVD. They complain either that it needs more disk space or more disk bandwidth.

Leaving aside the reasons for the split in the first place, it's pretty easy to unzip and concatenate the parts in one go. In a bourne shell derivative, for example:

for file in sol-nv-b53-sparc-dvd-iso-*
do
echo Doing $file >&2
unzip -p $file
done > sol-nv-b53-sparc-dvd.iso

Now, please verify the md5 sum of the file. Many problems are flushed out by doing that:
On solaris:
digest -a md5 sol-nv-b53-sparc-dvd.iso

Or more generally
openssl md5 sol-nv-b53-sparc-dvd.iso

Monday, November 20, 2006

Sub-Optimal?

Then they dropped all the sun-like left-side function keys. This provoked uproar among the potential buyers (who are, of course, all alpha geeks with religious positions on keyboard details), so it seems they may have changed their
mind
.

I’ve also seen no mention of support for anything other than windows or mac for the devices that they have released.

There are some positives:

Optimus 103 keyboard will be a mass storage device. That means, that Optimus
will be the first (to the best of our knowledge) keyboard to appear on a
desktop just like a hard disk or a flash drive. Among the benefits of this
solution is that we won’t have to create any drivers (except for the
OS-dependent Configurator software). Layouts could be put right into the
keyboard’s storage.

Which is kinda cool.

Will it be worth the ~USD400 price tag? We’ll have to wait and see what other compromises have been made.

Making packages

Following Eric Boutilier's latest two posts on packaging and a conversation on #opensolaris, I was interested enough to try pkgbuild for myself. Of course, I'd forgotten about his earlier series of posts on the topic, so I'd forgotten the connection to JDS.

As a result, I started making packages without the JDS CBE (Common build environment). But it's worked pretty well. For example, on a standard Solaris 10 installation (03/05 for me, but anything should work):

  • PATH=$PATH:/usr/sfw/bin:/usr/ccs/bin export PATH
  • Download the pkgbuild tool from http://pkgbuild.sourceforge.net/.
  • Unpack and install with a standard command ./configure && make && make install
  • Grab this spec file for ruby-1.8.5 that I knocked up with the help of Eric's posts and Redhat's docs.
  • As a non-root user run pkgtool --download --define="_prefix /opt/mypkgs" build-only ruby.spec
  • Wait :)
It should create a ~/packages directory, download the source file with wget, compile it, build a package and put it in ~/packages/PKGS.

Thursday, September 07, 2006

"Basement" processes in Solaris

All the documentation about the Solaris scheduler says that the highest priority runnable thread is chosen for execution (see, for example, section 3.8.4 of Solaris Internals 2/e). At first glance that might seem to mean that the same thread will always get the CPU, if it is runnable.

That is indeed what would happen if thread priorities were static, but in fact for most threads (those in the TS, IA, an FSS classes) the priority changes based on their usage of the CPU.

On the other hand, the FX (fixed priority) scheduling class does not change the priority of a thread, so that we can use it to experiment with the scheduler's behaviour.

First of all, lets get ourselves some privileges. Note that we don't need this for plain priority 0 processes, but we do for using any other priority or quantum later.


$ ppriv $$
449: -zsh
flags = <none>
E: basic
I: basic
P: basic
L: all
$ su root -c "ppriv -s EIP+proc_priocntl $$"
Password:
$ ppriv $$
449: -zsh
flags = <none>
E: basic,proc_priocntl
I: basic,proc_priocntl
P: basic,proc_priocntl
L: all


Ok, and we'll need something that will used lots of CPU and not make system calls that cause it to sleep. This will make observing the behaviour clearer.


$ cat spin.c
int main()
{
int i = 0;
for (;;)
i++;
exit(0);
}
$ gcc -o spin spin.c


Now, let's look at the current processes that we're running.


$ ps -o sid -p $$
SID 449
$ priocntl -d -i sid 449
TIME SHARING PROCESSES:
PID TSUPRILIM TSUPRI
449 0 0
593 0 0


So, only TS processes with no fancy characteristics.

Lets now start our test program. The FX class provides user priorities that range from 0-60 (numerically higher is higher priority). We want out test program to be low priority.


$ priocntl -e -c FX -m 0 -p 0 ./spin &
[1] 652
$ priocntl -d -i sid 449
TIME SHARING PROCESSES:
PID TSUPRILIM TSUPRI
449 0 0
653 0 0
FIXED PRIORITY PROCESSES:
PID FXUPRILIM FXUPRI FXTQNTM
652 0 0 200


Good, so it's running at low priority, but on this system it has very little competition. In fact it's using close to 100% of this box's single CPU. Lets allow some time for the stats to catch up.


$ prstat -c -p 652 15 5 | sed -n -e 1p -e /spin/p
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
652 boyd 996K 560K run 0 0 0:00:21 63% spin/1
652 boyd 996K 560K run 0 0 0:00:36 81% spin/1
652 boyd 996K 560K run 0 0 0:00:51 91% spin/1
652 boyd 996K 560K run 0 0 0:01:06 95% spin/1
652 boyd 996K 560K run 0 0 0:01:21 97% spin/1


Now, we start another job at the same priority.


$ priocntl -e -c FX -m 0 -p 0 ./spin &
[2] 660
$ priocntl -d -i sid 449TIME SHARING PROCESSES:
PID TSUPRILIM TSUPRI
449 0 0
661 0 0
FIXED PRIORITY PROCESSES:

PID FXUPRILIM FXUPRI FXTQNTM
652 0 0 200
660 0 0 200
$ prstat -c -p 652,660 60 2 | sed -n -e 1p -e /spin/p -e 's/^Total.*//p'
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
652 boyd 996K 560K run 0 0 0:01:45 71% spin/1
660 boyd 996K 560K run 0 0 0:00:08 27% spin/1

652 boyd 996K 560K run 0 0 0:02:15 51% spin/1
660 boyd 996K 560K run 0 0 0:00:37 48% spin/1


And we see that the two jobs are sharing the CPU nearly equally.

Now, lets tweak a little. First, notice that the two jobs have the same quantum, which means that they'll have the CPU for the same amount of time each time they are scheduled (assuming that no higher priority job preempts them).

Let's experiment with that quantum by halving the time for one process.


$ priocntl -s -t 100 -i pid 660
$ priocntl -d -i sid 449
TIME SHARING PROCESSES:
PID TSUPRILIM TSUPRI
449 0 0
669 0 0
FIXED PRIORITY PROCESSES:
PID FXUPRILIM FXUPRI FXTQNTM
652 0 0 200
660 0 0 100
$ prstat -c -p 652,660 60 2 | sed -n -e 1p -e /spin/p -e 's/^Total.*//p'
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
652 boyd 996K 560K run 0 0 0:02:36 54% spin/1
660 boyd 996K 560K run 0 0 0:00:55 44% spin/1

652 boyd 996K 560K run 0 0 0:03:15 64% spin/1
660 boyd 996K 560K run 0 0 0:01:16 35% spin/1


As we might expect, the adjusted process now has half as much CPU time as the other one.

Next, let's set the quantum back to its default value and bump the priority up by one.


$ priocntl -s -t 200 -m 1 -p 1 -i pid 660
$ priocntl -d -i sid 449
TIME SHARING PROCESSES:
PID TSUPRILIM TSUPRI
449 0 0
677 0 0
FIXED PRIORITY PROCESSES:
PID FXUPRILIM FXUPRI FXTQNTM
652 0 0 200
660 1 1 200
$ prstat -c -p 652,660 120 2 | sed -n -e 1p -e /spin/p -e 's/^Total.*//p'
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
660 boyd 996K 560K run 1 0 0:02:00 67% spin/1
652 boyd 996K 560K run 0 0 0:04:07 30% spin/1

660 boyd 996K 560K run 1 0 0:04:40 99% spin/1
652 boyd 996K 560K run 0 0 0:04:07 0.0% spin/1


Wow! That's really made a difference. Process 660 is getting a lot of CPU. That makes sense, since it has a higher priority and so, based on our initial premise, we'd assume it gets chosen over the lower priority process every time.

Let's see if that's really the case. First we need some extra privileges so that we can use DTrace.


$ su root -c "ppriv -s EIP+dtrace_kernel,dtrace_proc,dtrace_user $$"
Password:
$ ppriv $$
449: -zsh
flags = <none>
E: basic,dtrace_kernel,dtrace_proc,dtrace_user,proc_priocntl
I: basic,dtrace_kernel,dtrace_proc,dtrace_user,proc_priocntl
P: basic,dtrace_kernel,dtrace_proc,dtrace_user,proc_priocntl
L: all
$ dtrace -q -n 'sched:::on-cpu /execname == "spin"/ {@[pid] = count()} tick-5sec { exit(0) }'

660 103


Yep, just as we expected, process 652 has not been scheduled even once in our sampling period of 5 seconds. It's getting absolutely no CPU time at all.

Just to be sure, let's make the two priorities equal again and check again with DTrace to see that they are being scheduled more evenly.


$ priocntl -s -m 0 -p 0 -i pid 660
$ dtrace -q -n 'sched:::on-cpu /execname == "spin"/ {@[pid] = count()} tick-5sec { exit(0) }'

660 50
652 57


So, in summary, processes at the lowest priority level (0 in FX) will be starved of CPU time by anything on the system at a higher priority. Processes at the same priority level can have time apportioned between them using mechanisms such as the quantum.

The interaction between the FX and other scheduling classes becomes more complicated thanks to the appearance of global priorities into the equation, but that's a subject for another post. :)