USE Method by Brendan Gregg

The simplicity and compehensiveness of Brendan Gregg’s USE method blows my mind every time I revisit it as part of my Solaris skills refresher.

From the USE method page:

For every resource: check utilization, saturation and errors.

The reason I’m always impressed is because in addition to the method itself, Brendan provised a complete checklist for Solaris performance investigations. So every time there’s an issue, you can use this list to zoom in on one of the physical server components: CPU, memory, storage I/O, storage capacity, network and interconnects (CPU, memory and I/O).

If you’re running any SmartOS environments on Joyent, there’s a separate performance checklist for SmartOS virtualised environments – also very useful.

Oracle Solaris 11.4 Open Beta

We got a real treat this week – Oracle refreshed the Solaris 11.4 Open Beta, meaning that it’s still the same Solaris version but with a few really important improvements:

  • almost 300 additional bug fixes (280+ they say)
  • really cool ZFS improvements (device removal and scheduled scrub)
  • Spectre Variant 1 fixes
  • Software updates for GCC (it’s v7.3 now) and other packages

Speaking on the Continuous Delivery mentioned a while ago, this updates brings promises that Oracle Solaris 11 will receive update releases on an annual basis.

In addition to update releases, there will be smaller functional, performance and security improvements released as they become ready – so Support Repository Updates (SRUs) and Critical Patch Updates (CPUs) will continue happening between major Solaris releases.

Continuous Delivery for Solaris

Just a few days after the orginal post with Solaris 11.x plans, Oracle just provided an update on the official Solaris blog.

Starting with Solaris 11, OS development and update will follow a continuous delivery model, whih means the next few years will see lots of incremental improvements within 11.x instead of a major release like Solaris 12.

It’s also very good to see that extended support is planned for the next 25 years or so:

As we will deliver new features and capabilities as part of Oracle Solaris 11, we have extended the Oracle Solaris 11 and Oracle Solaris Cluster 4 Premier and Extended Support lifespans to January 2031 and January 2034, respectively.  Support dates are evaluated for update annually, and will be provided through at least the dates above.

Oracle Roadmap: SPARC/Solaris


Okay, so Oracle just updated their roadmap for SPARC and Solaris.

Seems Solaris 12 will take a couple more years to be released (I seem to remember that in 2014/2015 a very similar roadmap was suggesting Solaris 12 would show up in late 2016), which means for now there will only be Solaris 11.x incremental releases.

There’s also¬†promise of SPARC Next hardware which promises significant boost to performance.

Recommended Patchset for Solaris 10 – January 2016

Those of you still on Solaris 10 may want to download the latest Recommended Patchset for Solaris 10 which was published just last week, on 28th of January 2016.

There’s only four such patchsets a year and this is quite handy for rolling baselines when you plan to patch all of your Solaris 10 servers in a particular quarter.

While this patchset does not incluse ALL of the available security patches, it contains most critical ones to date.

From the README:

The Recommended OS Patchset Solaris 10 SPARC provides the minimum set of patches needed to address security and Sun Alert issues, and selected issues identified by Oracle Proactive Services and the Oracle Technical Support Center, for the Solaris 10 Operating System for sparc. The patches contained in this patchset are considered the most important and highly recommended patches for Solaris 10.

Joyent CLI basics

I’ve been trying different SmartOS images with Joyent for some time now, but always did everything from the Dashboard.

Joyent’s wonderful dashboard

Most users will never need anything else: the Joyent’s dashboard is incredible – simple yet powerful, providing vital views of your instances and your billing. Wish there was a tool like this for AWS!

Being a developer though, you’ll probably want to give Joyent’s CloudAPI a try. Since I’m not much of a developer, I settled on the NodeJS based CLI tools.

Install Joyent CLI

You need to get a nodejs installed in your environment, and on top of it get the Smart DataCenter module from Joyent:

npm install -g smartdc

There’s at least three variables to configure so that you can use the CLI, SDC_URL reflects the region (DC) you plan on using, that’s Amsterdam for me in Europe. SDC_ACCOUNT is your username, and SDC_KEY_ID is the key you’ll need to get from your Joyent Dashboard:

export SDC_ACCOUNT=greys
export SDC_KEY_ID="aa:bb:cc:dd:ee:ff:gg:hh:ii:jj:kk:ll:mm:nn:oo:pp"


List machines

sdc-listmachines is the command you’ll need to list machines.

And once you’re clear on which machine you’d like to inspect, you can use sdc-getmachine with its UUID to confirm all the necessary details about the machine: IPs and hardware configuration, state and access keys, etc.

Here’s how it should look:

greys$ sdc-getmachine d8288fe0-1d88-ef64-XXXX-YYYYYYYYYYYY
"id": "d8288fe0-1d88-ef64-XXXX-YYYYYYYYYYYY",
"name": "wp",
"type": "smartmachine",
"state": "stopped",
"image": "70f1b13e-0f85-XXXX-a009-YYYYYYYYYYYY",
"ips": [
"memory": 256,
"disk": 6144,
"metadata": {
"tags": {},
"created": "2015-10-09T18:58:18.264Z",
"updated": "2016-01-11T13:27:44.000Z",
"networks": [
"dataset": "sdc:sdc:wordpress:15.1.1",
"primaryIp": "37.153.XXX.YYY",
"firewall_enabled": false,
"compute_node": "44454c4c-4e00-1031-XXXX-YYYYYYYYYYYY",
"package": "t4-standard-256M"

Start and stop a machine with Joyent

sdc-startmachine and sdc-stopmachine are the commands which take UUID, and you can use the sdc-getmachine to track progress.

IMPORTANT: if you jus stop your Joyent machine, billing will still be happening unless and until destroy the machine like this:

sdc-deletemachine d8288fe0-1d88-ef64-XXXX-YYYYYYYYYYYY


Behaviour of the audit daemon

Always wanted to know how to make a clean start with nightly log rotations in Solaris audit setup.

Turns out it could't be simpler!

From the audit(1M) man page:

audit - control the behavior of the audit daemon

and a bit further down:

-s Notify the audit daemon to read the audit control file. The audit daemon stores the information internally. If the audit daemon is not running but audit has been enabled by means of bsmconv, the audit daemon is started.
-t Direct the audit daemon to close the current audit trail file and exit. Use -s to restart auditing. To disable auditing, use bsmunconv.

So the sequence should be:
1) Close current audit trail file:
audit -t
2) Do log rotation magic
3) Restart audit trail:
audit -s

How To: Confirm Link Speed for a Network Interface

Here’s a one liner that is really useful when you need to quickly confirm the link speed for network interfaces on your system.

The beauty of this command is that you can run it as a regular user:

bash-3.00$ kstat -p | grep link_speed
 e1000g:0:Port Stats:link_speed 1000
 e1000g:1:Port Stats:link_speed 1000

Using nohup for existing processes

Most of you are probably aware with the fact that by default any processes you may have running within your session will be killed once you terminate the session. The most common example is logging onto a remote server via SSH, starting some command and then closing the terminal session.

As you know, this happens because when your’e terminating your shell it ends up sending the SIGHUP signal to all the child processes, notifying them about the end of the session and therefore instructing them to wrap up whatever it was they were doing and to terminate.

Many are also aware of the wonderful nohup command which adds flexibility to start any command in a mode where it will ignore the SIGHUP and therefore stay running (and writing output to log files, for example) even after your terminal session completes.

Basic usage of the nohup command

Here’s how you should use it:

$ nohup ./ &
Sending output to nohup.out
[1] 3763

What this does is put your script into background while making it ignore SIGHUP signals in the past. The background element is not necessary but very common because if you expect something to be running for hours/days – you probably don’t want to be watching it in your terminal anyway. So you put the task into background.

The [1] indicates that this is the first (and only) job you’re running in background so far. The 3763 is the process ID (PID) of the script.

Output redirection into nohup.out

nohup command redirects all the standard input into nohup.out file so if you’re interested in your script’s output you can still keep an eye on it this way:

tail -f nohup.out

nohup for an existing process

In recent versions of Solaris 10 the nohup command has seen a really welcomed addition: it now allows you to make any process (within your privileges of course) ignore SIGHUP signal. This is really convenient because if you started some script in the morning and by late afternoon it’s obvious that it won’t finish by the time you should be heading home – you can now simply use nohup command to update the process and allow it to finish even when you log out.

Another reason it’s so convenient is because prior to this feature the scenario described above would leave you no option but to keep your session open. If you couldn’t (for example you must disconnect your laptop every evening) – you had to stop the script and restart next morning (with nohup this time).

Here’s how you update an existing process:

-bash-4.1# nohup -p 3468
Sending output to nohup.out

or get a message suggesting you must elevate your privileges before you can do it:

greys@solaris:~$ nohup -p 3468
nohup: cannot examine 3468: permission denied

Using Service Controller to confirm battery status

I’ve been working with a support engineer on replacing an SC battery in one of T2000 servers recently, and noticed that immediately upon rebooting a server it may not be possible to get battery and fans stats because prtdiag command wouldn’t work (picld daemon not fully operating yet).

Turns out, there’s another way to get this info – simply use #. to get back into ALOM and run the showevnironment command. Among other things, it reports battery status:

sc> showenvironment
Voltage sensors (in Volts):
Sensor          Status      Voltage LowSoft LowWarn HighWarn HighSoft
MB/V_+1V5       OK            1.48    1.36    1.39    1.60     1.63
MB/V_VMEML      OK            1.79    1.63    1.67    1.92     1.98
MB/V_VMEMR      OK            1.79    1.63    1.67    1.92     1.98
MB/V_VTTL       OK            0.89    0.81    0.83    0.96     0.99
MB/V_VTTR       OK            0.87    0.81    0.83    0.96     0.99
MB/V_+3V3STBY   OK            3.33    3.13    3.16    3.53     3.59
MB/V_VCORE      OK            1.31    1.20    1.24    1.36     1.39
IOBD/V_+1V5     OK            1.48    1.36    1.39    1.60     1.63
IOBD/V_+1V8     OK            1.79    1.63    1.67    1.92     1.96
IOBD/V_+3V3MAIN OK            3.36    3.06    3.10    3.49     3.53
IOBD/V_+3V3STBY OK            3.33    3.13    3.16    3.53     3.59
IOBD/V_+1V      OK            1.18    1.09    1.11    1.28     1.30
IOBD/V_+1V2     OK            1.16    1.09    1.11    1.28     1.30
IOBD/V_+5V      OK            5.09    4.55    4.75    5.35     5.45
IOBD/V_-12V     OK          -12.11  -13.08  -12.84  -11.16   -10.92
IOBD/V_+12V     OK           12.06   10.92   11.16   12.84    13.08
SC/BAT/V_BAT    OK            3.37      --    2.25      --       --