2017-07-27

Switching From Nouveau to the Proprietary NVIDIA driver in Debian Jessie

A couple of weeks ago, I started to experience random crashes on my 64-bit jessie desktop machine. Generally, these took the form of either a frozen desktop or a sudden blank screen, along with a complete lack of responsiveness to either mouse or keyboard.

Usually there was no obvious associated entry in any system log, but at last a series of messages appeared in the syslog file, starting with this one:

Jul 17 13:55:05 homebrew kernel: [24064.296254] nouveau E[ PFIFO][0000:01:00.0] write fault at 0x000029d000 [PTE] from GR/GPC0/GPCCS on channel 0x003fbad000 [Xorg[2071]] 

This was followed by several more messages that appeared to be related, the last of which was:

Jul 17 13:58:23 homebrew kernel: [24262.075187] nouveau E[ DRM] GPU lockup - switching to software fbcon

(On this occasion, although the desktop was non-responsive after the first message, I could still ssh into the machine, and shut it down cleanly from the ssh session, which is why there is a period of several minutes between these two messages.)

This suggested that the cause lay in the nouveau video driver, so I decided to switch to the proprietary NVIDIA driver. This turned out not to be as easy as one might expect, since there didn't seem to be a single place that defines the complete procedure  in detail. Hence this post.

Here are the steps that I followed:

1. Install the nvidia-driver package.

2. Install the nvidia-xconfig package.

3. Run nvidia-xconfig.

This complained about the lack of an xorg.conf file, but generated a default one with an nvidia entry for the driver.  There were several other errors, but rebooting at this point resulted in a system that booted and ran X.

So far so good, but during the boot sequence I noticed that the text on the system console was enormous. Similarly, if I switched to the console once the system had booted, the text appeared to be about 80x24, which is quite obnoxious on a 27-inch monitor.

Following the instructions at:

https://wiki.archlinux.org/index.php/GRUB/Tips_and_tricks#Setting_the_framebuffer_resolution

I added two lines to the file /etc/default/grub:

GRUB_GFXMODE=1280x1024x16,1024x768,auto 
GRUB_GFXPAYLOAD_LINUX=keep

DO NOT DO THIS.

After executing
  grub-mkconfig -o /boot/grub/grub.cfg
and rebooting, although the text on the console looked much better, I no longer had any X-based desktop. Switching to :0 merely gave me a blank screen. So I restored the grub.cfg file to the original version.

The above-named URL provides a deprecated mechanism for changing the console font, so that's what I ended up using. In particular, I changed one line of the /etc/default/grub file to read:

GRUB_CMDLINE_LINUX_DEFAULT="quiet vga=794"

and executed: 
  grub-mkconfig -o /boot/grub/grub.cfg

According to this documentation, this gives a 1280x1024 16-bit console, which is a somewhat lower resolution than I had with the nouveau driver, but is vastly better than the resolution without this line in the grub configuration file.

Now everything is working to my satisfaction. The only quirk I see is that at boot time, there is a LOT of disk activity for about 30 seconds after the desktop starts. I'm not sure what the reason for this might be, but at the end of it I have a fully-functioning system with a KDE desktop on :0 and i3 on :1, and can switch to a reasonable-looking console at will.

The best news is that, at least so far, I have experienced no system crashes since switching to the proprietary driver.



2017-07-24

Most-Logged Stations in CQ WW CW 2016

The public CQ WW CW logs allow us easily to tabulate the stations that appear in the largest number of entrants' logs. For 2016, the ten stations with the largest number of appearances were:

Callsign Appearances % logs
HK1NA 10,277 59
9A1A 9,921 63
TK0C 9,682 61
CN2R 9,675 63
PJ2T 9,569 53
CR3W 9,354 60
LZ9W 9,091 59
CN2AA 8,912 59
P33W 8,661 56
EF8R 8,409 57

The first column in the table is the callsign. The second column is the total number of times that the call appears in other stations' logs. That is, if a station worked HK1NA on six bands, that will increment the value in the second column of the HK1NA row by six. The third column is the percentage of logs that contain the callsign at least once.

Tables for prior years are available here.

We can also list the cumulative data for the ten-year span from 2007 to 2016:

Callsign Appearances % logs
LZ9W 91,382 67
PJ2T 84,001 58
9A1A 81,583 61
DF0HQ 80,298 63
PJ4A 72,262 55
D4C 71,571 49
W3LPL 69,447 52
LX7I 69,254 55
K3LR 68,908 53
CR3L 64,426 48

2017-07-17

Additional Information in Augmented Logs for CQ WW, 2005 to 2016

Now available are new augmented versions of the public logs for CQ WW CW and SSB for the period 2005 to 2016.

The cleaned logs are the result of processing the QSO: lines from the entrants' submitted Cabrillo files to ensure that all fields contain valid values and all the data match the format required in the rules. Any line containing illegal data in a field (for example, a zone number greater than 40, or a date/time stamp that is outside the contest period) has simply been removed. Also, only the QSO: lines are retained, so that each line in the file can be processed easily. The MD5 checksum for the file of cleaned logs is: 1b47059d1f2431b55d89a5eb954a05cc.

The augmented logs contain the same information as the cleaned logs, with the addition of some useful information on each line. The MD5 checksum for the compressed (~800 MB) file of augmented logs is: 7a728987fb8637ab8c156df3fa27d582. The information added to each line now includes two new fields: the callsign copied by the second party in the case that the second party bust the cull of the first party; amd the correct callsign of the second party in the case that the first party bust the second party's call.

In all, the addition fields in the augmented file comprise:
  1. The letter "A" or "U" indicating "assisted" or "unassisted"
  2. A four-digit number representing the time if the contact in minutes measured from the start of the contest. (I realise that this can be calculated from the other information on the line, but it saves a lot of time to have the number readily available in the file without having to calculate it each time.)
  3. Band
  4. A set of eleven flags, each -- apart from column k -- encoded as T/F: 
    • a. QSO is confirmed by a log from the second party 
    • b. QSO is a reverse bust (i.e., the second party appears to have bust the call of the first party) 
    • c. QSO is an ordinary bust (i.e., the first party appears to have bust the call of the second party) 
    • d. the call of the second party is unique 
    • e. QSO appears to be a NIL 
    • f. QSO is with a station that did not send in a log, but who did make 20 or more QSOs in the contest 
    • g. QSO appears to be a country mult 
    • h. QSO appears to be a zone mult 
    • i. QSO is a zone bust (i.e., the received zone appears to be a bust)
    • j. QSO is a reverse zone bust (i.e. the second party appears to have bust the zone of the first party)
    • k. This entry has three possible values rather than just T/F:
      • T: QSO appears to be made during a run by the first party
      • F: QSO appears not to be made during a run by the first party
      • U: the run status is unknown because insufficient frequency information is available in the first party's log 
  5. If the QSO is a reverse bust, the call logged by the second party; otherwise, the placeholder "-"
  6. If the QSO is an ordinary bust, the correct call that should have been logged by the first party; otherwise, the placeholder "-"
  7. If the QSO is a reverse zone bust, the zone logged by the second party; otherwise, the placeholder "-"
  8.  If the QSO is an ordinary zone bust, the correct zone that should have been logged by the first party; otherwise, the placeholder "-"
Notes:
  • The encoding of some of the flags requires subjective decisions to be made as to whether the flag should be true or false; consequently, and because CQ has yet to understand the importance of making their scoring code public, the value of a flag for a specific QSO line in some circumstances might not match the value that CQ would assign. (Also, CQ has more data available in the form of check logs, which are not made public.)
  • I made no attempt to deduce the run status of a QSO in the second party's log (if such exists), regardless of the status in the first party's log. This allows one cleanly to perform correct statistical analyses anent the number of QSOs made by running stations merely by excluding QSOs marked with a U in column k.
  • No attempt is made to detect the case in which both participants of a QSO bust the other station's call. This is a problematic situation because of the relatively high probability of a false positive unless both stations log the frequency as opposed to the band. (Also, on bands on which split-frequency QSOs are common, the absence of both transmit and receive frequency is a problem.) Because of the likelihood of false positives, it seems better, given the presumed rarity of double-bust QSOs, that no attempt be made to mark them.
  • The entries for the zones in the case of zone or reverse zone busts are normalised to two-digit values.

On Reporting Bust Rates in Contests

The usual metric used for comparing the bust rates for different stations is very simple: if a station makes a N QSOs, of which B are detected as busts, (usually by comparing the station's log with the logs submitted by other competitors) then the bust rate R is given simply by:
$$R = B / N$$
Admirably simple though this is, it suffers from at least two defects that seem to me to be fatal.

1. Allowing for the number of checkable QSOs


Suppose that we have two stations, 1 and 2, and we wish to compare their bust rates. Suppose that both stations make 100 QSOs, and suppose further that they both bust 5 QSOs, (as determined by an inspection of the logs of other competitors); then the usual inference is that the bust rate for the two stations is the same.

But one cannot conclude that, because the usual equation does not take into account the fact that, of the 100 QSOs made by each of the two stations, a very different number might have submitted their log to the contest sponsor in the two cases.

Suppose, for example, that all 100 of the QSOs made by the first station can be cross-checked with submitted logs, but only 50 of the QSOs made by the second station can be so checked. It is clear that the second station is likely not to be as good at copying calls as the first, and a better measure of the bust rate is made by using NV, the number of verifiable QSOs in place of the raw number N:
$$R = B / N_V$$
We can see whether this affects the ordering of, for example, the stations with the most busts in the CQ WW contest.

If we look at the stations with the most busts in the 2016 CQ WW CW contest, using the normal formula for calculating the percentage of busts, we find:

2016 CW -- Most Busts
Position Call QSOs Busts % Busts
1 PV8ADI 2,057 217 10.5
2 TK0C 12,521 171 1.4
3 LU2WA 2,048 169 8.3
4 TM1A 6,255 156 2.5
5 HG5F 3,062 147 4.8
6 HK1NA 14,037 145 1.0
7 CN2R 12,351 137 1.1
8 LZ9W 10,613136 1.3
9 PI4CC 7,736 119 1.5
10 NP2P 3,842 116 3.0

Using the revised formula, this becomes:

2016 CW -- Most Busts
Position Call Verified QSOs Busts % Busts
1 PV8ADI 1,666 217 13.0
2 TK0C 9,664 171 1.8
3 LU2WA 1,596 169 10.6
4 TM1A 5,307 156 2.9
5 HG5F 2,638 147 5.6
6 HK1NA 10,108 145 1.4
7 CN2R 9,544 137 1.4
8 LZ9W 8,946136 1.5
9 PI4CC 6,706 119 1.8
10 NP2P 3,162 116 3.7

2. Allowing for sampling distortions


The second issue with the normal calculation is more subtle, and is perhaps most easily seen with an example. In order to make the problem clearer, we will use an extreme example, but it should readily be seen that the problem exists to a lesser degree when the situation is less extreme.

Suppose we have two stations, the first of which makes 100 verified QSOs and busts no calls. Obviously, his verified bust rate is 0%. Now suppose that the second station makes 1,000 verified QSOs and also busts no calls. The problem now should be obvious: both stations have the same bust rate, and yet it is clear that the second operator is almost certainly better at copying than the first. We need a way to deal with this kind of situation.

The solution is to realise what we are trying to calculate, and how it relates to the data available to us. The real goal of calculating the bust percentage is to produce a measurement (or, rather, an estimate) of the rate at which an operator makes mistakes in copying calls.

So we make the simplifying assumption that each operator has a probability $p_n$ of busting a single call (if you think about it, this is not quite as silly as it might seem, since although the value of $p$ will doubtless vary under different reception conditions even for a single operator, all those variations can be taken into account by changing the meaning of $p$ so that it is a kind of mean value that reflects the conditions prevailing at the operator's location; this adjustment will vary from location to location, but that does not really matter, since we aren't trying to estimate some kind of ideal bust rate for a perfect location, but, rather, the bust rate that actually prevails for each operator at that operator's location). Now we can easily analyse how to compare the accuracy of competing operators.

Suppose that an operator has a probability $p_B$ of busting a call (and, hence, a probability $q = (1 - p_B)$ of copying a call correctly). Now if the operator makes $N_V$ verified QSOs, then the probability of there being a particular number of busts $B$ is given by the binomial distribution:

$$ {N_V \choose B} (p_B)^B (q^{(N_V - B)}) $$

where:

$$ {N_V \choose B} = { {N_V}! \over {B!(N_V-B)!} }$$

Call the actual number of busts $B_V$; then we have:

$$ \rm prob(B_V) = {N_V \choose B_V} (p_{B_V})^{B_V} (q^{(N_V - B_V)}) $$

Now, since we know that $B_V$ busts were measured out of a total of $N_V$ QSOs, we can determine what the distribution of $p_{B_V}$ looks like. 

(From this point on, we will drop the ${}_V$ suffix, since we will take it as read that we are discussing verified numbers.)

Looking at the table above, and plotting the relative probability of obtaining the actual number of busts as a function of $p$ for the ten stations listed, we find:



 We normalise each curve, so that the area under each is the same:



A couple of things are apparent from this plot:
  1. We can see that there is a non-negligible overlap between the curves for PV8ADI and LU2WA. This tells us that, despite the apparent substantial difference in the measured error rates in the logs of the two stations, in fact there is a not-completely-negligible probability that the base error rate for LU2WA is actually higher than that for PV8ADI. (We could calculate the actual probability, but at this point I just want to point out that it's obviously of the order of a few percent.)
  2. Although the original rates for PI4CC and TK0C are essentially identical, (and, hence, the peaks of the two curves occur at the same value of $p$) the curve for PI4CC is slightly broader than the curve for TK0C. This is a reflection of the fact that TK0C had more QSOs, and corresponds to the observation above about the two hypothetical stations that had no errors but who logged different numbers of QSOs.
Let us take a brief diversion prior to taking the next step.

Suppose that we somehow knew that a station $S$ had a probability of 0.1 of busting each QSO. And let us suppose that $S$ makes a total of 1,000 QSOs. Then it is easy to plot the probability of the number of actual QSOs $S$ would bust over the course of the contest (this is just the situation in the first equation above):


Now let us change the situation slightly. Suppose that we know that probability of S busting a QSO is either 0.09 or 0.11, but we don't know which of these it is, and each is equally likely. We can see that the difference in the probability curve for the expected number of busts is significant (the black curve is for probability = 0.09, the red for probability = 0.11):



With this information to hand, what is our best guess for the actual probability curve -- where, by "best guess" we mean "minimising the error"? Now, we know that the actual curve will be either the black curve or the red one, but since we don't know which it will be, our best guess will be the mean of the two -- i.e., the green curve. (You might remember from secondary school statistics that this is called the "expectation" -- a name that can be a bit confusing, since we expect this curve to be wrong! Similarly, a statistician will tell you that the "expectation" or "expected value" when rolling a fair 6-sided die is 3.5.)

Now suppose that we know that we know that the probability of a bust lies somewhere between 0.09 and 0.11, with a uniform distribution. Then we have:


where the white represents all the curves with a bust probability between 0.09 and 0.11, the black represents the expectation values taking into account just the two extreme curves, for bust probabilities of 0.09 and 0.11, and the green is the expectation curve for the entire range of probabilities between 0.9 and 0.11, uniformly distributed.

This is almost the situation that pertains in the case we are examining, with the exception that the curves we saw before we started this digression show that the values of $p$ are not distributed uniformly over a range. (To a good approximation, they are gaussian, at least for the higher values of the ratio $B / N$, although we won't take advantage of that approximation.)

So, if we take the example of PV8ADI, we can create a plot that shows the relative probability of obtaining a particular number of busts, given the distribution of probabilities $p$ that lead to the observed number of busts, $B$, in a total of $N$ QSOs. (If you think about it, you might be able to see that what we are doing here is to account for a second order perturbation on the first-order results. The size of this perturbation increases as the ratio $B/N$ decreases, as does the asymmetry of the first-order curve. The general effect will be to smear the first-order results to become more spread out.)

 For PV8ADI, we find:

where the black line is the basic binomial distribution for PV8ADI, with probability 217 / 1666, normalised to the number of busts for 1,000 QSOs. The green line is a similarly normalised line that takes into account all the binomial distributions for the various values of $p$, weighted by the probability of each value of $p$. As predicted, we see that the green line is more spread-out than the original black line.

We can subject each of the stations to the same treatment, normalising all results to 1,000 QSOs:

So what can we deduce from all this?

To start with, rather than simply quoting some kind of "bust rate", a range should be quoted for each station, representing, say, the 99% confidence limit for the rate.

If we do that, and reorder the stations in order of decreasing upper limit (which seems like the most reasonable ordering: it means that we are 99.5% sure that the actual bust rate is less than this number), then we find:

2016 CW -- 99% confidence limits for $p_B$
Position Call lower limit upper limit
1 PV8ADI 0.110 0.153
2 LU2WA 0.088 0.127
3 HG5F 0.045 0.068
4 NP2P 0.029 0.046
5 TM1A 0.024 0.036
6 PI4CC 0.014 0.022
7 TK0C 0.015 0.022
8 LZ9W 0.0120.019
9 HK1NA 0.012 0.018
10 CN2R 0.012 0.018

We can also perform more sophisticated comparisons between or among stations. For example, we might ask the question: what is the probability that LU2WA will have more busts than PV8ADI if both make 1,000 QSOs? [FYI, the answer is a little under 9%]

The important point here is that the single number that is usually quoted for a station's bust rate leaves much to be desired, and, in particular, may not be useful if one intends to use it to make comparisons to other stations' bust rates, unless both stations have a similar number of (verified) QSOs. A table, graph or chart is generally a much more useful guide when comparing bust rates between or amongst stations.