2016-01-24

Plots from CQ WW logs

The availability of the CQ WW public logs covering the period 2005 to 2015 allows us to look at various trends over that period.

The number of logs shows that for both CW and SSB the total number of entrants has been more or less stable for the most recent years, following a period of increase:


Breaking it down between unassisted and assisted, we see:

The number of assisted entrants shows a remarkably constant increase from year to year, whereas the unassisted numbers are close to constant. It will be interesting to see whether the assisted lines cross the unassisted lines next year.

The geographical distribution of entrants on CW has changed little over the period:


As I have noted before, on SSB the percentage of logs from western EU has decreased somewhat, and the percentage from JA/HL and zone 28 has slightly increased.


Of more interest (well, I think so, anyway) than the numbers of logs is the number of QSOs in the logs:

This shows the perhaps not-unexpected reality that as the number of assisted logs has increased, the number of QSOs in the top 10 percent of logs (as determined by number of QSOs in the log) has steadily declined. The median number of QSOs (represented by the coloured near-horizontal lines) has varied little, with only SSB Assisted showing a clear change in recent years, although both CW and SSB Assisted show a more or less gentle trend downwards in the median number of QSOs in the logs.

One unexpected fact emerges: the median log in CQ WW CW has consistently contained about a hundred more QSOs than has the median CQ WW SSB log.  I'm not sure why this should be, but it is a robust result.

Now, how about whether the CQ WW contest is actually gaining in popularity, as is often claimed (without any supporting evidence for the declaration, as far as I know). How many people are actually getting on the air and making QSOs in the contest?

Different people may set the bar as to how many times a call has to appear in the logs of other stations at different values. For example, if a call appears only once, it might well be a bust rather than a real participant. Even if a call appears, say, twenty times, that might merely reflect a busted spot that was logged by twenty people who did not check the call.

So we can produce a series of nearly-parallel lines, setting the required number of appearances at different values.

On CW, it makes no substantive difference which definition of participation one chooses: the number of participants is essentially unchanged from year to year. Any increase in recent years is marginal at best.


On SSB, there is indication of a slight decline in recent years.


Finally, it's of some interest to plot the year-by-year number of distinct QSOs made on each band. By "distinct" I mean that a QSO both sides of which are logged by entrants counts just once.

For CW:

And for SSB:



In both cases, one can see the importance of 10m in determining the total number of distinct QSOs. However, while 10m is the most dominant band on SSB in years of good propagation, that is not true on CW: on CW, in years of good propagation, 10m, 15m, 20m and 40m provide essentially identical contributions to the total.

We also see that on CW, 20m and 40m provide nearly identical numbers throughout the cycle; on SSB, 20m and 15m have this characteristic.

The relative lack of dominance of 10m on CW is likely due to the timing of the two contests (indeed, I wish that both CQ WPX and CQ WW would swap the SSB and CW events each year; with the current calendar, CW suffers in both types of contest). By the end of November, the number of hours of common daylight between EU and NA are simply too few for 10m to be dominant.


"Nowhere to Run" available on Kindle



My book Nowhere to Run is now available for the Kindle. You may read a sample (from the hardcopy version).


2016-01-23

Obtaining Magnetometer Data

The Problem  

 

Until a few days ago, it was possible to obtain near-real-time ("NRT") magnetometer data files from directories housed below http://magweb.cr.usgs.gov/data/magnetometer/. Attempts to reach that location, and the subordinate files that held the actual NRT data, now return:

The requested service is temporarily unavailable. Please try later.

After waiting for several days for something more useful to be returned, I finally realised that temporarily here seems to be government-speak for permanently.

There is still a way to obtain NRT magnetometer data, but it's vastly more complicated than simply downloading data from a URL. I do not know why the sites that make the data available do not hide all the complexity from users and simply provide a URL as an interface, but the fact is that, at least for now, they do not, and we now have to go through a rather tortuous process to obtain data that was, until a few days ago, essentially trivial to obtain.

Obtaining the data in the absence of a URL


These are the steps that I had to take on debian stable; other Linux distributions and other operating systems will likely differ in details, but the basic idea should be the same.

First, one needs to install python, since the data are accessed via a python script. Debian, like most  Linux distributions installs python by default.
Next, install the packages python-numpy, python-scipy and python-flake8 from the distribution repositories.

Now, install obspy by following the instructions at: https://github.com/obspy/obspy/wiki/Installation-on-Linux-via-Apt-Repository. If using a different distribution or operating system, follow the pertinent instructions at https://github.com/obspy/obspy/wiki.

Now that python and the necessary libraries are available, we need to download the actual software package that does the work; we get this from: https://github.com/usgs/geomag-algorithms.

I note that the installation instructions on the site read, in their entirety:
First time install. Walk through dependencies and other considerations.

If one looks inside the downloaded package, there is no sign of the standard INSTALL file. Neither is there a README or README.FIRST file. There is a README.md file, but its contents merely repeat the above "instructions" from the site. So I simply unpacked the package in a reasonable location and used it as-is. I am sure that that is not what one is supposed to do, but in the absence of explicit, detailed installation instructions, it will have to do.

After unpacking that software somewhere suitable, we have access to the script we need, which is geomag.py in the .../bin directory.

The documentation is... well, the politest word I can think of is "sparse". As unfortunately seems to be so often the case when trying to use government data, it is hard to escape the conclusion that they don't really want it to be used. They certainly don't seem to go to any effort to explain the details of the package in a way that is useful to users. So we're pretty much on our own. 

The command geomag.py -h offers a lot of output, most of it incomprehensible without further documentation. However, the following command (typed on one line) produces output in a format very similar to that stored at the URL that in the past provided one-minute Boulder magnetometer data for a particular day:

geomag.py --inchannels H D Z F --observatory BOU --starttime 2016-01-23T00:00:00Z --endtime 2016-01-23T23:59:00Z --input-edge cwbpub.cr.usgs.gov --output-iaga-file output-file-name

From this point one can process the file output-file-name appropriately, in a way similar to the manner in which the old URL-based data could be processed. For example, I have a widget on my desktop that displays the last couple of hours of data from the Boulder magnetometer, using the above method to obtain data every five minutes:

 

CQ WW logs, 2005 - 2015

The logs for the 2015 running of the CQ WW CW contest are now available from CQ. This means that logs from the period 2005 to 2015 are now available for both SSB and CW.

As before, I have created a compressed file that contains all the cleaned QSO lines from the Cabrillo files from all the logs for all the years for which data are available. The MD5 checksum of this file is: 84864c3d7a7f80653243a35ae014f0dc.

I have also created an augmented file, in compressed format, that adds useful data to each QSO. Each QSO line in the augmented file includes an additional four columns, with the following meanings:

  1. The letter "A" or "U" indicating "assisted" or "unassisted"
  2. A four-digit number representing the time if the contact in minutes measured from the start of the contest. (I realise that this can be calculated from the other information on the line, but it saves a lot of time to have the number readily available in the file without having to calculate it each time.)
  3. Band
  4. A set of eight flags, each encoded as T/F: 
    • a. QSO is confirmed by a log from the second party 
    • b. QSO is a reverse bust (i.e., the second party appears to have bust the call of the first party) 
    • c. QSO is an ordinary bust (i.e., the first party appears to have bust the call of the second party) 
    • d. the call of the second party is unique 
    • e. QSO appears to be a NIL 
    • f. QSO is with a station that did not send in a log, but who did make 20 or more QSOs in the contest 
    • g. QSO appears to be a country mult 
    • h. QSO appears to be a zone mult
The MD5 checksum of this file is: f2ab0dc3af3a4b5ed3f1846953c4d03b.
  
Note that the flags in the augmented data are calculated from the raw data independently of CQ. This is because:
  1. for reasons I cannot guess, CQ does not make the actual scoring code available ;
  2. the checklogs are not public, and hence represent additional data that CQ can use in determining the values of the flags. 
 

CQ WW Videos Updated

I have updated the set of CQ WW video maps on my youtube channel (channel N7DR). These video maps cover all the years for which public logs are currently available.

To access individual videos directly:
   

 

2016-01-07

CQ WW SSB 2015 logs

The logs for the 2015 running of the CQ WW SSB contest are now available from CQ.

As usual, I have created a compressed file that contains all the cleaned QSO lines from the Cabrillo files from all the logs for all the years for which data are available. As the 2015 data for the CW running of the contest are not yet available, this file contains only the SSB data. Currently, the file covers all the QSOs in the years from 2005 to 2015. The MD5 checksum of this file is: b532ba3307583c379c826314b81a5d04.

I have also created an augmented file, in compressed format, that adds useful data to each QSO. Each QSO line in the augmented file includes an additional four columns, with the following meanings:

  1. The letter "A" or "U" indicating "assisted" or "unassisted"
  2. A four-digit number representing the time if the contact in minutes measured from the start of the contest. (I realise that this can be calculated from the other information on the line, but it saves a lot of time to have the number readily available in the file without having to calculate it each time.)
  3. Band
  4. A set of eight flags, each encoded as T/F: 
    • a. QSO is confirmed by a log from the second party 
    • b. QSO is a reverse bust (i.e., the second party appears to have bust the call of the first party) 
    • c. QSO is an ordinary bust (i.e., the first party appears to have bust the call of the second party) 
    • d. the call of the second party is unique 
    • e. QSO appears to be a NIL 
    • f. QSO is with a station that did not send in a log, but who did make 20 or more QSOs in the contest 
    • g. QSO appears to be a country mult 
    • h. QSO appears to be a zone mult
The MD5 checksum of this file is: 1e3f1862d4c8126a5e85639368941d3a.
  
Note that the flags in the augmented data are calculated from the raw data independently of CQ. This is because:
  1. for reasons I cannot guess, CQ does not make the actual scoring code available ;
  2. the checklogs are not public, and hence represent additional data that CQ can use in determining the values of the flags.
Obviously, a huge number of analyses can be performed with these various files. There follow a few that interested me. 

Geographical Participation

How has the geographical distribution of entries changed over time? 

So there seem to be no major variations over time: the percentage of logs from western EU has decreased somewhat, and the percentage from JA/HL and zone 28 has slightly increased.

In terms of raw numbers of logs, almost everywhere with substantive participation has shown a reasonably consistent increase:

Popularity

I not-infrequently come across statements to the effect that contesting in general, and CQ WW in particular, are increasingly popular. Usually, no evidence for the statement is provided, as if it were self-evident; the only purported evidence I have seen is referral to the fact that the number of entries is increasing -- which is manifestly not the same as an increase in the popularity.

By definition, popularity demands some measure of people (or, in our case, the simple proxy of callsigns). So we can look at the number of calls in the logs as a function of time:

I find this graph particularly interesting, not just because it shows that the popularity of CQ WW SSB appears to have peaked a few years ago, but also because it shows that the result is quite robust regardless of how many times one deems it necessary for a call to appear before that call is deemed a participant.

Activity

We can also look at the change in activity as a function of year. Activity depends on the number of people participation, and on how many QSOs those people make:

Here the words "distinct QSOs" is intended to convey that a QSO is counted once, even if both participants have contributed a log.

Not unexpectedly, this plot shows that the total number of QSOs is dominated by conditions, and, in particular, the state of 10m during the contest.

 













2016-01-02

2015 RBN data

All the postings to the Reverse Beacon Network in 2015, along with the postings from prior years, are now available in the directory https://copy.com/smnSEO4RoQuTj4mC.

Some simple annual statistics for the period 2009 to 2015 follow (the 2009 numbers cover only part of that year, as the RBN was instantiated partway through the year).

Total posts:
2009:   5,007,040
2010:  25,116,810
2011:  49,705,539
2012:  71,584,195
2013:  92,875,152
2014: 108,862,505
2015: 116,754,643
 Total posting stations:
2009: 151
2010: 265
2011: 320
2012: 420
2013: 473
2014: 515
2015: 512
 Total posted callsigns:
2009: 143,724
2010: 266,189
2011: 271,133
2012: 308,010
2013: 353,952
2014: 398,293
2015: 433,812
Obviously, much more comprehensive statistics may be derived rather easily from the files in the directory.