D. R. Evans (N7DR): January 2021

2021-01-29

Most-Logged Stations in CQ WW CW and SSB Contests: 2020, and the decade from 2011 to 2020

The public CQ WW CW and SSB logs allow us easily to tabulate the stations that appear in the largest number of entrants' logs. For 2020, the ten stations with the largest number of appearances in CQ WW SSB logs were:

Callsign	Appearances	% logs
LZ9W	9,916	53
YT5A	9,459	55
DF0HQ	9,248	53
II2S	7,000	46
ES9C	6,951	45
RL3A	6,890	46
EA8RM	6,795	43
LZ5R	6,591	44
UB7K	6,565	43
E7DX	6,459	41

The first column in the table is the callsign. The second column is the total number of times that the call appears in logs. That is, for example, if a station worked LZ9W on six bands, that will increment the value in the second column of the LZ9W row by six. The third column is the percentage of logs that contain the callsign at least once.

Similarly, the ten stations with the largest number of appearances in CQ WW CW 2020 were:

Callsign	Appearances	% logs
CR3W	14,512	69
LZ9W	11,614	67
YT5A	11,190	65
TI7W	9,471	48
ZF1A	9,157	47
LN8W	8,993	55
CR6K	8,563	49
RM9A	8,476	51
LZ5R	8,332	54
OM7M	8,251	55

Note the substantial difference between the SSB and CW tables.

I find it interesting to see which stations have had the most long-term activity on the contests. For the ten years from 2011 to 2020 on SSB we find:

Callsign	Appearances	% logs
LZ9W	89,249	55
CN3A	81,518	53
DF0HQ	80,576	53
PJ2T	66,941	41
K3LR	66,338	46
OT5A	64,399	43
A73A	62,605	43
P33W	62,540	43
HG7T	60,871	44
TM6M	58,211	43

And for the same years on CW:

Callsign	Appearances	% logs
LZ9W	102,880	66
9A1A	97,305	61
PJ2T	88,148	52
P33W	78,611	52
DF0HQ	78,010	53
W3LPL	74,284	49
K3LR	70,112	48
LZ5R	69,355	53
PJ4A	67,404	48
ES9C	67,220	45

2021-01-16

New CQ WW Video Maps

I have updated the set of CQ WW video maps on my youtube channel (channel N7DR). These video maps cover all the years for which public CQ WW logs are currently available (2005 to 2020).

To access individual videos directly:

The videos are created with time steps of ten minutes; when playing the video, each time step is displayed for five seconds. The videos are presented as animated GIF files, so they should display correctly without any specialised video software installed on your computer.

The videos assume that all communication is via the great-circle short path route, and include only inter-zone contacts. The width of the arcs is an absolute measure of the number of QSOs taking place over that path in the particular 10-minute segment. The colour of the arc reflects the relative number of QSOs taking place over the path. Each separate image (i.e., 10-minute segment) is normalized so that the path with the greatest number of QSOs is rendered in white. Paths with fewer QSOs are in progressively darker colours. Thus, arc colour should not be compared from one still image to another; arc width, however, is meaningful. The width of an arc in pixels is one plus the natural logarithm of the number of QSOs represented by the arc.

2021-01-15

Cleaned and Augmented Logs (including RBN data) for CQ WW CW and SSB Contests, 2005 to 2020

Cleaned and augmented versions of the logs for the CQ WW CW and SSB contests are now available for the period 2005 to 2020.

Links to the cleaned and augmented logs may be followed here.

The cleaned logs are the result of processing the QSO: lines from the entrants' submitted Cabrillo files to ensure that all fields contain valid values and all the data match the format required in the rules. Any line containing illegal data in a field (for example, a zone number greater than 40, or a date/time stamp that is outside the contest period) has simply been removed. Also, only the QSO: lines are retained, so that each line in the file can be processed easily. All zones are rendered with two digits, so as to further simplify processing by scripts or programs.

The augmented logs contain the same information as the cleaned logs, but with the addition of some useful (derived) information on each line. In addition to the actual logs, two additional sources of information are used when appropriate:

AD1C has recently made accessible historical cty.dat and associated files. A copy of the cty,dat files is here. These allow us to use callsign-based multiplier lists as they would have existed at the time of each contest.
From 2009 onwards, the Reverse Beacon Network (RBN) has been available for the CW contests. This allows us to include the time since a station was last posted by the RBN (see below for details).

The information added to each line of the augmented logs comprises:

A sequence of four characters that are the same for each entry in a particular log:

a. letter "A" or "U" indicating "assisted" or "unassisted"
b. letter "Q", "L", "H" or "U", indicating respectively QRP, low power, high power or unknown power level
c. letter "S", "M", "C" or "U", indicating respectively a single-operator, multi-operator, checklog or unknown operator category [ the contest organisers have stated that checklogs are not made public, but in fact at least some of them from the early years have been, hence the need for the "C" category ]
d. character "1", "2", "+" or "U", indicating respectively that the number of transmitters is one, two, unlimited or unknown

A four-digit number representing the time if the contact in minutes measured from the start of the contest. (I realise that this can be calculated from the other information on the line, but it saves subsequent script-based processors of the file considerable time to have the number readily available in the file without having to calculate it for each QSO.)
Band
A set of fourteen flags, each -- apart from column k and column n -- encoded as T/F:
- a. QSO is confirmed by a log from the second party
- b. QSO is a reverse bust (i.e., the second party appears to have bust the call of the first party)
- c. QSO is an ordinary bust (i.e., the first party appears to have bust the call of the second party)
- d. the call of the second party is unique
- e. QSO appears to be a NIL
- f. QSO is with a station that did not send in a log, but who did make 20 or more QSOs in the contest
- g. QSO appears to be a country mult
- h. QSO appears to be a zone mult
- i. QSO is a zone bust (i.e., the received zone appears to be a bust)
- j. QSO is a reverse zone bust (i.e. the second party appears to have bust the zone of the first party)
- k. This entry has three possible values rather than just T/F:
  - T: QSO appears to be made during a run by the first party
  - F: QSO appears not to be made during a run by the first party
  - U: the run status is unknown because insufficient frequency information is available in the first party's log
- l. QSO is a dupe
- m. QSO is a dupe in the second party's log
- n. RBN information (see below)
If the QSO is a reverse bust, the call logged by the second party; otherwise, the placeholder "-"
If the QSO is an ordinary bust, the correct call that should have been logged by the first party; otherwise, the placeholder "-"
If the QSO is a reverse zone bust, the zone logged by the second party; otherwise, the placeholder "-"
If the QSO is an ordinary zone bust, the correct zone that should have been logged by the first party; otherwise, the placeholder "-"

RBN Information

In the CW contests from 2009 onwards, the RBN was active, automatically spotting the frequency at which any station calling CQ was transmitting. To reflect possible use of RBN information, the augmented files now include a fourteenth flag. For the sake of uniformity, this column is present in all the augmented files, regardless of whether the RBN actually contributed useful information to a particular contest.

Each QSO has one of several characters in the fourteenth column of flags. These characters should be interpreted as follows:

'-'
No useful RBN-derived information is available for this QSO.

'0'
The worked station (i.e., the second call on the log line) appears to have begun to CQ on this frequency within (roughly) 60 seconds prior to the QSO.

'A' to 'Z'
For the nth letter of the alphabet: the worked station appears to have been CQing on this frequency for (roughly) n minutes prior to the QSO.

'+'
The worked station appears to have been CQing for more than 26 minutes on this frequency.

'<'
Because the the RBN is distributed, and because each contest entrant station has its own clock, there is generally a skew between the reading of the clock of the station making the QSO and the timestamp from the RBN at which it believes a posting was made (indeed, it's unclear from the RBN's [lack of] documentation exactly how the timestamp on an individual RBN posting is to be interpreted). If the character '<' appears in the the RBN column, it indicates that the raw values of the clocks suggest that the QSO took place up to two minutes before the RBN reported the worked station commencing to CQ at this frequency. When this occurs, the most likely interpretation is that there is non-negligible skew between the two clocks, and the station was actually worked almost as soon as a CQ was posted by the RBN. This character also appears if the RBN erroneously posts the worked station as CQing at this frequency shortly after the QSO. But it might also mean that the entrant was simply lucky and found the CQing station just as it fired up on a new frequency.

Notes:

The encoding of some of the flags requires subjective decisions to be made as to whether the flag should be true or false; consequently, and because CQ has yet to understand the importance of making their scoring code public, the value of a flag for a specific QSO line in some circumstances might not match the value that CQ would assign. (Also, CQ has more data available in the form of check logs, which are generally not made public.)
I made no attempt to deduce or infer the run status of a QSO in the second party's log (if such exists), regardless of the status in the first party's log. This allows one cleanly to perform correct statistical analyses anent the number of QSOs made by running stations merely by excluding QSOs marked with a U in column k.
No attempt is made to detect the case in which both participants of a QSO bust the other station's call. This is a problematic situation because of the relatively high probability of a false positive unless both stations accurately log the frequency as opposed to merely the band. (Also, on bands on which split-frequency QSOs are common, the absence of both transmit and receive frequency is a problem; I confess that I have never understood why Cabrillo was not designed to report both transmit and receive frequencies -- or even to define clearly which frequency is to be reported. I digress.) Because of the likelihood of false positives, it seems better, given the presumed rarity of double-bust QSOs, that no attempt be made to mark them.
The entries for the zones in the case of zone or reverse zone busts are normalised to two-digit values.

2021-01-02

Reverse Beacon Network Actvity: 2009-2020

I here show various plots of the G(15, 100) grid-based scatter metric, G(15, 100), for the Reverse Beacon Network (RBN), using data from the inception of the RBN up to the end of 2020.

As in the past I note that a reasonable a priori case can be made on the basis of propagation characteristics that somewhat different metrics in the G(Δ, n) series might be better representations of RBN coverage on some of the bands. However, rather than make this into a full-scale research project, I shall here simply continue to use the G(15, 100) metric on the basis that it seems "good enough" on all bands.

RBN Posting Stations as a Function of Time

We begin by looking simply at how the number of per-band posters to the RBN has varied since the RBN's inception. (NB Throughout this post, we ignore posters for which the location is not recorded by the RBN; plots for which the abscissa is time show one datum per month.)

First, a plot of the total number of posters as a function of time:

This can be more compactly represented, along with similar per-band data for 160m through 10m (excluding 60m):

G(15, 100) as a Function of Time

Turning now to the geographical distribution of the posting stations, we can display the mensal values of G(15, 100) in a similar manner:

These figures seem to make rather clearly the rather depressing point that, with the exception of 2020, which by definition was an exceptional year because of the prevalence of COVID-19, since early 2017 there has been no substantive or sustained increase in either the number or geographical distribution of the stations posting to the RBN. It will be interesting to see what happens in 2021.

G(15, 100) as a Function of the Number of Posters

Finally, we can combine the mensal values of G(15, 100) and the number of posters. Firstly, including all bands:

The summary plot for these data is slightly different, as the ordinate is multi-valued for some values of the abscissa. So, in this summary plot, we take the mean value of G(15, 100) in bins of width equivalent to ten posters, and plot rectangles in the equivalent colours:

All in all, a rather unhappy picture emerges, in which the RBN, after expanding and increasing coverage rather nicely for the better part of a decade, became essentially static in early 2017 and has effectively failed to expand numerically or in geographical coverage until he emergence of the pandemic in 2020. It will be interesting to see whether 2021 brings a return to stasis or whether the renewed slight improvement in coverage will continue.

2021-01-01

2020 RBN Data

All the postings to the Reverse Beacon Network in 2020, along with the postings from prior years, are now available in this directory.

Some simple annual statistics for the period 2009 to 2020 follow (the 2009 numbers cover only part of that year, as the RBN was instantiated partway through that year).

Total posts:

2009: 5,007,040
2010: 25,116,810
2011: 49,705,539
2012: 71,584,195
2013: 92,875,152
2014: 108,862,505
2015: 116,385,762
2016: 111,027,068
2017: 117,973,111
2018: 131,930,432
2019: 135,558,461
2020: 173,655,453

Total posting stations:

2009: 151
2010: 265
2011: 320
2012: 420
2013: 473
2014: 515
2015: 511
2016: 590
2017: 625
2018: 550
2019: 583
2020: 616

Total posted distinct callsigns:

2009: 143,724
2010: 266,189
2011: 271,133
2012: 308,010
2013: 353,952
2014: 398,293
2015: 433,197
2016: 375,613
2017: 356,461
2018: 361,058
2019: 337,246
2020: 369,580

Obviously, statistics that are considerably more comprehensive may be derived rather easily from the files in the directory.

Note that if you intend to use the databaseߴs reported signal strengths in an analysis, you should be sure that you understand the ramifications of what the RBN means by SNR.