2024-01-24

Statistics from 2023 CQ WW SSB and CQ WW CW logs

A huge number of analyses can be performed with the various public CQ WW logs (cq-ww-2005--2023-augmented.xz; see here for details of the augmented format) for the period from 2005 to 2023.

As in prior years, there follow a few basic analyses that interest me. There is, of course, plenty of scope to use the log files for further analyses, some of which are suggested by the figures below.

Below are some simple analyses of basic statistics from the logs. The 2023 versions of the contests showed a more-or-less full return to normal operation, following several years disrupted by COVID and the invasion of Ukraine by Russia. The latter, of course, is still under way, but its effect on the contest seems to be decreasing. And we finally had sunspots (Well, I assume we did for the SSB leg -- I was away for that contest; conditions in the CW leg were certainly an improvement). 

Number of Logs

Until 2020, the raw number of submitted logs for SSB had been relatively flat for several years; the logs submitted for CW showed a fairly steady annual increase. In 2020, unsurprisingly, the number of logs in both modes increased to new record, probably because of the pandemic; CQ WW SSB 2021 set another record; on CW, the number of logs decreased slightly, but would still have been a record were it not for 2020. 2022 was another year of unusual circumstances: not only was the pandemic still in evidence in much of the world, but the Russian invasion of Ukraine, along with the CQ WW committee's vacillation on how to proceed in light of that invasion -- and then the protest against the committee's position as of the contest dates -- was always going to lead to a reduction in the number of submitted logs. In 2023, the numbers for both modes bounced back up somewhat, but they both still fell short of being a record, especially on CW.

One not infrequently reads statements to the effect that the popularity of contests such as CQ WW has long been increasing. This plot suggests that this claim had not been true for a number of years prior to 2020 (and even when it was true, there are alternative explanations for the year-on-year increase, such as increasing ease of electronic log submission). The circumstances for 2020, 2021 and 2022 have been so unusual that it would seem to be an error to regard them as in any way indicative of a trend. But 2023 does give some cause for hope that on SSB numbers have reached a new, higher plateau; CW seems to have reverted to the level of just before 2020.


Popularity

By definition, popularity requires some measure of people (or, in our case, the simple proxy of callsigns) -- there is no reason to believe, a priori, that the number of received logs as shown above is related in any particular way to the popularity of a contest, despite rather frequent conclusory statements to the contrary.

So we look at the number of calls in the logs as a function of time, rather than positing any kind of well-defined positively correlated relationship between log submission and popularity (actually, the posts I have seen don't even bother to posit such a relationship: they are silent on the matter, thereby simply seeming to presume that the reader will assume one). 

However, the situation isn't as simple as it might be, because of the presence of busted calls in logs. If a call appears in the logs just once (or some small number of times), it is more likely to be a bust rather an actual participant. Where to set a cut-off a priori in order to discriminate between busts and actual calls is unclear; but we can plot the results of choosing several such values. 

First, for SSB:

Regardless of how many logs a call has to appear in before we regard it as a legitimate callsign, the popularity of CQ WW SSB during the pandemic surely increased from the doldrums of the prior few years. Complicating the picture in the past couple of years is, of course, the reduction in participation that is (presumably) due to the Russian invasion of Ukraine. Whatever the cause, the number of calls certainly seems to be well down on the number at a similar point in the last solar cycle


[I note that a plausible argument can be made that the number of uniques will be more or less proportional to the number of QSOs made (I have not tested that hypothesis; I leave it as an exercise for the interested reader to determine whether it is true), but there is no obvious reason why the same would be true for, for example, callsigns that appear in, say, ten or more logs. The interested reader might also consider basing a similar analysis on eXtended Super Check Partial files as created by the drscp program.]

Moving to CW:

On CW, we see that in 2022 the reduction due (presumably) to the Russian invasion of Ukraine has led to the number of active calls being the lowest of all the years for which data are available. In 2023 there was a slight correction, but the numbers are still well short of the numbers during the high-sunspot years of the last solar cycle.

 

Geographical Participation


How has the geographical distribution of entries changed over time?

Again looking at SSB first:


The number of entrants from zone 16 has increased since last year, but is still well down from historical levels. The number of logs submitted from zone 28 continues to show an increased level. Still, the number of logs from zones outside EU or the US continues to be very small. This can be seen more clearly if we plot the percentage of logs received from each zone as a function of time:


In 2022, entrants from zone 16 dropped from the historical value of around 10% to close to zero. The slack was taken up principally by the US zones and zones 11 and 25. In 2023, the percentages edged closer to historical norms.

On CW, most zones evidence a sustained long-term increase:


Again we see the expected drop in entries from zone 16 in the past couple of years, but other than that trends continue more or less as before, with the relative increase spread more or less evenly across all zones, with the percentages of logs from each zone barely changing except for the pandemic years:



It is, I think, of some interest that the change in participation in zone 28 that is obvious on SSB is only gradually making itself felt on CW. Zone 24 is gradually becoming more common, although it remains far behind the powerhouse that is zone 25.


Activity


Total activity in a contest depends both on the number of people who participate and on how many QSOs each of those people makes. We can use the public logs to count the total number of distinct QSOs in the logs (that is, each QSO is counted only once, even if both participants have submitted a log).

For SSB:

 

There were five years in the most recent solar cycle in which more QSOs were made than in 2023. It doesn't seem likely that the next five years will all result in more QSOs than 2023, but I guess we'll see.

And for CW:


 

Pretty much the same situation holds as on SSB. Possibly the greying of the amateur radio community is finally taking its toll. I hope not, but I can't say that these graphs bode well.

 

Running and Calling


On SSB, the ongoing gradual shift towards stations strongly favouring either running or calling, rather than splitting their effort between the two types of operation, finally appears to have reached some kind of equilibrium. There was essentially no change between 2018 and 2019, and even a (very) slight reversal of the trend in 2020 and 2021. 2022, however, for the first time saw more than 30% of entrants making no run QSOs at all, a situation that continued in 2023. In 2023, the number of stations making fewer than 10% of their QSOs in a run exceeded 60%.



I have not investigated the cause of the decrease in the percentage of stations strongly favouring running, although the public logs could readily be used to distinguish possibilities that spring to mind, such as more SO2R operation, more multi-operator stations, and/or a reluctance of stations to forego the perceived advantages of spots from cluster networks. In any case, it certainly seems that SSB operators seem to fall decisively into one of two camps: runners and callers (look at the quite astonishing bimodal distribution in the first of the two graphs above, with the vast majority nearly always calling other stations).

On CW, the split between callers and runners continues to be much less bimodal than on SSB (as mentioned above, on SSB, fully 30% of entrants have no run QSOs; on CW, the equivalent number is below 10%). Indeed, the difference in call/run behaviour on the two modes (and the difference in the way that the behaviour has changed over time) is profound, and probably worthy of further investigation. CW continues to appear to exhibit what would seem to be a much healthier split between the two operating styles:


 


Assisted and Unassisted


We can see how the relative popularity of the assisted and unassisted categories has changed since they were introduced:


 

On CW, there continue to be more or less equal numbers of assisted and unassisted logs, although a gap in favour of assisted operation slowly seems to be opening. On SSB the unassisted logs handily exceeds the number of assisted logs. My guess, for what it's worth, is that CW assistance is more widespread partly because it (partially) absolves stations from actually being able to copy at high speed, and partly because the RBN is so effective that essentially all CQing stations are spotted.

I find it particularly interesting that the number of CWU logs has remained essentially unchanged ever since the unassisted category was created.

Looking at the number of QSOs appearing in the unassisted and assisted logs:


 

(The lines are for the median number of logs; the vertical bars run from 10% to 90%, 20% to 80%, 30% to 70%, 40% to 60%, with opacity increasing in that order.)


A long-term downward trend in the numbers of QSOs in the assisted logs ceased in 2016, and since then the median number of QSOs in the assisted logs has remained essentially unchanged. A more or less constant difference of roughly one hundred QSOs between the median CW and SSB logs (in favour of CW) continues.

Inter-Zone QSOs


We can show the number of inter-zone QSOs, both band-by-band and in total. In these plots, the number of QSOs is accumulated every ten minutes, so there are six points per hour.

The new cycle has definitely started. Unfortunately, the CW event suffers by a month later in the year than the SSB event. [I do not understand why the CQ WW committee do not alternate the weekends of the SSB and CW modes; but then, I don't understand a lot of what they do or don't do.]

Like 2022, 2023 saw fairly ordinary 15m participation on SSB, probably because of signs of activity on 10m. CW saw more activity, presumably because, the CW event being later in the year, 10m did not cooperate to the same extent as it did on SSB.

Much less activity on 20m, in both modes. Partly because of better conditions on 10m and 15m, but also because of the decreased activity in general, perhaps caused by the Russian invasion of Ukraine.

As always, CW dominates on 40m; and, within that mode, intra-EU QSOs further dominate. After the first few hours of the contest, very little DX was worked in either of the past couple of years.

80m is always dominated by CW; but the past couple of years have seen what appears to be a record low level of activity, perhaps because of the invasion of Ukraine

160m paints a similar story to 80m, although the raw QSO counts are much lower, and appear to have sunk to a record low -- certainly much lower in the equivalent point of the last cycle.

The overall picture shows the influence of the new solar cycle; but it seems clear that the ramp-up is much slower in this cycle, perhaps due to the invasion of Ukraine.

2024-01-12

Most-Logged Stations in CQ WW CW and SSB Contests: 2023, and the decade from 2014 to 2023

The public CQ WW CW and SSB logs allow us easily to tabulate the stations that appear in the largest number of entrants' logs. For 2023, the ten stations with the largest number of appearances in CQ WW SSB logs were:

Callsign Appearances % logs
CN3A 16,064 73
YT5A 11,799 58
LZ9W 11,419 57
PJ4K 11,268 57
9A1A 11,106 56
M6T 10,812 56
V26B 10,091 53
CR6K 9,983 54
ZF1A 9,918 50
DF0HQ 9,749 54


The first column in the table is the callsign. The second column is the total number of times that the call appears in logs. That is, for example, if a station worked ZF1A on six bands, that will increment the value in the second column of the ZF1A row by six. The third column is the percentage of logs that contain the callsign at least once.

Similarly, the ten stations with the largest number of appearances in CQ WW CW 2023 were:

Callsign Appearances % logs
D4C 16,225 77
CN3A 14,690 75
CR3W 13,164 69
CR3A 12,924 72
M6T 12,301 68
9A1A 12,185 69
PJ2T 11,636 63
PJ4K 11,527 66
YT5A 11,428 67
LZ9W 11,416 66


Note the substantial difference between the SSB and CW tables.

I find it interesting to see which stations have had the most long-term activity on the contests. For the ten years from 2014 to 2023 on SSB we find:

Callsign Appearances % logs
LZ9W 94,304 55
CN3A 87,218 50
DF0HQ 83,779 53
YT5A 73,470 47
PJ2T 71,101 40
M6T 70,468 46
P33W 66,574 43
K3LR 65,996 44
LZ5R 59,992 43
HG7T 59,939 42


And for the same years on CW: 

Callsign Appearances % logs
LZ9W 106,792 66
PJ2T 93,172 52
CR3W 91,010 53
YT5A 90,359 60
9A1A 88,575 54
P33W 80,240 52
DF0HQ 78,857 53
M6T 77,761 51
TK0C 73,181 44
RM9A 72,315 48

Similar tables from last year may be found here.

2024-01-05

Cleaned and Augmented Logs (including RBN data) for CQ WW CW and SSB Contests, 2005 to 2023

Cleaned and augmented versions of the logs for the CQ WW CW and SSB contests are now available for the period 2005 to 2023.

Links to the cleaned and augmented logs may be found in this directory (look for "clean" or "augmented" in the filenames).

The cleaned logs are the result of processing the QSO: lines from the entrants' submitted Cabrillo files to ensure that all fields contain valid values and all the data match the format required in the rules. Any line containing illegal data in a field (for example, a zone number greater than 40, or a date/time stamp that is outside the contest period) has simply been removed. Also, only the QSO: lines are retained, so that each line in the file can be processed easily. All zones are rendered with two digits, so as to further simplify processing by scripts or programs.

The augmented logs contain the same information as the cleaned logs, but with the addition of some useful (derived) information on each line. In addition to the actual logs, two additional sources of information are used when appropriate:

  1. AD1C has made accessible historical cty.dat and associated files. These allow us to use callsign-based multiplier lists as they would have existed at the time of each contest.

  2. From 2009 onwards, the Reverse Beacon Network (RBN) has been available for the CW contests. This allows us to include the time since a station was last posted by the RBN (see below for details).

The information added to each line of the augmented logs comprises:
  1. A sequence of four characters that are the same for each entry in a particular log:
    •  a. letter "A" or "U" indicating "assisted" or "unassisted"
    •  b. letter "Q", "L", "H" or "U", indicating respectively QRP, low power, high power or unknown power level
    •  c. letter "S", "M", "C" or "U", indicating respectively a single-operator, multi-operator, checklog or unknown operator category [ the contest organisers have stated that checklogs are not made public, but in fact at least some of them from the early years have been, hence the need for the "C" category ]
    •  d. character "1", "2", "+" or "U", indicating respectively that the number of transmitters is one, two, unlimited or unknown
  2. A four-digit number representing the time if the contact in minutes measured from the start of the contest. (I realise that this can be calculated from the other information on the line, but it saves subsequent script-based processors of the file considerable time to have the number readily available in the file without having to calculate it for each QSO.)
  3. Band
  4. A set of fourteen flags, each -- apart from column k and column n -- encoded as T/F: 
    • a. QSO is confirmed by a log from the second party 
    • b. QSO is a reverse bust (i.e., the second party appears to have bust the call of the first party) 
    • c. QSO is an ordinary bust (i.e., the first party appears to have bust the call of the second party) 
    • d. the call of the second party is unique 
    • e. QSO appears to be a NIL 
    • f. QSO is with a station that did not send in a log, but who did make 20 or more QSOs in the contest 
    • g. QSO appears to be a country mult 
    • h. QSO appears to be a zone mult 
    • i. QSO is a zone bust (i.e., the received zone appears to be a bust)
    • j. QSO is a reverse zone bust (i.e. the second party appears to have bust the zone of the first party)
    • k. This entry has three possible values rather than just T/F:
      • T: QSO appears to be made during a run by the first party
      • F: QSO appears not to be made during a run by the first party
      • U: the run status is unknown because insufficient frequency information is available in the first party's log
    • l. QSO is a dupe
    • m. QSO is a dupe in the second party's log
    • n. RBN information (see below)
  5. If the QSO is a reverse bust, the call logged by the second party; otherwise, the placeholder "-"
  6. If the QSO is an ordinary bust, the correct call that should have been logged by the first party; otherwise, the placeholder "-"
  7. If the QSO is a reverse zone bust, the zone logged by the second party; otherwise, the placeholder "-"
  8.  If the QSO is an ordinary zone bust, the correct zone that should have been logged by the first party; otherwise, the placeholder "-" 

RBN Information


In the CW contests from 2009 onwards, the RBN was active, automatically spotting the frequency at which any station calling CQ was transmitting. To reflect possible use of RBN information, the augmented files now include a fourteenth flag. For the sake of uniformity, this column is present in all the augmented files, regardless of whether the RBN actually contributed useful information to a particular contest.

Each QSO has one of several characters in the fourteenth column of flags. These characters should be interpreted as follows:

'-'
  No useful RBN-derived information is available for this QSO.

'0'
  The worked station (i.e., the second call on the log line) appears to have begun to CQ on this frequency within (roughly) 60 seconds prior to the QSO.

'A' to 'Z'
  For the nth letter of the alphabet: the worked station appears to have been CQing on this frequency for (roughly) n minutes prior to the QSO.

'+'
  The worked station appears to have been CQing for more than 26 minutes on this frequency.

'<'
  Because the the RBN is distributed, and because each contest entrant station has its own clock, there is generally a skew between the reading of the clock of the station making the QSO and the timestamp from the RBN at which it believes a posting was made (indeed, it's unclear from the RBN's [lack of] documentation exactly how the timestamp on an individual RBN posting is to be interpreted). If the character '<' appears in the the RBN column, it indicates that the raw values of the clocks suggest that the QSO took place up to two minutes before the RBN reported the worked station commencing to CQ at this frequency. When this occurs, the most likely interpretation is that there is non-negligible skew between the two clocks, and the station was actually worked almost as soon as a CQ was posted by the RBN. This character also appears if the RBN erroneously posts the worked station as CQing at this frequency shortly after the QSO. But it might also mean that the entrant was simply lucky and found the CQing station just as it fired up on a new frequency.

Notes:
  • The encoding of some of the flags requires subjective decisions to be made as to whether the flag should be true or false; consequently, and because CQ has yet to understand the importance of making their scoring code public, the value of a flag for a specific QSO line in some circumstances might not match the value that CQ would assign. (Also, CQ has more data available in the form of check logs, which are generally not made public.)
  • I made no attempt to deduce or infer the run status of a QSO in the second party's log (if such exists), regardless of the status in the first party's log. This allows one cleanly to perform correct statistical analyses anent the number of QSOs made by running stations merely by excluding QSOs marked with a U in column k.
  • No attempt is made to detect the case in which both participants of a QSO bust the other station's call. This is a problematic situation because of the relatively high probability of a false positive unless both stations accurately log the frequency as opposed to merely the band. (Also, on bands on which split-frequency QSOs are common, the absence of both transmit and receive frequency is a problem; I confess that I have never understood why Cabrillo was not designed to report both transmit and receive frequencies -- or even to define clearly which frequency is to be reported. I digress.) Because of the likelihood of false positives, it seems better, given the presumed rarity of double-bust QSOs, that no attempt be made to mark them.
  • The entries for the zones in the case of zone or reverse zone busts are normalised to two-digit values.

2024-01-03

CQ WW video maps, 2005 to 2023

 

I have updated the complete set of CQ WW video maps on my youtube channel (channel N7DR).


The videos are created with time steps of ten minutes; when playing the video, each time step is displayed for five seconds. The videos are presented as animated GIF files, so they should display correctly without any specialised video software installed on your computer.

The videos assume that all communication is via the great-circle short path route, and include only inter-zone contacts. The width of the arcs is an absolute measure of the number of QSOs taking place over that path in the particular 10-minute segment. The colour of the arc reflects the relative number of QSOs taking place over the path. Each separate image (i.e., 10-minute segment) is normalized so that the path with the greatest number of QSOs is rendered in white. Paths with fewer QSOs are in progressively darker colours. Thus, arc colour should not be compared from one still image to another; arc width, however, is meaningful. The width of an arc in pixels is one plus the natural logarithm of the number of QSOs represented by the arc.

CQ WW SSB and CW logs for 2023

CQ WW SSB and CW logs for 2023 have been added to this repository of public CQ WW logs.