Cleaned and augmented versions of the logs for the
CQ WW CW and SSB contests are now available for the period 2005 to 2020.
Links to the cleaned and augmented logs may be followed
here.
The cleaned logs are the result of processing the
QSO:
lines from the entrants' submitted Cabrillo files to ensure that all
fields contain valid values and all the data match the format required
in the rules. Any line containing illegal data in a field (for example, a
zone number greater than 40, or a date/time stamp that is outside the
contest period) has simply been removed. Also, only the
QSO:
lines are retained, so that each line in the file can be processed
easily. All zones are rendered with two digits, so as to further
simplify processing by scripts or programs.
The augmented logs
contain the same information as the cleaned logs, but with the addition
of
some useful (derived) information on each line. In addition to the
actual logs, two additional sources of information are used when
appropriate:
- AD1C has recently made accessible historical cty.dat and associated files. A copy of the cty,dat files is here. These allow us to use callsign-based multiplier lists as they would have existed at the time of each contest.
-
From 2009 onwards, the Reverse Beacon Network
(RBN) has been available for the CW contests. This allows us to include
the time since a station was last posted by the RBN (see below for
details).
The information added to each line of the augmented logs comprises:
- A sequence of four characters that are the same for each entry in a particular log:
- a. letter "A" or "U" indicating "assisted" or "unassisted"
- b. letter "Q", "L", "H" or "U", indicating respectively QRP, low power, high power or unknown power level
- c.
letter "S", "M", "C" or "U", indicating respectively a single-operator,
multi-operator, checklog or unknown operator category [ the contest
organisers have stated that checklogs are not made public, but in fact
at least some of them from the early years have been, hence the need for
the "C" category ]
- d. character "1", "2", "+" or "U", indicating respectively that the number of transmitters is one, two, unlimited or unknown
- A four-digit number representing
the time if the contact in minutes
measured from the start of the contest. (I realise that this can be
calculated
from the other information on the line, but it saves subsequent
script-based processors of the file considerable time to have the
number readily available in the file without having to calculate it for
each QSO.)
- Band
- A set of fourteen flags, each -- apart from column k and column n -- encoded as T/F:
- a. QSO is confirmed by a log from the second party
- b. QSO is a reverse bust (i.e., the second party appears to have bust the
call of the first party)
- c. QSO is an ordinary bust (i.e., the first party appears to have bust the
call of the second party)
- d. the call of the second party is unique
- e. QSO appears to be a NIL
- f. QSO is with a station that did not send in a log, but who did make 20
or more QSOs in the contest
- g. QSO appears to be a country mult
- h. QSO appears to be a zone mult
- i. QSO is a zone bust (i.e., the received zone appears to be a bust)
- j. QSO is a reverse zone bust (i.e. the second party appears to have bust the zone of the first party)
- k. This entry has three possible values rather than just T/F:
- T: QSO appears to be made during a run by the first party
- F: QSO appears not to be made during a run by the first party
- U: the run status is unknown because insufficient frequency information is available in the first party's log
- l. QSO is a dupe
- m. QSO is a dupe in the second party's log
- n. RBN information (see below)
- If the QSO is a reverse bust, the call logged by the second party; otherwise, the placeholder "-"
- If the QSO is an ordinary bust, the correct call that should have been logged by the first party; otherwise, the placeholder "-"
- If the QSO is a reverse zone bust, the zone logged by the second party; otherwise, the placeholder "-"
- If the QSO is an ordinary zone bust, the correct zone that should
have been logged by the first party; otherwise, the placeholder "-"
RBN Information
In the CW contests from 2009 onwards,
the RBN was active, automatically spotting the frequency at which any
station calling CQ was transmitting. To reflect possible use of RBN
information, the augmented files now include a fourteenth flag. For
the sake of uniformity, this column is present in all the augmented
files, regardless of whether the RBN actually contributed useful
information to a particular contest.
Each QSO has one of several characters in the fourteenth column of flags. These characters should be interpreted as follows:
'-'
No useful RBN-derived information is available for this QSO.
'0'
The worked station (
i.e., the second call on the log line) appears to have begun to CQ on this frequency within (roughly) 60 seconds prior to the QSO.
'A' to 'Z'
For the
nth letter of the alphabet: the worked station appears to have been CQing on this frequency for (roughly)
n minutes prior to the QSO.
'+'
The worked station appears to have been CQing for more than 26 minutes on this frequency.
'<'
Because the the RBN is distributed, and because each contest entrant
station has its own clock, there is generally a skew between the reading
of the clock of the station making the QSO and the timestamp from the
RBN at which it believes a posting was made (indeed, it's unclear from
the RBN's [lack of] documentation exactly how the timestamp on an
individual RBN posting is to be interpreted). If the character '<'
appears in the the RBN column, it indicates that the raw values of the
clocks suggest that the QSO took place up to two minutes
before
the RBN reported the worked station commencing to CQ at this frequency.
When this occurs, the most likely interpretation is that there is
non-negligible skew between the two clocks, and the station was actually
worked almost as soon as a CQ was posted by the RBN. This character also appears if the RBN erroneously posts the
worked station as CQing at this frequency shortly after the QSO. But it might also
mean that the entrant was simply lucky and found the CQing station just
as it fired up on a new frequency.
Notes:
- The encoding of some of the flags requires subjective
decisions to be made as to whether the flag should be true or false;
consequently, and because CQ has yet to understand the importance of
making their scoring code public, the value of a flag for a specific QSO
line in some circumstances might not match the value that CQ would
assign. (Also, CQ has more data available in the form of check logs,
which are generally not made public.)
- I made no attempt to deduce or infer the run status of a QSO in
the second party's log (if such exists), regardless of the status in the
first party's log. This allows one cleanly to perform correct
statistical analyses anent the number of QSOs made by running stations
merely by excluding QSOs marked with a U in column k.
- No attempt is made to detect the case in which both participants of a
QSO bust the other station's call. This is a problematic situation
because of the relatively high probability of a false positive unless
both stations accurately log the frequency as opposed to merely the
band. (Also, on bands
on which split-frequency QSOs are common, the absence of both transmit
and receive frequency is a problem; I confess that I have never
understood why Cabrillo was not designed to report both transmit and
receive frequencies -- or even to define clearly which frequency is to be reported. I digress.) Because of the likelihood of false
positives, it seems better, given the presumed rarity of double-bust
QSOs, that no attempt be made to mark them.
- The entries for the zones in the case of zone or reverse zone busts are normalised to two-digit values.