2019-12-20

Cleaned and Augmented Logs for ARRL DX CW and SSB contests, 2018 to 2019

Cleaned Logs


Cleaned versions of the logs for the ARRL DX CW and SSB contests are now available for 2018 and 2019.

Links to the cleaned logs may be followed here.

The cleaned logs are the result of processing the QSO: lines from the entrants' submitted Cabrillo files (as [gratuitously] modified by the ARRL) to ensure that all fields contain valid values and all the data match the column-specific standard format for this contest.

Any line containing illegal data in a field has simply been removed. Also, only the QSO: lines are retained, so that each line in the file can be processed easily. All QTH multipliers are rendered as two letters, and the power is rendered as four digits, regardless of how the submitted log recorded these two fields; this should simplify processing the logs by scripts or programs, as should the use of fixed-length records in these cleaned files.

Augmented Logs


Links to the augmented logs may be followed here.

The augmented logs for the ARRL DX contests contain the same information as the cleaned logs, but with the addition of some useful (derived) information on each line. The information added to each line comprises:
  1. The sequence of four characters that are the same for each entry in a particular log:
    •  a. letter "A" or "U" indicating "assisted" or "unassisted"
    •  b. letter "Q", "L", "H" or "U", indicating respectively QRP, low power, high power or unknown power level
    •  c. letter "S", "M", "C" or "U", indicating respectively a single-operator, multi-operator, checklog or unknown operator category 
    •  d. character "1", "2", "+" or "U", indicating respectively that the number of transmitters is one, two, unlimited or unknown
  2. A four-digit number representing the time if the contact in minutes measured from the start of the contest. (I realise that this can be calculated from the other information on the line, but it saves subsequent processors of the file considerable time to have the number readily available in the file without having to calculate it each time.)
  3. Band
  4. A set of fourteen flags, each -- apart from column k and column n -- encoded as T/F: 
    • a. QSO is confirmed by a log from the second party 
    • b. QSO is a reverse bust (i.e., the second party appears to have bust the call of the first party) 
    • c. QSO is an ordinary bust (i.e., the first party appears to have bust the call of the second party) 
    • d. the call of the second party is unique 
    • e. QSO appears to be a NIL 
    • f. QSO is with a station that did not send in a log, but who did make 20 or more QSOs in the contest 
    • g. QSO appears to be a country mult (may be T for W/VE stations only)
    • h. QSO appears to be a state/province mult (may be T for DX stations only)
    • i. QSO is an exchange bust (i.e., the received exchange appears to be a bust)
    • j. QSO is a reverse exchange bust (i.e. the second party appears to have bust the exchange of the first party)
    • k. This entry has three possible values rather than just T/F:
      • T: QSO appears to be made during a run by the first party
      • F: QSO appears not to be made during a run by the first party
      • U: the run status is unknown because insufficient frequency information is available in the first party's log
    • l. QSO is a dupe
    • m. QSO is a dupe in the second party's log
    • n. RBN information (see below)
  5. If the QSO is a reverse bust, the call logged by the second party; otherwise, the placeholder "-"
  6. If the QSO is an ordinary bust, the correct call that should have been logged by the first party; otherwise, the placeholder "-"
  7. If the QSO is a reverse exchange bust, the exchange logged by the second party; otherwise, the placeholder "-"
  8.  If the QSO is an ordinary exchange bust, the correct exchange that should have been logged by the first party; otherwise, the placeholder "-"

RBN Information


In CW contests from 2009 onwards, the RBN has been active, automatically spotting the frequency at which any station calling CQ was transmitting. To reflect possible use of RBN information, the augmented files include a fourteenth column. For the sake of uniformity, this column is present in all the augmented files, regardless of whether the RBN actually contributed useful information to a particular contest.

Each QSO has one of several characters in the fourteenth column of flags. These characters should be interpreted as follows:

'-'
  No useful RBN-derived information is available for this QSO.

'0'
  The worked station (i.e., the second call on the log line) appears to have begun to CQ on this frequency within (roughly) 60 seconds prior to the QSO.

'A' to 'Z'
  For the nth letter of the alphabet: the worked station appears to have been CQing on this frequency for (roughly) n minutes prior to the QSO.

'+'
  The worked station appears to have been CQing for more than 26 minutes on this frequency.

'<'
  Because the the RBN is distributed, and because each contest entrant station has its own clock, there is generally a skew between the reading of the clock of the station making the QSO and the timestamp from the RBN at which it believes a posting was made (indeed, it's unclear from the RBN's [lack of] documentation exactly how the timestamp on an individual RBN posting is to be interpreted). If the character '<' appears in the the RBN column, it indicates that the raw values of the clocks suggest that the QSO took place up to two minutes before the RBN reported the worked station commencing to CQ at this frequency. When this occurs, the most likely interpretation is that there is non-negligible skew between the two clocks, and the station was actually worked almost as soon as a CQ was posted by the RBN. But it might also mean that the entrant was simply lucky and found the CQing station just as it fired up on a new frequency.

Notes:
  • The encoding of some of the flags requires subjective decisions to be made as to whether the flag should be true or false; consequently, and because the ARRL has yet to understand the importance of making the scoring code public, the value of a flag for a specific QSO line in some circumstances might not match the value that the ARRL has assigned. (Also, the ARRL has additional, non-public, data available.)
  • I made no attempt to deduce or infer the run status of a QSO in the second party's log (if such exists), regardless of the status in the first party's log. This allows one cleanly to perform correct statistical analyses anent the number of QSOs made by running stations merely by excluding QSOs marked with a U in column k.
  • No attempt is made to detect the case in which both participants of a QSO bust the other station's call. This is a problematic situation because of the relatively high probability of a false positive unless both stations log the frequency as opposed to the band. (Also, on bands on which split-frequency QSOs are common, the absence of both transmit and receive frequency is a problem.) Because of the likelihood of false positives, it seems better, given the presumed rarity of double-bust QSOs, that no attempt be made to mark them.
  • The entries for the exchanges in the case of exchange or reverse exchange busts are normalised to two-letter or four-digit values in the same manner as described above for the exchanges in the cleaned logs.

2019-12-13

Revised Augmented Logs for CQ WW CW and SSB Contests, 2005 to 2018

AD1C has recently once more made accessible historical cty.dat and associated files. A copy of the cty,dat files is here.

This allows one to regenerate the augmented contest files, but using data that were current at the time of the contest. A pointer to these revised augmented files is here.

The augmented logs contain the same information as cleaned logs, but with the addition of some useful (derived) information on each line. The information added to each line comprises:
  1. The sequence of four characters that are the same for each entry in a particular log:
    •  a. letter "A" or "U" indicating "assisted" or "unassisted"
    •  b. letter "Q", "L", "H" or "U", indicating respectively QRP, low power, high power or unknown power level
    •  c. letter "S", "M", "C" or "U", indicating respectively a single-operator, multi-operator, checklog or unknown operator category [ the contest organisers have stated that checklogs are not made public, but in fact at least some of them from the early years have been, hence the need for the "C" category ]
    •  d. character "1", "2", "+" or "U", indicating respectively that the number of transmitters is one, two, unlimited or unknown
  2. A four-digit number representing the time if the contact in minutes measured from the start of the contest. (I realise that this can be calculated from the other information on the line, but it saves subsequent processors of the file considerable time to have the number readily available in the file without having to calculate it each time.)
  3. Band
  4. A set of fourteen flags, each -- apart from column k and column n -- encoded as T/F: 
    • a. QSO is confirmed by a log from the second party 
    • b. QSO is a reverse bust (i.e., the second party appears to have bust the call of the first party) 
    • c. QSO is an ordinary bust (i.e., the first party appears to have bust the call of the second party) 
    • d. the call of the second party is unique 
    • e. QSO appears to be a NIL 
    • f. QSO is with a station that did not send in a log, but who did make 20 or more QSOs in the contest 
    • g. QSO appears to be a country mult 
    • h. QSO appears to be a zone mult 
    • i. QSO is a zone bust (i.e., the received zone appears to be a bust)
    • j. QSO is a reverse zone bust (i.e. the second party appears to have bust the zone of the first party)
    • k. This entry has three possible values rather than just T/F:
      • T: QSO appears to be made during a run by the first party
      • F: QSO appears not to be made during a run by the first party
      • U: the run status is unknown because insufficient frequency information is available in the first party's log
    • l. QSO is a dupe
    • m. QSO is a dupe in the second party's log
    • n. RBN information (see below)
  5. If the QSO is a reverse bust, the call logged by the second party; otherwise, the placeholder "-"
  6. If the QSO is an ordinary bust, the correct call that should have been logged by the first party; otherwise, the placeholder "-"
  7. If the QSO is a reverse zone bust, the zone logged by the second party; otherwise, the placeholder "-"
  8.  If the QSO is an ordinary zone bust, the correct zone that should have been logged by the first party; otherwise, the placeholder "-"

RBN Information


In the CW contests from 2009 onwards, the RBN was active, automatically spotting the frequency at which any station calling CQ was transmitting. To reflect possible use of RBN information, the augmented files now include a fourteenth column. For the sake of uniformity, this column is present in all the augmented files, regardless of whether the RBN actually contributed useful information to a particular contest.

Each QSO has one of several characters in the fourteenth column of flags. These characters should be interpreted as follows:

'-'
  No useful RBN-derived information is available for this QSO.

'0'
  The worked station (i.e., the second call on the log line) appears to have begun to CQ on this frequency within (roughly) 60 seconds prior to the QSO.

'A' to 'Z'
  For the nth letter of the alphabet: the worked station appears to have been CQing on this frequency for (roughly) n minutes prior to the QSO.

'+'
  The worked station appears to have been CQing for more than 26 minutes on this frequency.

'<'
  Because the the RBN is distributed, and because each contest entrant station has its own clock, there is generally a skew between the reading of the clock of the station making the QSO and the timestamp from the RBN at which it believes a posting was made (indeed, it's unclear from the RBN's [lack of] documentation exactly how the timestamp on an individual RBN posting is to be interpreted). If the character '<' appears in the the RBN column, it indicates that the raw values of the clocks suggest that the QSO took place up to two minutes before the RBN reported the worked station commencing to CQ at this frequency. When this occurs, the most likely interpretation is that there is non-negligible skew between the two clocks, and the station was actually worked almost as soon as a CQ was posted by the RBN. But it might also mean that the entrant was simply lucky and found the CQing station just as it fired up on a new frequency.

Notes:
  • The encoding of some of the flags requires subjective decisions to be made as to whether the flag should be true or false; consequently, and because CQ has yet to understand the importance of making their scoring code public, the value of a flag for a specific QSO line in some circumstances might not match the value that CQ would assign. (Also, CQ has more data available in the form of check logs, which are generally not made public.)
  • I made no attempt to deduce or infer the run status of a QSO in the second party's log (if such exists), regardless of the status in the first party's log. This allows one cleanly to perform correct statistical analyses anent the number of QSOs made by running stations merely by excluding QSOs marked with a U in column k.
  • No attempt is made to detect the case in which both participants of a QSO bust the other station's call. This is a problematic situation because of the relatively high probability of a false positive unless both stations log the frequency as opposed to the band. (Also, on bands on which split-frequency QSOs are common, the absence of both transmit and receive frequency is a problem.) Because of the likelihood of false positives, it seems better, given the presumed rarity of double-bust QSOs, that no attempt be made to mark them.
  • The entries for the zones in the case of zone or reverse zone busts are normalised to two-digit values.

2019-11-03

drmap

drmap is a program for generating amateur-radio-related maps from USGS National Map data. From the description in the source code, with graphics added inline:

    drmap
      -ant  <antenna height>
     
        The height of the antenna. If -imperial is present, the height is in feet, otherwise it is in metres.
     
      -call <callsign>
     
        The callsign associated with the plot. Must be present.
       
      -cells <number of cells>
     
        The number of cells from the centre of the plot to the edges. The default is 3/8 of the width of the plot, in pixels. For the default width of 800, the value is therefore 300.
       
      -datadir <directory>
     
        The directory that contains USGS GridFloat tiles
       
      -elev
     
        Create an elevation plot: the plotted values are the elevation of each cell as seen from the antenna. Most are therefore negative.
       
      -grad
     
        Create a gradient plot: the plotted values are the gradient of the terrain in the direction from the QTH.
       
      -hzn [distance limit]
     
        Plot the elevation of the horizon around the periphery of the figure. Eye-level is set in the same way as eye-level for the -los option. Only distances out to the distance limit are used in this calculation. If the distance limit value is not present, it is assumed to be the same as the radius of the figure.
       
      -imperial
     
        Use imperial units instead of metric. That is, miles instead of kilometres and feet instead of metres. Applies both to values on the command line and to values on the output plot(s).
       
      -lat <latitude>
     
        Latitude in degrees north. If present, -long should also be present.
       
      -long <longitude>
     
        Longitude in degrees east. If present, -lat should also be present. Note that because the USGS data covers only the US, longitude should be negative; but if it is positive, the program will negate the value before use.
       
      -los
     
        Create a line-of-sight plot in addition to the standard height-field plot. Eye-level is assumed to be 1.5m or 5 feet, unless the -ant option is presewnt, in which case eye-level is the same as the height of the antenna.

      -outdir <directory>
     
        The directory into which the output maps should be written
       
      -radius <distance1[,distance2[,distance3...]]>
     
        One or more radii for the plot(s), in units of km unless -imperial is present, in which case the units are miles.
       
      -qthdb <QTH database filename>
     
        A file linking QTH information to callsigns. Each line of the file should contain three entries pertaining to a station, separated by white space: the callsign, the latitude and the longitude. This database will be used only if one or both of the -lat and -long parameters is missing from the command line.
       
      -sm
     
        USGS tiles are each about 450MB in size. This parameter ("small memory") tells drmap to use the disk files that contain the tiles as-is, rather than moving them into RAM where their contents can be accessed much more quickly. Using this parameter therefore slows access, but means that there is essentially no limit to the number of tiles that may be used to build a plot. drmap automatically stops loading tiles into RAM when there is less than about 500MB of free RAM and switches to using the tiles on disk, so ordinarily there is no need to worry about whether to use the "-sm" parameter. This parameter will be removed in  future versions of drmap if it seems to be unneeded in practice.
       
      -width <pixels>
     
        width, in pixels, of the plot(s). The default is 800. The height is automatically set to be three quarters of this value.
       
    Examples:
      drmap -call n7dr -datadir /zfs1/data/usgs/drmap -outdir /tmp/drmap -qthdb ~/radio/qthdb -imperial -ant 50 -radius 2 -los -hzn 5
     
        Look up the call "n7dr" (case is ignored) in the file "~/radio/qthdb". Each line in that file is of the form:


        <callsign>     <latitude>     <longitude>
       
        In particular the following line appears in that file on my system:


        N7DR        40.108016   -105.051700
       
        The latitude and longitude information for N7DR are extracted from that file.
       
        The program will look for relevant USGS files in the directory "/zfs1/data/usgs/drmap". If it fails to find any needed files, it will download them from the USGS and place them in that directory prior to using them.
       
        The program will write output plots in the directory tmp/drmap.
       
        The program will use imperial units (miles and feet), and assume an antenna 50 feet above ground.
       
        It will generate a height plot displaying a radius of 2 miles around the N7DR QTH:


       
        It will also create a line-of-sight plot (showing the terrain visible from the putative antenna, 50 feet above ground level):


        
        In both plots, the elevation of the horizon as seen from 50 feet above ground level is drawn around the periphery. The program assumes that no contribution to the horizon is more than five miles from the QTH. It also (always) calculates and displays the M[ean] H[eight] A[bove] T[errain] of the antenna for the area inside the provided radius.


      drmap -call RMNP -datadir /zfs1/data/usgs/drmap -outdir /tmp/drmap -lat 40.441358 -long -105.753685 -radius 1
     
        Create a plot for the point 40°.441358N, 105°.753685W (which is in Rocky Mountain National Park).
       
        The program will look for relevant USGS files in the directory "/zfs1/data/usgs/drmap". If it fails to find any needed files, it will download them from the USGS and place them in that directory prior to using them.

        The program will write output plots in the directory tmp/drmap.
       
        The program will use metric units (kilometres and metres).
       
        It will generate a height plot displaying a radius of 1 kilometre around the designated location:




      drmap -call RMNP -datadir /zfs1/data/usgs/drmap -outdir /tmp/drmap -lat
40.441358 -long -105.753685 -radius 10 -los -hzn 5 -grad -elev
   
        Create a plot for the point 40°.441358N, 105°.753685W (which is in Rocky Mountain National Park).
       
        The program will look for relevant USGS files in the directory "/zfs1/data/usgs/drmap". If it fails to find any needed files, it will download them from the USGS and place them in that directory prior to using them.

        The program will write output plots in the directory tmp/drmap.
       
        The program will use metric units (kilometres and metres).
       
        It will generate a height plot displaying a radius of 10 kilometres around the designated location:



       It will also generate a line-of-sight plot, assuming the default eye level of 1.5m:

       
        It will also generate a gradient plot, showing at each point the gradient measured in a direction from the centre of the plot to that point:



        It will also generate an elevation plot (reminiscent of 1960s tie-dye), showing at each point the elevation angle measured in a direction from the centre of the plot to that point:


        In all these plots, the elevation of the horizon as seen from eye level is drawn around the periphery. The program assumes that no contribution to the horizon is more than five kilometres from the QTH.

2019-09-03

drlog moving from Eclipse to Codelite

The Eclipse IDE has been removed from the newest version of debian stable. This is somewhat of a case of déjà vu, as a few years ago KDevelop3 was similarly removed from Kubuntu, at a time when that was development environment for drlog, even though, at the time, KDevelop3 was stable and functional, whereas its replacement, KDevelop4, was... well, to be polite, completely broken. That was when I switched to Eclipse.

One can, in theory, still download a functional Eclipse from eclipse.org to use on debian, but when I did that the result was a completely useless program that spewed java errors and would not display dialog boxes.

So it is, again, time to go looking for an IDE for developing drlog.

I expect to be moving to Codelite, although it does seem to have some issues for serious development work, in particular a rather poor editor that doesn't, as far as I can tell, allow multiple views on different parts of a file. On the other hand, its use of virtual directories makes it easy to move projects from Eclipse to Codelite. Anyway, I shall try it for a while and see what happens.

In any case, rather than trying to inflict my choice on others, I shall be keeping the raw drlog makefile current, so that it can be used directly to build the program regardless of a user's choice of development environment. Sometime shortly I shall remove the Eclipse-related files from the repository.

2019-08-27

Stations With Lowest Probability of Busting a Call: CQ WW 2017 and 2018

Prior posts in this series:
Throughout this post, I apply the procedures developed in the second post above.

I begin with an ordered list of the stations with the lowest probabilities of busting a call in 2017 CQ WW SSB.

2017 CQ WW SSB -- weighted mean values of $p_{bust}$
Position Call weighted mean $Q_v$ $B$
1 OH0X 0.0010 978 0
2 C6ARW 0.0010 2,058 1
3 K1ZZ 0.0012 1,715 1
4 F4FTA 0.0012 800 0
5 DF2RG 0.0014 774 0
6 DJ8OG 0.0015 2,169 2
7 EW1P 0.00151,355 1
8 NH6V 0.0015 1,299 1
9 K2EP 0.0016 627 0
10 UW7W 0.0016 1,266 1

It is also interesting to plot the aggregated probability function for $p_{bust}$, weighted by the number of verified QSOs, $Q_v$, for all stations:


The location of the vertical line represents the weighted mean of the probability curve.

For 2018 CQ WW SSB:

2018 CQ WW SSB -- weighted mean values of $p_{bust}$
Position Call weighted mean $Q_v$ $B$
1 DF2RG 0.0011 887 0
2 N4PQX 0.0012 768 0
3 N9NB 0.0013 758 0
4OM0WR 0.0013 732 0
5 KA1ZD 0.0013 1,500 1
6 YT5IVN 0.0014 1,402 1
7 WA2FZB 0.0015628 0
8 JM1NKT 0.0016 602 0
9 KV1J 0.0017 582 0
10 W1CU 0.0017 579 0


For 2017 CQ WW CW:

2017 CQ WW CW -- weighted mean values of $p_{bust}$
Position Call weighted mean $Q_v$ $B$
1 K0KX 0.0005 1,682 0
2 DM5EE 0.0006 1,430 0
3 K1ZZ 0.0007 3,028 1
4 YO5OHO 0.0007 1,376 0
5 DP4X 0.0007 1,315 0
6 WE9V 0.0007 1,256 0
7 UT4U 0.00081,244 0
8 JH8SLS 0.0008 1,206 0
9 SP6MLX 0.0009 1,077 0
10 K2XR 0.0009 1,069 0


For 2018 CQ WW CW:

2018 CQ WW CW -- weighted mean values of $p_{bust}$
Position Call weighted mean $Q_v$ $B$
1 HB9ARF 0.0006 1,432 0
2 K6LL 0.0007 1,371 0
3 OM6RM 0.0007 1,286 0
4 UW1WU 0.0008 1,218 0
5 KG4V 0.0008 1,206 0
6 W3KB 0.0008 1,153 0
7 EW1P 0.00081,116 0
8 EU4E 0.0009 1,067 0
9 W2CDO 0.0009 1,011 0
10 DF2RG 0.0010 997 0


We can limit the analysis to calling stations (i.e., not the running station).

2017 CQ WW SSB and CW:

2017 CQ WW SSB -- weighted mean values of $p_{bust}$ (non-run)
Position Call weighted mean $Q_v$ $B$
1 EW1P 0.0008 1,132 0
2 DL6NDW 0.0012 829 0
3 DJ8OG 0.0013 762 0
4 DF2RG 0.0013 761 0
5 F4FTA 0.0015 633 0
6 K1ZZ 0.0015 1,304 1
7 SK6AW 0.0015628 0
8 K2EP 0.0016 614 0
9 UW7W 0.0016 1,234 1
10 IZ4JMA 0.0016 592 0

2017 CQ WW CW -- weighted mean values of $p_{bust}$ (non-run)
Position Call weighted mean $Q_v$ $B$
1 RM9A 0.0005 1,834 0
2 ES9C 0.0006 1,662 0
3 K0KX 0.0007 1,325 0
4 SP2LNW 0.0008 1,241 0
5 R7MM 0.0008 1,239 0
6 DM5EE 0.0008 1,160 0
7 ED2C 0.00081,122 0
8 RF9C 0.0009 1,096 0
9 LY3B 0.0009 1,068 0
10 UZ3A 0.0010 991 0


2018 CQ WW SSB and CW:

2018 CQ WW SSB -- weighted mean values of $p_{bust}$ (non-run)
Position Call weighted mean $Q_v$ $B$
1 9A2EU 0.0010 976 0
2 YT5IVN 0.0011 880 0
3 DF2RG 0.0011 868 0
4 K5ZD 0.0013 749 0
5 K3PP 0.0013 744 0
6 N4PQX 0.0013 736 3
7 LX7I 0.0015632 0
8 N9NB 0.0015 628 0
9 WA2FZB 0.0016 589 0
10 K3OO 0.0017 575 0

2018 CQ WW CW -- weighted mean values of $p_{bust}$ (non-run)
Position Call weighted mean $Q_v$ $B$
1 HG6N 0.0006 1,619 0
2 UX1VT 0.0007 1,391 0
3 S54X 0.0007 1,374 0
4 DL1WA 0.0007 1,307 0
5 K1AR 0.0008 1,209 0
6 NR4M 0.0008 1,148 0
7 RT4M 0.00081,138 0
8 LY5W 0.0008 1,111 0
9 K3PH 0.0009 1,100 0
10 W8FJ 0.0009 1,087 0


And similarly for running stations.

2017 CQ WW SSB and CW:

2017 CQ WW SSB -- weighted mean values of $p_{bust}$ (run)
Position Call weighted mean $Q_v$ $B$
1 C6ARW 0.0010 1,925 1
2 CF7RR 0.0013 1,557 1
3 OH0X 0.0014 694 0
4 HP1XT 0.0016 621 0
5 OH0V 0.0017 2,337 3
6 TK9R 0.0018 3,982 6
7 NH6V 0.00191,035 1
8 PY2KJ 0.0021 1,456 2
9 DJ8OG 0.0021 1,407 2
10 VA2WA 0.0022 1,853 3

2017 CQ WW CW -- weighted mean values of $p_{bust}$ (run)
Position Call weighted mean $Q_v$ $B$
1 K1ZZ 0.0004 2,294 0
2 W1GD 0.0011 888 0
3 8S0DX 0.0011 873 0
4 SN1Y 0.0012 1,714 1
5 N3ER 0.0013 749 0
6 EU4E 0.0013 747 0
7 UT4U 0.0013731 0
8 OM0WR 0.0013 1,517 1
9 JH8SLS 0.0014 710 0
10 S59ABC 0.0014 2,898 3


2018 CQ WW SSB and CW:

2018 CQ WW SSB -- weighted mean values of $p_{bust}$ (run)
Position Call weighted mean $Q_v$ $B$
1 OM0WR 0.0014 697 0
2 MI0I 0.0023 423 0
3 PY2KJ 0.0025 1,207 2
4 ZF9CW 0.0025 3,973 9
5 OQ5M 0.0025 1,190 2
6 KA1ZD 0.0026 381 0
7 PY2UD 0.0028715 1
8 LY9A 0.0028 711 1
9 UW2M 0.0029 1,050 2
10 M0MCV 0.0029 1,049 2

2018 CQ WW CW -- weighted mean values of $p_{bust}$ (run)
Position Call weighted mean $Q_v$ $B$
1 HB9ARF 0.0008 1,139 0
2 K8CX 0.0012 800 0
3 EU4E 0.0012 772 0
4 YU0A 0.0012 770 0
5 OH8X 0.0013 2,322 2
6 ZR2A 0.0013 1,485 1
7 R7MM 0.0014691 0
8 UA9LAO 0.0014 682 0
9 DF3AX 0.0015 629 0
10 DK2GZ 0.0017 573 0



We can also look at the changes over the period from 2005 to 2018.

First for all QSOs:


For non-run QSOs:


And for run QSOs:


Most-Logged Stations in CQ WW CW and SSB Contests, 2018

The public CQ WW CW and SSB logs allow us easily to tabulate the stations that appear in the largest number of entrants' logs. For 2018, the ten stations with the largest number of appearances in (augmented) CQ WW SSB logs were:

Callsign Appearances % logs
CN3A 9,585 61
D4C 8,440 55
EF8R 8,369 56
ES9C 7,649 50
LZ9W 7,643 48
M6T 7,249 49
FY5KE 7,158 46
PZ5K 7,089 46
PJ4G 6,898 43
DF0HQ 6,746 46

The first column in the table is the callsign. The second column is the total number of times that the call appears in logs. That is, if a station worked CN3A on six bands, that will increment the value in the second column of the CN3A row by six. The third column is the percentage of logs that contain the callsign at least once.

For comparison, here is the equivalent table for 2017:

Callsign Appearances % logs
CN3A 10,347 62
EF8R 9,099 56
CN2R 8,649 55
LZ9W 8,251 53
CN2AA 8,090 54
ES9C 7,941 53
M6T 7,856 52
A73A 7,501 47
PZ5K 7,481 45
DF0HQ 7,372 50

Similarly, the ten stations with the largest number of appearances in CQ WW CW 2018 were:

Callsign Appearances % logs
EF8R 12,063 72
CN3A 11,603 70
P33W 10,804 67
CR3DX 10,546 67
9A1A 10,518 69
TK0C 10,318 65
LZ9W 10,026 67
M6T 9,785 65
ES9C 9,521 62
PJ2T 9,455 57

And the equivalent table for 2017:

Callsign Appearances % logs
TK0C 10,719 63
9A1A 10,594 65
M6T 9,884 62
CR3W 9,783 61
YT5A 9,692 63
PJ2T 9,661 53
EF8R 9,538 60
LZ9W 9,257 60
V47T 9,128 55
CN2AA 9,092 58

We can also perform the same analysis for, say, a ten-year span, to show which stations have most consistently appeared in other stations' logs. So, for CQ WW SSB for the period 2009 to 2018, we find:

Callsign Appearances % logs
LZ9W 84,898 56
CN3A 84,806 59
DF0HQ 78,670 54
PJ2T 71,826 47
OT5A 71,036 51
K3LR 69,954 49
P33W 66,944 49
A73A 63,280 46
CN2R 59,700 45
V26B 57,338 41

And for CW over the same span:

Callsign Appearances % logs
9A1A 99,223 67
LZ9W 96,166 66
PJ2T 91,295 57
DF0HQ 84,414 61
P33W 83,190 59
W3LPL 74,354 52
K3LR 73,820 52
PJ4A 69,920 51
LX7I 66,448 50
CR3L 64,432 44

2019-08-25

Call Busts and Reverse Busts: CQ WW 2018

This is the latest in a series of posts on (callsign) busts and reverse busts in the CQ WW contests. These posts are based on the various public CQ WW logs (cq-ww-2005--2018-augmented.xz; see here for details of the augmented format).

Prior posts in the series on busts and reverse busts in CQ WW:
In this post, we include only verified QSOs; that is, QSOs for which both parties submitted a log.

First, the tables for the CQ WW SSB 2018:

2018 SSB -- Most Busts
Position Call QSOs Busts % Busts
1 V26B 6,160 166 2.7
2 CN3A 9,457 159 1.7
3 D4C 8,295 155 1.9
4 LZ9W 7,558 142 1.9
5 TM0T 4,189 136 3.2
6 A44A 3,944 135 3.4
7 ZF1A 6,423 115 1.8
8 OK5Z 3,634112 3.1
9 EF8R 8,357 107 1.3
10 OT5A 5,021 106 2.1

2018 SSB -- Most Reverse Busts
Position Call QSOs Reverse Busts % Reverse Busts
1 OG60F 4,258 486 11.4
2 HC0E 1,873 274 14.6
3 TM3R 3,605 233 6.5
4 DF0HQ 6,857 195 2.8
5 KL7RA 4,622 157 3.4
6 IK2YCW 4,021 150 3.7
7 EF8R 8,357 148 1.8
8 JA3YBK 3,509140 4.0
9 JE2YRB 1,362 140 10.3
10 ED3M 3,152 131 4.2

2018 SSB -- Highest Percentage of Busts (≥100 QSOs)
Position Call QSOs % Busts
1 YB2BNN 136 16.9
2 SP9KB 189 14.8
3 UB3AQA 129 13.2
4 IU8GYT 210 12.4
5 YC1HLT 108 12.0
6 KE0KOT 101 11.9
7 R7AC 112 11.6
8 W7UPF 13710.9
9 TA3PZ 152 10.5
10 9W6VAT 143 10.5

2018 SSB -- Highest Percentage of Reverse Busts (≥100 QSOs)
Position Call QSOs % Reverse Busts
1 HC0E 1,873 14.6
2 OG60F 4,258 11.4
3 BY4QA 437 11.2
4 9M2SDX 135 11.1
5 BV2A/3 443 11.1
6 W6MOB 101 10.9
7 JE2YRB 1,362 10.3
8 IK3SSJ 2079.7
9 9M6GOH 288 9.0
10 IK3XTY 123 8.9

Now the tables for the 2018 CW data:

2018 CW -- Most Busts
Position Call QSOs Busts % Busts
1 TK0C 10,328 175 1.7
2OZ5E 6,144 149 2.4
3 D41CV 7,456 144 1.9
4 CN3A 11,670 141 1.2
5 LZ9W 9,958 132 1.3
6 F6KOP 3,738 129 3.5
7 NP2P 3,369 120 3.6
8 RW0A 6,524116 1.8
9 G3V 6,593 113 1.7
10 RM9A 8,834 112 1.3

2018 CW -- Most Reverse Busts
Position Call QSOs Reverse Busts % Reverse Busts
1 PE75W 1,408 412 29.3
2 ES9C 9,696 322 3.3
3 JS3CTQ 2,455 320 13.0
4 CN3A 11,670 278 2.4
5 DF0HQ 8,658 274 3.2
6 EF8R 12,115 271 2.2
7 OG60F 5,561 232 4.2
8 RM9A 8,834213 2.4
9 UA4S 5,283 208 3.9
10 CR3W 8,586 173 2.0

2018 CW -- Highest Percentage of Busts (≥100 QSOs)
Position Call QSOs % Busts
1 KD5QHV 116 23.3
2 YO7LYM 250 18.8
3 W3ICM 107 18.7
4 OH1LAR 156 18.6
5 PA0SKP 204 18.1
6 K4KAY 235 17.9
7 W4PH 198 17.7
8 LA6M 22517.0
9 GD4EIP 453 16.8
10 VE5WI 180 16.7

2018 CW -- Highest Percentage of Reverse Busts (≥100 QSOs)
Position Call QSOs % Reverse Busts
1 PE75W 1,408 29.3
2 YU1LG 167 19.2
3 DK3HM 294 15.6
4 K3HW 414 14.7
5 PY2LPM 103 14.6
6 OG55W 463 14.0
7 JS3CTQ 2,455 13.0
8 SA0BXV 15712.7
9 JA8HBO 103 12.6
10 G0AZS 188 12.2

Now we look at the tables that integrate ten years' data.

For SSB:

2009 to 2018 SSB -- Most Busts
Position Call QSOs Busts % Busts
1 LZ9W 83,740 1,456 1.7
2 OT5A 69,414 1,363 2.0
3 PJ2T 70,851 1,324 1.9
4 CN3A 83,700 1,316 1.6
5 A73A 62,374 1,260 2.0
6 HG1S 42,980 868 2.0
7 EF8R 50,246 850 1.7
8 V26B 56,636843 1.5
9 JA7YRR 28,832 821 2.8
10 HG7T 54,081 785 1.5

2009 to 2018 SSB -- Most Reverse Busts
Position Call QSOs Reverse Busts % Reverse Busts
1 DF0HQ 79,704 2,301 2.9
2 JA3YBK 41,284 1,311 3.2
3 K3LR 69,728 1,078 1.5
4 CN2R 59,585 1,027 1.7
5 WE3C 36,245 976 2.7
6 CN3A 83,700 891 1.1
7 HG1S 42,980 884 2.1
8 S52ZW 32,848847 2.6
9 W3LPL 53,037 800 1.5
10 HK1NA 40,363 788 2.0

2009 to 2018 SSB -- Highest Percentage of Busts (≥500 QSOs)
Position Call QSOs % Busts
1 PV8ADI 858 16.2
2 K2JMY 2,062 14.8
3 EA7JQT 572 12.4
4 EA1HTF 1,244 11.9
5 PU1MMZ 544 11.4
6 K8TS 679 10.9
7 PU2TRX 868 10.8
8 YB9KA 55710.1
9 HC2GF 936 10.0
10 E20WXA 594 9.9

2009 to 2018 SSB -- Highest Percentage of Reverse Busts (≥500 QSOs)
Position Call QSOs % Reverse Busts
1 CW90A 1,370 30.9
2 BA8AG 752 16.4
3 BW2/KU1CW 940 12.3
4 OG60F 4,258 11.4
5 ZP6DYA 1,226 11.3
6 V84SCQ 806 10.5
7 LU9DDJ 701 10.4
8 JE2YRB 1,36210.3
9 BV55D 919 10.2
10 JG3SVP 1,270 10.1

 And for CW:

2009 to 2018 CW -- Most Busts
Position Call QSOs Busts % Busts
1 PJ2T 90,264 1,260 1.4
2 LZ9W 95,211 1,193 1.3
3 PV8ADI 8,127 1,177 14.5
4 PI4CC 53,794 1,056 2.0
5 OZ5E 30,550 824 2.7
6 9A1A 98,218 809 0.8
7 PJ4A 69,325 806 1.2
8 HG1S 35,367801 2.3
9 D4C 56,710 788 1.4
10 RW0A 47,270 768 1.6

2009 to 2018 CW -- Most Reverse Busts
Position Call QSOs Reverse Busts % Reverse Busts
1 JS3CTQ 24,173 3,139 13.0
2 DF0HQ 86,422 3,073 3.6
3 ES9C 59,162 1,834 3.1
4 W2FU 56,827 1,456 2.6
5 K3LR 74,177 1,408 1.9
6 DR1A 49,284 1,290 2.6
7 IR4X 44,952 1,274 2.8
8 NR4M 54,5081,184 2.2
9 HG7T 58,224 1,178 2.0
10 W0AIH 29,076 1,151 4.0

2009 to 2018 CW -- Highest Percentage of Busts (≥500 QSOs)
Position Call QSOs % Busts
1 W2UDT 911 19.8
2 BD3MV 1,005 19.2
3 YO7LYM 1,711 17.0
4 JA3AHY 554 16.6
5 DJ5UZ 686 16.5
6 WP3Y 570 16.3
7 AE3D 1,016 15.8
8 YU1NIM 61915.8
9 AD7XG 1,178 15.4
10 SM5BJT 615 14.6

2009 to 2018 CW -- Highest Percentage of Reverse Busts (≥500 QSOs)
Position Call QSOs % Reverse Busts
1 G3RWF 943 99.9
2 RZ3VO 1,792 49.9
3 YT65A 1,149 37.2
4 5K0A 1,853 32.1
5 OG55W 3,032 29.4
6 PE75W 1,408 29.3
7 DP65HSC 516 16.9
8 5J1E 1,52315.6
9 SB0A 1,102 15.5
10 YP0HQ 1,787 14.0

NB: In tables of reverse busts, one sometimes finds what seems like an unreasonable number of reverse busts (as, in this table, for G3RWF). This is generally caused by a discrepancy between the call actually sent by the listed station and the one recorded as being sent in at least some QSOs in the log, although it can, of course, be due to an unusual callsign structure or poor signal quality.

2019-08-24

Augmented Logs for ARRL DX CW and SSB Contests, 2018

Links to the augmented logs may be followed here.

The augmented logs for the ARRL DX contests contain the same information as the cleaned logs, but with the addition of some useful (derived) information on each line. The information added to each line comprises:
  1. The sequence of four characters that are the same for each entry in a particular log:
    •  a. letter "A" or "U" indicating "assisted" or "unassisted"
    •  b. letter "Q", "L", "H" or "U", indicating respectively QRP, low power, high power or unknown power level
    •  c. letter "S", "M", "C" or "U", indicating respectively a single-operator, multi-operator, checklog or unknown operator category 
    •  d. character "1", "2", "+" or "U", indicating respectively that the number of transmitters is one, two, unlimited or unknown
  2. A four-digit number representing the time if the contact in minutes measured from the start of the contest. (I realise that this can be calculated from the other information on the line, but it saves subsequent processors of the file considerable time to have the number readily available in the file without having to calculate it each time.)
  3. Band
  4. A set of fourteen flags, each -- apart from column k and column n -- encoded as T/F: 
    • a. QSO is confirmed by a log from the second party 
    • b. QSO is a reverse bust (i.e., the second party appears to have bust the call of the first party) 
    • c. QSO is an ordinary bust (i.e., the first party appears to have bust the call of the second party) 
    • d. the call of the second party is unique 
    • e. QSO appears to be a NIL 
    • f. QSO is with a station that did not send in a log, but who did make 20 or more QSOs in the contest 
    • g. QSO appears to be a country mult (may be T for W/VE stations only)
    • h. QSO appears to be a state/province mult (may be T for DX stations only)
    • i. QSO is a zone bust (i.e., the received zone appears to be a bust)
    • j. QSO is a reverse zone bust (i.e. the second party appears to have bust the zone of the first party)
    • k. This entry has three possible values rather than just T/F:
      • T: QSO appears to be made during a run by the first party
      • F: QSO appears not to be made during a run by the first party
      • U: the run status is unknown because insufficient frequency information is available in the first party's log
    • l. QSO is a dupe
    • m. QSO is a dupe in the second party's log
    • n. RBN information (see below)
  5. If the QSO is a reverse bust, the call logged by the second party; otherwise, the placeholder "-"
  6. If the QSO is an ordinary bust, the correct call that should have been logged by the first party; otherwise, the placeholder "-"
  7. If the QSO is a reverse exchange bust, the exchange logged by the second party; otherwise, the placeholder "-"
  8.  If the QSO is an ordinary exchange bust, the correct exchange that should have been logged by the first party; otherwise, the placeholder "-"

RBN Information


In the CW contests from 2009 onwards, the RBN has been active, automatically spotting the frequency at which any station calling CQ was transmitting. To reflect possible use of RBN information, the augmented files include a fourteenth column. For the sake of uniformity, this column is present in all the augmented files, regardless of whether the RBN actually contributed useful information to a particular contest.

Each QSO has one of several characters in the fourteenth column of flags. These characters should be interpreted as follows:

'-'
  No useful RBN-derived information is available for this QSO.

'0'
  The worked station (i.e., the second call on the log line) appears to have begun to CQ on this frequency within (roughly) 60 seconds prior to the QSO.

'A' to 'Z'
  For the nth letter of the alphabet: the worked station appears to have been CQing on this frequency for (roughly) n minutes prior to the QSO.

'+'
  The worked station appears to have been CQing for more than 26 minutes on this frequency.

'<'
  Because the the RBN is distributed, and because each contest entrant station has its own clock, there is generally a skew between the reading of the clock of the station making the QSO and the timestamp from the RBN at which it believes a posting was made (indeed, it's unclear from the RBN's [lack of] documentation exactly how the timestamp on an individual RBN posting is to be interpreted). If the character '<' appears in the the RBN column, it indicates that the raw values of the clocks suggest that the QSO took place up to two minutes before the RBN reported the worked station commencing to CQ at this frequency. When this occurs, the most likely interpretation is that there is non-negligible skew between the two clocks, and the station was actually worked almost as soon as a CQ was posted by the RBN. But it might also mean that the entrant was simply lucky and found the CQing station just as it fired up on a new frequency.

Notes:
  • The encoding of some of the flags requires subjective decisions to be made as to whether the flag should be true or false; consequently, and because the ARRL has yet to understand the importance of making the scoring code public, the value of a flag for a specific QSO line in some circumstances might not match the value that the ARRL has assign. (Also, the ARRL has more, non-public, data available.)
  • I made no attempt to deduce or infer the run status of a QSO in the second party's log (if such exists), regardless of the status in the first party's log. This allows one cleanly to perform correct statistical analyses anent the number of QSOs made by running stations merely by excluding QSOs marked with a U in column k.
  • No attempt is made to detect the case in which both participants of a QSO bust the other station's call. This is a problematic situation because of the relatively high probability of a false positive unless both stations log the frequency as opposed to the band. (Also, on bands on which split-frequency QSOs are common, the absence of both transmit and receive frequency is a problem.) Because of the likelihood of false positives, it seems better, given the presumed rarity of double-bust QSOs, that no attempt be made to mark them.
  • The entries for the exchanges in the case of exchange or reverse exchange busts are normalised to two-digit values in the same manner as the exchanges in the cleaned.