D. R. Evans (N7DR): 2019

2019-12-20

Cleaned and Augmented Logs for ARRL DX CW and SSB contests, 2018 to 2019

Cleaned Logs

Cleaned versions of the logs for the ARRL DX CW and SSB contests are now available for 2018 and 2019.

Links to the cleaned logs may be followed here.

The cleaned logs are the result of processing the QSO: lines from the entrants' submitted Cabrillo files (as [gratuitously] modified by the ARRL) to ensure that all fields contain valid values and all the data match the column-specific standard format for this contest.

Any line containing illegal data in a field has simply been removed. Also, only the QSO: lines are retained, so that each line in the file can be processed easily. All QTH multipliers are rendered as two letters, and the power is rendered as four digits, regardless of how the submitted log recorded these two fields; this should simplify processing the logs by scripts or programs, as should the use of fixed-length records in these cleaned files.

Augmented Logs

Links to the augmented logs may be followed here.

The augmented logs for the ARRL DX contests contain the same information as the cleaned logs, but with the addition of some useful (derived) information on each line. The information added to each line comprises:

The sequence of four characters that are the same for each entry in a particular log:

a. letter "A" or "U" indicating "assisted" or "unassisted"
b. letter "Q", "L", "H" or "U", indicating respectively QRP, low power, high power or unknown power level
c. letter "S", "M", "C" or "U", indicating respectively a single-operator, multi-operator, checklog or unknown operator category
d. character "1", "2", "+" or "U", indicating respectively that the number of transmitters is one, two, unlimited or unknown

A four-digit number representing the time if the contact in minutes measured from the start of the contest. (I realise that this can be calculated from the other information on the line, but it saves subsequent processors of the file considerable time to have the number readily available in the file without having to calculate it each time.)
Band
A set of fourteen flags, each -- apart from column k and column n -- encoded as T/F:
- a. QSO is confirmed by a log from the second party
- b. QSO is a reverse bust (i.e., the second party appears to have bust the call of the first party)
- c. QSO is an ordinary bust (i.e., the first party appears to have bust the call of the second party)
- d. the call of the second party is unique
- e. QSO appears to be a NIL
- f. QSO is with a station that did not send in a log, but who did make 20 or more QSOs in the contest
- g. QSO appears to be a country mult (may be T for W/VE stations only)
- h. QSO appears to be a state/province mult (may be T for DX stations only)
- i. QSO is an exchange bust (i.e., the received exchange appears to be a bust)
- j. QSO is a reverse exchange bust (i.e. the second party appears to have bust the exchange of the first party)
- k. This entry has three possible values rather than just T/F:
  - T: QSO appears to be made during a run by the first party
  - F: QSO appears not to be made during a run by the first party
  - U: the run status is unknown because insufficient frequency information is available in the first party's log
- l. QSO is a dupe
- m. QSO is a dupe in the second party's log
- n. RBN information (see below)
If the QSO is a reverse bust, the call logged by the second party; otherwise, the placeholder "-"
If the QSO is an ordinary bust, the correct call that should have been logged by the first party; otherwise, the placeholder "-"
If the QSO is a reverse exchange bust, the exchange logged by the second party; otherwise, the placeholder "-"
If the QSO is an ordinary exchange bust, the correct exchange that should have been logged by the first party; otherwise, the placeholder "-"

RBN Information

In CW contests from 2009 onwards, the RBN has been active, automatically spotting the frequency at which any station calling CQ was transmitting. To reflect possible use of RBN information, the augmented files include a fourteenth column. For the sake of uniformity, this column is present in all the augmented files, regardless of whether the RBN actually contributed useful information to a particular contest.

Each QSO has one of several characters in the fourteenth column of flags. These characters should be interpreted as follows:

'-'
No useful RBN-derived information is available for this QSO.

'0'
The worked station (i.e., the second call on the log line) appears to have begun to CQ on this frequency within (roughly) 60 seconds prior to the QSO.

'A' to 'Z'
For the nth letter of the alphabet: the worked station appears to have been CQing on this frequency for (roughly) n minutes prior to the QSO.

'+'
The worked station appears to have been CQing for more than 26 minutes on this frequency.

'<'
Because the the RBN is distributed, and because each contest entrant station has its own clock, there is generally a skew between the reading of the clock of the station making the QSO and the timestamp from the RBN at which it believes a posting was made (indeed, it's unclear from the RBN's [lack of] documentation exactly how the timestamp on an individual RBN posting is to be interpreted). If the character '<' appears in the the RBN column, it indicates that the raw values of the clocks suggest that the QSO took place up to two minutes before the RBN reported the worked station commencing to CQ at this frequency. When this occurs, the most likely interpretation is that there is non-negligible skew between the two clocks, and the station was actually worked almost as soon as a CQ was posted by the RBN. But it might also mean that the entrant was simply lucky and found the CQing station just as it fired up on a new frequency.

Notes:

The encoding of some of the flags requires subjective decisions to be made as to whether the flag should be true or false; consequently, and because the ARRL has yet to understand the importance of making the scoring code public, the value of a flag for a specific QSO line in some circumstances might not match the value that the ARRL has assigned. (Also, the ARRL has additional, non-public, data available.)
I made no attempt to deduce or infer the run status of a QSO in the second party's log (if such exists), regardless of the status in the first party's log. This allows one cleanly to perform correct statistical analyses anent the number of QSOs made by running stations merely by excluding QSOs marked with a U in column k.
No attempt is made to detect the case in which both participants of a QSO bust the other station's call. This is a problematic situation because of the relatively high probability of a false positive unless both stations log the frequency as opposed to the band. (Also, on bands on which split-frequency QSOs are common, the absence of both transmit and receive frequency is a problem.) Because of the likelihood of false positives, it seems better, given the presumed rarity of double-bust QSOs, that no attempt be made to mark them.
The entries for the exchanges in the case of exchange or reverse exchange busts are normalised to two-letter or four-digit values in the same manner as described above for the exchanges in the cleaned logs.

2019-12-13

Revised Augmented Logs for CQ WW CW and SSB Contests, 2005 to 2018

AD1C has recently once more made accessible historical cty.dat and associated files. A copy of the cty,dat files is here.

This allows one to regenerate the augmented contest files, but using data that were current at the time of the contest. A pointer to these revised augmented files is here.

The augmented logs contain the same information as cleaned logs, but with the addition of some useful (derived) information on each line. The information added to each line comprises:

The sequence of four characters that are the same for each entry in a particular log:

a. letter "A" or "U" indicating "assisted" or "unassisted"
b. letter "Q", "L", "H" or "U", indicating respectively QRP, low power, high power or unknown power level
c. letter "S", "M", "C" or "U", indicating respectively a single-operator, multi-operator, checklog or unknown operator category [ the contest organisers have stated that checklogs are not made public, but in fact at least some of them from the early years have been, hence the need for the "C" category ]
d. character "1", "2", "+" or "U", indicating respectively that the number of transmitters is one, two, unlimited or unknown

A four-digit number representing the time if the contact in minutes measured from the start of the contest. (I realise that this can be calculated from the other information on the line, but it saves subsequent processors of the file considerable time to have the number readily available in the file without having to calculate it each time.)
Band
A set of fourteen flags, each -- apart from column k and column n -- encoded as T/F:
- a. QSO is confirmed by a log from the second party
- b. QSO is a reverse bust (i.e., the second party appears to have bust the call of the first party)
- c. QSO is an ordinary bust (i.e., the first party appears to have bust the call of the second party)
- d. the call of the second party is unique
- e. QSO appears to be a NIL
- f. QSO is with a station that did not send in a log, but who did make 20 or more QSOs in the contest
- g. QSO appears to be a country mult
- h. QSO appears to be a zone mult
- i. QSO is a zone bust (i.e., the received zone appears to be a bust)
- j. QSO is a reverse zone bust (i.e. the second party appears to have bust the zone of the first party)
- k. This entry has three possible values rather than just T/F:
  - T: QSO appears to be made during a run by the first party
  - F: QSO appears not to be made during a run by the first party
  - U: the run status is unknown because insufficient frequency information is available in the first party's log
- l. QSO is a dupe
- m. QSO is a dupe in the second party's log
- n. RBN information (see below)
If the QSO is a reverse bust, the call logged by the second party; otherwise, the placeholder "-"
If the QSO is an ordinary bust, the correct call that should have been logged by the first party; otherwise, the placeholder "-"
If the QSO is a reverse zone bust, the zone logged by the second party; otherwise, the placeholder "-"
If the QSO is an ordinary zone bust, the correct zone that should have been logged by the first party; otherwise, the placeholder "-"

RBN Information

In the CW contests from 2009 onwards, the RBN was active, automatically spotting the frequency at which any station calling CQ was transmitting. To reflect possible use of RBN information, the augmented files now include a fourteenth column. For the sake of uniformity, this column is present in all the augmented files, regardless of whether the RBN actually contributed useful information to a particular contest.

Each QSO has one of several characters in the fourteenth column of flags. These characters should be interpreted as follows:

'-'
No useful RBN-derived information is available for this QSO.

'0'
The worked station (i.e., the second call on the log line) appears to have begun to CQ on this frequency within (roughly) 60 seconds prior to the QSO.

'A' to 'Z'
For the nth letter of the alphabet: the worked station appears to have been CQing on this frequency for (roughly) n minutes prior to the QSO.

'+'
The worked station appears to have been CQing for more than 26 minutes on this frequency.

'<'
Because the the RBN is distributed, and because each contest entrant station has its own clock, there is generally a skew between the reading of the clock of the station making the QSO and the timestamp from the RBN at which it believes a posting was made (indeed, it's unclear from the RBN's [lack of] documentation exactly how the timestamp on an individual RBN posting is to be interpreted). If the character '<' appears in the the RBN column, it indicates that the raw values of the clocks suggest that the QSO took place up to two minutes before the RBN reported the worked station commencing to CQ at this frequency. When this occurs, the most likely interpretation is that there is non-negligible skew between the two clocks, and the station was actually worked almost as soon as a CQ was posted by the RBN. But it might also mean that the entrant was simply lucky and found the CQing station just as it fired up on a new frequency.

Notes:

The encoding of some of the flags requires subjective decisions to be made as to whether the flag should be true or false; consequently, and because CQ has yet to understand the importance of making their scoring code public, the value of a flag for a specific QSO line in some circumstances might not match the value that CQ would assign. (Also, CQ has more data available in the form of check logs, which are generally not made public.)
I made no attempt to deduce or infer the run status of a QSO in the second party's log (if such exists), regardless of the status in the first party's log. This allows one cleanly to perform correct statistical analyses anent the number of QSOs made by running stations merely by excluding QSOs marked with a U in column k.
No attempt is made to detect the case in which both participants of a QSO bust the other station's call. This is a problematic situation because of the relatively high probability of a false positive unless both stations log the frequency as opposed to the band. (Also, on bands on which split-frequency QSOs are common, the absence of both transmit and receive frequency is a problem.) Because of the likelihood of false positives, it seems better, given the presumed rarity of double-bust QSOs, that no attempt be made to mark them.
The entries for the zones in the case of zone or reverse zone busts are normalised to two-digit values.

2019-11-03

drmap

drmap is a program for generating amateur-radio-related maps from USGS National Map data. From the description in the source code, with graphics added inline:

    drmap
      -ant <antenna height>

        The height of the antenna. If -imperial is present, the height is in feet, otherwise it is in metres.

      -call <callsign>

        The callsign associated with the plot. Must be present.

      -cells <number of cells>

        The number of cells from the centre of the plot to the edges. The default is 3/8 of the width of the plot, in pixels. For the default width of 800, the value is therefore 300.

      -datadir <directory>

        The directory that contains USGS GridFloat tiles

      -elev

        Create an elevation plot: the plotted values are the elevation of each cell as seen from the antenna. Most are therefore negative.

      -grad

        Create a gradient plot: the plotted values are the gradient of the terrain in the direction from the QTH.

      -hzn [distance limit]

        Plot the elevation of the horizon around the periphery of the figure. Eye-level is set in the same way as eye-level for the -los option. Only distances out to the distance limit are used in this calculation. If the distance limit value is not present, it is assumed to be the same as the radius of the figure.

      -imperial

        Use imperial units instead of metric. That is, miles instead of kilometres and feet instead of metres. Applies both to values on the command line and to values on the output plot(s).

      -lat <latitude>

        Latitude in degrees north. If present, -long should also be present.

      -long <longitude>

        Longitude in degrees east. If present, -lat should also be present. Note that because the USGS data covers only the US, longitude should be negative; but if it is positive, the program will negate the value before use.

      -los

        Create a line-of-sight plot in addition to the standard height-field plot. Eye-level is assumed to be 1.5m or 5 feet, unless the -ant option is presewnt, in which case eye-level is the same as the height of the antenna.

      -outdir <directory>

        The directory into which the output maps should be written

      -radius <distance1[,distance2[,distance3...]]>

        One or more radii for the plot(s), in units of km unless -imperial is present, in which case the units are miles.

      -qthdb <QTH database filename>

        A file linking QTH information to callsigns. Each line of the file should contain three entries pertaining to a station, separated by white space: the callsign, the latitude and the longitude. This database will be used only if one or both of the -lat and -long parameters is missing from the command line.

      -sm

        USGS tiles are each about 450MB in size. This parameter ("small memory") tells drmap to use the disk files that contain the tiles as-is, rather than moving them into RAM where their contents can be accessed much more quickly. Using this parameter therefore slows access, but means that there is essentially no limit to the number of tiles that may be used to build a plot. drmap automatically stops loading tiles into RAM when there is less than about 500MB of free RAM and switches to using the tiles on disk, so ordinarily there is no need to worry about whether to use the "-sm" parameter. This parameter will be removed in future versions of drmap if it seems to be unneeded in practice.

      -width <pixels>

        width, in pixels, of the plot(s). The default is 800. The height is automatically set to be three quarters of this value.

    Examples:
      drmap -call n7dr -datadir /zfs1/data/usgs/drmap -outdir /tmp/drmap -qthdb ~/radio/qthdb -imperial -ant 50 -radius 2 -los -hzn 5

        Look up the call "n7dr" (case is ignored) in the file "~/radio/qthdb". Each line in that file is of the form:

        <callsign>     <latitude>     <longitude>

        In particular the following line appears in that file on my system:

        N7DR        40.108016   -105.051700

        The latitude and longitude information for N7DR are extracted from that file.

        The program will look for relevant USGS files in the directory "/zfs1/data/usgs/drmap". If it fails to find any needed files, it will download them from the USGS and place them in that directory prior to using them.

        The program will write output plots in the directory tmp/drmap.

        The program will use imperial units (miles and feet), and assume an antenna 50 feet above ground.

        It will generate a height plot displaying a radius of 2 miles around the N7DR QTH:

It will also create a line-of-sight plot (showing the terrain visible from the putative antenna, 50 feet above ground level):

        In both plots, the elevation of the horizon as seen from 50 feet above ground level is drawn around the periphery. The program assumes that no contribution to the horizon is more than five miles from the QTH. It also (always) calculates and displays the M[ean] H[eight] A[bove] T[errain] of the antenna for the area inside the provided radius.

      drmap -call RMNP -datadir /zfs1/data/usgs/drmap -outdir /tmp/drmap -lat 40.441358 -long -105.753685 -radius 1

        Create a plot for the point 40°.441358N, 105°.753685W (which is in Rocky Mountain National Park).

        The program will look for relevant USGS files in the directory "/zfs1/data/usgs/drmap". If it fails to find any needed files, it will download them from the USGS and place them in that directory prior to using them.

        The program will write output plots in the directory tmp/drmap.

        The program will use metric units (kilometres and metres).

        It will generate a height plot displaying a radius of 1 kilometre around the designated location:

      drmap -call RMNP -datadir /zfs1/data/usgs/drmap -outdir /tmp/drmap -lat 40.441358 -long -105.753685 -radius 10 -los -hzn 5 -grad -elev

        Create a plot for the point 40°.441358N, 105°.753685W (which is in Rocky Mountain National Park).

        The program will look for relevant USGS files in the directory "/zfs1/data/usgs/drmap". If it fails to find any needed files, it will download them from the USGS and place them in that directory prior to using them.

        The program will write output plots in the directory tmp/drmap.

        The program will use metric units (kilometres and metres).

        It will generate a height plot displaying a radius of 10 kilometres around the designated location:

It will also generate a line-of-sight plot, assuming the default eye level of 1.5m:

It will also generate a gradient plot, showing at each point the gradient measured in a direction from the centre of the plot to that point:

It will also generate an elevation plot (reminiscent of 1960s tie-dye), showing at each point the elevation angle measured in a direction from the centre of the plot to that point:

In all these plots, the elevation of the horizon as seen from eye level is drawn around the periphery. The program assumes that no contribution to the horizon is more than five kilometres from the QTH.

2019-09-03

drlog moving from Eclipse to Codelite

The Eclipse IDE has been removed from the newest version of debian stable. This is somewhat of a case of déjà vu, as a few years ago KDevelop3 was similarly removed from Kubuntu, at a time when that was development environment for drlog, even though, at the time, KDevelop3 was stable and functional, whereas its replacement, KDevelop4, was... well, to be polite, completely broken. That was when I switched to Eclipse.

One can, in theory, still download a functional Eclipse from eclipse.org to use on debian, but when I did that the result was a completely useless program that spewed java errors and would not display dialog boxes.

So it is, again, time to go looking for an IDE for developing drlog.

I expect to be moving to Codelite, although it does seem to have some issues for serious development work, in particular a rather poor editor that doesn't, as far as I can tell, allow multiple views on different parts of a file. On the other hand, its use of virtual directories makes it easy to move projects from Eclipse to Codelite. Anyway, I shall try it for a while and see what happens.

In any case, rather than trying to inflict my choice on others, I shall be keeping the raw drlog makefile current, so that it can be used directly to build the program regardless of a user's choice of development environment. Sometime shortly I shall remove the Eclipse-related files from the repository.

2019-08-27

Stations With Lowest Probability of Busting a Call: CQ WW 2017 and 2018

Prior posts in this series:

Throughout this post, I apply the procedures developed in the second post above.

I begin with an ordered list of the stations with the lowest probabilities of busting a call in 2017 CQ WW SSB.

2017 CQ WW SSB -- weighted mean values of $p_{bust}$
Position	Call	weighted mean	$Q_v$	$B$
1	OH0X	0.0010	978	0
2	C6ARW	0.0010	2,058	1
3	K1ZZ	0.0012	1,715	1
4	F4FTA	0.0012	800	0
5	DF2RG	0.0014	774	0
6	DJ8OG	0.0015	2,169	2
7	EW1P	0.0015	1,355	1
8	NH6V	0.0015	1,299	1
9	K2EP	0.0016	627	0
10	UW7W	0.0016	1,266	1

It is also interesting to plot the aggregated probability function for $p_{bust}$, weighted by the number of verified QSOs, $Q_v$, for all stations:

The location of the vertical line represents the weighted mean of the probability curve.

For 2018 CQ WW SSB:

2018 CQ WW SSB -- weighted mean values of $p_{bust}$
Position	Call	weighted mean	$Q_v$	$B$
1	DF2RG	0.0011	887	0
2	N4PQX	0.0012	768	0
3	N9NB	0.0013	758	0
4	OM0WR	0.0013	732	0
5	KA1ZD	0.0013	1,500	1
6	YT5IVN	0.0014	1,402	1
7	WA2FZB	0.0015	628	0
8	JM1NKT	0.0016	602	0
9	KV1J	0.0017	582	0
10	W1CU	0.0017	579	0

For 2017 CQ WW CW:

2017 CQ WW CW -- weighted mean values of $p_{bust}$
Position	Call	weighted mean	$Q_v$	$B$
1	K0KX	0.0005	1,682	0
2	DM5EE	0.0006	1,430	0
3	K1ZZ	0.0007	3,028	1
4	YO5OHO	0.0007	1,376	0
5	DP4X	0.0007	1,315	0
6	WE9V	0.0007	1,256	0
7	UT4U	0.0008	1,244	0
8	JH8SLS	0.0008	1,206	0
9	SP6MLX	0.0009	1,077	0
10	K2XR	0.0009	1,069	0

For 2018 CQ WW CW:

2018 CQ WW CW -- weighted mean values of $p_{bust}$
Position	Call	weighted mean	$Q_v$	$B$
1	HB9ARF	0.0006	1,432	0
2	K6LL	0.0007	1,371	0
3	OM6RM	0.0007	1,286	0
4	UW1WU	0.0008	1,218	0
5	KG4V	0.0008	1,206	0
6	W3KB	0.0008	1,153	0
7	EW1P	0.0008	1,116	0
8	EU4E	0.0009	1,067	0
9	W2CDO	0.0009	1,011	0
10	DF2RG	0.0010	997	0

We can limit the analysis to calling stations (i.e., not the running station).

2017 CQ WW SSB and CW:

2017 CQ WW SSB -- weighted mean values of $p_{bust}$ (non-run)
Position	Call	weighted mean	$Q_v$	$B$
1	EW1P	0.0008	1,132	0
2	DL6NDW	0.0012	829	0
3	DJ8OG	0.0013	762	0
4	DF2RG	0.0013	761	0
5	F4FTA	0.0015	633	0
6	K1ZZ	0.0015	1,304	1
7	SK6AW	0.0015	628	0
8	K2EP	0.0016	614	0
9	UW7W	0.0016	1,234	1
10	IZ4JMA	0.0016	592	0

2017 CQ WW CW -- weighted mean values of $p_{bust}$ (non-run)
Position	Call	weighted mean	$Q_v$	$B$
1	RM9A	0.0005	1,834	0
2	ES9C	0.0006	1,662	0
3	K0KX	0.0007	1,325	0
4	SP2LNW	0.0008	1,241	0
5	R7MM	0.0008	1,239	0
6	DM5EE	0.0008	1,160	0
7	ED2C	0.0008	1,122	0
8	RF9C	0.0009	1,096	0
9	LY3B	0.0009	1,068	0
10	UZ3A	0.0010	991	0

2018 CQ WW SSB and CW:

2018 CQ WW SSB -- weighted mean values of $p_{bust}$ (non-run)
Position	Call	weighted mean	$Q_v$	$B$
1	9A2EU	0.0010	976	0
2	YT5IVN	0.0011	880	0
3	DF2RG	0.0011	868	0
4	K5ZD	0.0013	749	0
5	K3PP	0.0013	744	0
6	N4PQX	0.0013	736	3
7	LX7I	0.0015	632	0
8	N9NB	0.0015	628	0
9	WA2FZB	0.0016	589	0
10	K3OO	0.0017	575	0

2018 CQ WW CW -- weighted mean values of $p_{bust}$ (non-run)
Position	Call	weighted mean	$Q_v$	$B$
1	HG6N	0.0006	1,619	0
2	UX1VT	0.0007	1,391	0
3	S54X	0.0007	1,374	0
4	DL1WA	0.0007	1,307	0
5	K1AR	0.0008	1,209	0
6	NR4M	0.0008	1,148	0
7	RT4M	0.0008	1,138	0
8	LY5W	0.0008	1,111	0
9	K3PH	0.0009	1,100	0
10	W8FJ	0.0009	1,087	0

And similarly for running stations.

2017 CQ WW SSB and CW:

2017 CQ WW SSB -- weighted mean values of $p_{bust}$ (run)
Position	Call	weighted mean	$Q_v$	$B$
1	C6ARW	0.0010	1,925	1
2	CF7RR	0.0013	1,557	1
3	OH0X	0.0014	694	0
4	HP1XT	0.0016	621	0
5	OH0V	0.0017	2,337	3
6	TK9R	0.0018	3,982	6
7	NH6V	0.0019	1,035	1
8	PY2KJ	0.0021	1,456	2
9	DJ8OG	0.0021	1,407	2
10	VA2WA	0.0022	1,853	3

2017 CQ WW CW -- weighted mean values of $p_{bust}$ (run)
Position	Call	weighted mean	$Q_v$	$B$
1	K1ZZ	0.0004	2,294	0
2	W1GD	0.0011	888	0
3	8S0DX	0.0011	873	0
4	SN1Y	0.0012	1,714	1
5	N3ER	0.0013	749	0
6	EU4E	0.0013	747	0
7	UT4U	0.0013	731	0
8	OM0WR	0.0013	1,517	1
9	JH8SLS	0.0014	710	0
10	S59ABC	0.0014	2,898	3

2018 CQ WW SSB and CW:

2018 CQ WW SSB -- weighted mean values of $p_{bust}$ (run)
Position	Call	weighted mean	$Q_v$	$B$
1	OM0WR	0.0014	697	0
2	MI0I	0.0023	423	0
3	PY2KJ	0.0025	1,207	2
4	ZF9CW	0.0025	3,973	9
5	OQ5M	0.0025	1,190	2
6	KA1ZD	0.0026	381	0
7	PY2UD	0.0028	715	1
8	LY9A	0.0028	711	1
9	UW2M	0.0029	1,050	2
10	M0MCV	0.0029	1,049	2

2018 CQ WW CW -- weighted mean values of $p_{bust}$ (run)
Position	Call	weighted mean	$Q_v$	$B$
1	HB9ARF	0.0008	1,139	0
2	K8CX	0.0012	800	0
3	EU4E	0.0012	772	0
4	YU0A	0.0012	770	0
5	OH8X	0.0013	2,322	2
6	ZR2A	0.0013	1,485	1
7	R7MM	0.0014	691	0
8	UA9LAO	0.0014	682	0
9	DF3AX	0.0015	629	0
10	DK2GZ	0.0017	573	0

We can also look at the changes over the period from 2005 to 2018.

First for all QSOs:

For non-run QSOs:

And for run QSOs:

Most-Logged Stations in CQ WW CW and SSB Contests, 2018

The public CQ WW CW and SSB logs allow us easily to tabulate the stations that appear in the largest number of entrants' logs. For 2018, the ten stations with the largest number of appearances in (augmented) CQ WW SSB logs were:

Callsign	Appearances	% logs
CN3A	9,585	61
D4C	8,440	55
EF8R	8,369	56
ES9C	7,649	50
LZ9W	7,643	48
M6T	7,249	49
FY5KE	7,158	46
PZ5K	7,089	46
PJ4G	6,898	43
DF0HQ	6,746	46

The first column in the table is the callsign. The second column is the total number of times that the call appears in logs. That is, if a station worked CN3A on six bands, that will increment the value in the second column of the CN3A row by six. The third column is the percentage of logs that contain the callsign at least once.

For comparison, here is the equivalent table for 2017:

Callsign	Appearances	% logs
CN3A	10,347	62
EF8R	9,099	56
CN2R	8,649	55
LZ9W	8,251	53
CN2AA	8,090	54
ES9C	7,941	53
M6T	7,856	52
A73A	7,501	47
PZ5K	7,481	45
DF0HQ	7,372	50

Similarly, the ten stations with the largest number of appearances in CQ WW CW 2018 were:

Callsign	Appearances	% logs
EF8R	12,063	72
CN3A	11,603	70
P33W	10,804	67
CR3DX	10,546	67
9A1A	10,518	69
TK0C	10,318	65
LZ9W	10,026	67
M6T	9,785	65
ES9C	9,521	62
PJ2T	9,455	57

And the equivalent table for 2017:

Callsign	Appearances	% logs
TK0C	10,719	63
9A1A	10,594	65
M6T	9,884	62
CR3W	9,783	61
YT5A	9,692	63
PJ2T	9,661	53
EF8R	9,538	60
LZ9W	9,257	60
V47T	9,128	55
CN2AA	9,092	58

We can also perform the same analysis for, say, a ten-year span, to show which stations have most consistently appeared in other stations' logs. So, for CQ WW SSB for the period 2009 to 2018, we find:

Callsign	Appearances	% logs
LZ9W	84,898	56
CN3A	84,806	59
DF0HQ	78,670	54
PJ2T	71,826	47
OT5A	71,036	51
K3LR	69,954	49
P33W	66,944	49
A73A	63,280	46
CN2R	59,700	45
V26B	57,338	41

And for CW over the same span:

Callsign	Appearances	% logs
9A1A	99,223	67
LZ9W	96,166	66
PJ2T	91,295	57
DF0HQ	84,414	61
P33W	83,190	59
W3LPL	74,354	52
K3LR	73,820	52
PJ4A	69,920	51
LX7I	66,448	50
CR3L	64,432	44

2019-08-25

Call Busts and Reverse Busts: CQ WW 2018

This is the latest in a series of posts on (callsign) busts and reverse busts in the CQ WW contests. These posts are based on the various public CQ WW logs (cq-ww-2005--2018-augmented.xz; see here for details of the augmented format).

Prior posts in the series on busts and reverse busts in CQ WW:

In this post, we include only verified QSOs; that is, QSOs for which both parties submitted a log.

First, the tables for the CQ WW SSB 2018:

2018 SSB -- Most Busts
Position	Call	QSOs	Busts	% Busts
1	V26B	6,160	166	2.7
2	CN3A	9,457	159	1.7
3	D4C	8,295	155	1.9
4	LZ9W	7,558	142	1.9
5	TM0T	4,189	136	3.2
6	A44A	3,944	135	3.4
7	ZF1A	6,423	115	1.8
8	OK5Z	3,634	112	3.1
9	EF8R	8,357	107	1.3
10	OT5A	5,021	106	2.1

2018 SSB -- Most Reverse Busts
Position	Call	QSOs	Reverse Busts	% Reverse Busts
1	OG60F	4,258	486	11.4
2	HC0E	1,873	274	14.6
3	TM3R	3,605	233	6.5
4	DF0HQ	6,857	195	2.8
5	KL7RA	4,622	157	3.4
6	IK2YCW	4,021	150	3.7
7	EF8R	8,357	148	1.8
8	JA3YBK	3,509	140	4.0
9	JE2YRB	1,362	140	10.3
10	ED3M	3,152	131	4.2

2018 SSB -- Highest Percentage of Busts (≥100 QSOs)
Position	Call	QSOs	% Busts
1	YB2BNN	136	16.9
2	SP9KB	189	14.8
3	UB3AQA	129	13.2
4	IU8GYT	210	12.4
5	YC1HLT	108	12.0
6	KE0KOT	101	11.9
7	R7AC	112	11.6
8	W7UPF	137	10.9
9	TA3PZ	152	10.5
10	9W6VAT	143	10.5

2018 SSB -- Highest Percentage of Reverse Busts (≥100 QSOs)
Position	Call	QSOs	% Reverse Busts
1	HC0E	1,873	14.6
2	OG60F	4,258	11.4
3	BY4QA	437	11.2
4	9M2SDX	135	11.1
5	BV2A/3	443	11.1
6	W6MOB	101	10.9
7	JE2YRB	1,362	10.3
8	IK3SSJ	207	9.7
9	9M6GOH	288	9.0
10	IK3XTY	123	8.9

Now the tables for the 2018 CW data:

2018 CW -- Most Busts
Position	Call	QSOs	Busts	% Busts
1	TK0C	10,328	175	1.7
2	OZ5E	6,144	149	2.4
3	D41CV	7,456	144	1.9
4	CN3A	11,670	141	1.2
5	LZ9W	9,958	132	1.3
6	F6KOP	3,738	129	3.5
7	NP2P	3,369	120	3.6
8	RW0A	6,524	116	1.8
9	G3V	6,593	113	1.7
10	RM9A	8,834	112	1.3

2018 CW -- Most Reverse Busts
Position	Call	QSOs	Reverse Busts	% Reverse Busts
1	PE75W	1,408	412	29.3
2	ES9C	9,696	322	3.3
3	JS3CTQ	2,455	320	13.0
4	CN3A	11,670	278	2.4
5	DF0HQ	8,658	274	3.2
6	EF8R	12,115	271	2.2
7	OG60F	5,561	232	4.2
8	RM9A	8,834	213	2.4
9	UA4S	5,283	208	3.9
10	CR3W	8,586	173	2.0

2018 CW -- Highest Percentage of Busts (≥100 QSOs)
Position	Call	QSOs	% Busts
1	KD5QHV	116	23.3
2	YO7LYM	250	18.8
3	W3ICM	107	18.7
4	OH1LAR	156	18.6
5	PA0SKP	204	18.1
6	K4KAY	235	17.9
7	W4PH	198	17.7
8	LA6M	225	17.0
9	GD4EIP	453	16.8
10	VE5WI	180	16.7

2018 CW -- Highest Percentage of Reverse Busts (≥100 QSOs)
Position	Call	QSOs	% Reverse Busts
1	PE75W	1,408	29.3
2	YU1LG	167	19.2
3	DK3HM	294	15.6
4	K3HW	414	14.7
5	PY2LPM	103	14.6
6	OG55W	463	14.0
7	JS3CTQ	2,455	13.0
8	SA0BXV	157	12.7
9	JA8HBO	103	12.6
10	G0AZS	188	12.2

Now we look at the tables that integrate ten years' data.

For SSB:

2009 to 2018 SSB -- Most Busts
Position	Call	QSOs	Busts	% Busts
1	LZ9W	83,740	1,456	1.7
2	OT5A	69,414	1,363	2.0
3	PJ2T	70,851	1,324	1.9
4	CN3A	83,700	1,316	1.6
5	A73A	62,374	1,260	2.0
6	HG1S	42,980	868	2.0
7	EF8R	50,246	850	1.7
8	V26B	56,636	843	1.5
9	JA7YRR	28,832	821	2.8
10	HG7T	54,081	785	1.5

2009 to 2018 SSB -- Most Reverse Busts
Position	Call	QSOs	Reverse Busts	% Reverse Busts
1	DF0HQ	79,704	2,301	2.9
2	JA3YBK	41,284	1,311	3.2
3	K3LR	69,728	1,078	1.5
4	CN2R	59,585	1,027	1.7
5	WE3C	36,245	976	2.7
6	CN3A	83,700	891	1.1
7	HG1S	42,980	884	2.1
8	S52ZW	32,848	847	2.6
9	W3LPL	53,037	800	1.5
10	HK1NA	40,363	788	2.0

2009 to 2018 SSB -- Highest Percentage of Busts (≥500 QSOs)
Position	Call	QSOs	% Busts
1	PV8ADI	858	16.2
2	K2JMY	2,062	14.8
3	EA7JQT	572	12.4
4	EA1HTF	1,244	11.9
5	PU1MMZ	544	11.4
6	K8TS	679	10.9
7	PU2TRX	868	10.8
8	YB9KA	557	10.1
9	HC2GF	936	10.0
10	E20WXA	594	9.9

2009 to 2018 SSB -- Highest Percentage of Reverse Busts (≥500 QSOs)
Position	Call	QSOs	% Reverse Busts
1	CW90A	1,370	30.9
2	BA8AG	752	16.4
3	BW2/KU1CW	940	12.3
4	OG60F	4,258	11.4
5	ZP6DYA	1,226	11.3
6	V84SCQ	806	10.5
7	LU9DDJ	701	10.4
8	JE2YRB	1,362	10.3
9	BV55D	919	10.2
10	JG3SVP	1,270	10.1

And for CW:

2009 to 2018 CW -- Most Busts
Position	Call	QSOs	Busts	% Busts
1	PJ2T	90,264	1,260	1.4
2	LZ9W	95,211	1,193	1.3
3	PV8ADI	8,127	1,177	14.5
4	PI4CC	53,794	1,056	2.0
5	OZ5E	30,550	824	2.7
6	9A1A	98,218	809	0.8
7	PJ4A	69,325	806	1.2
8	HG1S	35,367	801	2.3
9	D4C	56,710	788	1.4
10	RW0A	47,270	768	1.6

2009 to 2018 CW -- Most Reverse Busts
Position	Call	QSOs	Reverse Busts	% Reverse Busts
1	JS3CTQ	24,173	3,139	13.0
2	DF0HQ	86,422	3,073	3.6
3	ES9C	59,162	1,834	3.1
4	W2FU	56,827	1,456	2.6
5	K3LR	74,177	1,408	1.9
6	DR1A	49,284	1,290	2.6
7	IR4X	44,952	1,274	2.8
8	NR4M	54,508	1,184	2.2
9	HG7T	58,224	1,178	2.0
10	W0AIH	29,076	1,151	4.0

2009 to 2018 CW -- Highest Percentage of Busts (≥500 QSOs)
Position	Call	QSOs	% Busts
1	W2UDT	911	19.8
2	BD3MV	1,005	19.2
3	YO7LYM	1,711	17.0
4	JA3AHY	554	16.6
5	DJ5UZ	686	16.5
6	WP3Y	570	16.3
7	AE3D	1,016	15.8
8	YU1NIM	619	15.8
9	AD7XG	1,178	15.4
10	SM5BJT	615	14.6

2009 to 2018 CW -- Highest Percentage of Reverse Busts (≥500 QSOs)
Position	Call	QSOs	% Reverse Busts
1	G3RWF	943	99.9
2	RZ3VO	1,792	49.9
3	YT65A	1,149	37.2
4	5K0A	1,853	32.1
5	OG55W	3,032	29.4
6	PE75W	1,408	29.3
7	DP65HSC	516	16.9
8	5J1E	1,523	15.6
9	SB0A	1,102	15.5
10	YP0HQ	1,787	14.0

NB: In tables of reverse busts, one sometimes finds what seems like an unreasonable number of reverse busts (as, in this table, for G3RWF). This is generally caused by a discrepancy between the call actually sent by the listed station and the one recorded as being sent in at least some QSOs in the log, although it can, of course, be due to an unusual callsign structure or poor signal quality.

2019-08-24

Augmented Logs for ARRL DX CW and SSB Contests, 2018

The sequence of four characters that are the same for each entry in a particular log:

a. letter "A" or "U" indicating "assisted" or "unassisted"
b. letter "Q", "L", "H" or "U", indicating respectively QRP, low power, high power or unknown power level
c. letter "S", "M", "C" or "U", indicating respectively a single-operator, multi-operator, checklog or unknown operator category
d. character "1", "2", "+" or "U", indicating respectively that the number of transmitters is one, two, unlimited or unknown

A four-digit number representing the time if the contact in minutes measured from the start of the contest. (I realise that this can be calculated from the other information on the line, but it saves subsequent processors of the file considerable time to have the number readily available in the file without having to calculate it each time.)
Band
A set of fourteen flags, each -- apart from column k and column n -- encoded as T/F:
- a. QSO is confirmed by a log from the second party
- b. QSO is a reverse bust (i.e., the second party appears to have bust the call of the first party)
- c. QSO is an ordinary bust (i.e., the first party appears to have bust the call of the second party)
- d. the call of the second party is unique
- e. QSO appears to be a NIL
- f. QSO is with a station that did not send in a log, but who did make 20 or more QSOs in the contest
- g. QSO appears to be a country mult (may be T for W/VE stations only)
- h. QSO appears to be a state/province mult (may be T for DX stations only)
- i. QSO is a zone bust (i.e., the received zone appears to be a bust)
- j. QSO is a reverse zone bust (i.e. the second party appears to have bust the zone of the first party)
- k. This entry has three possible values rather than just T/F:
  - T: QSO appears to be made during a run by the first party
  - F: QSO appears not to be made during a run by the first party
  - U: the run status is unknown because insufficient frequency information is available in the first party's log
- l. QSO is a dupe
- m. QSO is a dupe in the second party's log
- n. RBN information (see below)
If the QSO is a reverse bust, the call logged by the second party; otherwise, the placeholder "-"
If the QSO is an ordinary bust, the correct call that should have been logged by the first party; otherwise, the placeholder "-"
If the QSO is a reverse exchange bust, the exchange logged by the second party; otherwise, the placeholder "-"
If the QSO is an ordinary exchange bust, the correct exchange that should have been logged by the first party; otherwise, the placeholder "-"

RBN Information

In the CW contests from 2009 onwards, the RBN has been active, automatically spotting the frequency at which any station calling CQ was transmitting. To reflect possible use of RBN information, the augmented files include a fourteenth column. For the sake of uniformity, this column is present in all the augmented files, regardless of whether the RBN actually contributed useful information to a particular contest.

Each QSO has one of several characters in the fourteenth column of flags. These characters should be interpreted as follows:

'-'
No useful RBN-derived information is available for this QSO.

'0'
The worked station (i.e., the second call on the log line) appears to have begun to CQ on this frequency within (roughly) 60 seconds prior to the QSO.

'A' to 'Z'
For the nth letter of the alphabet: the worked station appears to have been CQing on this frequency for (roughly) n minutes prior to the QSO.

'+'
The worked station appears to have been CQing for more than 26 minutes on this frequency.

'<'
Because the the RBN is distributed, and because each contest entrant station has its own clock, there is generally a skew between the reading of the clock of the station making the QSO and the timestamp from the RBN at which it believes a posting was made (indeed, it's unclear from the RBN's [lack of] documentation exactly how the timestamp on an individual RBN posting is to be interpreted). If the character '<' appears in the the RBN column, it indicates that the raw values of the clocks suggest that the QSO took place up to two minutes before the RBN reported the worked station commencing to CQ at this frequency. When this occurs, the most likely interpretation is that there is non-negligible skew between the two clocks, and the station was actually worked almost as soon as a CQ was posted by the RBN. But it might also mean that the entrant was simply lucky and found the CQing station just as it fired up on a new frequency.

Notes:

The encoding of some of the flags requires subjective decisions to be made as to whether the flag should be true or false; consequently, and because the ARRL has yet to understand the importance of making the scoring code public, the value of a flag for a specific QSO line in some circumstances might not match the value that the ARRL has assign. (Also, the ARRL has more, non-public, data available.)
I made no attempt to deduce or infer the run status of a QSO in the second party's log (if such exists), regardless of the status in the first party's log. This allows one cleanly to perform correct statistical analyses anent the number of QSOs made by running stations merely by excluding QSOs marked with a U in column k.
No attempt is made to detect the case in which both participants of a QSO bust the other station's call. This is a problematic situation because of the relatively high probability of a false positive unless both stations log the frequency as opposed to the band. (Also, on bands on which split-frequency QSOs are common, the absence of both transmit and receive frequency is a problem.) Because of the likelihood of false positives, it seems better, given the presumed rarity of double-bust QSOs, that no attempt be made to mark them.
The entries for the exchanges in the case of exchange or reverse exchange busts are normalised to two-digit values in the same manner as the exchanges in the cleaned.