2020-05-28

Creating a Local Database From FCC Public Files

The FCC don't make it particularly easy to query their database(s) to find useful information related to a licensee. Yes, there is the web-based "License Search" page, but that's not useful for broader questions (to create a random example: how many Advanced class licensees are there in Texas?).

Fortunately, it's fairly easy to generate a local version of the FCC database, thereby allowing one to use classical tools (such as, for example,  awk) to obtain rapid answers to queries that one deems interesting.

The FCC generates a series of data files weekly (on Sunday), and makes those available over the Internet. (They also generate daily files of changes, as described here.) This allows interested parties to download the data, merge and simplify them, and generate a local file that contains the interesting/useful information.

In total there are eight data files, typically identified by a two-character tag: AM, CO, EN, HD, HS, LA, SC or SF. Each file contains a number of records, and the contents of each record is documented (lightly) in this document. Unfortunately, although that document gives the name of each field, it's not always obvious from the name what the field actually contains. Many (but far from all) fields are documented here; other fields are a matter for guesswork, or even a shrug of the shoulders.

Anyway, for the most part it's easy to combine all the useful fields from the eight data files into a single file in which each record is identified by a (unique) callsign. In practice, it turns out that, at least to my eyes, only the first four of the named data files contain information of general interest.

Consequently, I have created a file, updated weekly (on a Monday) that combines the more interesting data from the eight published data files. The records in this file are delimited by linefeeds, and fields within each record, as in the original data files, are separated by the standard UNIX pipe character, "|". Each output record contains 48 fields.

The contents of the eight original files are described in this document as follows:

Amateur
Position Data Element Definition
[AM]
1   Record Type [AM]            char(2)
2   Unique System Identifier    numeric(9,0)
3   ULS File Number             char(14)
4   EBF Number                  varchar(30)
5   Call Sign                   char(10)
6   Operator Class              char(1)
7   Group Code                  char(1)
8   Region Code                 tinyint
9   Trustee Call Sign           char(10)
10  Trustee Indicator           char(1)
11  Physician Certification     char(1)
12  VE Signature                char(1)
13  Systematic Call Sign Change char(1)
14  Vanity Call Sign Change     char(1)
15  Vanity Relationship         char(12)
16  Previous Call Sign          char(10)
17  Previous Operator Class     char(1)
18  Trustee Name                varchar(50)

Comments
Position Data Element Definition
[CO]
1   Record Type [CO]            char(2)
2   Unique System Identifier    numeric(9,0)
3   ULS File Number             char(14)
4   Call Sign                   char(10)
5   Comment Date                mm/dd/yyyy
6   Description                 varchar(255)
7   Status Code                 char(1)
8   Status Date                 mm/dd/yyyy

Entity
Position Data Element Definition
[EN]
1   Record Type [EN]                char(2)
2   Unique System Identifier        numeric(9,0)
3   ULS File Number                 char(14)
4   EBF Number                      varchar(30)
5   Call Sign                       char(10)
6   Entity Type                     char(2)
7   Licensee ID                     char(9)
8   Entity Name                     varchar(200)
9   First Name                      varchar(20)
10  MI                              char(1)
11  Last Name                       varchar(20)
12  Suffix                          char(3)
13  Phone                           char(10)
14  Fax                             char(10)
15  Email                           varchar(50)
16  Street Address                  varchar(60)
17  City                            varchar(20)
18  State                           char(2)
19  Zip Code                        char(9)
20  PO Box                          varchar(20)
21  Attention Line                  varchar(35)
22  SGIN                            char(3)
23  FCC Registration Number (FRN)   char(10)
24  Applicant Type Code             char(1)
25  Applicant Type Code Other       char(40)
26  Status Code                     char(1)
27  Status Date                     mm/dd/yyyy

1   Record Type [HD]                            char(2)
2   Unique System Identifier                    numeric(9,0)
3   ULS File Number                             char(14)
4   EBF Number                                  varchar(30)
5   Call Sign                                   char(10)
6   License Status                              char(1)
7   Radio Service Code                          char(2)
8   Grant Date                                  mm/dd/yyyy
9   Expired Date                                mm/dd/yyyy
10  Cancellation Date                           mm/dd/yyyy
11  Eligibility Rule Num                        char(10)
12  Reserved                                    char(1)
13  Alien                                       char(1)
14  Alien Government                            char(1)
15  Alien Corporation                           char(1)
16  Alien Officer                               char(1)
17  Alien Control                               char(1)
18  Revoked                                     char(1)
19  Convicted                                   char(1)
20  Adjudged                                    char(1)
21  Reserved                                    char(1)
22  Common Carrier                              char(1)
23  Non Common Carrier                          char(1)
24  Private Comm                                char(1)
25  Fixed                                       char(1)
26  Mobile                                      char(1)
27  Radiolocation                               char(1)
28  Satellite                                   char(1)
29  Developmental or STA or Demonstration       char(1)
30  InterconnectedService                       char(1)
31  Certifier First Name                        varchar(20)
32  Certifier MI                                char(1)
33  Certifier Last Name                         varchar(20)
34  Certifier Suffix                            char(3)
35  Certifier Title                             char(40)
36  Female                                      char(1)
37  Black or African-American                   char(1)
38  Native American                             char(1)
39  Hawaiian                                    char(1)
40  Asian                                       char(1)
41  White                                       char(1)
42  Hispanic                                    char(1)
43  Effective Date                              mm/dd/yyyy
44  Last Action Date                            mm/dd/yyyy
45  Auction ID                                  integer
46  Broadcast Services - Regulatory Status      char(1)
47  Band Manager - Regulatory Status            char(1)
48  Broadcast Services - Type of Radio Service  char(1)
49  Alien Ruling                                char(1)
50  Licensee Name Change                        char(1)
51  Whitespace Indicator                        char(1)

History
Position Data Element Definition
[HS]
1   Record Type [HS]            char(2)
2   Unique System Identifier    numeric(9,0)
3   ULS File Number             char(14)
4   Call Sign                   char(10)
5   Log Date                    mm/dd/yyyy
6   Code                        char(6)

License Attachment
Position Data Element Definition
[LA]
1   Record Type [LA]            char(2)
2   Unique System Identifier    numeric(9,0)
3   Call Sign                   char(10)
4   Attachment Code             char(1)
5   Attachment Description      varchar(60)
6   Attachment Date             mm/dd/yyyy
7   Attachment File Name        varchar(60)
8   Action Performed            char(1)

Special Condition
Position Data Element Definition
[SC]
1   Record Type [SC]            char(2)
2   Unique System Identifier    numeric(9,0)
3   ULS File Number             char(14)
4   EBF Number                  varchar(30)
5   Call Sign                   char(10)
6   Special Condition Type      char(1)
7   Special Condition Code      int
8   Status Code                 char(1)
9   Status Date                 mm/dd/yyyy

License Free Form Special Condition
Position Data Element Definition
[SF]
1   Record Type [SF]                    char(2)
2   Unique System Identifier            numeric(9,0)
3   ULS File Number                     char(14)
4   EBF Number                          varchar(30)
5   Call Sign                           char(10)
6   License Free Form Type              char(1)
7   Unique License Free Form Identifier numeric(9,0)
8   Sequence Number                     integer
9   License Free Form Condition         varchar(255)
10  Status Code                         char(1)
11  Status Date                         mm/dd/yyyy
The following extract from the code that creates the output database maps these on a one-to-one basis to internal identifiers:

[AM]
RECORD_TYPE,
ID,
ULS_NUMBER,
EBF_NUMBER,
CALLSIGN,
OPERATOR_CLASS,
GROUP_CODE,
REGION_CODE,
TRUSTEE_CALLSIGN,
TRUSTEE_INDICATOR,
PHYSICIAN_CERTIFICATION,
VE_SIGNATURE,
SYSTEMATIC_CALLSIGN_CHANGE,
VANITY_CALLSIGN_CHANGE,
VANITY_RELATIONSHIP,
PREVIOUS_CALLSIGN,
PREVIOUS_OPERATOR_CLASS,
TRUSTEE_NAME
 [CO]
RECORD_TYPE,
ID,
ULS_NUMBER,
CALLSIGN,
COMMENT_DATE,
DESCRIPTION,
STATUS_CODE,
STATUS_DATE
 [EN]
RECORD_TYPE,
ID,
ULS_NUMBER,
EBF_NUMBER,
CALLSIGN,
ENTITY_TYPE,
LICENSE_ID,
ENTITY_NAME,
FIRST_NAME,
MIDDLE_INITIAL,
LAST_NAME,
SUFFIX,
PHONE,
FAX,
EMAIL,
STREET_ADDRESS,
CITY,
STATE,
ZIP_CODE,
PO_BOX,
ATTENTION_LINE,
SGIN,
FRN,
APPLICANT_TYPE_CODE,
APPLICANT_TYPE_CODE_OTHER,
STATUS_CODE,
STATUS_DATE
 [HD]
RECORD_TYPE,
ID,
ULS_NUMBER,
EBF_NUMBER,
CALLSIGN,
LICENSE_STATUS,
RADIO_SERVICE_CODE,
GRANT_DATE,
EXPIRED_DATE,
CANCELLATION_DATE,
ELIGIBILITY_RULE_NUM,
RESERVED_1,
ALIEN,
ALIEN_GOVERNMENT,
ALIEN_CORPORATION,
ALIEN_OFFICER,
ALIEN_CONTROL,
REVOKED,
CONVICTED,
ADJUDGED,
RESERVED_2,
COMMON_CARRIER,
NON_COMMON_CARRIER,
PRIVATE_COMM,
FIXED,
MOBILE,
RADIOLOCATION,
SATELLITE,
DEVELOPMENTAL_STA_DEMONSTRATION,
INTERCONNECTED_SERVICE,
CERTIFIER_FIRST_NAME,
CERTIFIER_MIDDLE_INITIAL,
CERTIFIER_LAST_NAME,
CERTIFIER_SUFFIX,
CERTIFIER_TITLE,
FEMALE,
BLACK_AFRICAN_AMERICAN,
NATIVE_AMERICAN,
HAWAIIAN,
ASIAN,
WHITE,
HISPANIC,
EFFECTIVE_DATE,
LAST_ACTION_DATE,
AUCTION_ID,
BROADCAST_SERVICES_REGULATORY_STATUS,
BAND_MANAGER_REGULATORY_STATUS,
BROADCAST_SERVICES_SERVICE_TYPE,
ALIEN_RULING,
LICENSEE_NAME_CHANGE,
WHITESPACE_INDICATOR
 [HS]
RECORD_TYPE,
ID,
ULS_NUMBER,
CALLSIGN,
LOG_DATE,
CODE
 [LA]
RECORD_TYPE,
ID,
CALLSIGN,
ATTACHMENT_CODE,
ATTACHMENT_DESCRIPTION,
ATTACHMENT_DATE,
ATTACHMENT_FILENAME,
ACTION_PERFORMED
 [SC]
RECORD_TYPE,
ID,
ULS_NUMBER,
EBF_NUMBER,
CALLSIGN,
SPECIAL_CONDITION_TYPE,
SPECIAL_CONDITION_CODE,
STATUS_CODE,
STATUS_DATE
 [SF]
RECORD_TYPE,
ID,
ULS_NUMBER,
EBF_NUMBER,
CALLSIGN,
LICENSE_FREEFORM_TYPE,
UNIQUE_LICENSE_FREEFORM_ID,
SEQUENCE_NUMBER,
LICENSE_FREEFORM_CONDITION,
STATUS_CODE,
STATUS_DATE

The 48 output fields selected from the above lists are (arranged in groups of ten for easy counting):

ID,
CALLSIGN,
OPERATOR_CLASS,
GROUP_CODE,
REGION_CODE,
TRUSTEE_CALLSIGN,
TRUSTEE_INDICATOR,
SYSTEMATIC_CALLSIGN_CHANGE,
VANITY_CALLSIGN_CHANGE,
VANITY_RELATIONSHIP,

PREVIOUS_CALLSIGN,
PREVIOUS_OPERATOR_CLASS,
TRUSTEE_NAME,
COMMENT_DATE,
DESCRIPTION,
CO_STATUS_CODE, (i.e., STATUS_CODE from [CO])
CO_STATUS_DATE, (i.e., STATUS_DATE from [CO])
ENTITY_NAME,
FIRST_NAME,
MIDDLE_INITIAL,

LAST_NAME,
SUFFIX,
PHONE,
FAX,
EMAIL,
STREET_ADDRESS,
CITY,
STATE,
ZIP_CODE,
PO_BOX,

ATTENTION_LINE,
FRN,
APPLICANT_TYPE_CODE,
APPLICANT_TYPE_CODE_OTHER,
EN_STATUS_CODE, (i.e., STATUS_CODE from [EN])
EN_STATUS_DATE, (i.e., STATUS_DATE from [EN])
LICENSE_STATUS,
RADIO_SERVICE_CODE,
GRANT_DATE,
EXPIRED_DATE,

CANCELLATION_DATE,
ELIGIBILITY_RULE_NUM,
REVOKED,
CONVICTED,
ADJUDGED,
EFFECTIVE_DATE,
LAST_ACTION_DATE,
LICENSEE_NAME_CHANGE
The contents of these fields are based on the original equivalent entries in the original data files. The entries for the fields are subject to the following transformations before being written to the output file:
  • The entry is converted to upper case;
  • Any line feeds (yes, the FCC allows line feeds within a field) are converted to the four-character sequence: <LF>;
  • Leading and trailing spaces are removed;
  • If the field is a date, it is converted from FCC format (mm/dd/yyyy) to ISO 8601 extended format: YYYY-MM-DD.
The latest output file created in this manner (and its MD5 checksum) may be downloaded from this directory.

The full source code to generate the output file may be downloaded here.

To create the binary from the source code, go to the directory that contains the makefile and type:
make fcc-db
This should generate the executable program as: bin/fcc-db. The program may be executed from within the bin directory as:
fcc-db [directory]
where [directory] is the name of the directory that contains the input FCC AM.dat, CO.dat, EN.dat and HD.dat files. Those files should be processed and the output written to stdout.

For what it's worth, it takes somewhat less than 15 seconds for the program to execute to completion on my desktop computer if stdout is redirected to an output file.

2020-05-26

RBN Posts From Current Year Added

The RBN data directory on adrive now contains a file of the RBN posts from the current year. This file is updated once per week, on Monday. As before, files containing all the posts from prior years are available in the same directory.

2020-03-16

RBN-based Proxy Metric for Relative CW Activity

The Reverse Beacon Network (RBN) makes available an Internet-based historical record of all CW stations on the HF amateur bands detected, decoded and posted by the stations that participate in the RBN. This suggests the possibility of using those data in some form to provide a more or less reliable indicator of the number of CW-active stations since early 2009, when the RBN came into being.

Performing such an analysis, though, is not a trivial matter because of several characteristics of the RBN. In particular, one cannot simply count the number of callsigns detected over a period and assume that that number may reasonably be simply compared to the number of callsigns detected over any other period of similar duration. The following characteristics (at least) argue against such a naïve comparison:
  1. The RBN comprises, at any one moment, a number of receiving stations that post detected CW callsigns to the Internet. However, no record is kept of the number of stations actively recording and posting at any given moment; that number varies widely over time.
  2. The geographical distribution of posting stations varies from moment to moment.
  3. Even at the best of times, large swathes of the Earth's surface contain no posting stations.
  4. Different posting stations listen to different sets of HF bands, and a single posting station may monitor different bands at different times.
  5. The software that drives the RBN is tuned with the intent of eliminating callsigns of stations that are calling other stations. That is, the RBN is intended to post only the callsigns of stations seeking contacts (generally by calling CQ).
  6. The software may decode callsigns incorrectly, and thus post a callsign that is not in fact present on the air.
  7. Hardly the RBN's fault, but some stations send so poorly that several variants of their call may be decoded and posted.
The effect of these characteristics is to pollute any list of posted callsigns, and also to cause some callsigns that were actually used on the air not to appear in any such list. The question is, then: is it possible still to produce a useful metric from the RBN data?

First, let's look at issue number 5 above: namely that callsigns only of CQing stations are posted. In the past this might have been regarded as a serious shortcoming; it is less clear that the same is true today, especially if we are attempting to measure (in some sense) those stations whose operators actively prefer and enjoy CW. With the modern proliferation of huge CW pile-ups on major DXpeditions there are now many stations whose goal is simply to work the DXpedition (perhaps for a "slot"; perhaps for a new one) and operators of such stations may well have next to no knowledge of CW at all, using pre-recorded macros for transmission, and either code readers or simple aural recognition of the pattern of their callsign to decode the high-speed CW from the DX station. It would not seem right to include such stations in any listing of CW activity, since their use of CW is mere happenstance. So issue number 5 merely serves to cause us to define "CW activity" in a particular manner that might not have been appropriate in the past but nowadays, in the current milieu, seems quite defensible, and perhaps even preferable to merely listing all callsigns heard on the air. Thus, "CW activity" herein means "CW activity by stations calling CQ".

Having finessed issue #5, the others remain; to some extent they must all affect the calls posted by the RBN. Any one of them might be amenable to reasonable analysis or at least intelligent guesswork as to its effect; taken together, it seems prudent to simply say that in any group of calls posted over a particular time, some calls are likely to be in error, and some genuine calls that were active may well have been missed. That much is obvious, but it does not necessarily mean that a reasonable metric cannot be derived with pragmatic simplicity from the actual postings made by the RBN.

We can begin by restricting the domain of the problem, for now, to a single band. The obvious choice is 20m, as it has a reasonably consistent high level of activity throughout a solar cycle. So let's look at some numbers and graphs created from the raw RBN postings over the period from its inception in early 2009 to the end of 2019, using the calendar year as the basic element of time (thus, the values for 2009 will be affected by the shorter period for which samples are available, in addition to any effects from the  considerations listed above).

To get some feel for the data, we begin by defining a value $V(n)$ that is the number of callsigns that appear exactly $n$ times within a particular year. We can then plot the values of $V(1)$ to $V(100)$ for each of the years for which we have data:


Despite the lack of detail at this scale, a couple of interesting things are immediately apparent:
  1. The plots of $V(n)$ are remarkably consistent from year to year;
  2. The transition between large negative gradient and shallow negative gradient occurs over a relatively short range of $n$.
It is perhaps useful to put into words the most obvious feature of the plot: a large percentage of the calls posted by the RBN occur just a very few times. If one thinks about this, if a call appears only once (or just a small number of times) in the course of a year, it would seem overwhelmingly likely that it is a bust of some kind: the circumstances that would lead to a valid call being correctly decoded and posted only once (or a small number of times) in an entire year would seem to be overwhelmingly unlikely. Similarly, if a call appears a relatively large number of times, it is highly unlikely to be a bust. We shall return to this simple fact below.

By switching to a logarithmic scale, we can see more detail in the data:


Apart from 2009, when, as mentioned above, there are fewer data, the plots for each year essentially lie on top of each other; further, there are no obvious characteristics other than a more or less monotonic decrease in $V(n)$ as $n$ increases. This is good, as it suggests that there is some robustness to this kind of analysis, and therefore it may be possible to use it as a basis for calculation of an activity metric.

What happens if we look at a band with great variation over the course of a solar cycle; 10m, for instance?

No new features appear, further suggesting that this is a robust approach.

The graphs suggest that an activity metric MCW may be defined over an interval of time by:
$$
MCW = \sum_{n=1}\alpha(n) V(n)
$$
where the form of $\alpha(n)$ is to be determined.

For small values of $n$, all or almost all calls will be busts -- that is, $\alpha(\rm small) \approx 0$; for large values of $n$ no or almost no calls will be busts  -- that is, $\alpha(\rm large) \approx 1$.

We can reasonably define "large" to be the lowest value $n$ for which $V(n)$ is statistically indistinguishable from $V(n+1)$ and (to be safe) for which $V(n+1)$ is statistically indistinguishable from $V(n+2)$. And we can reasonably define "statistically indistinguishable" as meaning that $V(m+1)$ lies in the range:
$$
V(m) \pm 2\sqrt{(V(m))}
$$

We can then define "small" simply as unity, and $\alpha(n)$ as a linear function with the value zero for $n=1$ and unity for $n \ge L$, where $L$ is the smallest "large" number defined by the procedure described above. [NB there is nothing magic about these precise definitions; reasonable variations on this theme provide essentially identical relative results from year to year -- another indication of robustness. One simply needs to be careful to maintain the same definition of $\alpha(n)$ for all the periods under examination. [Actually, it works even with the same meta-definition, but let's not bother with that]] By this definition, "large" will vary from year to year and from band to band, meaning that $\alpha(n)$ will similarly vary. And that is one way to proceed, but it means calculating the meaning of "large" for each year and each band. A simpler way to analyze the data is to choose a universal value for "large" that is large enough to encompass all years and bands being analysed.

The actual values of "large" as defined above, for all the years and bands for which we have annual data, are given in the following table:

Band20092010201120122013201420152016201720182019
10m711121313141414999
12m48161012101288710
15m1011161717171213161013
17m7139112110129131811
20m2116161820211927242120
30m1217141518131318121514
40m1418172224212523202015
80m1622121815131414192222
160m912111011111218121414
HF1819272324233223202222

The last line, marked "HF" is derived from all the postings for the individual bands for a given year. [Note that this bears no simple relationship to the other lines; if this isn't obvious, just think of a call that is posted once on 20m and once on 160m in the course of a year: this will contribute to $V(1)$ on 20m and 160m, but to $V(2)$ on HF.]

The largest value in the above table is 32 (HF, 2015). Therefore one plausible $\alpha(n)$ function would go linearly from a value of zero at $n=1$ to a value of 1 at $n=35$, and have the value 1 at all values >35.

Applying this procedure, we finally arrive at the following figure:


The CW Activity Metric as defined above is arguably a poor choice for showing small changes in activity over time (although it is not obvious how to create a better metric for such purposes given the limitations of the RBN listed above); but if the metric shows little variation over a period, it is hard to see how the underlying CW activity can have varied much either, absent a rather precisely anti-correlated change in the factors that lead to busts in the RBN postings.

The effect of the solar cycle is readily apparent in the figure (see the lines for the higher frequency bands, where activity does indeed change as a function of the cycle, and the metric reflects this change); however, taking HF activity as a whole, it is remarkable how consistent activity has been for many years. There is certainly no evidence from this analysis of the RBN data that CW activity taken as a whole has shown any consistent wane over the past decade. 

2020-03-09

Signals from the VP8PJ and VP8ORK DXpeditions to South Orkney

A few interesting (I think, anyway) graphs of CW signals from the most recent two DXpeditions to South Orkney as reported by the Reverse Beacon Network:





Quite a difference.

Worth noting is that:
  1. VP8ORK was operational in a period with the solar flux index in the high 70s, whereas VP8PJ was operational close to the cycle minimum, with a solar flux index in the high 60s.

  2. During the VP8ORK DXpedition a total of 100 RBN stations were active; during the VP8PJ DXpedition, a total of 234 RBN stations were active.

2020-02-05

Busting Calls: CQ WW 2019

Prior posts in this series:
Throughout this post, I apply the procedures developed in the second post above.

For the purpose of this post, only verified QSOs are counted.

Lowest Probability


I begin with an ordered list of the stations with the lowest probabilities of busting a call in 2019 CQ WW SSB.

2019 CQ WW SSB -- weighted mean values of $p_{bust}$
Position Call weighted mean $Q_v$ $B$
1 DL7URH 0.0008 1,125 0
2 DF2RG 0.0009 1,037 0
3 K0KX 0.0011 840 0
4 G4PVM 0.0012 819 0
5 N4PQX 0.0012 779 0
6 WA2FZB 0.0012 774 0
7 K3WJV 0.0013741 0
8 N3ZA 0.0014 703 0
9 DL1MHJ 0.0014 682 0
10 G4IIY 0.0017 579 0

And for 2019 CQ WW CW:

2019 CQ WW CW -- weighted mean values of $p_{bust}$ (all)
Position Call weighted mean $Q_v$ $B$
1 YU1RA 0.0006 1,515 0
2 K0KX 0.0007 1,387 0
3 RT3N 0.0008 1,236 0
4 UA3QPA 0.0008 1,204 0
5 W3KB 0.0008 1,179 0
6 N2GC 0.0008 1,168 0
7 VO2AC 0.0008 1,154 0
8 SQ8N 0.0008 1,118 0
9 YO8DOH 0.0009 1,104 0
10 OM6RM 0.0009 1,064 0

Well done to K0KX for appearing (and appearing near the top) of both tables.

It is interesting to plot the aggregated probability function for $p_{bust}$, weighted by the number of verified QSOs, $Q_v$, for all stations:


In case it isn't clear, the location of the solid vertical lines represent the weighted means of the probability curves.

We can limit the analysis to calling stations (i.e., not the running station).

2019 CQ WW SSB -- weighted mean values of $p_{bust}$ (no-run)
Position Call weighted mean $Q_v$ $B$
1 DL7URH 0.0009 1,041 0
2 DF2RG 0.0009 1,017 0
3 K0KX 0.0012 819 0
4 G4PVM 0.0012 814 0
5 G3TXF 0.0012 782 0
6 N4PQX 0.0013 745 0
7 K3WJV 0.0013 735 0
8 I2WIJ 0.0013 715 0
9 WA2FZB 0.0013 713 0
10 OK6Y 0.0014 709 0

2019 CQ WW CW -- weighted mean values of $p_{bust}$ (no-run)
Position Call weighted mean $Q_v$ $B$
1 IR4X 0.0005 1,762 0
2 UT4U 0.0006 1,612 0
3 RT4M 0.0006 1,452 0
4 K0KX 0.0009 1,103 0
5 K3PH 0.0009 1,094 0
6 K1AR 0.0009 1,075 0
7 YO4NF 0.0009 1,075 0
8 LA4C 0.0009 1,074 0
9 RG5A 0.0009 1,062 0
10 AA9A 0.0009 1,058 0


And similarly for running stations.

2019 CQ WW SSB -- weighted mean values of $p_{bust}$ (run)
Position Call weighted mean $Q_v$ $B$
1 DG5MLA 0.0020 482 0
2 K6XX 0.0022 439 0
3 OG7A 0.0023 1,759 3
4 S58Y 0.0024 847 1
5 M0BJL 0.0024 409 0
6 9A2EU 0.0026 374 0
7 DM4X 0.0027 2,183 5
8 PT2AW 0.0029 333 0
9 N4BP 0.0030 330 0
10 IR4M 0.0030 4,042 11

2019 CQ WW CW -- weighted mean values of $p_{bust}$ (run)
Position Call weighted mean $Q_v$ $B$
1 G4IIY 0.0007 1,369 0
2 YU1RA 0.0007 1,267 0
3 VO2AC 0.0012 825 0
4 LY5W 0.0014 697 0
5 OQ5M 0.0014 2,129 2
6 BY8MA 0.0016 596 0
7 DL6KWN 0.0016 589 0
8 N3RS 0.0017 1,815 2
9 V55A 0.0017 4,184 6
10 K3PH 0.0018 1,113 1


We can also look at the changes over the period from 2005 to 2019.

First for all QSOs:


Then for calling stations:


And for running stations:


I think it's also interesting to see who appears to have the lowest probability of busting  a call over an extended period. So, for the last ten years:

2010--2019 CQ WW SSB -- weighted mean values of $p_{bust}$ (all)
Position Call weighted mean $Q_v$ $B$
1 JM1NKT 0.0007 3,040 1
2 NW0M 0.0007 2,875 1
3 KV1J 0.0008 3,851 2
4 WA1ZYX 0.0008 1,196 0
5 NB3C 0.0009 1,066 0
6 IK4OMU 0.0009 1,039 0
7 OK2VWB 0.0009 1,023 0
8 AA1BU 0.0009 1,017 0
9 LY3CY 0.0009 5,333 4
10 ES2MC 0.0009 4,223 3

2010--2019 CQ WW CW -- weighted mean values of $p_{bust}$ (all)
Position Call weighted mean $Q_v$ $B$
1 NW0M 0.0003 2,661 0
2 DJ1YFK 0.0004 2,150 0
3 JA1QOW 0.0004 4,774 1
4 HL1VAU 0.0004 4,718 1
5 KM3T 0.0005 1,826 0
6 EU4E 0.0005 3,715 1
7 WB4TDH 0.0006 5,230 2
8 R7MY 0.0006 3,266 1
9 VX7SZ 0.0006 1,491 0
10 UA3QPA 0.0007 3,095 1

Congratulations are definitely due to NW0M for that effort.

Highest Probability


We can also look at the calls associated with the highest probability of busting calls in either the forward or the reverse direction:

2019 SSB -- Most Busts
Position Call QSOs Busts % Busts
1 CN3A 9,554 195 2.0
2 EF8R 11,853 195 1.6
3 V26B 6,928 160 2.3
4 HI3LT 4,756 159 3.3
5 IR7T 5,878 152 2.6
6 PR4T 4,447 146 3.3
7 ED1R 6,409 132 2.1
8 LZ9W 8,457 130 1.5
9 ZF1A 5,986 128 2.1
10 OT5A 5,595 121 2.2

2019 CW -- Most Busts
Position Call QSOs Busts % Busts
1 9A/AI6V 1,833 263 14.3
2 EF8R 14,040 212 1.5
3 CN3A 11,458 168 1.5
4 JT5DX 6,338 161 2.5
5 NP2P 3,908 147 3.8
6 PZ5W 7,468 140 1.9
7 E7DX 6,794 129 1.9
8 ZW8T 1,276 127 10.0
9 F6KOP 4,745 125 2.6
10 TK0C 8,221 124 1.5


2019 SSB -- Highest Percentage of Busts (≥100 QSOs)
Position Call QSOs Busts % Busts
1 RQ2K 177 28 15.8
2 YB2BNN 257 39 15.2
3 OH1POR 140 20 14.3
4 EA4GWL 296 39 13.2
5 DO1GMW 100 13 13.0
6 OZ0QF 138 17 12.3
7 K6DBS 140 16 11.4
8 PR7AD 106 12 11.3
9 YB7WW 151 17 11.3
10 OK2BOZ 117 13 11.1

2019 CW -- Highest Percentage of Busts (≥100 QSOs)
Position Call QSOs Busts % Busts
1 LZ1BY 203 41 20.2
2 W4IT 302 57 18.9
3 K1EEE 132 24 18.2
4 K4KAY 143 26 18.2
5 TF3EO 151 27 17.9
6 W6SFG 175 30 17.1
7 K7TRF 133 22 16.5
8 VA3ZNW 110 18 16.4
9 OZ5KU 110 18 16.4
10 UV2IM 123 20 16.3


2019 SSB -- Most Reverse Busts
Position Call QSOs Reverse Busts % Reverse Busts
1 PR4T 4,447 336 7.6
2 EF8R 11,853 190 1.6
3 DF0HQ 7,267 186 2.6
4 HC0E 2,413 145 6.0
5 FY5KE 7,209 130 1.8
6 TM3R 4,494 118 2.6
7 ED7W 2,747 118 4.3
8 RO2E 3,262 117 3.6
9 HI3LT 4,756 112 2.4
10 IO5O 4,592 110 2.4

2019 CW -- Most Reverse Busts
Position Call QSOs Reverse Busts % Reverse Busts
1 HG5A 3,914 503 12.9
2 JS3CTQ 2,390 296 12.4
3 DF0HQ 8,553 281 3.3
4 IS0SWW 5,141 273 5.3
5 EF8R 14,040 263 1.9
6 4U25B 2,599 245 9.4
7 DR4A 6,589 220 3.3
8 RM9A 8,611 191 2.2
9 CN3A 11,458 190 1.7
10 ES9C 6,858 186 2.7


2010--2019 SSB -- Most Busts
Position Call QSOs Busts % Busts
1 LZ9W 85,297 1,486 1.7
2 CN3A 88,174 1,460 1.7
3 OT5A 69,012 1,350 2.0
4 PJ2T 71,872 1,339 1.9
5 A73A 65,680 1,281 2.0
6 EF8R 62,099 1,045 1.7
7 V26B 60,324 964 1.6
8 HG7T 57,597 831 1.4
9 D4C 58,631 806 1.4
10 RT6A 44,900 783 1.7

2010--2019 CW -- Most Busts
Position Call QSOs Busts % Busts
1 PJ2T 94,865 1,311 1.4
2 LZ9W 99,126 1,226 1.2
3 PV8ADI 8,127 1,177 14.5
4 PI4CC 50,515 977 1.9
5 D4C 64,411 860 1.3
6 9A1A 104,579 832 0.8
7 ZW8T 11,510 822 7.1
8 RW0A 50,926 808 1.6
9 PJ4A 71,047 803 1.1
10 JA3YBK 55,320 765 1.4


2010--2019 SSB -- Highest Percentage of Busts (≥500 QSOs)
Position Call QSOs Busts % Busts
1 PV8ADI 858 139 16.2
2 K2JMY 2,062 305 14.8
3 EA7JQT 572 71 12.4
4 PU1MMZ 544 62 11.4
5 EA1HTF 1,389 153 11.0
6 PU2TRX 868 94 10.8
7 EA4GWL 686 71 10.3
8 YB9KA 557 56 10.1
9 LU6DCN 604 57 9.4
10 E20WXA 650 61 9.4

2010--2019 CW -- Highest Percentage of Busts (≥500 QSOs)
Position Call QSOs Busts % Busts
1 W2UDT 811 167 20.6
2 4Z5FW 527 108 20.5
3 BD3MV 1,091 217 19.9
4 DJ5UZ 766 135 17.6
5 YO7LYM 1,711 291 17.0
6 JA3AHY 554 92 16.6
7 WP3Y 570 93 16.3
8 AE3D 1,016 161 15.8
9 YU1NIM 619 98 15.8
10 IK0YUO 558 83 14.9


2010--2019 SSB -- Most Reverse Busts
Position Call QSOs Reverse Busts % Reverse Busts
1 DF0HQ 79,637 2,309 2.9
2 JA3YBK 40,441 1,286 3.2
3 K3LR 69,289 1,062 1.5
4 CN2R 55,540 956 1.7
5 CN3A 88,174 921 1.0
6 WE3C 32,201 876 2.7
7 IK2YCW 26,491 857 3.2
8 EF8R 62,099 823 1.3
9 W3LPL 53,188 814 1.5
10 GM2T 38,631 802 2.1

2010--2019 CW -- Most Reverse Busts
Position Call QSOs Reverse Busts % Reverse Busts
1 JS3CTQ 25,282 3,215 12.7
2 DF0HQ 88,116 3,100 3.5
3 ES9C 66,020 2,020 3.1
4 W2FU 56,534 1,489 2.6
5 K3LR 75,096 1,451 1.9
6 DR1A 47,148 1,280 2.7
7 HG7T 62,301 1,270 2.0
8 IR4X 45,823 1,255 2.7
9 NR4M 55,964 1,253 2.2
10 DR4A 50,977 1,240 2.4


2010--2019 SSB -- Highest Percentage of Reverse Busts (≥500 QSOs)
Position Call QSOs % Reverse Busts
1 CW90A 1,370 30.9
2 BA8AG 752 16.4
3 BW2/KU1CW 940 12.3
4 OG60F 4,258 11.4
5 ZP6DYA 1,226 11.3
6 V84SCQ 806 10.5
7 LU9DDJ 701 10.4
8 BV55D 919 10.2
9 JG3SVP 1,270 10.1
10 PP5BS 895 9.5

2010--2019 CW -- Highest Percentage of Reverse Busts (≥500 QSOs)
Position Call QSOs % Reverse Busts
1 RZ3VO 892 100.0
2 G3RWF 943 99.9
3 YT65A 1,149 37.2
4 5K0A 1,853 32.1
5 PE75W 1,408 29.3
6 OG55W 3,265 28.6
7 DP65HSC 516 16.9
8 5J1E 1,523 15.6
9 SB0A 1,102 15.5
10 K3HW 1,044 14.9

In tables of reverse busts, one sometimes finds what seems like an unreasonable number of reverse busts (as, in the last table, for RZ3VO and G3RWF). This is generally caused by a discrepancy between the call actually sent by the listed station and the one recorded as being sent in at least some QSOs in the station's log.

2020-01-30

Continent-Based Analyses from 2019 CQ WW SSB and CQ WW CW logs

In addition to zone-based analyses, we can perform similar analyses based on continent rather than zone using the various public CQ WW logs (cq-ww-2005--2019-augmented.xz; see here for details of the augmented format) for the period from 2005 to 2019.

Continent Pairs


We start by looking at the number of QSOs for pairs of continents from the contests for 2019.

The procedure is simple. We consider only QSOs that meet the following criteria:
  1. marked as "two-way" QSOs (i.e., both parties submitted a log containing the QSO);
  2. no callsign or zone is bust by either party.
A counter is maintained for every possible pair of continents and the pertinent counter is incremented once for each distinct QSO between stations in those continents.

Separate figures are provided below for each band, led by a figure integrating QSOs on all bands. The figures are constructed in such a way as to show the results for both the SSB and CW contests on a single figure. (Any pair of continents with no QSOs that meet the above criteria appears in black on the figures.)








Continents and Distance


Below is a series of figures showing the distribution of distance for QSOs as a function of continent.

Each plot shows a colour-coded distribution of the distance of QSOs for each continent, with the data for SSB appearing above the data for CW within each continent.

For every half-QSO in a given continent, the distance of the QSO is calculated; in ths way, the total  number of half-QSOs in bins of width 500 km is accumulated. Once all the QSOs for a particular contest have been binned in this manner, the distribution for each continent is normalised to total 100% and the result coded by colour and plotted. The mean distance for each continent and mode is denoted by a small white rectangle added to the underlying distance distribution. The 99% confidence range of the value of mean is marked by a small blue rectangle (typically entirely subsumed by the white rectangle). The median is marked with a vertical brown rectangle.

As usual, only QSOs for which logs have been provided by both parties, and which show no bust of either callsign or zone number are included. Bins coloured black are those for which no QSOs are present at the relevant distance.

The resulting plots are reproduced below.









Half-QSOs Per Continent, 2005 to 2019


A simple way to display the activity in the CQ WW contests is to count the number of half-QSOs in each continent (a single QSO contains two half-QSOs, so a single QSO may contain two different continents or the same continent twice). We count half QSOs, making sure to include each valid QSO only once (that is, if the same QSO appears in two submitted logs, it is counted only once).

If we do this for the entire contest without taking the individual bands into account, we obtain this figure:


The plot shows data for both SSB and CW contests over the period from 2005 to 2019. I include only QSOs for which both parties submitted a log and neither party bust either the zone or the call of the other party. The black triangles represent contests in which no half-QSOs were made from (or to) a particular continent. We can, of course, generate equivalent plots on a band-by-band basis:







As in prior years, the activity from EU so overwhelms these figures that in order to get a feel for the activity elsewhere, we need to move to a logarithmic scale:








Intra-Continental QSOs


We can also easily look at the percentage of QSOs that are between two stations on the same continent, and in particular between two EU stations:


So, for example, in CQ WW CW in 2019, a little over 30% of QSOs were within the same continent; about 24% of QSOs -- nearly a quarter -- were between two European stations. Yet, judging from the name, which is "CQ World Wide DX Contest", this is supposed to be a world-wide DX contest, not a European QSO party.







Flogging a dead horse, on 160m fully 50% of QSOs were between two European stations.