2019-08-05

Evaluating Station Contributions to the Reverse Beacon Network

The Reverse Beacon Network (RBN) [which is more akin to a television "network" than a classical computer network] comprises a number of stations that forward callsigns received over the air. The typical number of such forwarding stations [to which I often refer as "posters", as the word "node" has connotations that do not apply to these posting stations] is in the hundreds over the course of a year.

It is reasonable and illuminating to ask how one can evaluate the contribution of  the posting stations to the value of the RBN as a whole. How best to do this is non-obvious, as the RBN has some interesting characteristics. In addition to issues associated with the vagaries of HF propagation between any two points at a given moment, particular difficulties are that: (i) there is no way to know the size of the network at any particular time, and (ii) callsigns posted by posting stations may contain errors.

Several defensible approaches can be taken to performing an evaluation of the contribution of individual posters; here I describe one such that, following experimentation, I have tentatively selected as both useful and computationally efficient.

The RBN data for 2018, for example, comprise a file roughly 11.4 GB in size, containing some 132,000,000 individual posts; the algorithm below can completely process this file in about eleven minutes on a desktop PC (~6,800 bogomips; 16GB total physical RAM), producing results for each individual HF band and for all HF bands as a whole.

The Algorithm


Define the following terms:

  • $B$ is a band; one of 160 ... 10
  • $Y$ is a year; one of 2009 ... 2018
  • ${}_{B}^{Y}P$ is the set of posts for band $B$ in year $Y$
  • ${}_{B}^{Y}S$ is the set of posters in ${}_{B}^{Y}P$
  • ${}_{B}^{Y}P_{i}$ is the set of posts by the $i$th member of ${}_{B}^{Y}S$

For the $i$th member of ${}_{B}^{Y}S$:

  •   set $N\_posts$ equal to the size of ${}_{B}^{Y}P_{i}$
  •   set $N\_empty$ to zero
  •   set $N\_corroborated$ to zero
  •   set $N\_same\_total$ to zero
  •   set $pvalue$ to zero

For each element $E$ of ${}_{B}^{Y}P_{i}$, construct a box $B$ that corresponds to the frequency and time intervals comprising the the frequency of $E \pm 1$ kHz and the time of $E \pm 60$ seconds. Eliminate all posts by the $i$th member of ${}_{B}^{Y}S$ from $B$.

  • If no posts remain in $B$, increment the value of $N\_empty$.
  •  set $N\_same$ to the number of posts for which the call of the posted station exactly matches the call posted by the $i$th member of ${}_{B}^{Y}S$.
 
  If $N\_same > 0$:
 
  •   increment the value of $N\_corroborated$
  •   add $N\_same$ to $N\_same\_total$
  •   add $(1 / (N\_same + 1))$ to $pvalue$
 
After doing this for all the elements of $E$, calculate::

  • $non\_empty\_mean = N\_corroborated / max(N\_posts - N\_empty, 1)$
  • $V = pvalue + (N\_empty \times non\_empty\_mean)$

$V$ is then the value of the $i$th member of ${}_{B}^{Y}S$.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.