The summary file, after being uncompressed, comprises a single large table of values separated by white space. The name of each column (there are nine columns in all) is on the first row. The columns are:
- band: a string that identifies the band pertaining to this row. Typical values are "15m" or "160m"; if a row contains data that are not distinguished by band, then the characters "NA" are used.
- mode: a string that identifies the mode pertaining to this row. Typical values are "CW" or "RTTY"; if a row contains data that are not distinguished by mode, then the characters "NA" are used.
- type: a single character that identifies whether the data on this row are for a period of a year ("A"), a month ("M") or a day ("D").
- year: the numeric four-digit value of the year to which the current row pertains.
- month: the numeric value of the month (January = 1, etc.) of the data in this row. If the data are of type A or D, then this element has the value "NA".
- doy: the numeric value of the day number of the year (January 1st = 1, etc.). The maximum value in each year is 366 (even if the year is not a leap year). In the event that the year is not a leap year, the data in columns 7, 8 and 9 will be set to 0 when doy is 366. If the data are of type A or M, then this element has the value "NA".
- posts: the total number of posts recorded by the RBN for the band, mode and period identified by the first six columns.
- calls: the total number of distinguishable calls recorded by the RBN for the band, mode and period identified by the first six columns.
- posters: the total number of distinguishable posters recorded by the RBN for the band, mode and period identified by the first six columns.
band mode type year month doy posts calls posters
NA NA A 2009 NA NA 5007040 143724 151
This tells us that the first line of actual data comprises annual data for the year 2009, with no separation by band or mode. In 2009, we see that there were 50,007,040 posts of 143,724 callsigns by 151 posters.
The summary file allows rather rapid analysis of the RBN. For example, this plot in this post was originally generated rather tediously from the entire 50GB dataset. Using the summary file, the same plot --
-- can be generated in less than ten seconds.
The summary file can be used to generate a simple plot of the growth of the RBN since its inception just as quickly:
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.