## 2017-08-09

### Further Sanity Checking of the LANL GPS Charged-Particle Dataset

Prior posts in this series may be found at:
In this post I continue looking at the values of fields for the CXD experiment carried on ns53 through ns73.

#### Sanity check for collection_interval

The values for collection_interval seem reasonable. (Recall that we processed these values earlier, at stage 4 and stage 10.)

#### Sanity check for year

The values for year seem reasonable.

#### Sanity check for decimal_year

The documentation for decimal_year states:

 Column Variable name type Dim. description 24 decimal_year double 1 decimal year = year + (decimal_day-1.0)/(days in year)

But if we look at the values in the files, we see an immediate problem. The values are formatted as, for example:
2.007879e+03
That is, the smallest increment is 0.000001e+03 years, which is 0.001 years, or somewhat less than nine hours. This is obviously useless.

The decimal_year field is redundant, as it can be derived, as stated in the documentation, from the year field and the decimal_day field. But if the decimal_year field is going to be present, it should at least have the same value as the result of the calculation in the documentation. There is obvious use for this field if it has an accurate value, so that brings us to define stage 26:

#### Stage 26: Recalculate decimal_year

The values of decimal_day are recorded to 10-6 day. So decimal_year should be recorded to (roughly) 10-9 year. We can accomplish the reformatting easily, by applying the following script, do-stage-26.py,  to each file:

#!/usr/bin/env python
# -*- coding: utf8 -*-

import re
import sys

filename = sys.argv[1]

with open(filename) as records:
for line in records:
fields = line.split()
decimal_day_str = fields[0]
year_str = fields[22]
decimal_day_value = float(decimal_day_str)
year_value = int(year_str)

days_in_year = 365

if ( (year_value == 2004) or (year_value == 2008) or (year_value == 2012) or (year_value == 2016)):
days_in_year = 366

decimal_year_value = year_value + (decimal_day_value - 1.0) / days_in_year

decimal_year_str = '{:14.9f}'.format(decimal_year_value)

fields[23] = decimal_year_str
newline = " ".join(fields)

print newline

Applying this to each file:

for file in ns[567]*
do
./do-stage-26.py $file | tr -s " " | sed 's/$//' > ../gps-stage-26/$file done The resulting maximum and minimum values of decimal_year look like this: decimal_year Satellite Minimum Maximum 53 2005.769864123 2016.999996964 54 2001.169868310 2016.999995825 55 2007.879454718 2017.000000000 56 2003.145206589 2016.999995066 57 2008.051914464 2017.000000000 58 2006.939732433 2016.999995825 59 2004.256832087 2016.999999620 60 2004.562844536 2016.999993169 61 2004.887981525 2016.999998104 62 2010.465755041 2016.999998325 63 2011.578088408 2016.999999653 64 2014.164385844 2016.923493000 65 2012.784155790 2016.999996205 66 2013.435620655 2016.999996964 67 2014.394527493 2016.999996396 68 2014.624659342 2016.999995003 69 2014.912332318 2016.999999716 70 2016.120224178 2016.999992601 71 2015.276714230 2016.999999874 72 2015.583565671 2016.999995825 72 2015.871238110 2016.999997628 #### Sanity check for SVN_number The documentation for SVN_number states:  Column Variable name type Dim. description 25 SVN_number int 1 SVN number of satellite The values for SVN_number seem to be correct, except for the obvious silliness that the field name, when expanded, is Satellite_Vehicle_Number_number, and the description includes the nonsensical term SVN number. Consequently I simply redefine this field as:  Column Variable name type Dim. description 25 SVN int 1 Satellite Vehicle Number #### Sanity check for b_coord_radius The maximum and minimum values for b_coord_radius are: b_coord_radius Satellite Minimum Maximum 53 6.828054e-01 4.200857e+00 54 8.312735e-01 4.211072e+00 55 8.557934e-01 4.198010e+00 56 7.165368e-01 4.202026e+00 57 7.242072e-01 4.171993e+00 58 7.577640e-01 4.188587e+00 59NA 4.214008e+00 60 NA 4.205339e+00 61 NA 4.223202e+00 62 1.304388e+00 4.193189e+00 63 6.851737e-01 4.209296e+00 64 1.461084e+00 4.209824e+00 65 1.361361e+00 4.202571e+00 66 1.049070e+00 4.182467e+00 67 7.584448e-01 4.211562e+00 68 7.835127e-01 4.211309e+00 69 7.074734e-01 4.169802e+00 70 8.317451e-01 4.207506e+00 71 1.880229e+00 4.171600e+00 72 1.366415e+00 4.170103e+00 72 7.233613e-01 4.213620e+00 There are two obvious problems with these values: some satellites contain invalid (NA) values; and the precision of the field varies as a function of the value of the field. We can remove all the records that contain the an invalid value easily enough, which gives us stage 27. #### Stage 27: Remove records with invalid values of b_coord_radius for file in ns[567]* do awk '$26!="NA" { print $0 }'$file > ../gps-stage-27/$file done This gives us: b_coord_radius Satellite Minimum Maximum 53 6.828054e-01 4.200857e+00 54 8.312735e-01 4.211072e+00 55 8.557934e-01 4.198010e+00 56 7.165368e-01 4.202026e+00 57 7.242072e-01 4.171993e+00 58 7.577640e-01 4.188587e+00 59 6.926215e-01 4.214008e+00 60 8.069324e-01 4.205339e+00 61 7.764630e-01 4.223202e+00 62 1.304388e+00 4.193189e+00 63 6.851737e-01 4.209296e+00 64 1.461084e+00 4.209824e+00 65 1.361361e+00 4.202571e+00 66 1.049070e+00 4.182467e+00 67 7.584448e-01 4.211562e+00 68 7.835127e-01 4.211309e+00 69 7.074734e-01 4.169802e+00 70 8.317451e-01 4.207506e+00 71 1.880229e+00 4.171600e+00 72 1.366415e+00 4.170103e+00 72 7.233613e-01 4.213620e+00 #### Stage 28: Reformat values of b_coord_radius As noted above, the accuracy of the numbers reported varies as a function of the value of the field. This is not a desirable situation. Since values greater than unity are reported with a precision of a millionth of a terrestrial radius, there seems no point in reporting values at greater precision (i.e., less than about 6m), especially since such a precision surely vastly exceeds the accuracy of the magnetic field model on which the numbers are based. Consequently, we can reformat the values so that they have a consistent precision of one millionth of the terrestrial radius. We define a script called do-stage-28.py: #!/usr/bin/env python # -*- coding: utf8 -*- import re import sys filename = sys.argv[1] with open(filename) as records: for line in records: fields = line.split() value_str = fields[25] # b_coord_radius value = float(value_str) output_format = '{:8.6f}' new_value_str = output_format.format(value) fields[25] = new_value_str newline = " ".join(fields) print newline And operate on the files: for file in ns[567]* do ./do-stage-28.py$file | tr -s " " | sed 's/ $//' > ../gps-stage-28/$file
done

Now we have:

Satellite Minimum Maximum
53 0.682805 4.200857
54 0.831273 4.211072
55 0.855793 4.198010
56 0.716537 4.202026
57 0.724207 4.171993
58 0.757764 4.188587
59 0.692622 4.214008
60 0.806932 4.205339
61 0.776463 4.223202
62 1.304388 4.193189
63 0.685174 4.209296
64 1.461084 4.209824
65 1.361361 4.202571
66 1.049070 4.182467
67 0.758445 4.211562
68 0.783513 4.211309
69 0.707473 4.169802
70 0.831745 4.207506
71 1.880229 4.171600
72 1.366415 4.170103
72 0.723361 4.213620

#### Sanity check for b_coord_height

The maximum and minimum values for b_coord_height are:

b_coord_height
Satellite Minimum Maximum
53 -4.105137e+00 4.082754e+00
54 -4.011254e+00 4.119900e+00
55 -4.097133e+00 3.981823e+00
56 -4.091484e+00 4.110250e+00
57 -4.076309e+00 4.113386e+00
58 -4.051769e+00 4.112167e+00
59 -4.092374e+00 4.107748e+00
60 -3.965876e+00 4.114727e+00
61 -4.095934e+00 4.074463e+00
62 -3.918038e+00 3.950894e+00
63 -4.087862e+00 4.103668e+00
64 -3.801516e+00 3.899932e+00
65 -3.946078e+00 3.755339e+00
66 -3.898498e+00 4.027861e+00
67 -4.095983e+00 4.105099e+00
68 -4.039725e+00 4.085420e+00
69 -4.077435e+00 4.103233e+00
70 -4.077986e+00 4.069141e+00
71 -3.667803e+00 3.713735e+00
72 -3.727146e+00 3.938838e+00
72 -4.134635e+00 4.132903e+00

These require the same kind of massaging as b_coord_radius.

#### Stage 29: Reformat values of b_coord_height

We define do-stage-29.py:

#!/usr/bin/env python
# -*- coding: utf8 -*-

import re
import sys

filename = sys.argv[1]

with open(filename) as records:
for line in records:
fields = line.split()
value_str = fields[26]      # b_coord_height
value = float(value_str)

output_format = '{:8.6f}'

new_value_str = output_format.format(value)

fields[26] = new_value_str
newline = " ".join(fields)

print newline

and apply it to the files:

for file in ns[567]*
do
./do-stage-29.py $file | tr -s " " | sed 's/$//' > ../gps-stage-29/$file done with the result: b_coord_height Satellite Minimum Maximum 53 -4.105137 4.082754 54 -4.011254 4.119900 55 -4.097133 3.981823 56 -4.091484 4.110250 57 -4.076309 4.113386 58 -4.051769 4.112167 59 -4.092374 4.107748 60 -3.965876 4.114727 61 -4.095934 4.074463 62 -3.918038 3.950894 63 -4.087862 4.103668 64 -3.801516 3.899932 65 -3.946078 3.755339 66 -3.898498 4.027861 67 -4.095983 4.105099 68 -4.039725 4.085420 69 -4.077435 4.103233 70 -4.077986 4.069141 71 -3.667803 3.713735 72 -3.727146 3.938838 72 -4.134635 4.132903 #### Sanity check for magnetic_longitude The maximum and minimum values for magnetic_longitude are: magnetic_longitude Satellite Minimum Maximum 53 4.185175e-03 3.599997e+02 54 1.637927e-04 3.599992e+02 55 3.244386e-05 3.600000e+02 56 3.396840e-04 3.599996e+02 57 5.629913e-04 3.599997e+02 58 1.443975e-03 3.599997e+02 59 2.364732e-05 3.600000e+02 60 3.059084e-04 3.599989e+02 61 6.888084e-04 3.599998e+02 62 9.731220e-04 3.599994e+02 63 1.367992e-03 3.599995e+02 64 3.209574e-04 3.599982e+02 65 1.819354e-03 3.599992e+02 66 8.708465e-04 3.599990e+02 67 1.153824e-05 3.599956e+02 68 6.599287e-04 3.599986e+02 69 5.731681e-03 3.599997e+02 70 3.041434e-02 3.599747e+02 71 5.421115e-04 3.599966e+02 72 1.369661e-03 3.599986e+02 72 6.566200e-04 3.599996e+02 We apply changes similar to stage 23, as follows. #### Stage 30: reformat values of magnetic_longitude We define do-stage-30.py: #!/usr/bin/env python # -*- coding: utf8 -*- import re import sys filename = sys.argv[1] field_nr = 27 # magnetic_longitude with open(filename) as records: for line in records: fields = line.split() value_str = fields[field_nr] # longitude value = float(value_str) if value < 0: value += 360 if value >= 360: value -= 360 new_value_str = '{:9.4f}'.format(value) fields[field_nr] = new_value_str newline = " ".join(fields) print newline And apply it: for file in ns[567]* do ./do-stage-30.py$file | tr -s " " | sed 's/ $//' > ../gps-stage-30/$file
done

With the result:

magnetic_longitude
Satellite Minimum Maximum
53 0.0042 359.9997
54 0.0002 359.9992
55 0.0000 359.9998
56 0.0003 359.9996
57 0.0006 359.9997
58 0.0014 359.9997
59 0.0000 359.9999
60 0.0003 359.9989
61 0.0007 359.9998
62 0.0010 359.9994
63 0.0014 359.9995
64 0.0003 359.9982
65 0.0018 359.9992
66 0.0009 359.9990
67 0.0000 359.9956
68 0.0007 359.9986
69 0.0057 359.9997
70 0.0304 359.9747
71 0.0005 359.9966
72 0.0014 359.9986
72 0.0007 359.9996

We  checkpoint the stage 30 dataset. Th file has the MD5 checksum: eba369c61f7974a636358758e1035ae9.

The data table for ns41 and ns48 still looks like this:

Column Variable name type Dim. Description
1 decimal_day double 1 GPS time -- a number from 1 (1-Jan 00:00) to 366 (31-Dec 24:00) or 367 in leap years
2 Geographic_Latitude double 1 Latitude of satellite (deg)
3 Geographic_Longitude double 1 Longitude of satellite (deg)
5-12 rate_electron_measured double 8 Measured rate (Hz) in each of the 8 BDD electron channels (E1-E8)
13-20 rate_proton_measured double 8 Measured rate (Hz) in each of the 8 BDD proton channels (P1-P8)
21 collection_interval int 1 dosimeter collection period (seconds)
22 year int 1 year (e.g. 2015)
23 decimal_year double 1 decimal year = year + (decimal_day-1.0)/(days in year)
24 svn_number int 1 SVN number of satellite
26 b_coord_height double 1 height above the earth's dipole equatorial plane (earth radii)
27 magnetic_longitude double 1 Magnetic longitude (degrees)
28 L_shell double 1 L_shell (earth radii) -- I do not clearly understand the origin of the calculation, but it seems to be a dipole field/T-89
29 bfield_ratio double 1 Bsatellite/Bequator
30 local_time double 1 magnetic local time (0-24 hours)
31 b_sattelite double 1 B field at satellite (gauss)
32 b_equator double 1 B field at equator (on this field line I think) (gauss)
33-40 electron_background double 8 estimated background in electron channels E1-E8 (Hz)
41-48 proton_background double 8 estimated background in proton channels P1-P8 (Hz)
49 proton_activity int 1 =1 if there is significant proton activity
50 electron_temperature double 1 electron temperature from a one Maxwellian fit (MeV)
51 electron_density_fit double 1 electron number density from a one Maxwellian fit (cm-3)
52-59 model_counts_electron_fit double 8 E1-E8 rates from the 2-parameter Maxwellian fit to the electron data
60-67 dtc_counts_electron double 8 Dead time corrected electron rates (from data, not fit)
68-97 integral_flux_instrument double 30 (based on 2 parameter Maxwellian fit) integral of electron flux above integral_flux_energy[i] particles/(cm2sec)
98-127 integral_flux_energy double 30 energies for the integral of integral_flux_instrument (MeV)
128-142 electron_diff_flux_energy double 15 energies for the fluxes in electron_diff_flux_energy (MeV)
143-157 electron_diff_flux double 15 (based on 2 parameter Maxwellian fit) electron flux at energies electron_diff_flux[i] (particle/(cm2 sr MeV sec))

And for the remaining satellites we now have (I have clarified the descriptions of the b_coord_radius and b_coord_height fields):

Column Variable name type Dim. description
1 decimal_day double 1 GPS time: a decimal number in the range [1, 367) in leap years or [1, 366) otherwise, representing the day of the year (1-Jan 00:00 to 31-Dec 24:00).
2 Geographic_Latitude double 1 Latitude of satellite (°, N +ve)
3 Geographic_Longitude double 1 Longitude of satellite (°, E +ve, measured from Greenwich meridian)
4 Rad_Re double 1 Distance from centre of Earth, in units of Earth radii.
5-15 rate_electron_measured double 11 Measured rate (Hz) in each of the 11 CXD electron channels
16-20 rate_proton_measured double 5 Measured rate (Hz) in each of the 5 CXD proton channels (P1-P5)
21 LEP_thresh double 1 LEP threshold in E1 channels, in keV
22 collection_interval int 1 dosimeter collection period (seconds)
23 year int 1 year (e.g. 2015)
24 decimal_year double 1 decimal year = year + (decimal_day-1.0) / (days in year)
25 SVN int 1 Satellite Vehicle Number
26 b_coord_radius double 1 Distance from dipole axis, in units of Earth radii.
27 b_coord_height double 1 Distance from dipole equatorial plane, in units of Earth radii (N +ve).
28 magnetic_longitude double 1 Magnetic longitude (degrees)
29 L_shell double 1 L shell: McIlwain calculation according to model with T89 External Field, IGRF Internal Field.
30 L_LGM_TS04IGRF double 1 LanlGeoMag L-shell McIlwain calculation, TS04 External Field, IGRF Internal Field.
31 L_LGM_OP77IGRF double 1 LanlGeoMag L-shell McIlwain calculation, OP77 External Field, IGRF Internal Field (not currently filled)
32 L_LGM_T89CDIP double 1 LanlGeoMag L-shell McIlwain calculation, T89 External Field, Centered Dipole Internal Field
33 bfield_ratio double 1 Bsatellite/Bequator
34 local_time double 1 magnetic local time (0-24 hours)
35 utc_lgm double 1 UTC (0-24 hours)
36 b_sattelite double 1 B field at satellite (gauss)
37 b_equator double 1 B field at equator (on this field line I think) (gauss)
38-48 electron_background double 11 estimated background in electron channels E1-E11 (Hz)
49-53 proton_background double 5 estimated background in proton channels P1-P5 (Hz)
54 proton_activity int 1 =1 if there is significant proton activity
55 proton_temperature_fit double 1 characteristic momentum -- R0 in the expression given above (MeV/c)
56 proton_density_fit double 1 N0 parameter in fit to proton flux ((protons/(cm2 sec sr MeV))
57 electron_temperature_fit double 1 electron temperature from a one Maxwellian fit (MeV)
58 electron_density_fit double 1 electron number density from a one Maxwellian fit (cm-3)
59-63 model_counts_proton_fit_pf double 5 P1-P5 rate from proton fit (using proton_temperature_fit, proton_density_fit)
64-74 model_counts_electron_fit double 11 E1-E11 rates from the 9-parameter electron flux model
75-80 proton_integrated_flux_fit double 6 integral of proton flux (based on fit) above 10, 15.85, 25.11, 30, 40, 79.43 MeV (proton kinetic energy)
81-110 integral_flux_instrument double 30 (based on 9 parameter fit) integral of electron flux above integral_flux_energy[i] particles/(cm2 sec)
111-140 integral_flux_energy double 30 energies for the integral of integral_flux_instrument (MeV)
141-155 electron_diff_flux_energy double 15 energies for the fluxes in electron_diff_flux_energy (MeV)
156-170 electron_diff_flux double 15 (based on 9 parameter fit) electron flux at energies electron_diff_flux[i] (particle/(cm2 sr MeV sec))
171-179 Efitpars double 9 fit parameters for 9 parameter electron fit

Finally, here are the number of records for each satellite at the end of the processing for stage 30:

Satellite Stage 30 Records
ns41 1,990,340
ns48 1,105,300
ns53 1,331,017
ns54 1,938,603
ns55 1,054,569
ns56 1,680,369
ns57 1,082,028
ns58 1,174,629
ns59 1,513,323
ns60 1,492,497
ns61 1,467,283
ns62 774,976
ns63 651,078
ns64 343,643
ns65 480,017
ns66 446,139
ns67 327,174
ns68 304,964
ns69 262,021
ns70 110,260
ns71 220,694
ns72 181,519
ns73 145,292