Extracting light curves from ZTF JSON file 

Lynne Hillenbrand posted this problem: 
   "Given file ZTFS2020j.json how do I extract light curves?"  
[This file in question can be found at on this website]

We use the Unix tool "jq" to solve this problem. This tool is widely
called as "the sed for JSON files".


jq is designed to sequentially filter JSON data. An internal pipe,
"|",  moves the data to the next stage. The "." is the simplest
filter. It reproduces the input and by default the output is "pretty

$ jq . ZTF2020j.json > a

Review file "a" in "vi" or equivalent.  Alternatively, or in addition,
load up this file on Firefox. You will get an excellent overview
of the file structure (use "collapse" and "expand", as needed).

Pedagogically the correct command to determine the overall structure
of the data file  is

$ jq '. | length,type' a          #determine the number of keywords

$ jq 'length,type' a              #but, thanks to defaults, can be simplified 
18				  #18 keywords
"object"		          #not an array

$ jq 'keys-unsorted' a            #let us get a summary of keywords
The photometric data

The light curve data is an array of objects associated with "lc". 
Let us get the vital statistics of "lc"

$ jq '.lc | type, length' a      #so lc consists of three arrays

$ jq '.lc[0]' a                  #let us review the first array 

    "_id": "jo1tkiex4sr41nr1t9cu2q5u",
    "telescope": "PO:1.2m",
    "instrument": "ZTF",
    "release": "ZTF_sources_20191101",
    "id": 11768202002814,
    "filter": 2,
    "lc_type": "temporal",
    "data": [
        "catflags": 0,
        "chi": 0.779,
        "dec": 43.8856188,
        "expid": 58136421,
        "hjd": 2458335.86724,
        "mag": 15.664,
        "magerr": 0.015,
        "programid": 2,
        "ra": 314.5451081,
        "sharp": -0.032,
        "uexpid": 11768202058136420
      "catflags": 0,
      "chi": 1.419,
      "dec": 43.8856296,
      "expid": 58040686,
      "hjd": 2458334.90986,
      "mag": 15.701,
      "magerr": 0.015,
      "programid": 2,
      "ra": 314.5450891,
      "sharp": -0.035,
      "uexpid": 11768202058040686

From an inspection of the above we conclude that the photometric
data are in an array called "data".

$ jq '.lc[].data | length ' a    #there 77, 36 and 218 epochs

Using grep I extracted relevant summary lines for each dataset.
Clearly, filters and ownership lead to three datasets.

$ jq '.lc' a | grep -B5 "filter"
    "_id": "jo1tkiex4sr41nr1t9cu2q5u",       #first data set (index=0)
    "telescope": "PO:1.2m",
    "instrument": "ZTF",
    "release": "ZTF_sources_20191101",
    "id": 11768202002814,
    "filter": 2,
    "_id": "kjq6u5ch9sxv9rrdntwiato2",	     #second data set
    "telescope": "PO:1.2m",
    "instrument": "ZTF",
    "release": "ZTF_sources_20191101",
    "id": 11768201001469,
    "filter": 1,
    "_id": "w9614x1x03z8z7y67yr25f2g",      #third data set
    "telescope": "PO:1.2m",
    "instrument": "ZTF",
    "release": "ZTF_sources_20191101",
    "id": 10730521008859,
    "filter": 1,

Extracting the light curves

Let us agree to extract only "mag" and "magerr".  

$ jq '.lc[0] | .data[] | .hjd,.mag,.magerr' a | xargs -n 3 > 0.dat
$ jq '.lc[1] | .data[] | .hjd,.mag,.magerr' a | xargs -n 3 > 1.dat
$ jq '.lc[2] | .data[] | .hjd,.mag,.magerr' a | xargs -n 3 > 2.dat

On the other hand, an "all jq" solution:
$ jq -r '.lc[0] | .data[]|[.hjd,.mag,.magerr] | join(" ")' a > 0.dat

You can save typing with this one-liner

$ for ((i=0;i<$(jq ".lc|length" a);i++));do;jq ".lc[$i]|.data[]|.hjd,.mag,.magerr" a|xargs -n 3 >$i.dat;done