Converting data from PhysioNet JSON to CSV

PhysioNet data is available in binary (dat) form but their web site also provides records in JSON.

These records include samples / measurements from 12-lead ECGs recorded at 1ksps.

To convert these to CSV we can use the ol’ jq:

# create headers
cat input.json | jq -r '["sample"] + [.fetch.signal[].name] | @csv' > output.csv

# add data
cat input.json | jq -r '.fetch.signal |
    [. as $in | .[] as $item | $item.samp | to_entries
        | .[] | . + {"sample": .key, ($item.name): .value} |
        del(.key,.value)]
    | group_by(.sample) | .[] | add | [.[]] | @csv' >> output.csv

This will then output a file in a format not unlike the following:

"i","ii","iii","avr","avl","avf"
-1458,-387,1073,923,-1266,342
20,-18,-39,-2,30,-29
-3,-29,-26,17,12,-27
-13,-6,7,9,-11,1
7,30,23,-18,-7,26
...