Some notes on en4 oceans data

EN4 data is available here as netCDF format. It’s a set of data that pulls back salinity and temperatures from beacons across the planet. It’s recorded as a grid accurate to 1°.



Null fields are plentiful. You’re not about to find a beacon in the middle of the Sahara. Nulls are recorded as -32768 (bottom end of a signed 16-bit int). The _FillValue corresponds to the fill in use.

Note that valid_min and valid_max are available on field attributes. A sensible programmer would use these to filter out the bad data (NB to self).

Sample of the top of the ‘salinity’ section:

"salinity": {
  "shape": ["time", "depth", "lat", "lon"],
  "type": "float",
  "attributes": {
    "long_name": "salinity",
    "units": "psu",
    "standard_name": "sea_water_salinity",
    "_FillValue": -32768,
    "add_offset": 0,
    "scale_factor": 1,
    "valid_min": -5,
    "valid_max": 48
  "data": [[[[-32768, -32768, -32768, -32768, -32768, -32768, -32768 ...

time_bnds indicates the bounds of data (date from <-> date to); in this case the ’time’ is the number of days since 1800-01-01. 80353 would for example be 2020-01-01. So a time_bnds of [80353, 80384] is going to correspond to the first month of January, 2020.

(Assuming the reason the field is being stored as a float is to allow for multiple measurements to be taken in a day?)

depth refers to depth of water below the buoy; it seems as though, when the water is shallow, measurements are sometimes disregarded (too variable). In this data set there are 42 bounds.

Loading data

Using ncks (the ‘kitchen sink’), blocks of data can be easily pulled out of the data file. The --json switch allows us to feed the data through something ‘friendlier’ like jq.

# never mind the hacky sed, just give us some basic 
# stuff to play with for now:
ncks --json -v salinity | \
	sed 's/-32768/null/g' > output
Software - list of tools for data extraction - list of all freely available tools for netCDF data extraction