GHRSST is a high resolution globally-complete oceanographic dataset of sea surface temperature. It is strictly a 2d product, published daily since July 2002 and records temperature in Kelvin at the ocean surface, a blended product of observations, satellite data, and models. The dimension of the grid in pixels is 36000x17999. The storage format is NetCDF, with ‘time’ as an unlimited dimension recorded in the file - each file has a single 3D time step in the typical way that NetCDF includes and “UNLIMITED” dimension that will be appended to continually in future.
GHRSST has a bit of a broken grid, the coordinates are degenerate rectilinear (i.e. completely described by the grid in xmin,xmax,ymin,ymax and the dimensions 36000x17999 but they store the longitude and latitude 1D coordinate arrays materalized as 32-bit floating point with resultant noise). It’s not clear how the grid should be aligned exactly, and the metadata attributes list -180,180, -90,90 as the valid range but this can’t be correct at 0.01 degree resolution for that size grid).
See the (ncdump -h listed below).
## replace this local file with the one at the new earthdata store commented out (but you need your earthdata creds, or Authorization Bearer token)ncsrc <-"/rdsi/PUBLIC/raad/data/podaac-opendap.jpl.nasa.gov/opendap/allData/ghrsst/data/GDS2/L4/GLOB/JPL/MUR/v4.1/2002/152/20020601090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc"## https://cmr.earthdata.nasa.gov/virtual-directory/collections/C1996881146-POCLOUD/temporal/2002/05/31## https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-protected/MUR-JPL-L4-GLOB-v4.1/20020601090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc## read the lon,lat 1D arrays from the file (every file stores these 36000 lons and 17999 lats with ostensibly 0.01 spacinglibrary(RNetCDF)con <-open.nc(ncsrc)
open the file
expand file name
DONE: expand file name
lat <-var.get.nc(con, "lat")lon <-var.get.nc(con, "lon")close.nc(con)## weirdly right-edge alignedprint(range(lon))
[1] -179.99 180.00
## these are ok, but we miss half a cell from -90/90 (hence the 17999 thing, but why did they do that?)print(range(lat))
[1] -89.99 89.99
## there's noise in the valuesplot(diff(lon), pch =".")format(range(diff(lon)), digits =16)
[1] "0.0099945068359375" "0.0100097656250000"
## and it's way more noise than in Float64points(diff(fixlon <-seq(-179.995, 179.995, by =0.01)), pch =".", col ="red")
These points are the implicit pairs by expanding lon,lat from the file and compare what GDAL derives as the extent (red), and the vertical black lines is -180,180,-90,90. The hatched line is what we think should be assigned (an implicit missing row, split half at the top and bottom) - close enough to GDAL but with the longitude fixed and the extent made tidy (resolution 0.01, aligned to 0).
the a_scale/offset mean we get Celsius from the Int16 rather than Fahrenheit, the a_ullr fixes the regular extent and avoids all the numeric fuzz and redundancy of the lon,lat arrays, sd_name means we don’t need the ‘NETCDF:dsn:variable’ prefix:suffix stuff for the GDAL subdataset (but only since GDAL 3.9.0), and a_srs means we can churn through the warper for lovely new grids of any extent and dimension and crs we like.
Another option would be to specify a target grid and let GDAL warp from the geolocation arrays.
3.2 Geolocation arrays
gdalinfo on the file reports
system(sprintf("gdalinfo %s", ncsrc))
The the GEOLOCATION section, it knows about those 1D arrays.
So then we can do
subdataset modify metadata specify grid and resample method
3.3 Demonstration of the Float32 precision problem
Essentially floating point numbers struggle with precision for longitude and latitude (we might rescale closer to integers, but that’s tricky for metadata).
to bring home the point about the single precision (I think this is right, appreciate if anyone points out a mistake or misunderstanding)
Generate the coord values first in double precision, then in single (in R we have to return doubles, but the calc is done in float for the second one).
library(cpp11)cpp_function('doubles coord64(double start, int len, double step) { writable::doubles out = writable::doubles(len); for (int i = 0; i < len; i++) { out[i] = start + i * step; } return out ;} ')range(lon64 <-coord64(-179.99, 36000, 0.01))
[1] -179.99 180.00
format(range(diff(lon64)), digits =16)
[1] "0.009999999999990905" "0.010000000000047748"
range(lat64 <-coord64(-89.99, 17999, 0.01))
[1] -89.99 89.99
format(range(diff(lat64)), digits =16)
[1] "0.009999999999990905" "0.010000000000019327"
cpp_function('doubles coord32(float start, int len, float step) { writable::doubles out = writable::doubles(len); for (int i = 0; i < len; i++) { out[i] = start + i * step; } return out ;}')range(lon32 <-coord32(-179.99, 36000, 0.01))