Parallel computation with Dask for InSAR applications¶
A practical introduction to parallel computation for DePSI developers.
Why Dask?
Disclaimer:
- It is not always faster than numpy;
- It is not always more efficient than numpy;
Dask is a solution for Scaling Up your computation with Less pain:
- Operates with larger-than-memory data;
- Use HPC resources efficiently;
- Minimal changes to your numpy code;
Dask in a nutshell¶
Configure scheduler¶
This is to tell Dask how to utilize the available resources.
Check Dask documentation for scheduling for more details.
If you execute this notebook on your local machine, choose one config from Single-machine scheduler or Distibuted schedular, do not execute all of them, since the last config will overwrite the previous ones.
Single-machine scheduler¶
import dask
# threaded scheduler, usually the default option
dask.config.set(scheduler="threads")
# process scheduler, not recommended by dask. Use local cluster instead.
dask.config.set(scheduler="processes")
# force single-threaded execution, very useful for debugging
dask.config.set(scheduler="synchronous")
Distributed¶
One can also use the distributed scheduler. This is especially useful when working on a HPC environment.
# Initiate a local cluster
from dask.distributed import Client, LocalCluster
cluster = LocalCluster(
n_workers=4, threads_per_worker=2
) # limit the concurrency resources to avoid memory issues
# connect to cluster
client = Client(cluster)
# Initiate a SLURM dask cluster
# We use this in HPC with SLURM system
# ONLY FOR DEMONSTRATION, NOT EXECUTABLE ON LOCAL
from dask.distributed import Client
from dask_jobqueue import SLURMCluster
cluster = SLURMCluster(
name="dask-worker", # Name of the Slurm job
queue="normal", # Name of the node partition on your SLURM system
cores=4, # Number of cores per worker
memory="32 GB", # Total amount of memory per worker
processes=1, # Number of Python processes per worker
walltime="3:00:00", # Reserve each worker for X hour
)
# connect to cluster
client = Client(cluster)
Examples¶
Below we will show three examples of how Dask has been used for parallel computation in InSAR.
import xarray as xr
import numpy as np
import dask
import dask.array as da
stmat = xr.open_zarr("data/stm.zarr")
stmat
<xarray.Dataset> Size: 14MB Dimensions: (space: 78582, time: 10) Coordinates: azimuth (space) int64 629kB dask.array<chunksize=(10000,), meta=np.ndarray> lat (space) float32 314kB dask.array<chunksize=(10000,), meta=np.ndarray> lon (space) float32 314kB dask.array<chunksize=(10000,), meta=np.ndarray> range (space) int64 629kB dask.array<chunksize=(10000,), meta=np.ndarray> * time (time) int64 80B 0 1 2 3 4 5 6 7 8 9 Dimensions without coordinates: space Data variables: amplitude (space, time) float32 3MB dask.array<chunksize=(10000, 10), meta=np.ndarray> complex (space, time) complex64 6MB dask.array<chunksize=(10000, 10), meta=np.ndarray> phase (space, time) float32 3MB dask.array<chunksize=(10000, 10), meta=np.ndarray> Attributes: multi-look: coarsen-mean
- space: 78582
- time: 10
- azimuth(space)int64dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 613.92 kiB 78.12 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type int64 numpy.ndarray - lat(space)float32dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 306.96 kiB 39.06 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray - lon(space)float32dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 306.96 kiB 39.06 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray - range(space)int64dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 613.92 kiB 78.12 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type int64 numpy.ndarray - time(time)int640 1 2 3 4 5 6 7 8 9
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- amplitude(space, time)float32dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 3.00 MiB 390.62 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray - complex(space, time)complex64dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 6.00 MiB 781.25 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 2 graph layers Data type complex64 numpy.ndarray - phase(space, time)float32dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 3.00 MiB 390.62 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray
- timePandasIndex
PandasIndex(Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='time'))
- multi-look :
- coarsen-mean
Example 1: da.apply_gufunc
¶
In this example, we consider parallelizing a function that only involves numpy arrays. We will extract real part from a complex array.
One real life example can be found in DePSI/slc.py reconstruct SLC complex from ifg and mother complex.
Use numpy function on dask array directly¶
# Directly get the real part of the complex data
# This function is supported by numpy so can be parallelized directly
real_1 = stmat["complex"].data.real
dask.visualize(real_1)
Write a function that takes numpy arrays as input, then vectorize it per block¶
# A function which takes an array
def get_real_array(complex):
return complex.real
meta_arr = np.array((), dtype=np.float32)
real_2 = da.apply_gufunc(
get_real_array,
"()->()",
stmat["complex"].data,
meta=meta_arr,
)
dask.visualize(real_2)
Write a function that takes a single element as input¶
# A function which takes a single value
@np.vectorize
def get_real_number(complex):
if ~np.isnan(complex):
return complex.real
else:
return np.nan
meta_arr = np.array((), dtype=np.float32)
real_3 = da.apply_gufunc(
get_real_number,
"()->()",
stmat["complex"].data,
meta=meta_arr,
)
dask.visualize(real_3)
After building the task graph¶
# Re-assign to xarray dataset
stmat["real"] = (("space", "time"), real_3)
stmat
<xarray.Dataset> Size: 18MB Dimensions: (space: 78582, time: 10) Coordinates: azimuth (space) int64 629kB dask.array<chunksize=(10000,), meta=np.ndarray> lat (space) float32 314kB dask.array<chunksize=(10000,), meta=np.ndarray> lon (space) float32 314kB dask.array<chunksize=(10000,), meta=np.ndarray> range (space) int64 629kB dask.array<chunksize=(10000,), meta=np.ndarray> * time (time) int64 80B 0 1 2 3 4 5 6 7 8 9 Dimensions without coordinates: space Data variables: amplitude (space, time) float32 3MB dask.array<chunksize=(10000, 10), meta=np.ndarray> complex (space, time) complex64 6MB dask.array<chunksize=(10000, 10), meta=np.ndarray> phase (space, time) float32 3MB dask.array<chunksize=(10000, 10), meta=np.ndarray> real (space, time) float32 3MB dask.array<chunksize=(10000, 10), meta=np.ndarray> Attributes: multi-look: coarsen-mean
- space: 78582
- time: 10
- azimuth(space)int64dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 613.92 kiB 78.12 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type int64 numpy.ndarray - lat(space)float32dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 306.96 kiB 39.06 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray - lon(space)float32dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 306.96 kiB 39.06 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray - range(space)int64dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 613.92 kiB 78.12 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type int64 numpy.ndarray - time(time)int640 1 2 3 4 5 6 7 8 9
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- amplitude(space, time)float32dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 3.00 MiB 390.62 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray - complex(space, time)complex64dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 6.00 MiB 781.25 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 2 graph layers Data type complex64 numpy.ndarray - phase(space, time)float32dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 3.00 MiB 390.62 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray - real(space, time)float32dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 3.00 MiB 390.62 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 5 graph layers Data type float32 numpy.ndarray
- timePandasIndex
PandasIndex(Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='time'))
- multi-look :
- coarsen-mean
# Example of triggering the computation
real_3_comp = real_3.compute()
real_3_comp
array([[ 514.93115, -296.68585, 233.41502, ..., -375.02396, -468.7583 , 428.09406], [ -2632.2842 , 776.7178 , 2548.0337 , ..., 3019.0337 , -3931.0288 , 3837.6934 ], [ 0. , 0. , 0. , ..., 0. , 0. , 0. ], ..., [ 34810.37 , -15664.879 , 66410.12 , ..., 45431.094 , 20597.82 , -53849.72 ], [ -5483.4355 , -6193.3354 , 376.4837 , ..., 2700.0833 , -306.2439 , 4781.2935 ], [ -2147.1392 , -3128.516 , 2229.8782 , ..., -1608.2722 , 2551.9136 , -2731.8252 ]], dtype=float32)
Question¶
What is the difference between real_3.compute()
and stmat['real'].compute()
?
Example 2: xr.map_blocks
¶
In this example, we will parallelize a function that operates on an Xarray dataset, instead of a dask array. This is useful if the function you write requires multiple data variables from an Xarray dataset.
In this example, we will write a dummy function that selects points in a bounding box, which can be potentially expanded to large colume background polygons. It provides a simple example which demonstrates how to use xr.map_blocks
.
This is adapted a real life example in stmtools/stm.py enrich an STM from Polygons, enriching an STM from large cadastral dataset.
stmat = xr.open_zarr("data/stm.zarr")
from shapely.geometry import box
# A bounding boxes in the region
bbox = [4.85, 52.375, 4.875, 52.425] # [min_lon, min_lat, max_lon, max_lat]
# For a single bounding box, the following code is efficient
# mask = (
# (stmat["lon"] >= bbox[0])
# & (stmat["lon"] <= bbox[2])
# & (stmat["lat"] >= bbox[1])
# & (stmat["lat"] <= bbox[3])
# )
# mask = mask.compute()
# stmat_sel = stmat.where(mask, drop=True)
# When writing a block function
# One can check if the background bounding box overlaps with the block
def in_box_block(stmat, bbox):
# Get the bounding box of the block
block_bbox = [
stmat["lon"].min(),
stmat["lat"].min(),
stmat["lon"].max(),
stmat["lat"].max(),
]
# Check if the block bbox overlaps with the bounding box
if box(*block_bbox).intersects(box(*bbox)):
mask = (
(stmat["lon"] >= bbox[0])
& (stmat["lon"] <= bbox[2])
& (stmat["lat"] >= bbox[1])
& (stmat["lat"] <= bbox[3])
)
else:
mask = xr.DataArray(np.zeros(stmat["lon"].shape, dtype=bool), dims=("space"))
return stmat.where(mask)
stmat_sel = xr.map_blocks(
in_box_block,
stmat,
args=([bbox]),
template=stmat,
)
stmat_sel
<xarray.Dataset> Size: 14MB Dimensions: (space: 78582, time: 10) Coordinates: azimuth (space) int64 629kB dask.array<chunksize=(10000,), meta=np.ndarray> lat (space) float32 314kB dask.array<chunksize=(10000,), meta=np.ndarray> lon (space) float32 314kB dask.array<chunksize=(10000,), meta=np.ndarray> range (space) int64 629kB dask.array<chunksize=(10000,), meta=np.ndarray> * time (time) int64 80B 0 1 2 3 4 5 6 7 8 9 Dimensions without coordinates: space Data variables: amplitude (space, time) float32 3MB dask.array<chunksize=(10000, 10), meta=np.ndarray> complex (space, time) complex64 6MB dask.array<chunksize=(10000, 10), meta=np.ndarray> phase (space, time) float32 3MB dask.array<chunksize=(10000, 10), meta=np.ndarray> Attributes: multi-look: coarsen-mean
- space: 78582
- time: 10
- azimuth(space)int64dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 613.92 kiB 78.12 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 22 graph layers Data type int64 numpy.ndarray - lat(space)float32dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 306.96 kiB 39.06 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 22 graph layers Data type float32 numpy.ndarray - lon(space)float32dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 306.96 kiB 39.06 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 22 graph layers Data type float32 numpy.ndarray - range(space)int64dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 613.92 kiB 78.12 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 22 graph layers Data type int64 numpy.ndarray - time(time)int640 1 2 3 4 5 6 7 8 9
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- amplitude(space, time)float32dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 3.00 MiB 390.62 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 22 graph layers Data type float32 numpy.ndarray - complex(space, time)complex64dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 6.00 MiB 781.25 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 22 graph layers Data type complex64 numpy.ndarray - phase(space, time)float32dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 3.00 MiB 390.62 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 22 graph layers Data type float32 numpy.ndarray
- timePandasIndex
PandasIndex(Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='time'))
- multi-look :
- coarsen-mean
dask.visualize(stmat_sel["amplitude"].data)
stmat_sel = stmat_sel.dropna("space")
stmat_sel
<xarray.Dataset> Size: 953kB Dimensions: (space: 5179, time: 10) Coordinates: azimuth (space) int64 41kB dask.array<chunksize=(5179,), meta=np.ndarray> lat (space) float32 21kB dask.array<chunksize=(5179,), meta=np.ndarray> lon (space) float32 21kB dask.array<chunksize=(5179,), meta=np.ndarray> range (space) int64 41kB dask.array<chunksize=(5179,), meta=np.ndarray> * time (time) int64 80B 0 1 2 3 4 5 6 7 8 9 Dimensions without coordinates: space Data variables: amplitude (space, time) float32 207kB dask.array<chunksize=(5179, 10), meta=np.ndarray> complex (space, time) complex64 414kB dask.array<chunksize=(5179, 10), meta=np.ndarray> phase (space, time) float32 207kB dask.array<chunksize=(5179, 10), meta=np.ndarray> Attributes: multi-look: coarsen-mean
- space: 5179
- time: 10
- azimuth(space)int64dask.array<chunksize=(5179,), meta=np.ndarray>
Array Chunk Bytes 40.46 kiB 40.46 kiB Shape (5179,) (5179,) Dask graph 1 chunks in 23 graph layers Data type int64 numpy.ndarray - lat(space)float32dask.array<chunksize=(5179,), meta=np.ndarray>
Array Chunk Bytes 20.23 kiB 20.23 kiB Shape (5179,) (5179,) Dask graph 1 chunks in 23 graph layers Data type float32 numpy.ndarray - lon(space)float32dask.array<chunksize=(5179,), meta=np.ndarray>
Array Chunk Bytes 20.23 kiB 20.23 kiB Shape (5179,) (5179,) Dask graph 1 chunks in 23 graph layers Data type float32 numpy.ndarray - range(space)int64dask.array<chunksize=(5179,), meta=np.ndarray>
Array Chunk Bytes 40.46 kiB 40.46 kiB Shape (5179,) (5179,) Dask graph 1 chunks in 23 graph layers Data type int64 numpy.ndarray - time(time)int640 1 2 3 4 5 6 7 8 9
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- amplitude(space, time)float32dask.array<chunksize=(5179, 10), meta=np.ndarray>
Array Chunk Bytes 202.30 kiB 202.30 kiB Shape (5179, 10) (5179, 10) Dask graph 1 chunks in 23 graph layers Data type float32 numpy.ndarray - complex(space, time)complex64dask.array<chunksize=(5179, 10), meta=np.ndarray>
Array Chunk Bytes 404.61 kiB 404.61 kiB Shape (5179, 10) (5179, 10) Dask graph 1 chunks in 23 graph layers Data type complex64 numpy.ndarray - phase(space, time)float32dask.array<chunksize=(5179, 10), meta=np.ndarray>
Array Chunk Bytes 202.30 kiB 202.30 kiB Shape (5179, 10) (5179, 10) Dask graph 1 chunks in 23 graph layers Data type float32 numpy.ndarray
- timePandasIndex
PandasIndex(Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='time'))
- multi-look :
- coarsen-mean
Example 3: groupby + map + map_blocks¶
Sometimes, the function you write only operates with one single point. Directly using xr.map_blocks
requires you to chunk the STM into each point, which results in many small chunks and huge overhead. If the function you write is not easy to be translated into an array funcyion, a easy trick is to put a groupby
and map
function inside the block function.
In this example, we will write a dummy function that calculates the average amplitude of each point. And if the average amplitude is smaller than a threshold, we will set the average amplitude to zero.
This is adapted from a real life example, algorithm proposed by Y.Wang.
stmat = xr.open_zarr("data/stm.zarr")
h = stmat["amplitude"].mean(dim="time").plot.hist(bins=100, range=(2000, 20000))
# Point function
# In real life, this function will be more complex
def avg_amp_pnt(stmat_pnt, threshold):
stmat_pnt["average_amplitude"] = stmat_pnt["amplitude"].mean(dim="time")
if stmat_pnt["average_amplitude"] < threshold:
stmat_pnt["average_amplitude"] = 0
return stmat_pnt
# A block function which calls the point function by groupby
def avg_amp_block(stmat, threshold):
groups = stmat.groupby("space")
stmat_out = groups.map(
avg_amp_pnt,
threshold=threshold,
)
return stmat_out
# Since a new variable is added
# We create a template with the new data variable
stmat_template = stmat.copy()
stmat_template["average_amplitude"] = xr.DataArray(
np.zeros(stmat.sizes["space"]), dims=("space")
)
stmat_template
<xarray.Dataset> Size: 15MB Dimensions: (space: 78582, time: 10) Coordinates: azimuth (space) int64 629kB dask.array<chunksize=(10000,), meta=np.ndarray> lat (space) float32 314kB dask.array<chunksize=(10000,), meta=np.ndarray> lon (space) float32 314kB dask.array<chunksize=(10000,), meta=np.ndarray> range (space) int64 629kB dask.array<chunksize=(10000,), meta=np.ndarray> * time (time) int64 80B 0 1 2 3 4 5 6 7 8 9 Dimensions without coordinates: space Data variables: amplitude (space, time) float32 3MB dask.array<chunksize=(10000, 10), meta=np.ndarray> complex (space, time) complex64 6MB dask.array<chunksize=(10000, 10), meta=np.ndarray> phase (space, time) float32 3MB dask.array<chunksize=(10000, 10), meta=np.ndarray> average_amplitude (space) float64 629kB 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 Attributes: multi-look: coarsen-mean
- space: 78582
- time: 10
- azimuth(space)int64dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 613.92 kiB 78.12 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type int64 numpy.ndarray - lat(space)float32dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 306.96 kiB 39.06 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray - lon(space)float32dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 306.96 kiB 39.06 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray - range(space)int64dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 613.92 kiB 78.12 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 2 graph layers Data type int64 numpy.ndarray - time(time)int640 1 2 3 4 5 6 7 8 9
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- amplitude(space, time)float32dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 3.00 MiB 390.62 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray - complex(space, time)complex64dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 6.00 MiB 781.25 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 2 graph layers Data type complex64 numpy.ndarray - phase(space, time)float32dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 3.00 MiB 390.62 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 2 graph layers Data type float32 numpy.ndarray - average_amplitude(space)float640.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
array([0., 0., 0., ..., 0., 0., 0.])
- timePandasIndex
PandasIndex(Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='time'))
- multi-look :
- coarsen-mean
stmat2 = xr.map_blocks(
avg_amp_block, stmat, kwargs={"threshold": 1e4}, template=stmat_template
)
stmat2
<xarray.Dataset> Size: 15MB Dimensions: (space: 78582, time: 10) Coordinates: azimuth (space) int64 629kB dask.array<chunksize=(10000,), meta=np.ndarray> lat (space) float32 314kB dask.array<chunksize=(10000,), meta=np.ndarray> lon (space) float32 314kB dask.array<chunksize=(10000,), meta=np.ndarray> range (space) int64 629kB dask.array<chunksize=(10000,), meta=np.ndarray> * time (time) int64 80B 0 1 2 3 4 5 6 7 8 9 Dimensions without coordinates: space Data variables: amplitude (space, time) float32 3MB dask.array<chunksize=(10000, 10), meta=np.ndarray> average_amplitude (space) float64 629kB dask.array<chunksize=(10000,), meta=np.ndarray> complex (space, time) complex64 6MB dask.array<chunksize=(10000, 10), meta=np.ndarray> phase (space, time) float32 3MB dask.array<chunksize=(10000, 10), meta=np.ndarray> Attributes: multi-look: coarsen-mean
- space: 78582
- time: 10
- azimuth(space)int64dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 613.92 kiB 78.12 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 23 graph layers Data type int64 numpy.ndarray - lat(space)float32dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 306.96 kiB 39.06 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 23 graph layers Data type float32 numpy.ndarray - lon(space)float32dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 306.96 kiB 39.06 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 23 graph layers Data type float32 numpy.ndarray - range(space)int64dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 613.92 kiB 78.12 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 23 graph layers Data type int64 numpy.ndarray - time(time)int640 1 2 3 4 5 6 7 8 9
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
- amplitude(space, time)float32dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 3.00 MiB 390.62 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 23 graph layers Data type float32 numpy.ndarray - average_amplitude(space)float64dask.array<chunksize=(10000,), meta=np.ndarray>
Array Chunk Bytes 613.92 kiB 78.12 kiB Shape (78582,) (10000,) Dask graph 8 chunks in 23 graph layers Data type float64 numpy.ndarray - complex(space, time)complex64dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 6.00 MiB 781.25 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 23 graph layers Data type complex64 numpy.ndarray - phase(space, time)float32dask.array<chunksize=(10000, 10), meta=np.ndarray>
Array Chunk Bytes 3.00 MiB 390.62 kiB Shape (78582, 10) (10000, 10) Dask graph 8 chunks in 23 graph layers Data type float32 numpy.ndarray
- timePandasIndex
PandasIndex(Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='time'))
- multi-look :
- coarsen-mean
# Trigger the computation
avg_amp = stmat2["average_amplitude"].compute()
h = avg_amp.plot.hist(bins=100, range=(2000, 20000))