Tutorial¶
Basics¶
impy’s concept is to make image analysis on Python much more user friendly. The core array object,
ImgArray, retains all the features of numpy.ndarray while implemented with variety of function
that minimize your effort of image processing.
Let’s start with a simple example.
import impy as ip
img = ip.imread("path/to/image.tif") # read
out = img.gaussian_filter(sigma=1) # process
out
ImgArray of
name : image.tif
shape : 10(t), 20(z), 256(y), 256(x)
label shape : No label
dtype : uint16
source : path/to/image.tif
scale : ScaleView(t=1.0, z=0.217, y=0.217, x=0.217)
Here, you can see
each dimension is labeled with a symbol (t, z, y and x).
Physical scale is tagged to the object as a
ScaleView, a subclass of Pythondict.the source file is recorded.
impy’s imread function can extract metadata (such as axes) from image files and “memorize” them
as additional properties of array objects.
You can refer to the properties by:
img.axes # Out: Axes['t', 'z', 'y', 'x']
img.source # Out: such as WindowsPath("path/to/image.tif")
img.name # Out: "image.tif"
img.scale # Out: ScaleDict(t=1.0px, z=0.217μm, y=0.217μm, x=0.217μm)
After you finished all the process you want, you can view the results in napari viewer. Just pass the
image object to GUI handler object gui in impy.
ip.gui.add(out)
and wait until viewer window opens.
Axes Targeted Slicing¶
You may want to get slice(s) of an image with a format like “t=3:10” but generally you always have to
care about which is t-axis. Most arrays in impy have extended numpy.ndarray to enable the
“axis-targeted slicing”. You can use following grammar:
Single slice, like
img["t=1"].String that follows Python slicing rules, such as
img["t=5:10"]orimg["t=-1"]Fancy slicing, like
img["t=1,3,5"]Conbination of them with splitter “;”, like
img["t=3;z=5:7"]
Similarly, image shape is also extended to support axis-based access. In the example above, the image of
interest has (t, z, y, x) axes and the shape is (10, 20, 256, 256). Similar to numpy.ndarray, you can
know its shape using img.shape but in impy an object of AxesShape will be returned instead of
Python tuple, as long as axes are well-defined.
img.shape # Out: AxesShape(t=10, z=20, y=256, z=256)
Here, you can get the size of z-axis by img.shape.z instead of img.shape[1].
Note
You can also use dict for slicing.
img[{"y": 3, "x": slice(4, 10)}] # identical to img["y=3;x=4:10"]
Batch Processing¶
Image Stacks¶
Owing to the axes information, impy can automatically execute functions for every image slice properly. As in the first example, with a tzyx image, instead of running
out = np.empty_like(img)
for t in range(10):
out[t] = img[t].gaussian_filter(sigma=1)
you just need to run a single code
out = img.gaussian_filter(sigma=1)
and the function “knows” zyx or (1,2,3) axes are spatial dimensions and filtering should be iterated along t axis.
If you want yx axes be the spatial dimensions, i.e., iterate over t and z axes, explicitly specify it with dims
keyword argument:
out = img.gaussian_filter(sigma=1, dims="yx")
out = img.gaussian_filter(sigma=1, dims=2) # this is fine
Running Function with Different Parameters¶
Apply a function to whole image with different parameters
out = img.for_params("log_filter", var={"sigma": [1, 2, 3, 4]})
out = img.for_params("log_filter", sigma=[1, 2, 3, 4]) # This is also supported.
Apply a function along an axis with different parameters
You usually want to apply same function to each channel but with different parameters.
out = img.for_each_channel("hessian_eigval", sigma=[1, 2])
Images with Different Shapes¶
For images with different shapes, they cannot be stacked into a single array. In this case, you can use DataList, an
extension of Python list. DataList recognizes any member functions of its components and call the function for all
the components. Here’s an example:
imglist = ip.DataList([img1, img2, img3])
outputs = imglist.gaussian_filter(sigma=3)
gaussian_filter is a member function of img1, img2 and img3, so that inside imglist, gaussian_filter
is called three times. Following code is essentially same as what is going on inside DataList:
outputs = []
for img in imglist:
out = img.gaussian_filter(sigma=3)
outputs.append(out)
outputs = ip.DataList(outputs)
impy also provides DataDict, an extension of Python dict, which works similarly to DataList. Aside from
the feature of iterative function call, you can give names for each image as dictionary keys, and get the value from
attribution, imgdict.name instead of imgdict["name"].
imglist = ip.DataDict(first=img1, second=img2, third=img3)
outputs = imglist.gaussian_filter(sigma=3)
outputs.first
Extended Numpy functions¶
In almost all the numpy functions, the keyword argument axis can be given as the symbol of axis if the argument(s) are ImgArray
or other arrays that belong to subclass of MetaArray.
np.mean(img, axis="z") # Z-projection, although ImgArray provides more flexible function "proj()"
np.stack([img1, img2], axis="c") # Merging colors
This is achieved by defining __array_function__ method. See Numpy’s documentation
for details.
You can also make an ImgArray in a way similar to numpy:
ip.array([2, 4, 6], dtype="uint16")
ip.zeros((100, 100), dtype=np.float32)
ip.random.normal(size=(100, 100))
Use GPU¶
impy can automatically switch between numpy and cupy. Using GPU can largely boost
your image analysis especially when it relies on Fourier transformation or linear algebra.
You can setup GPU calculation within a context using
with ip.use("cupy"):
img_deconv = img.lucy(psf_image)
or globally
ip.Const["RESOURCE"] = "cupy"
Advanced Reading Options¶
Read Separate Images as an Image Stack¶
If images are saved as separate tif files in a directory, you can read them as an image stack by:
img = ip.imread("path/to/image/*.tif")
Read Separate Images as an DataList¶
img = ip.imread_collection("path/to/image/*.tif")
Large Images¶
There are two ways to handle large images.
LazyImgArray¶
If you deal with very large images that exceeds PC memory, you can use LazyImgArray. This object retains
memory map of the image file that is split into smaller chunks, and passes it to dask array as “ready to
read” state. The image data is therefore loaded only when it is needed. Many useful functions in ImgArray
are also implemented in LazyImgArray so that you can easily handle large datasets.
To read large images as LazyImgArray, call impy.lazy.imread instead. You can specify its chunk size using
chunks parameter.
img = ip.lazy.imread("path/to/image.tif", chunks=(1, "auto", "auto", "auto"))
img
LazyImgArray of
name : image.tif
shape : 300(t), 25(z), 1024(y), 1024(x)
chunk sizes : 1(t), 25(z), 1024(y), 1024(x)
dtype : uint16
source : path/to/image.tif
scale : ScaleView(t=1.0px, z=0.217μm, y=0.217μm, x=0.217μm)
You can check its size in GB:
img.GB
15.72864
When you have to convert it to ImgArray, use compute function:
img.compute() # dask's compute() function will be called inside
BigImgArray¶
LazyImgArray is useful to process large images. However, it is not suitable for interactive analysis
because calculation starts from the beginning for every operation. BigImgArray is a subclass of
LazyImgArray but it stores the cashed data in a temporary file.
You can use big_imread() function to open an image file as a BigImgArray object.
img = ip.big_imread("path/to/image.tif", chunks=(1, "auto", "auto", "auto"))
img
BigImgArray of
name : image.tif
shape : 300(t), 25(z), 1024(y), 1024(x)
chunk sizes : 1(t), 25(z), 1024(y), 1024(x)
dtype : uint16
source : path/to/image.tif
scale : ScaleView(t=1.0px, z=0.217μm, y=0.217μm, x=0.217μm)
A BigImgArray processes an image out-of-core, but store the result in a temporary
file. As LazyImgArray, you can convert it into ImgArray using compute function.
img1 = img.gaussian_filter() # computed and cached here
img2 = img1.threshold(img1.mean()) # computed and cached here
out = img2.compute() # convert to ImgArray