bfio.BioReader

class BioReader(file_path, max_workers=None, backend=None, clean_metadata=True, level=None)

Bases: BioBase

Read supported image formats using Bio-Formats.

This class handles file reading of multiple formats. It can read files from any Bio-Formats supported file format, but is specially optimized for handling the OME tiled tiff format.

There are three backends: bioformats, python, and zarr. The bioformats backend directly uses Bio-Formats for file reading, and can read any forma that is supported by Bio-Formats. The python backend will only read images in OME Tiff format with tile tags set to 1024x1024, and is significantly faster than the “bioformats” backend for reading these types of tiff files. The zarr backend will only read OME Zarr files.

File reading and writing are multi-threaded by default, except for the bioformats backend which does not currently support threading. Half of the available CPUs detected by multiprocessing.cpu_count() are used to read an image.

For for information, visit the Bio-Formats page: https://www.openmicroscopy.org/bio-formats/

Note

In order to use the bioformats backend, jpype must be installed.

Initialize the BioReader.

Parameters

file_path (Union[str, Path]) – Path to file to read
max_workers (Optional[int]) – Number of threads used to read and image. Default is half the number of detected cores.
backend (Optional[str]) – Can be python, bioformats, or zarr. If None, then BioReader will try to autodetect the proper backend. Default is python.
clean_metadata (bool) – Will try to reformat poorly formed OME XML metadata if True. If False, will throw an error if the metadata is poorly formed. Default is True.
level (Optional[int]) – For multi-resolution image, specify the resolution level. For other image type, this will be ignored

X()

py:property::

Setter: X is read_only in BioReader
Getter: Number of pixels in the x-dimension (width)
Type: int

Y()

py:property::

Setter: Y is read_only in BioReader
Getter: Number of pixels in the y-dimension (height)
Type: int

Z()

py:property::

Setter: Z is read_only in BioReader
Getter: Number of pixels in the z-dimension (depth)
Type: int

C()

py:property::

Setter: C is read_only in BioReader
Getter: Number of pixels in the c-dimension
Type: int

T()

py:property::

Setter: T is read_only in BioReader
Getter: Number of pixels in the t-dimension
Type: int

__getitem__(keys)

Image loading using numpy-like indexing.

This is an abbreviated method of accessing the read method, where a portion of the image will be loaded using numpy-like slicing syntax. Up to 5 dimensions can be designated depending on the number of available dimensions in the image array (Y, X, Z, C, T).

Note

Not all methods of indexing can be used, and some indexing will lead to unexpected results. For example, logical indexing cannot be used, and step sizes in slice objects is ignored for the first three indices. This means and index such as [0:100:2,0:100:2,0,0,0] will return a 100x100x1x1x1 numpy array.

Parameters: keys (Union[tuple, slice]) – numpy-like slicing used to load a section of an image.
Return type: ndarray
Returns: A numpy.ndarray where trailing empty dimensions are removed.

Example

import bfio

# Initialize the bioreader
br = bfio.BioReader('Path/To/File.ome.tif')

# Load and  a 100x100 array of pixels
a = br[:100,:100,:1,0,0]

# Slice steps sizes are ignored for the first 3 indices, so this
# returns the same as above
a = br[0:100:2,0:100:2,0:1,0,0]

# The last two dimensions can receive a tuple or list as input
# Load the first and third channel
a = br[:100,100,0:1,(0,2),0]

# If the file is 3d, load the first 10 z-slices
b = br[...,:10,0,0]

read(X=None, Y=None, Z=None, C=None, T=None)

Read the image.

Read the all or part of the image. A n-dimmensional numpy.ndarray is returned such that all trailing empty dimensions will be removed.

For example, if an image is read and it represents an xz plane, then the shape will be [1,m,n].

Parameters

X (Union[list, tuple, None]) – The (min,max) range of pixels to load along the x-axis (columns). If None, loads the full range. Defaults to None.
Y (Union[list, tuple, None]) – The (min,max) range of pixels to load along the y-axis (rows). If None, loads the full range. Defaults to None.
Z (Union[list, tuple, int, None]) – The (min,max) range of pixels to load along the z-axis (depth). Alternatively, an integer can be passed to select a single z-plane. If None, loads the full range. Defaults to None.
C (Union[list, tuple, int, None]) – Values indicating channel indices to load. If None, loads the full range. Defaults to None.
T (Union[list, tuple, int, None]) – Values indicating timepoints to load. If None, loads the full range. Defaults to None.

Return type

ndarray

Returns

A 5-dimensional numpy array.

__call__(tile_size, tile_stride=None, batch_size=None, channels=[0])

Iterate through tiles of an image.

The BioReader object can be called, and will act as an iterator to load tiles of an image. The iterator buffers the loading of pixels asynchronously to quickly deliver images of the appropriate size.

Parameters

tile_size (Union[list, tuple]) – A list/tuple of length 2, indicating the height and width of the tiles to return.
tile_stride (Union[list, tuple, None]) – A list/tuple of length 2, indicating the row and column stride size. If None, then tile_stride = tile_size. Defaults to None.
batch_size (Optional[int]) – Number of tiles to return on each iteration. Defaults to None, which is the smaller of 32 or the maximum_batch_size
channels (List[int]) – A placeholder. Only the first channel is ever loaded. Defaults to [0].

Return type

Iterable[Tuple[ndarray, tuple]]

Returns

A tuple containing a 4-d numpy array and a tuple containing a list of X,Y,Z,C,T indices. The numpy array has dimensions [tile_num,tile_size[0],tile_size[1],channels]

Example

from bfio import BioReader
import matplotlib.pyplot as plt

br = BioReader('/path/to/file')

for tiles,ind in br(tile_size=[256,256],tile_stride=[200,200]):
    for i in tiles.shape[0]:
        print(
            'Displaying tile with X,Y coords: {},{}'.format(
                ind[i][0],ind[i][1]
            )
        )
        plt.figure()
        plt.imshow(tiles[ind,:,:,0].squeeze())
        plt.show()

classmethod image_size(filepath)

image_size Read image width and height from header.

This class method only reads the header information of tiff files or the zarr array json to identify the image width and height. There are instances when the image dimensions may want to be known without actually loading the image, and reading only the header is considerably faster than loading bioformats just to read simple metadata information.

If the file is not a TIFF or OME Zarr, returns width = height = -1.

This code was adapted to only operate on tiff images and includes additional to read the header of little endian encoded BigTIFF files. The original code can be found at: https://github.com/shibukawa/imagesize_py

Parameters: filepath (Path) – Path to tiff file
Returns: Tuple of ints indicating width and height.

property bpp: Same as bytes_per_pixel.

property bytes_per_pixel: int: Number of bytes per pixel.

property channel_names: List[str]: Get the channel names for the image.

close(): Close the image.

property cnames: List[str]: Same as channel_names.

property dtype: dtype: The numpy pixel type of the data.

maximum_batch_size(tile_size, tile_stride=None)

maximum_batch_size Maximum allowable batch size for tiling.

The pixel buffer only loads at most two supertiles at a time. If the batch size is too large, then the tiling function will attempt to create more tiles than what the buffer holds. To prevent the tiling function from doing this, there is a limit on the number of tiles that can be retrieved in a single call. This function determines what the largest number of retrievable batches is.

Parameters

tile_size (List[int]) – The height and width of the tiles to retrieve
tile_stride (Optional[List[int]]) – If None, defaults to tile_size. Defaults to None.

Return type

int

Returns

Maximum allowed number of batches that can be retrieved by the: iterate method.

property metadata: OME

Get the metadata for the image.

This function calls the Bio-Formats metadata parser, which extracts metadata from an image. This returns a reference to an OMEXML class, which is a convenient handler for the complex xml metadata created by Bio-Formats.

Most basic metadata information have their own BioReader methods, such as image dimensions(i.e. x, y, etc). However, in some cases it may be necessary to access the underlying metadata class.

Minor changes have been made to the original OMEXML class created for python-bioformats, so the original OMEXML documentation should assist those interested in directly accessing the metadata. In general, it is best to assign data using the object properties to ensure the metadata stays in sync with the file.

For information on the OMEXML class: https://github.com/CellProfiler/python-bioformats/blob/master/bioformats/omexml.py

Returns: OMEXML object for the image

property physical_size_x: Tuple[float, str]

Physical size of pixels in x-dimension.

Returns: Units per pixel, Units (i.e. “cm” or “mm”)

property physical_size_y: Tuple[float, str]

Physical size of pixels in y-dimension.

Returns: Units per pixel, Units (i.e. “cm” or “mm”)

property physical_size_z: Tuple[float, str]

Physical size of pixels in z-dimension.

Returns: Units per pixel, Units (i.e. “cm” or “mm”)

property ps_x: Tuple[float, str]: Same as physical_size_x.

property ps_y: Same as physical_size_y.

property ps_z: Same as physical_size_z.

property read_only: bool: Returns true if object is ready only.

property samples_per_pixel: int: Number of samples per pixel.

property shape: Tuple[int, int, int, int, int]

The 5-dimensional shape of the image.

Returns: (Y, X, Z, C, T) shape of the image

property spp: Same as samples_per_pixel.