Package dvt

Distant Viewing Toolkit for the Analysis of Visual Culture

The Distant TV Toolkit is a Python package designed to facilitate the computational analysis of visual culture. It contains low-level architecture for applying state-of-the-art computer vision algorithms to still and moving images. The higher-level functionality of the toolkit allows users to quickly extract semantic metadata from digitized collections. Extracted information can be visualized for search and discovery or aggregated and analyzed to find patterns across a corpus.

Expand source code
# -*- coding: utf-8 -*-
"""Distant Viewing Toolkit for the Analysis of Visual Culture

The Distant TV Toolkit is a Python package designed to facilitate the
computational analysis of visual culture. It contains low-level architecture
for applying state-of-the-art computer vision algorithms to still and moving
images. The higher-level functionality of the toolkit allows users to quickly
extract semantic metadata from digitized collections. Extracted information
can be visualized for search and discovery or aggregated and analyzed to find
patterns across a corpus.
"""

from cv2 import imread, imwrite, imshow

from .abstract import ImageAnnotator, BatchAnnotator
from .annotate import SizeAnnotator, ImwriteAnnotator, AverageAnnotator
from .aggregate import CutAggregator
from .batch import DiffAnnotator
from .detectron import (
    InstanceAnnotator,
    LVISAnnotator,
    CityscapesAnnotator,
    KeypointsAnnotator,
    PanopticAnnotator
)
from .keras import (
    FaceAnnotator,
    FaceDetectMtcnn,
    FaceEmbedVgg2,
    EmbedAnnotator,
    EmbedImageKeras,
    EmbedImageKerasResNet50,
)
from .output import DVTOutput
from .video import VideoFrameInput, VideoBatchInput, FrameBatch

__version__ = "0.4.0"

Sub-modules

dvt.abstract

Abstract classes for running the Distant Viewing Toolkit.

dvt.aggregate

Aggregators.

dvt.annotate

Annotators for extracting high-level metadata about the images in the input.

dvt.batch

Annotators for extracting high-level metadata about the images in the input.

dvt.detectron

Annotators that require installing keras and tensorflow.

dvt.keras

Annotators that require installing keras and tensorflow.

dvt.output

Video files.

dvt.utils

Utility functions used across the toolkit …

dvt.video

Video files.

Functions

def imread(filename, flags)

imread(filename[, flags]) -> retval . @brief Loads an image from a file. .
. @anchor imread .
. The function imread loads an image from the specified file and returns it. If the image cannot be . read (because of missing file, improper permissions, unsupported or invalid format), the function . returns an empty matrix ( Mat::data==NULL ). .
. Currently, the following file formats are supported: .
. - Windows bitmaps - *.bmp, *.dib (always supported) . - JPEG files - *.jpeg, *.jpg, *.jpe (see the Note section) . - JPEG 2000 files - *.jp2 (see the Note section) . - Portable Network Graphics - *.png (see the Note section) . - WebP - *.webp (see the Note section) . - Portable image format - *.pbm, *.pgm, *.ppm *.pxm, *.pnm (always supported) . - PFM files - *.pfm (see the Note section) . - Sun rasters - *.sr, *.ras (always supported) . - TIFF files - *.tiff, *.tif (see the Note section) . - OpenEXR Image files - *.exr (see the Note section) . - Radiance HDR - *.hdr, *.pic (always supported) . - Raster and Vector geospatial data supported by GDAL (see the Note section) .
. @note . - The function determines the type of an image by the content, not by the file extension. . - In the case of color images, the decoded images will have the channels stored in B G R order. . - When using IMREAD_GRAYSCALE, the codec's internal grayscale conversion will be used, if available. . Results may differ to the output of cvtColor() . - On Microsoft Windows* OS and MacOSX*, the codecs shipped with an OpenCV image (libjpeg, . libpng, libtiff, and libjasper) are used by default. So, OpenCV can always read JPEGs, PNGs, . and TIFFs. On MacOSX, there is also an option to use native MacOSX image readers. But beware . that currently these native image loaders give images with different pixel values because of . the color management embedded into MacOSX. . - On Linux*, BSD flavors and other Unix-like open-source operating systems, OpenCV looks for . codecs supplied with an OS image. Install the relevant packages (do not forget the development . files, for example, "libjpeg-dev", in Debian* and Ubuntu*) to get the codec support or turn . on the OPENCV_BUILD_3RDPARTY_LIBS flag in CMake. . - In the case you set WITH_GDAL flag to true in CMake and @ref IMREAD_LOAD_GDAL to load the image, . then the GDAL driver will be used in order to decode the image, supporting . the following formats: Raster, . Vector. . - If EXIF information is embedded in the image file, the EXIF orientation will be taken into account . and thus the image will be rotated accordingly except if the flags @ref IMREAD_IGNORE_ORIENTATION . or @ref IMREAD_UNCHANGED are passed. . - Use the IMREAD_UNCHANGED flag to keep the floating point values from PFM image. . - By default number of pixels must be less than 2^30. Limit can be set using system . variable OPENCV_IO_MAX_IMAGE_PIXELS .
. @param filename Name of file to be loaded. . @param flags Flag that can take values of cv::ImreadModes

def imshow(winname, mat)

. @brief Displays an image in the specified window. .
. The function imshow displays an image in the specified window. If the window was created with the . cv::WINDOW_AUTOSIZE flag, the image is shown with its original size, however it is still limited by the screen resolution. . Otherwise, the image is scaled to fit the window. The function may scale the image, depending on its depth: .
. - If the image is 8-bit unsigned, it is displayed as is. . - If the image is 16-bit unsigned or 32-bit integer, the pixels are divided by 256. That is, the . value range [0,255*256] is mapped to [0,255]. . - If the image is 32-bit or 64-bit floating-point, the pixel values are multiplied by 255. That is, the . value range [0,1] is mapped to [0,255]. .
. If window was created with OpenGL support, cv::imshow also support ogl::Buffer , ogl::Texture2D and . cuda::GpuMat as input. .
. If the window was not created before this function, it is assumed creating a window with cv::WINDOW_AUTOSIZE. .
. If you need to show an image that is bigger than the screen resolution, you will need to call namedWindow("", WINDOW_NORMAL) before the imshow. .
. @note This function should be followed by cv::waitKey function which displays the image for specified . milliseconds. Otherwise, it won't display the image. For example, waitKey(0) will display the window . infinitely until any keypress (it is suitable for image display). waitKey(25) will display a frame . for 25 ms, after which display will be automatically closed. (If you put it in a loop to read . videos, it will display the video frame-by-frame) .
. @note .
. [Windows Backend Only] Pressing Ctrl+C will copy the image to the clipboard. .
. [Windows Backend Only] Pressing Ctrl+S will show a dialog to save the image. .
. @param winname Name of the window. . @param mat Image to be shown.

def imwrite(filename, img, params)

imwrite(filename, img[, params]) -> retval . @brief Saves an image to a specified file. .
. The function imwrite saves the image to the specified file. The image format is chosen based on the . filename extension (see cv::imread for the list of extensions). In general, only 8-bit . single-channel or 3-channel (with 'BGR' channel order) images . can be saved using this function, with these exceptions: .
. - 16-bit unsigned (CV_16U) images can be saved in the case of PNG, JPEG 2000, and TIFF formats . - 32-bit float (CV_32F) images can be saved in PFM, TIFF, OpenEXR, and Radiance HDR formats; . 3-channel (CV_32FC3) TIFF images will be saved using the LogLuv high dynamic range encoding . (4 bytes per pixel) . - PNG images with an alpha channel can be saved using this function. To do this, create . 8-bit (or 16-bit) 4-channel image BGRA, where the alpha channel goes last. Fully transparent pixels . should have alpha set to 0, fully opaque pixels should have alpha set to 255/65535 (see the code sample below). . - Multiple images (vector of Mat) can be saved in TIFF format (see the code sample below). .
. If the format, depth or channel order is different, use . Mat::convertTo and cv::cvtColor to convert it before saving. Or, use the universal FileStorage I/O . functions to save the image to XML or YAML format. .
. The sample below shows how to create a BGRA image, how to set custom compression parameters and save it to a PNG file. . It also demonstrates how to save multiple images in a TIFF file: . @include snippets/imgcodecs_imwrite.cpp . @param filename Name of the file. . @param img (Mat or vector of Mat) Image or Images to be saved. . @param params Format-specific parameters encoded as pairs (paramId_1, paramValue_1, paramId_2, paramValue_2, … .) see cv::ImwriteFlags