In the video below you can see the sequence of a car driving in a city scene and braking. The layers I rendered out for groundtruth data are the rendered image with the boundingbox of the car (top left), the emission layer ( shows the brakelights when they start to emit light, top right ), the optical flow (lower left), and the depth of each pixel in the world scene ( lower right).
Render-time was about 10h on a Nvidia GeForce GTX 680, tilesize 256×256, total image-size: 960×720.
Setting up groundtruth rendering, saving and reading it again
After you have composed your scene, switch to the Cycles rendering engine, and enable the render-passes in the properties-view (under render-layers) you plan on saving as ground-truth information. In this case, I selected the Z-layer, which represents the depth of the scene:
Now open the Node-editor-view, display the Compositing-Nodetree and put a checkmark at „Use Nodes“, it should look like this:
Hit „T“, and in the „output“-tab, select „File Output“. Connect it to the Z-output of the render-layers-node. Hit „N“ while the File-Output-node is selected, enter an appropriate File-subpath as name for the data (I chose „Depth“ here), and depending on your usecase, change the output format. As I am expecting float-values that are not constrained to 0-255, I will save the data in the OpenEXR-format:
After rendering ([F12]), we have a file called /tmp/Depth0001.exr. It can be read and displayed using python:
#!/usr/bin/python ''' author: Tobias Weis ''' import OpenEXR import Imath import array import numpy as np import csv import time import datetime import h5py import matplotlib.pyplot as plt def exr2numpy(exr, maxvalue=1.,normalize=True): """ converts 1-channel exr-data to 2D numpy arrays """ file = OpenEXR.InputFile(exr) # Compute the size dw = file.header()['dataWindow'] sz = (dw.max.x - dw.min.x + 1, dw.max.y - dw.min.y + 1) # Read the three color channels as 32-bit floats FLOAT = Imath.PixelType(Imath.PixelType.FLOAT) (R) = [array.array('f', file.channel(Chan, FLOAT)).tolist() for Chan in ("R") ] # create numpy 2D-array img = np.zeros((sz,sz,3), np.float64) # normalize data = np.array(R) data[data > maxvalue] = maxvalue if normalize: data /= np.max(data) img = np.array(data).reshape(img.shape,-1) return img depth_data = exr2numpy("Depth0001.exr", maxvalue=15, normalize=False) fig = plt.figure() plt.imshow(depth_data) plt.colorbar() plt.show()
Further layers to generate groundtruth data
Besides depth information, we can also generate groundtruth data for other modalities:
- Object/Material index: Can be used to generate pixel-accurate annotations of objects for semantic segmentation tasks, or simply computing bounding boxes for IOU measures
- Normal: Can be used to generate pixel-accurate groundtruth data of surface normals
- Vector: Will show up as „Speed“ in the node editor, contains information for optical flow between two frames in a sequence
- Emit: Pixel-accurate information of emssion-properties.