I spent some time on training several cascades to detect cars in ego-view automotive videos,
and will now document what I’ve learned.
I will use the existing OpenCV-tools.
-> pos/ 1000 images containing the desired object
-> pos.info (containing the filenames of the objects, number of objects in the frame and bounding boxes in the format x,y,width,height)
-> neg/ 2000 images that do not contain cars at all
-> negs.txt text-file containing the filenames of all negative images
For the positive images I used tight bounding boxes. You actually do not need as many negative images as you want to use negative samples later on, as the training-script will sample patches from the negative images given, so it can actually be less images than negative samples.
Some of the positive images, the bounding-boxes have been annotated by hand (ground-truth-data):