YOLO-G

 

Problem

Geospatial and aerial imagery pose unique challenges for computer vision. Additionally, the data may break many of the conventional architectures and training techniques used in machine learning computer vision (ML-CV). In Ampsight’s experience, existing ML-CV architectures generally fail to perform satisfactorily on this type of imagery, especially on unseen data.

To improve the performance and generalization of ML-CV algorithms in this context, we introduce a new ML-CV architecture based on the YOLO series of single shot object detection models, which we call YOLOG (YOLO Geospatial). YOLOG builds on research by Zheng Ge et al on another YOLO architecture called YOLOX.

Solution

The following sections highlight some of YOLO-G’s advances.

Constant GSD Chipping

One of the major problems with geospatial imagery is the inconsistency in the size of objects due to the substantial variance of resolution in this type of imagery. For example, a car may be only a few dozen pixels in one image, and tens of thousands in another.

To improve the performance of models on imagery of varying resolutions we modify the chipping process, which takes in large images and breaks them into smaller parts to account for the Ground Sample Distance (GSD) of the imagery being chipped. This allows us to ensure that objects have a consistent size when they are seen by the model, even if their sharpness may change.

Class Masking

Figure 1:  Aerial imagery (left) with GSD of .076 meters. Aerial imagery (right) with GSD of .60 meters.

As the number of classes a model is desired to detect grows, so does the cross-class labeling burden. In traditional imagery, often objects of interest are the only objects in a given image. In overhead imagery, this is not at all the case. Instead, many different objects often co-occur in each image. This becomes a problem if each image requires annotations for every single class the model is tasked with detecting.

To resolve this issue, we introduced modified learning and evaluation objectives that take the contents of imagery into account, and prevent training the model to erroneously consider unlabeled samples as not objects. For example on the right, cases, where only semi-trucks are labeled, co-occur with unlabeled cars. Conventionally, those cars would be used as background data, effectively causing the model to incorrectly learn that some cars are not cars at all, and potentially negatively affecting performance.

Figure 2: An example of In-Class label incompleteness, where all instances of an object (cars in this case) are not labeled.  In the middle and right are examples of Cross-class label incompleteness, where objects of one class are labeled but other classes are not

Outlier Modeling

 At Ampsight, we have lots of experience running models at large scales, and one of the most common issues we have encountered is that while models can perform well on training data, and achieve compelling metrics on test sets, they may still have significant challenges dealing with the complex and unpredictable backgrounds they are exposed to when run at large scales, which can result in thousands or tens of thousands of false positives, regardless of model F1 scores on test data.

Instead of requiring prohibitive amounts of negative samples, YOLOG internally and automatically extends background detection by modeling outliers as it trains, without the need for any additional data. This allows it to produce significantly more robust detections with far fewer false positives.

Modeling false positives graph vs approximate F1 score

Figure 3: YOLOX model produces many false positives, while on the right after training with the same data a YOLO-G model does not.


Conclusion

Our initial evaluations of YOLO-G and YOLO-X models trained on the same dataset and tested on unseen data of a different area with a different GSD has shown compelling improvements in performance.

 

 

Figure 4: Evaluations for YOLOG and YOLOX models trained on the same data against unseen data of a different location and GSD.

 
 
Guest User