Abstract:
In this paper, we tackle the problem of semantic segmentation for nighttime images that plays an equally important role as that for daytime images in autonomous driving, but is also much more challenging due to very poor illuminations and scarce annotated datasets. It can be treated as an unsupervised domain adaptation (UDA) problem, i.e., applying other labeled dataset taken in the daytime to guide the network training meanwhile reducing the domain shift, so that the trained model can generalize well to the desired domain of nighttime images. However, current general-purpose UDA approaches are insufficient to address the significant appearance difference between the day and night domains. To overcome such a large domain gap, we propose a novel domain adaptation network “DANIA” for nighttime semantic image segmentation by leveraging a labeled daytime dataset (the source domain) and an unlabeled dataset that contains coarsely aligned day-night image pairs (the target daytime and nighttime domains). These three domains are used to perform a multi-target adaptation via adversarial training in the network. Specifically, for the unlabeled day-night image pairs, we use the pixel-level predictions of static object categories on a daytime image as a pseudo supervision to segment its counterpart nighttime image. We also include a step of image alignment to relieve the inaccuracy caused by the misalignment between day-night image pairs by estimating a flow to refine the pseudo supervision produced by daytime images. Finally, a re-weighting strategy is applied to further improve the predictions, especially boosting the prediction accuracy of small objects. The proposed DANIA is a one-stage adaptation framework for nighttime semantic segmentation, which does not train additional day-night image transfer models as a separate pre-processing stage. Extensive experiments on Dark Zurich and Nighttime Driving datasets show that our DANIA achieves state-of-the-art performance for nighttime semantic segmentation.