University of Sussex
Mitra,_Bhargav_Kumar.pdf (16.12 MB)

Scene segmentation using similarity, motion and depth based cues

Download (16.12 MB)
posted on 2023-06-07, 15:33 authored by Bhargav Kumar Mitra
Segmentation of complex scenes to aid surveillance is still considered an open research problem. In this thesis a computational model (CM) has been developed to classify a scene into foreground, moving-shadow and background regions. It has been demonstrated how the CM, with the optional use of a channel ratio test, can be applied to demarcate foreground shadow regions in indoor scenes illuminated by a fixed incandescent source of light. A combined approach, involving the CM working in tandem with a traditional motion cue based segmentation method, has also been constructed. In the combined approach, the CM is applied to segregate the foreground shaded regions in a current frame based on a binary mask generated using a standard background subtraction process (BSP). Various popular outlier detection strategies have been investigated to assess their suitabilities in generating a threshold automatically, required to develop a binary mask from a difference frame, the outcome of the BSP. To evaluate the full scope of the pixel labeling capabilities of the CM and to estimate the associated time constraints, the model is deployed for foreground scene segmentation in recorded real-life video streams. The observations made validate the satisfactory performance of the model in most cases. In the second part of the thesis depth based cues have been exploited to perform the task of foreground scene segmentation. An active structured light based depthestimating arrangement has been modeled in the thesis; the choice of modeling an active system over a passive stereovision one has been made to alleviate some of the difficulties associated with the classical correspondence problem. The model developed not only facilitates use of the set-up but also makes possible a method to increase the working volume of the system without explicitly encoding the projected structured pattern. Finally, it is explained how scene segmentation can be accomplished based solely on the structured pattern disparity information, without generating explicit depthmaps. To de-noise the difference frames, generated using the developed method, two median filtering schemes have been implemented. The working of one of the schemes is advocated for practical use and is described in terms of discrete morphological operators, thus facilitating hardware realisation of the method to speed-up the de-noising process.


File Version

  • Published version



Department affiliated with

  • Engineering and Design Theses

Qualification level

  • doctoral

Qualification name

  • dphil


  • eng


University of Sussex

Full text available

  • Yes

Legacy Posted Date


Usage metrics

    University of Sussex (Theses)


    No categories selected


    Ref. manager