‎"Behind every stack of books there is a flood of knowledge."

RGB-D Object Dataset



The RGB-D Object Dataset is a large dataset of 300 common household objects. The objects are organized into 51 categories arranged using WordNet hypernym-hyponym relationships (similar to ImageNet). This dataset was recorded using a Kinect style 3D camera that records synchronized and aligned 640×480 RGB and depth images at 30 Hz. Each object was placed on a turntable and video sequences were captured for one whole rotation. For each object, there are 3 video sequences, each recorded with the camera mounted at a different height so that the object is viewed from different angles with the horizon.

Unlike many existing datasets,such as Caltech 101 and ImageNet, objects in this dataset are organized into both categories and instances. In these datasets, the class dog contains images from many different dogs and there is no way to tell whether two images contain the same dog, while in the RGB-D Object Dataset the category soda can is divided into physically unique instances like Pepsi Can and Mountain Dew Can. The dataset also provides ground truth pose information for all 300 objects.

Here are some example objects that have been segmented from the background.

RGB-D Scenes Dataset

Aside from isolated views of the 300 objects, the RGB-D Object Dataset also includes 8 annotated video sequences of natural scenes containing objects from the dataset. The scenes cover common indoor environments, including office workspaces, meeting rooms, and kitchen areas. The objects are visible from different viewpoints and distances and may be partially or completely occluded in some frames.


Object Labeling in 3D Scenes

In this video we demonstrate a view-based approach for labeling objects in 3D scenes reconstructed from RGB-D (Kinect) videos. The top row shows the original RGB and depth video frames, with high scoring bounding box object detections plotted on the RGB image. The 3D scene labeling is shown at the bottom, with objects color coded by category (bowl=red, cap=green, cereal=blue, mug=yellow, soda=cyan).

For technical details and more results, see the paper Detection-based Object Labeling in 3D Scenes.

OASIS (Object-Aware Situated Interactive System)

OASIS is a software architecture that enables the prototyping of applications that use RGB-D cameras and underlying computer vision algorithms to recognize and track objects and gestures, combined with interactive projection. Object recognition is an important component of OASIS. The system recognizes objects that are placed within the interactive projection area so that the appropriate animations and augmented reality scenarios can be created. Our approach uses both depth and color information from the RGB-D camera to recognize different objects. Novel objects can be trained on the fly and recognized in the future.

One example application of OASIS is the following interactive LEGO playing scenario that was shown at the Consumer Electronics Show (CES) 2011.




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Virtual Fashion Technology

Virtual Fashion Education


"chúng tôi chỉ là tôi tớ của anh em, vì Đức Kitô" (2Cr 4,5b)


News About Tech, Money and Innovation


Modern art using the GPU

Theme Showcase

Find the perfect theme for your blog.


Learn to Learn

Gocomay's Blog

Con tằm đến thác vẫn còn vương tơ

Toán cho Vật lý

Khoa Vật lý, Đại học Sư phạm Tp.HCM - ĐT :(08)-38352020 - 109

Maths 4 Physics & more...

Blog Toán Cao Cấp (M4Ps)

Bucket List Publications

Indulge- Travel, Adventure, & New Experiences


‎"Behind every stack of books there is a flood of knowledge."

The Blog

The latest news on and the WordPress community.

%d bloggers like this: