Visual human tracking are often disturbed by the objects’ occlusion, similar appearance and the clutter background. Using multiple cameras may partially resolve these problems, but it is difficult to estimate the accurate 3D location of the targets. This situation may be improved as the application of RGB-D camera recently, which introduces the depth information for visual tracking. But it is still afflicted by objects occlusion if only using single RGB-D camera. To settle down the occlusion problem and get robust people tracking, we study a collaborative tracking method using multiple RGB-D cameras. In this method, an automatic registration method is presented to construct the transformation model between different cameras. The targets are represented as the fusion of visual and depth features and tracked by an integrated particle filtering method in the multi-camera system. The experiments on real scenarios show that the proposed method is robust to occlusion and similar appearance. Moreover, the location of the targets is also available in the tracking procedure.