Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Exporting camera values for use in MVSNet

See original GitHub issue

I’ve been struggling to adapt camera extrinsics and intrinsics for use in MVSNet.

For extrinsics, I export E = [R|t] (via Colmap) as a 3x4 matrix composed of a 3x3 rotation matrix and 3x1 translation vector, as shown below: r1,1 r1,2 r1,3 | t1 r2,1 r2,2 r2,3 | t2 r3,1 r3,2 r3,3 | t3

For example, here’s the working ‘00000036_cam.txt’ file:

extrinsic
-0.304163 -0.87333 0.380498 -236.771
0.244298 0.314555 0.917264 -567.94
-0.920762 0.371953 0.117677 583.523
0.0 0.0 0.0 1.0

intrinsic
2892.33 0 823.205
0 2883.18 619.072
0 0 1

425 2.5

When run, it yields the resulting depth map: 36-ref My output for the same source image using the code snippet shown below is:

extrinsic
0.998263 0.00300635 0.0588315 -0.453214
0.00892024 0.979466 -0.201412 -0.553871
-0.058229 0.201587 0.977738 1.11213
0.0 0.0 0.0 1.0

intrinsic
2889.61 0 800
0 2889.61 600
0 0 1

425 1.0

The depth map that results is shown here: 36 2 5

Looking at the intrinsics, you will immediately note that I’m using a single focal length value and that I haven’t tuned the principal point; however, that seems not to be a problem as I’m able to compute results for your provided data by substituting my intrinsics values.

I thought perhaps the data was fine but I needed to adjust the DEPTH_MIN and DEPTH_INTERVAL values in order to frame the depth values, but changing those values yields highly similar results.

Therefore, the problem seems to be my construction of the extrinsics matrix. Any pointers would be very welcome.

Could you share the specification or code you use to export the camera values prior to MVSNet reconstruction?

My c++ output code using the Colmap library is below:

    file << "extrinsic" << std::endl;

    Eigen::Matrix3d R;
    R = image.second.RotationMatrix();
    
    // Write camera rotation matrix and translation vector
    file << R(0,0) << " " << R(0,1) << " " << R(0,2) << " " << image.second.Tvec(0) << std::endl;
    file << R(1,0) << " " << R(1,1) << " " << R(1,2) << " " << image.second.Tvec(1) << std::endl;
    file << R(2,0) << " " << R(2,1) << " " << R(2,2) << " " << image.second.Tvec(2) << std::endl;
    file << "0.0 0.0 0.0 1.0" << std::endl;

    // Write camera intrinsics
    file << std::endl;
    file << "intrinsic" << std::endl;

    // Reference to current image's camera
    auto& camera = cameras_.at(image.second.CameraId());
    
    // Note hard-coded zero value for skew
    file << camera.FocalLength() << " 0 " << camera.PrincipalPointX() << std::endl;
    file << "0 " << camera.FocalLength() << " " << camera.PrincipalPointY() << std::endl;
    file << "0 0 1" << std::endl;
    file << std::endl;```

Issue Analytics

State:
Created 5 years ago
Reactions:1
Comments:22 (5 by maintainers)

Top GitHub Comments

5reactions

KevinCaincommented, Nov 2, 2018

Hello, @BrianPugh,

I create the ‘pairs.txt’ input by exporting visibility information from my own system, which essentially does the following:

Iterate through the image set
If a 2D point in the current image has a 3D point, get the images in the track
If the track image ID is anything other than our current image, add it to a list of visible images

Since it sounds like you’d like to use the Colmap CLI, you can get this visibility information by exporting a sparse reconstruction as PMVS format, which writes a text file with a list of images with visibility list for each:

pmvs/vis.dat

You can see how to call this from the CLI here; this should be it:

$ colmap image_undistorter \
    --image_path $DATASET_PATH/images \
    --input_path $DATASET_PATH/sparse/0 \
    --output_path $DATASET_PATH/dense \
    --output_type PMVS \
    --max_image_size 2000

5reactions

KevinCaincommented, Oct 24, 2018

Yes, of course, @BrianPugh, you can output the extrinsics and intrinsics based on my C++ code above in this thread; just add the following line to export DEPTH_MIN and DEPTH_INTERVAL.

file << depth_ranges[currentCamera].first << " " << ((depth_ranges[currentCamera].second - depth_ranges[currentCamera].first) / max_d) << std::endl;

As I noted above, I compute the global depth range per image using the as follows: DEPTH_MIN = depth_range_min DEPTH_INTERVAL = (depth_range_max - depth_range_min) / max_d

Above, depth_ranges is computed from Colmap’s ComputeDepthRanges(). That should be all you need!

Top Results From Across the Web

MVSNet: Depth Inference for Unstructured Multi-view Stereo

We present an end-to-end deep learning architecture for depth map inference from multi-view images. In the network, we first extract deep visual ...

DEMVSNet: Denoising and depth inference for unstructured ...

Most deep-learning-based multi-view stereo series studies are concerned with improving the depth prediction accuracy of noise-free images.

PASMVS: A perfectly accurate, synthetic, path-traced dataset ...

The intrinsic and extrinsic camera files were exported as a single (comma-separated value) CSV file for every scene. Data format, Raw.

Three-Dimensional Reconstruction Method for Bionic ... - MDPI

We fed the captured image and camera parameters to the trained deep neural ... CES-MVSNet for 3D reconstruction using the bionic compound-eye system...

MVSNet: Depth Inference for Unstructured ... - CVF Open Access

MVSNet. 3 inference, our 3D cost volume is built upon the camera frustum ... use the learned features for stereo matching and semi-global...