沈阳理工大学自动化与电气工程学院;
多视图立体匹配(Multi-View Stereo)是计算机视觉领域的重要任务之一,旨在从多个视角的图像中恢复场景的结构信息。然而,由于成本体积聚合在局部存在着严重的不一致性,直接聚合几何相邻成本会导致严重错误导向。现有的方法要么寻求二维空间的最优选择性聚集,要么增加聚集的手段,但都无法有效解决成本体积的几何不一致性,导致深度估计的精度和鲁棒性不佳。为了解决这个问题,提出用于多视图立体的协同表达(CRMVS),旨在协同多个模块整合几何的一致性信息,提高多视图立体匹配任务的深度估计精度和鲁棒性。首先,利用改进的特征金字塔网络(FPN)增强网络的特征提取能力。其次,设计了一个渐进式权重网络模块(PWN)进行代价体的构建。最后,设计了一个几何代价聚合与精化网络模块(GCR)来对代价体进行精准聚合。实验结果表明在DTU,Tanks&Temple数据集上都展现出了先进的性能。
44 | 0 | 24 |
下载次数 | 被引频次 | 阅读次数 |
[1] Kendall A, Martirosyan H, Dasgupta S, et al. End-to-end learning of geometry and context for deep stereo regression[J]. Proceedings of the IEEE International Conference on Computer Vision,2017:66-75.
[2] Luo G, Wu X, Lin L. A Dual Attention Network for Scene Segmentation and Object Proposal Generation[J]. In European Conference on Computer Vision(ECCV),2018:269-285.
[3] Zhang H, Goodfellow I, Metaxas D, et al. Self-Attention Generative Adversarial Networks[J]. International Conference on Machine Learning(ICML),2018,arXiv:1805.08318.
[4] Park J, Tai Y W. Deep multi-view stereo using 3d guided pruning network[J]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2019:5184-5193.
[5] Yao H, Luo Z, Li S, et al. MVSNet:Depth inference for unstructured multi-view stereo[J]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:2821-2830.
[6] Godard C, Mac Aodha O, Firman M, et al. Digging into selfsupervised monocular depth estimation[J]. Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:3828-3838.
[7]周晓清,王翔,郑锦,等.基于自适应空间稀疏化的高效多视图立体匹配[J].电子学报,2023,51(11):3079-3091.
[8]孙凯,张成,詹天,等.融合注意力机制和多层动态形变卷积的多视图立体视觉重建方法[J].兵工学报,2024:1-11.
[9] Furukawa Y, Ponce J. Accurate, Dense, and Robust Multiview Stereopsis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32:1362-1376.
[10] Yao Y, Luo Z, Li S, et al. Recurrent mvsnet for high-resolution multi-view stereo depth inference[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,2019:5525-5534.
[11] Chen R, Han S, Xu J, et al. Point-based multi-view stereo network[C]//Proceedings of the IEEE/CVF international conference on computer vision,2019:1538-1547.
[12] Gu X, Fan Z, Zhu S, et al. Cascade cost volume for highresolution multi-view stereo and stereo matching[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020:2495-2504.
[13] Zhang X, Hu Y, Wang H, et al. Long-range attention network for multi-view stereo[C]//proceedings of the IEEE/CVF winter conference on applications of computer vision,2021:3782-3791.
[14] Henrik Aan?s, Rasmus Ramsb?l Jensen, George Vogiatzis, et al.Large-scale data for multiple-view stereopsis[J]. International Journal of Computer Vision,2016:153–168.
[15] Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, et al. Tanks and temples:Benchmarking large-scale scene reconstruction[J]. ACM Transactions on Graphics(ToG),2017,36(4):1–13.
[16] Silvano Galliani, Katrin Lasinger, Konrad Schindler. Massively parallel multiview stereopsis by surface normal diffusion[J]. Proceedings of the IEEE International Conference on Computer Vision,2015:873–881.
[17] Sch?nberger J L, Zheng E, Frahm J M, et al. Pixelwise view selection for unstructured multi-view stereo[C]//Computer Vision–ECCV2016:14th European Conference, Amsterdam, The Netherlands, Springer International Publishing,2016:501-518.
[18] Tong W, Guan X, Kang J, et al. Normal assisted pixel-visibility learning with cost aggregation for multiview stereo[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(12):24686-24697.
[19] Wei Z, Zhu Q, Min C, et al. Aa-rmvsnet:Adaptive aggregation recurrent multi-view stereo network[C]//Proceedings of the IEEE/CVF international conference on computer vision, 2021:6187-6196.
[20] Zhang J, Li S, Luo Z, et al. Vis-mvsnet:Visibility-aware multiview stereo network[J]. International Journal of Computer Vision, 2023, 131(1):199-214.
[21] Peng R, Wang R, Wang Z, et al. Rethinking depth estimation for multi-view stereo:A unified representation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022:8645-8654.
[22] Wang X, Zhu Z, Huang G, et al. Mvster:Epipolar transformer for efficient multi-view stereo[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland, 2022:573-591.
[23] Ding Y, Yuan W, Zhu Q, et al. Transmvsnet:Global contextaware multi-view stereo network with transformers[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022:8585-8594.
[24] Mi Z, Di C, Xu D. Generalized binary search network for highlyefficient multi-view stereo[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,2022:12991-13000.
[25] Zhang Y, Zhu J, Lin L. Multi-view stereo representation revist:Region-aware mvsnet[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2023:17376-17385.
[26] Zhang Z, Peng R, Hu Y, et al. Geomvsnet:Learning multi-view stereo with geometry perception[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2023:21508-21518.
[27] Cao C, Ren X, Fu Y. Mvsformer:Learning robust image representations via transformers and temperature-based depth for multi-view stereo[J]. arXiv preprint arXiv:2208.02541, 2022, 5.
[28] Liu T, Ye X, Zhao W, et al. When epipolar constraint meets nonlocal operators in multi-view stereo[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2023:18088-18097.
基本信息:
DOI:
中图分类号:TP391.41
引用信息:
[1]朱治年,刘韵婷,肖培宇等.多阶段协同的多视图立体算法研究[J].通信与信息技术,2025,No.274(02):28-32.
基金信息: