15.3.1.13 Vision-Language Navigation

Chapter Contents (Back)
Navigation. Vision-Language.

Bajcsy, R., and Nagel, H.H.,
Descriptive and Prescriptive Languages for Mobility Tasks: Are They Different?,
AIU96(280-300). BibRef 9600

Zhu, M., Chen, W., Xia, J., Ma, Y., Zhang, Y., Luo, Y., Huang, Z., Liu, L.,
Location2Vec: A Situation-Aware Representation for Visual Exploration of Urban Locations,
ITS(20), No. 10, October 2019, pp. 3981-3990.
IEEE DOI 1910
Trajectory, Visualization, Sociology, Statistics, Vehicle dynamics, Mobile handsets, Natural language processing, Human mobility, visual exploration BibRef

Li, P.[Pei], Li, X.[Xinde], Li, X.H.[Xiang-Hui], Pan, H.[Hong], Khyam, M.O., Noor-A-Rahim, M., Ge, S.S.[Shuzhi Sam],
Place perception from the fusion of different image representation,
PR(110), 2021, pp. 107680.
Elsevier DOI 2011
Indoor place perception, CNN, LSTM, Convolutional auto-encoder, Natural language BibRef

Wu, Z.K.[Zong-Kai], Liu, Z.[Zihan], Wang, T.[Ting], Wang, D.L.[Dong-Lin],
Improved Speaker and Navigator for Vision-and-Language Navigation,
MultMedMag(28), No. 4, October 2021, pp. 55-63.
IEEE DOI 2112
Navigation, Visualization, Decoding, Trajectory, Task analysis, Feature extraction, Head BibRef

Wang, X.[Xin], Huang, Q.Y.[Qiu-Yuan], Celikyilmaz, A.[Asli], Gao, J.F.[Jian-Feng], Shen, D.[Dinghan], Wang, Y.F.[Yuan-Fang], Wang, W.Y.[William Yang], Zhang, L.[Lei],
Vision-Language Navigation Policy Learning and Adaptation,
PAMI(43), No. 12, December 2021, pp. 4205-4216.
IEEE DOI 2112
BibRef
Earlier:
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation,
CVPR19(6622-6631).
IEEE DOI Award, CVPR, Student. 2002
Navigation, Visualization, Cognition, Reinforcement learning, Natural languages, Benchmark testing, Natural languages, multimodal machine learning BibRef

Lin, B.Q.[Bing-Qian], Zhu, Y.[Yi], Long, Y.X.[Yan-Xin], Liang, X.D.[Xiao-Dan], Ye, Q.X.[Qi-Xiang], Lin, L.[Liang],
Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation,
PAMI(44), No. 10, October 2022, pp. 7175-7189.
IEEE DOI 2209
Navigation from language. Navigation, Task analysis, Visualization, Robustness, Perturbation methods, Stairs, Natural languages, self-supervised learning BibRef

Wang, H.Q.[Han-Qing], Wang, W.G.[Wen-Guan], Liang, W.[Wei], Hoi, S.C.H.[Steven C. H.], Shen, J.B.[Jian-Bing], Van Gool, L.J.[Luc J.],
Active Perception for Visual-Language Navigation,
IJCV(131), No. 3, March 2023, pp. 607-625.
Springer DOI 2302
BibRef

Zhao, H.J.[Hao-Jie], Wang, X.[Xiao], Wang, D.[Dong], Lu, H.C.[Hu-Chuan], Ruan, X.[Xiang],
Transformer vision-language tracking via proxy token guided cross-modal fusion,
PRL(168), 2023, pp. 10-16.
Elsevier DOI 2304
Visual object tracking, Transformer, Vision-language BibRef

Ding, H.H.[Heng-Hui], Liu, C.[Chang], Wang, S.[Suchen], Jiang, X.D.[Xu-Dong],
VLT: Vision-Language Transformer and Query Generation for Referring Segmentation,
PAMI(45), No. 6, June 2023, pp. 7900-7916.
IEEE DOI 2305
BibRef
Earlier:
Vision-Language Transformer and Query Generation for Referring Segmentation,
ICCV21(16301-16310)
IEEE DOI 2203
Image segmentation, Transformers, Feature extraction, Task analysis, Representation learning, Fuses, Decoding, vision-language transformer. Convolutional codes, Visualization, Computational modeling, Vision+language BibRef

Zeng, H.T.[Hai-Tao], Song, X.H.[Xin-Hang], Jiang, S.Q.[Shu-Qiang],
Multi-Object Navigation Using Potential Target Position Policy Function,
IP(32), 2023, pp. 2608-2619.
IEEE DOI 2305
Navigation, Task analysis, Semantics, Visualization, Reinforcement learning, Trajectory, Multi-object navigation, embodied AI BibRef

Qiao, Y.Y.[Yan-Yuan], Qi, Y.K.[Yuan-Kai], Hong, Y.C.[Yi-Cong], Yu, Z.[Zheng], Wang, P.[Peng], Wu, Q.[Qi],
HOP+: History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation,
PAMI(45), No. 7, July 2023, pp. 8524-8537.
IEEE DOI 2306
BibRef
Earlier:
HOP: History-and-Order Aware Pretraining for Vision-and-Language Navigation,
CVPR22(15397-15406)
IEEE DOI 2210
Task analysis, Trajectory, History, Navigation, Predictive models, Transformers, Atmospheric modeling, memory networks. Visualization, Computational modeling, Decision making, Vision+language, Navigation and autonomous driving BibRef

Wang, L.[Liuyi], He, Z.T.[Zong-Tao], Dang, R.[Ronghao], Chen, H.[Huiyi], Liu, C.J.[Cheng-Ju], Chen, Q.J.[Qi-Jun],
RES-StS: Referring Expression Speaker via Self-Training With Scorer for Goal-Oriented Vision-Language Navigation,
CirSysVideo(33), No. 7, July 2023, pp. 3441-3454.
IEEE DOI 2307
Task analysis, Navigation, Training, Trajectory, Data models, Visualization, Feature extraction, contrastive learning BibRef

Lin, B.Q.[Bing-Qian], Long, Y.X.[Yan-Xin], Zhu, Y.[Yi], Zhu, F.D.[Feng-Da], Liang, X.D.[Xiao-Dan], Ye, Q.X.[Qi-Xiang], Lin, L.[Liang],
Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning,
PAMI(45), No. 10, October 2023, pp. 12535-12549.
IEEE DOI 2310
BibRef


Krantz, J.[Jacob], Gervet, T.[Theophile], Yadav, K.[Karmesh], Wang, A.[Austin], Paxton, C.[Chris], Mottaghi, R.[Roozbeh], Batra, D.[Dhruv], Malik, J.[Jitendra], Lee, S.[Stefan], Chaplot, D.S.[Devendra Singh],
Navigating to Objects Specified by Images,
ICCV23(10882-10891)
IEEE DOI 2401
BibRef

Yang, X.[Xu], Li, Z.Z.[Zhang-Zikang], Xu, H.Y.[Hai-Yang], Zhang, H.W.[Han-Wang], Ye, Q.[Qinghao], Li, C.L.[Chen-Liang], Yan, M.[Ming], Zhang, Y.[Yu], Huang, F.[Fei], Huang, S.F.[Song-Fang],
Learning Trajectory-Word Alignments for Video-Language Tasks,
ICCV23(2504-2514)
IEEE DOI 2401
BibRef

Lin, K.[Kunyang], Chen, P.H.[Pei-Hao], Huang, D.[Diwei], Li, T.H.[Thomas H.], Tan, M.K.[Ming-Kui], Gan, C.[Chuang],
Learning Vision-and-Language Navigation from YouTube Videos,
ICCV23(8283-8292)
IEEE DOI Code:
WWW Link. 2401
BibRef

Wang, Z.[Zun], Li, J.[Jialu], Hong, Y.C.[Yi-Cong], Wang, Y.[Yi], Wu, Q.[Qi], Bansal, M.[Mohit], Gould, S.[Stephen], Tan, H.[Hao], Qiao, Y.[Yu],
Scaling Data Generation in Vision-and-Language Navigation,
ICCV23(11975-11986)
IEEE DOI 2401
BibRef

Cui, Y.[Yibo], Xie, L.[Liang], Zhang, Y.K.[Ya-Kun], Zhang, M.[Meishan], Yan, Y.[Ye], Yin, E.[Erwei],
Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation,
ICCV23(12009-12019)
IEEE DOI 2401
BibRef

Patel, D.[Dhruvesh], Eghbalzadeh, H.[Hamid], Kamra, N.[Nitin], Iuzzolino, M.L.[Michael Louis], Jain, U.[Unnat], Desai, R.[Ruta],
Pretrained Language Models as Visual Planners for Human Assistance,
ICCV23(15256-15268)
IEEE DOI 2401
BibRef

Liu, S.[Shubo], Zhang, H.S.[Hong-Sheng], Qi, Y.[Yuankai], Wang, P.[Peng], Zhang, Y.N.[Yan-Ning], Wu, Q.[Qi],
AerialVLN: Vision-and-Language Navigation for UAVs,
ICCV23(15338-15348)
IEEE DOI Code:
WWW Link. 2401
BibRef

Wang, Z.[Zihan], Li, X.Y.[Xiang-Yang], Yang, J.H.[Jia-Hao], Liu, Y.Q.[Ye-Qi], Jiang, S.Q.[Shu-Qiang],
GridMM: Grid Memory Map for Vision-and-Language Navigation,
ICCV23(15579-15590)
IEEE DOI Code:
WWW Link. 2401
BibRef

Wang, H.Q.[Han-Qing], Liang, W.[Wei], Van Gool, L.J.[Luc J.], Wang, W.G.[Wen-Guan],
Dreamwalker: Mental Planning for Continuous Vision-Language Navigation,
ICCV23(10839-10849)
IEEE DOI 2401
BibRef

Liu, R.[Rui], Wang, X.H.[Xiao-Han], Wang, W.G.[Wen-Guan], Yang, Y.[Yi],
Bird's-Eye-View Scene Graph for Vision-Language Navigation,
ICCV23(10934-10946)
IEEE DOI 2401
BibRef

Qiao, Y.[Yanyuan], Yu, Z.[Zheng], Wu, Q.[Qi],
VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation,
ICCV23(15397-15406)
IEEE DOI Code:
WWW Link. 2401
BibRef

Huo, J.Y.[Jing-Yang], Sun, Q.[Qiang], Jiang, B.[Boyan], Lin, H.T.[Hai-Tao], Fu, Y.W.[Yan-Wei],
GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot Attention for Vision-and-Language Navigation,
CVPR23(23212-23221)
IEEE DOI 2309
BibRef

Mu, Y.[Yao], Yao, S.[Shunyu], Ding, M.Y.[Ming-Yu], Luo, P.[Ping], Gan, C.[Chuang],
EC2: Emergent Communication for Embodied Control,
CVPR23(6704-6714)
IEEE DOI 2309
BibRef

Zhang, J.Z.[Jia-Zhao], Dai, L.[Liu], Meng, F.[Fanpeng], Fan, Q.N.[Qing-Nan], Chen, X.[Xuelin], Xu, K.[Kai], Wang, H.[He],
3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification,
CVPR23(6672-6682)
IEEE DOI 2309
BibRef

Wang, X.H.[Xiao-Han], Wang, W.G.[Wen-Guan], Shao, J.Y.[Jia-Yi], Yang, Y.[Yi],
LANA: A Language-Capable Navigator for Instruction Following and Generation,
CVPR23(19048-19058)
IEEE DOI 2309
BibRef

Krantz, J.[Jacob], Banerjee, S.[Shurjo], Zhu, W.[Wang], Corso, J.[Jason], Anderson, P.[Peter], Lee, S.[Stefan], Thomason, J.[Jesse],
Iterative Vision-and-Language Navigation,
CVPR23(14921-14930)
IEEE DOI 2309
BibRef

Li, X.Y.[Xiang-Yang], Wang, Z.[Zihan], Yang, J.H.[Jia-Hao], Wang, Y.[Yaowei], Jiang, S.Q.[Shu-Qiang],
KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation,
CVPR23(2583-2592)
IEEE DOI 2309
BibRef

Yang, Z.[Zijiao], Majumdar, A.[Arjun], Lee, S.[Stefan],
Behavioral Analysis of Vision-and-Language Navigation Agents,
CVPR23(2574-2582)
IEEE DOI 2309
BibRef

Gadre, S.Y.[Samir Yitzhak], Wortsman, M.[Mitchell], Ilharco, G.[Gabriel], Schmidt, L.[Ludwig], Song, S.[Shuran],
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation,
CVPR23(23171-23181)
IEEE DOI 2309
BibRef

Li, J.[Jialu], Bansal, M.[Mohit],
Improving Vision-and-Language Navigation by Generating Future-View Image Semantics,
CVPR23(10803-10812)
IEEE DOI 2309
BibRef

Kamath, A.[Aishwarya], Anderson, P.[Peter], Wang, S.[Su], Koh, J.Y.[Jing Yu], Ku, A.[Alexander], Waters, A.[Austin], Yang, Y.F.[Yin-Fei], Baldridge, J.[Jason], Parekh, Z.[Zarana],
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning,
CVPR23(10813-10823)
IEEE DOI 2309
BibRef

Tang, S.K.[Sheng-Kun], Wang, Y.Q.[Ya-Qing], Kong, Z.[Zhenglun], Zhang, T.C.[Tian-Chi], Li, Y.[Yao], Ding, C.[Caiwen], Wang, Y.Z.[Yan-Zhi], Liang, Y.[Yi], Xu, D.[Dongkuan],
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model,
CVPR23(10781-10791)
IEEE DOI 2309
BibRef

Armitage, J.[Jason], Impett, L.[Leonardo], Sennrich, R.[Rico],
A Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location Cues,
WACV23(1094-1103)
IEEE DOI 2302
Training, Visualization, Navigation, Trajectory planning, Filtering, Urban areas, Transformer cores BibRef

Kuo, C.W.[Chia-Wen], Ma, C.Y.[Chih-Yao], Hoffman, J.[Judy], Kira, Z.[Zsolt],
Structure-Encoding Auxiliary Tasks for Improved Visual Representation in Vision-and-Language Navigation,
WACV23(1104-1113)
IEEE DOI 2302
Training, Visualization, Navigation, Source coding, Task analysis, Algorithms: Vision + language and/or other modalities BibRef

Irshad, M.Z.[Muhammad Zubair], Mithun, N.C.[Niluthpol Chowdhury], Seymour, Z.[Zachary], Chiu, H.P.[Han-Pang], Samarasekera, S.[Supun], Kumar, R.[Rakesh],
Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments,
ICPR22(4065-4071)
IEEE DOI 2212
Visualization, Navigation, Semantics, Natural languages, Transformers, Feature extraction BibRef

Ossandón, J.[Joaquín], Earle, B.[Benjamín], Soto, Á.[Álvaro],
Bridging the Visual Semantic Gap in VLN via Semantically Richer Instructions,
ECCV22(XXXVII:54-69).
Springer DOI 2211
Visual-and-Language Navigation BibRef

Burns, A.[Andrea], Arsan, D.[Deniz], Agrawal, S.[Sanjna], Kumar, R.[Ranjitha], Saenko, K.[Kate], Plummer, B.A.[Bryan A.],
A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility,
ECCV22(VIII:312-328).
Springer DOI 2211
BibRef

Huang, Z.M.[Zan-Ming], Shangguan, Z.K.[Zhong-Kai], Zhang, J.Y.[Jimu-Yang], Bar, G.[Gilad], Boyd, M.[Matthew], Ohn-Bar, E.[Eshed],
ASSISTER: Assistive Navigation via Conditional Instruction Generation,
ECCV22(XXXVI:271-289).
Springer DOI 2211
BibRef

Zhou, K.W.[Kai-Wen], Wang, X.E.[Xin Eric],
FedVLN: Privacy-Preserving Federated Vision-and-Language Navigation,
ECCV22(XXXVI:682-699).
Springer DOI 2211
BibRef

Chen, S.Z.[Shi-Zhe], Guhur, P.L.[Pierre-Louis], Tapaswi, M.[Makarand], Schmid, C.[Cordelia], Laptev, I.[Ivan],
Learning from Unlabeled 3D Environments for Vision-and-Language Navigation,
ECCV22(XXIX:638-655).
Springer DOI 2211
BibRef

Krantz, J.[Jacob], Lee, S.[Stefan],
Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments,
ECCV22(XXIX:588-603).
Springer DOI 2211
BibRef

Lin, C.[Chuang], Jiang, Y.[Yi], Cai, J.F.[Jian-Fei], Qu, L.Z.[Li-Zhen], Haffari, G.[Gholamreza], Yuan, Z.H.[Ze-Huan],
Multimodal Transformer with Variable-Length Memory for Vision-and-Language Navigation,
ECCV22(XXXVI:380-397).
Springer DOI 2211
BibRef

Cheng, W.H.[Wen-Hao], Dong, X.P.[Xing-Ping], Khan, S.[Salman], Shen, J.B.[Jian-Bing],
Learning Disentanglement with Decoupled Labels for Vision-Language Navigation,
ECCV22(XXXVI:309-329).
Springer DOI 2211
BibRef

Kolmet, M.[Manuel], Zhou, Q.[Qunjie], Ošep, A.[Aljoša], Leal-Taixé, L.[Laura],
Text2Pos: Text-to-Point-Cloud Cross-Modal Localization,
CVPR22(6677-6686)
IEEE DOI 2210
To specify a location. Location awareness, Point cloud compression, Visualization, Navigation, Mobile handsets, Pattern recognition, Navigation and autonomous driving BibRef

Partsey, R.[Ruslan], Wijmans, E.[Erik], Yokoyama, N.[Naoki], Dobosevych, O.[Oles], Batra, D.[Dhruv], Maksymets, O.[Oleksandr],
Is Mapping Necessary for Realistic PointGoal Navigation?,
CVPR22(17211-17220)
IEEE DOI 2210
Recurrent neural networks, Navigation, Robot vision systems, Reinforcement learning, Benchmark testing, Sensors, Robot vision BibRef

Ramakrishnan, S.K.[Santhosh Kumar], Chaplot, D.S.[Devendra Singh], Al-Halah, Z.[Ziad], Malik, J.[Jitendra], Grauman, K.[Kristen],
PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning,
CVPR22(18868-18878)
IEEE DOI 2210
Training, Solid modeling, Navigation, Design methodology, Semantics, Supervised learning, Scene analysis and understanding, Robot vision BibRef

Chen, S.Z.[Shi-Zhe], Guhur, P.L.[Pierre-Louis], Tapaswi, M.[Makarand], Schmid, C.[Cordelia], Laptev, I.[Ivan],
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation,
CVPR22(16516-16526)
IEEE DOI 2210
Visualization, Navigation, Grounding, Benchmark testing, Transformers, Encoding, Vision+language, Navigation and autonomous driving BibRef

Zhou, M.Y.[Ming-Yang], Yu, L.C.[Li-Cheng], Singh, A.[Amanpreet], Wang, M.J.[Meng-Jiao], Yu, Z.[Zhou], Zhang, N.[Ning],
Unsupervised Vision-and-Language Pretraining via Retrieval-based Multi-Granular Alignment,
CVPR22(16464-16473)
IEEE DOI 2210
Adaptation models, Visualization, Computational modeling, Benchmark testing, Data models, Pattern recognition, Self- semi- meta- unsupervised learning BibRef

Wang, S.[Su], Montgomery, C.[Ceslee], Orbay, J.[Jordi], Birodkar, V.[Vighnesh], Faust, A.[Aleksandra], Gur, I.[Izzeddin], Jaques, N.[Natasha], Waters, A.[Austin], Baldridge, J.[Jason], Anderson, P.[Peter],
Less is More: Generating Grounded Navigation Instructions from Landmarks,
CVPR22(15407-15417)
IEEE DOI 2210
Training, Visualization, Navigation, Grounding, Focusing, Detectors, Multitasking, Vision+language BibRef

Hong, Y.C.[Yi-Cong], Wang, Z.[Zun], Wu, Q.[Qi], Gould, S.[Stephen],
Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation,
CVPR22(15418-15428)
IEEE DOI 2210
Training, Bridges, Navigation, Grounding, Pattern recognition, Task analysis, Vision+language BibRef

Chen, J.[Jinyu], Gao, C.[Chen], Meng, E.[Erli], Zhang, Q.[Qiong], Liu, S.[Si],
Reinforced Structured State-Evolution for Vision-Language Navigation,
CVPR22(15429-15438)
IEEE DOI 2210
Navigation, Computational modeling, Layout, Natural languages, Reinforcement learning, Predictive models, Vision+language BibRef

Georgakis, G.[Georgios], Schmeckpeper, K.[Karl], Wanchoo, K.[Karan], Dan, S.[Soham], Miltsakaki, E.[Eleni], Roth, D.[Dan], Daniilidis, K.[Kostas],
Cross-modal Map Learning for Vision and Language Navigation,
CVPR22(15439-15449)
IEEE DOI 2210
Navigation, Grounding, Semantics, Natural languages, Predictive models, Benchmark testing, Vision+language, Navigation and autonomous driving BibRef

Wang, H.Q.[Han-Qing], Liang, W.[Wei], Shen, J.B.[Jian-Bing], Van Gool, L.J.[Luc J.], Wang, W.G.[Wen-Guan],
Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation,
CVPR22(15450-15460)
IEEE DOI 2210
Training, Learning systems, Correlation, Navigation, Computational modeling, Buildings, Vision+language BibRef

Song, C.H.[Chan Hee], Kil, J.[Jihyung], Pan, T.Y.[Tai-Yu], Sadler, B.M.[Brian M.], Chao, W.L.[Wei-Lun], Su, Y.[Yu],
One Step at a Time: Long-Horizon Vision-and-Language Navigation with Milestones,
CVPR22(15461-15470)
IEEE DOI 2210
Navigation, Computational modeling, Robot vision systems, Machine learning, Autonomous agents, Pattern recognition, Robot vision BibRef

Guhur, P.L.[Pierre-Louis], Tapaswi, M.[Makarand], Chen, S.Z.[Shi-Zhe], Laptev, I.[Ivan], Schmid, C.[Cordelia],
Airbert: In-Domain Pretraining for Vision-and-Language Navigation,
ICCV21(1614-1623)
IEEE DOI 2203
Adaptation models, Navigation, Atmospheric modeling, Computational modeling, Natural languages, Training data, Vision for robotics and autonomous vehicles BibRef

Liu, C.[Chong], Zhu, F.[Fengda], Chang, X.J.[Xiao-Jun], Liang, X.D.[Xiao-Dan], Ge, Z.[Zongyuan], Shen, Y.D.[Yi-Dong],
Vision-Language Navigation with Random Environmental Mixup,
ICCV21(1624-1634)
IEEE DOI 2203
Visualization, Navigation, Natural languages, Benchmark testing, Data models, Task analysis, Vision+language, BibRef

Qi, Y.[Yuankai], Pan, Z.Z.[Zi-Zheng], Hong, Y.C.[Yi-Cong], Yang, M.H.[Ming-Hsuan], van den Hengel, A.J.[Anton J.], Wu, Q.[Qi],
The Road to Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation,
ICCV21(1635-1644)
IEEE DOI 2203
Visualization, TV, Navigation, Roads, Bit error rate, Predictive models, Linguistics, Vision+language, BibRef

Liu, Z.Y.[Zhe-Yuan], Rodriguez-Opazo, C.[Cristian], Teney, D.[Damien], Gould, S.[Stephen],
Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models,
ICCV21(2105-2114)
IEEE DOI 2203
Visualization, Limiting, Codes, Image retrieval, Natural languages, Computer architecture, Vision+language, Representation learning BibRef

Pashevich, A.[Alexander], Schmid, C.[Cordelia], Sun, C.[Chen],
Episodic Transformer for Vision-and-Language Navigation,
ICCV21(15922-15932)
IEEE DOI 2203
Training, Visualization, Navigation, Natural languages, Detectors, Benchmark testing, Transformers, Vision+language BibRef

Chen, K.[Kevin], Chen, J.K.[Junshen K.], Chuang, J.[Jo], Vázquez, M.[Marynel], Savarese, S.[Silvio],
Topological Planning with Transformers for Vision-and-Language Navigation,
CVPR21(11271-11281)
IEEE DOI 2111
Backtracking, Navigation, Natural languages, Buildings, Transformers, Planning BibRef

Badki, A.[Abhishek], Gallo, O.[Orazio], Kautz, J.[Jan], Sen, P.[Pradeep],
Binary TTC: A Temporal Geofence for Autonomous Navigation,
CVPR21(12941-12950)
IEEE DOI 2111
Quantization (signal), Estimation, Tools, Observers, Cameras, Real-time systems BibRef

Wang, H.Q.[Han-Qing], Wang, W.G.[Wen-Guan], Liang, W.[Wei], Xiong, C.M.[Cai-Ming], Shen, J.B.[Jian-Bing],
Structured Scene Memory for Vision-Language Navigation,
CVPR21(8451-8460)
IEEE DOI 2111
Visualization, Recurrent neural networks, Navigation, Decision making, Layout, Memory architecture BibRef

Wang, H.Q.[Han-Qing], Wang, W.G.[Wen-Guan], Shu, T.M.[Tian-Min], Liang, W.[Wei], Shen, J.B.[Jian-Bing],
Active Visual Information Gathering for Vision-language Navigation,
ECCV20(XXII:307-322).
Springer DOI 2011
BibRef

Cao, J.[Jize], Gan, Z.[Zhe], Cheng, Y.[Yu], Yu, L.C.[Li-Cheng], Chen, Y.C.[Yen-Chun], Liu, J.J.[Jing-Jing],
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-language Models,
ECCV20(VI:565-580).
Springer DOI 2011
BibRef

Moghaddam, M.K.[Mahdi Kazemi], Abbasnejad, E.[Ehsan], Wu, Q.[Qi], Shi, J.Q.F.[Javen Qin-Feng], van den Hengel, A.J.[Anton J.],
ForeSI: Success-Aware Visual Navigation Agent,
WACV22(3401-3410)
IEEE DOI 2202
Training, Visualization, Navigation, Detectors, Reinforcement learning, Predictive models, Analysis and Understanding BibRef

Qi, Y., Wu, Q., Anderson, P., Wang, X., Wang, W.Y., Shen, C., van den Hengel, A.J.,
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments,
CVPR20(9979-9988)
IEEE DOI 2008
Task analysis, Navigation, Robots, Natural languages, Visualization, Object recognition, Indoor environments BibRef

Qi, Y.K.[Yuan-Kai], Pan, Z.Z.[Zi-Zheng], Zhang, S.P.[Sheng-Ping], van den Hengel, A.J.[Anton J.], Wu, Q.[Qi],
Object-and-action Aware Model for Visual Language Navigation,
ECCV20(X:303-317).
Springer DOI 2011
BibRef

Krantz, J.[Jacob], Wijmans, E.[Erik], Majumdar, A.[Arjun], Batra, D.[Dhruv], Lee, S.[Stefan],
Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments,
ECCV20(XXVIII:104-120).
Springer DOI 2011
Agents must execute low-level actions to follow natural language navigation directions. BibRef

Wang, H.[Hu], Wu, Q.[Qi], Shen, C.H.[Chun-Hua],
Soft Expert Reward Learning for Vision-and-Language Navigation,
ECCV20(IX:126-141).
Springer DOI 2011
BibRef

Kim, J., Moon, S., Rohrbach, A., Darrell, T.J., Canny, J.,
Advisable Learning for Self-Driving Vehicles by Internalizing Observation-to-Action Rules,
CVPR20(9658-9667)
IEEE DOI 2008
Visualization, Semantics, Natural languages, Image segmentation, Generators, Training, Roads BibRef

Fu, T.J.[Tsu-Jui], Wang, X.E.[Xin Eric], Peterson, M.F.[Matthew F.], Grafton, S.T.[Scott T.], Eckstein, M.P.[Miguel P.], Wang, W.Y.[William Yang],
Counterfactual Vision-and-Language Navigation via Adversarial Path Sampler,
ECCV20(VI:71-86).
Springer DOI 2011
Based on language descriptions, relate them to the environment. BibRef

Majumdar, A.[Arjun], Shrivastava, A.[Ayush], Lee, S.[Stefan], Anderson, P.[Peter], Parikh, D.[Devi], Batra, D.[Dhruv],
Improving Vision-and-language Navigation with Image-text Pairs from the Web,
ECCV20(VI:259-274).
Springer DOI 2011
BibRef

Zhu, F.D.[Feng-Da], Zhu, Y.[Yi], Chang, X.J.[Xiao-Jun], Liang, X.D.[Xiao-Dan],
Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks,
CVPR20(10009-10019)
IEEE DOI 2008
Task analysis, Navigation, Cognition, Trajectory, Semantics, Training, Natural languages BibRef

Hao, W., Li, C., Li, X., Carin, L., Gao, J.,
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-Training,
CVPR20(13134-13143)
IEEE DOI 2008
Task analysis, Navigation, Visualization, Trajectory, Presses, Head, Predictive models BibRef

Yu, F., Deng, Z., Narasimhan, K., Russakovsky, O.,
Take the Scenic Route: Improving Generalization in Vision-and-Language Navigation,
VL3W20(4000-4004)
IEEE DOI 2008
Navigation, Benchmark testing, Task analysis, Natural languages, Visualization, Training data, Markov processes BibRef

Ma, C.Y.[Chih-Yao], Wu, Z.X.[Zu-Xuan], Al Regib, G.[Ghassan], Xiong, C.M.[Cai-Ming], Kira, Z.[Zsolt],
The Regretful Agent: Heuristic-Aided Navigation Through Progress Estimation,
CVPR19(6725-6733).
IEEE DOI 2002
Navigating to a goal purely from language instructions and visual information. BibRef

Ke, L.Y.M.[Li-Yi-Ming], Li, X.J.[Xiu-Jun], Bisk, Y.[Yonatan], Holtzman, A.[Ari], Gan, Z.[Zhe], Liu, J.J.[Jing-Jing], Gao, J.F.[Jian-Feng], Choi, Y.J.[Ye-Jin], Srinivasa, S.[Siddhartha],
Tactical Rewind: Self-Correction via Backtracking in Vision-And-Language Navigation,
CVPR19(6734-6742).
IEEE DOI 2002
BibRef

Wang, X.[Xin], Xiong, W.H.[Wen-Han], Wang, H.M.[Hong-Min], Wang, W.Y.[William Yang],
Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation,
ECCV18(XVI: 38-55).
Springer DOI 1810
BibRef

Anderson, P.[Peter], Wu, Q.[Qi], Teney, D.[Damien], Bruce, J.[Jake], Johnson, M.[Mark], Sünderhauf, N.[Niko], Reid, I.D.[Ian D.], Gould, S.[Stephen], van den Hengel, A.J.[Anton J.],
Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments,
CVPR18(3674-3683)
IEEE DOI 1812
Navigation, Task analysis, Robots, Visualization, Cameras, Natural languages BibRef

Chen, H.[Howard], Suhr, A.[Alane], Misra, D.[Dipendra], Snavely, N.[Noah], Artzi, Y.[Yoav],
TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments,
CVPR19(12530-12539).
IEEE DOI 2002
BibRef

Khoshelham, K., Díaz-Vilariño, L.,
3D Modelling of Interior Spaces: Learning the Language of Indoor Architecture,
CloseRange14(321-326).
DOI Link 1411
BibRef

van Laere, O.[Olivier], Schockaert, S.[Steven], Dhoedt, B.[Bart],
Finding locations of Flickr resources using language models and similarity search,
ICMR11(48).
DOI Link 1301
estimate where a given photo or video was taken, using only the tags that a user has assigned BibRef

Chapter on Active Vision, Camera Calibration, Mobile Robots, Navigation, Road Following continues in
Visual SLAM: Simultaneous Location and Mapping or Matching .


Last update:Apr 10, 2024 at 09:54:40