Text input based on virtual reality is a common technique in interactive systems, but it may face challenges related to input efficiency and task load when interacting with VR hardware and ...
Abstract: As a fundamental and challenging task in bridging language and vision domains, Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the ...