Abstract: While convolution neural networks (CNNs) and vision transformers (ViTs) dominate visual representation learning, the growing model depth causes difficulty for interpretability. Although ...
Abstract: Aiming at the problems in UAV visual object tracking, such as environmental interference like occlusion, similar objects, inaccurate estimation of object spatial position information, and ...