Abstract: Spatio-temporal action detection networks, which need to simultaneously extract and fuse spatial and temporal features, often result in existing models becoming bloated and difficult to run ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results