RS5M and GeoRSCLIP A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing

Posted Jan 3, 2023

By 1 min read

论文名称: RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing
模型架构: VLM
Visual Encoder: Transformer
Text Encoder: Transformer
Model Details: (CLIP)
Task: Scene Classification, Image-text Retrieval, Semantic Localization
Link: https://arxiv.org/abs/2306.11300
Code/Project: -
Survey Inclusion: awesome-remote-sensing-vision-language-models
Published in: Arxiv 2023

This post is licensed under CC BY 4.0 by the author.

Trending Tags