Edit model card

Model Card for Thesis-CLIP-geoloc-continent

CLIP-ViT model fine-tuned for image geolocation. Optimized for queries at continent-level.

Model Details

Model Description

Model Sources

How to Get Started with the Model

from PIL import Image
import requests
from transformers import CLIPProcessor, CLIPModel

model = CLIPModel.from_pretrained("jrheiner/thesis-clip-geoloc-continent")
processor = CLIPProcessor.from_pretrained("jrheiner/thesis-clip-geoloc-continent")

url = "https://huggingface.co/spaces/jrheiner/thesis-demo/resolve/main/kerger-test-images/Oceania_Australia_-32.947127313081_151.47903359833_kerger.jpg"
image = Image.open(requests.get(url, stream=True).raw)
choices = ["North America", "Africa", "Asia", "Oceania", "South America", "Europe"]
inputs = processor(text=choices, images=image, return_tensors="pt", padding=True)
outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities

Training Details

The model was fine-tuned on 177 270 images (29 545 per continent) sourced from Mapillary.

Downloads last month
157
Safetensors
Model size
428M params
Tensor type
F32
ยท
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for jrheiner/thesis-clip-geoloc-continent

Finetuned
this model

Space using jrheiner/thesis-clip-geoloc-continent 1