Run a Model on an Image

Run inference on an image with the Python SDK, across all task types.

Once a Version is trained, its .model property returns a task-specific model object you can call .predict() on. The SDK handles authentication, image preprocessing, and JSON response parsing.

The model class returned depends on the project type - object detection, classification, segmentation, keypoint detection, or VLM. The .predict() signature is similar across all of them; differences are noted below.

Object detection

import roboflow

rf = roboflow.Roboflow(api_key="YOUR_API_KEY")
project = rf.workspace().project("my-detector")
version = project.version(3)
model = version.model

predictions = model.predict(
    "photo.jpg",
    confidence=40,   # 0–100
    overlap=30,      # 0–100, NMS IoU threshold
).json()

for p in predictions["predictions"]:
    print(p["class"], p["confidence"], p["x"], p["y"], p["width"], p["height"])

Save a visualization (full roboflow package only - not roboflow-slim):

model.predict("photo.jpg", confidence=40, overlap=30).save("prediction.jpg")

Classification

project = rf.workspace().project("my-classifier")
model = project.version(2).model

predictions = model.predict("photo.jpg").json()
print(predictions["top"], predictions["confidence"])

Classification's predict() doesn't take overlap. Pass hosted=True if you're providing a URL rather than a local path.

Instance segmentation

project = rf.workspace().project("my-instance-seg")
model = project.version(1).model

predictions = model.predict("photo.jpg", confidence=40).json()
for p in predictions["predictions"]:
    print(p["class"], len(p["points"]))   # polygon vertices

Semantic segmentation

project = rf.workspace().project("my-semantic-seg")
model = project.version(1).model

predictions = model.predict("photo.jpg").json()
# Returns per-pixel class predictions (run-length encoded).

Keypoint detection

project = rf.workspace().project("my-keypoint")
model = project.version(1).model

predictions = model.predict("photo.jpg", confidence=40).json()
for p in predictions["predictions"]:
    for kp in p["keypoints"]:
        print(kp["class"], kp["x"], kp["y"], kp["confidence"])

Vision-language (VLM)

project = rf.workspace().project("my-vlm-project")
model = project.version(1).model

predictions = model.predict("photo.jpg", text="What is in this image?").json()
print(predictions)

VLM models accept a text prompt in addition to (or sometimes instead of) the image, depending on the underlying base model.

Hosted images

Pass hosted=True and a URL to skip the local upload - the inference server fetches the image directly:

model.predict("https://example.com/photo.jpg", hosted=True)

Where inference runs

By default model.predict() calls Roboflow's hosted Serverless v2 inference at serverless.roboflow.com. For higher throughput or on-prem use cases:

Self-hosted Inference - install Roboflow Inference and point the SDK at it via version.model = version.model_for(local="http://localhost:9001") (or call inference-sdk directly for streaming and batched calls).
Dedicated Deployments - see Manage Dedicated Deployments and the product docs deployment overview.

The legacy detect.roboflow.com and other task-specific Serverless v1 endpoints are deprecated - new code should use Serverless v2.

REST and CLI equivalents

REST: see Run a Model on an Image (REST).
CLI: see Run a Model on an Image (CLI).

PreviousTrain a Model NextTwo-Stage and CLIP Inference

Last updated 16 days ago

Was this helpful?