When we designed Sightova, we made a deliberate decision: a single scan should give you everything you need to assess an image. Not just whether it's real or fake — but what generated it, whether it contains harmful content, and which specific pixels were manipulated.
Today, every image uploaded to Sightova is processed by four specialized models in a single pass.
Four Models, One Scan
Authenticity Detection — Our primary Vision Transformer model analyzes the image's pixel structure to determine whether it was generated by AI or captured by a camera. This produces the core probability score and confidence level that forms the foundation of every analysis.
Generator Identification — When an image is flagged as AI-generated, a second specialized classifier identifies which generative model created it. This model distinguishes between 14 different architectures, from Stable Diffusion and Midjourney to GPT Image, Flux, and others. Knowing which tool created an image is often as important as knowing it's synthetic.
Content Safety Classification — Every image is simultaneously evaluated for NSFW content. This isn't a simple filter — it's a dedicated binary classifier trained to identify explicit material with high precision. The NSFW probability is included in every scan result, enabling platforms to combine authenticity and safety decisions in a single workflow.
Pixel Heatmap Analysis — Our segmentation model produces a spatial map showing exactly which regions of the image exhibit signs of AI manipulation. This transforms detection from a binary answer into a forensic visualization.
Why Multi-Model Matters
Each of these capabilities exists in isolation elsewhere. You can find NSFW classifiers, you can find AI detectors, you can find generator identifiers. But using separate tools means separate uploads, separate API calls, separate latency, and separate billing.
More importantly, isolated tools miss the correlations. When our authenticity model flags an image at 73% AI probability, the heatmap can reveal that only a specific region — a face, a background — was manipulated. The generator identifier might show that the editing was performed with a specific inpainting model. The NSFW classifier might flag content that was synthetically generated rather than photographed. Together, these signals paint a complete picture that no single model can provide alone.
For Platforms at Scale
For content moderation teams, this multi-signal approach is particularly powerful. A single API call returns a structured JSON payload containing all four analyses. Your automated pipeline can make nuanced decisions: quarantine images that are both AI-generated and NSFW, escalate partially manipulated photos for human review, or automatically tag content with its identified generator for transparency labeling.
The entire pipeline runs in seconds on standard hardware. There are no separate endpoints to call, no orchestration to manage, and no additional cost per model.
Unified Dashboard Experience
For individual users and smaller teams working through the web dashboard, all four analyses appear in a single results view. The Analysis Results tab shows authenticity scores and generator identification. The Heatmap tab provides the pixel-level visualization. The Metadata & Provenance tab surfaces C2PA content credentials, EXIF data, and geolocation when available.
Everything you need to make a trust decision about an image is available from a single upload.
The Road Ahead
We're continuing to expand each of these models — training on emerging generators, improving heatmap resolution, and adding new safety classifications. The architecture is designed so that adding new analysis capabilities requires no changes to the API contract or dashboard workflow.
Our goal is straightforward: upload one image, understand everything about it.