Semantic search is coming to Granit

Imagine typing "person on the beach" into your catalog and seeing the right photo appear — without ever having tagged it. That's what the Granit team is working on right now.

Semantic search is coming to Granit

Imagine typing "person on the beach" into your catalog and seeing the right photo appear — without ever having tagged it. That's what we're working on at Granit right now.

What we're building

Multimodal semantic search lets you find a media file based on what it actually contains visually, not what you've written about it. Concretely, in Granit, that means being able to type:

  • "Sunset over the sea"
  • "Black and white portrait"
  • "Outdoor wedding atmosphere"
  • "Red car against an urban backdrop"

...and see the right media surface in your catalog, even without tags, without a dedicated folder, without a text description.

For a photographer delivering a selection to a client, or a studio juggling tens of thousands of assets, this is a category shift: you move from manual organization (folders + tags + memory) to a search engine that actually understands content.

Where we are

We've explored several paths in parallel. Multiple AI models — self-hosted open source, managed models from major providers — and several deployment strategies, from dedicated serverless to third-party APIs. Each approach has its trade-offs: result quality, latency, cost, operational complexity.

Our internal tests are very promising. Natural-language queries surface the right media, even on fine-grained combinations (object + mood + style). We're currently fine-tuning the relevance threshold and the search experience to keep things fluid, even on large catalogs.

What this will unlock

In Granit, eventually:

  • Find any media in a few words, without ever having tagged it.
  • Navigate your portfolio by theme or mood rather than by folder.
  • Find "photos similar to this one" in one click from any media detail page.
  • Automatically group visually coherent assets to build moodboards or client selections.
  • Extend search to videos and other formats over time — the tech we're building is multimodal by design.

We don't have a public release date yet, but it's getting closer. We'll keep you posted.