Semantic search is coming to Granit
Imagine typing "person on the beach" into your catalog and seeing the right photo appear — without ever having tagged it. That's what the Granit team is working on right now.
Imagine typing "person on the beach" into your catalog and seeing the right photo appear — without ever having tagged it. That's what we're working on at Granit right now.
What we're building
Multimodal semantic search lets you find a media file based on what it actually contains visually, not what you've written about it. Concretely, in Granit, that means being able to type:
- "Sunset over the sea"
- "Black and white portrait"
- "Outdoor wedding atmosphere"
- "Red car against an urban backdrop"
...and see the right media surface in your catalog, even without tags, without a dedicated folder, without a text description.
For a photographer delivering a selection to a client, or a studio juggling tens of thousands of assets, this is a category shift: you move from manual organization (folders + tags + memory) to a search engine that actually understands content.
Where we are
We've explored several paths in parallel. Multiple AI models — self-hosted open source, managed models from major providers — and several deployment strategies, from dedicated serverless to third-party APIs. Each approach has its trade-offs: result quality, latency, cost, operational complexity.
Our internal tests are very promising. Natural-language queries surface the right media, even on fine-grained combinations (object + mood + style). We're currently fine-tuning the relevance threshold and the search experience to keep things fluid, even on large catalogs.
What this will unlock
In Granit, eventually:
- Find any media in a few words, without ever having tagged it.
- Navigate your portfolio by theme or mood rather than by folder.
- Find "photos similar to this one" in one click from any media detail page.
- Automatically group visually coherent assets to build moodboards or client selections.
- Extend search to videos and other formats over time — the tech we're building is multimodal by design.
We don't have a public release date yet, but it's getting closer. We'll keep you posted.