Saturday, July 24, 2010


SauceNAO is a reverse image search engine focusing on Anime and Manga related images.

The SauceNAO site uses a modified version of the Image Query Database server code which can be found at the heart of the IQDB reverse image search service.
The IQDB server works by taking an image supplied by the user, separating its color channels, and running a wavelet transform on each channel.
The output of the wavelet transform is weighted using a little bit of magic, and the transformed image is compared to the already processed images in the index to determine the closest match.

It takes some serious power, and a helluva lot of storage space to sort and pre-process a few hundred million image files, but having a large part of the work done already is what keeps the searches as fast as possible.
As an example, the pixiv index at a mere 10 million images takes a good 24 hours to generate from scratch, and pixiv is growing by leaps and bounds. (~200k new images this week alone)

But before pre-processing can even begin, the content must first be gathered. Deciding what to index, and then finding a way to get it is where the real challenge lies.
Guessing what people _might_ search for, or only indexing big name titles really doesn't cut it. Any random person can source the popular stuff...
The goal of SauceNAO is to gather both the popular things, and the most ridiculously obscure things that neither I nor seemingly anyone else has heard of.

In that respect, Only two of the indexes really pass muster: the H-Anime index and the pixiv Index. Unfortunately, the former hasn't been updated in nearly 8 months, and is horribly out of date...
I'll get to that soon. >_>;

Next Updates:
pixiv - next couple days
H-Anime - few weeks
E-Anime (first release) - month or two

The E-Anime index will contain Ecchi Anime. Just a few hundred series to start with, but it'll keep expanding until everything is covered, or I run out of room... ;P

