We've had several problems pop up this year that have called for comparing a bunch of documents to a bunch of other documents, typically to find which ones are similar.
It's a simple problem on its face, but a difficult one to scale. Comparing thousands of documents to one another can call for tens of millions of individual comparisons. Tens of thousands of documents can mean hundreds of millions or even billions of comparisons, assuming you want to compare everything to everything else.