EMBench allows the automatic evaluation of matching algorithms. Unfortunately, we currently do not allow the upload of binary files on the server. We instead use algorithms from the SecondString library, and more specifically: JaroWinkler, Jaro, MongeElkan, TFIDF. To evaluate you own matching algorithm, you need to download and perform a local execution of EMBench (more information is here).
The following evaluation is performed over collections containing 2500 entities with 2000 entities destroyed using misspelling, abbreviation and permutation modifiers. Note that these collection sizes were selected for demonstration purposes and the EMBench can be used for generated collections of arbitrary sizes.
|Last modified: July 2014, Page maintained by: Ekaterini Ioannou, Yannis Velegrakis|