Machine Learning Problems: The Easy Parts

Contify

So, we implemented de-duplication algorithms to significantly reduce the resources required to process the information in these documents. We therefore had to write our own version of hierarchical clustering that returns a large number of tight clusters or duplicates.

Machine Learning Problems: The Easy Parts

Contify

So, we implemented de-duplication algorithms to significantly reduce the resources required to process the information in these documents. We therefore had to write our own version of hierarchical clustering that returns a large number of tight clusters or duplicates.