Remove frequency
article thumbnail

Machine Learning Problems: The Easy Parts

Contify

So, we implemented de-duplication algorithms to significantly reduce the resources required to process the information in these documents. The features are based on the frequency and importance of entities among other things (discussed at a later point in this post). Inverse document frequency. Entity Identification.

article thumbnail

Machine Learning Problems: The Easy Parts

Contify

So, we implemented de-duplication algorithms to significantly reduce the resources required to process the information in these documents. The features are based on the frequency and importance of entities among other things (discussed at a later point in this post). Inverse document frequency. Entity Identification.

article thumbnail

14 Quick Tips for Kick-Ass Lead Management

Hubspot

Determine the validity of a lead. We definitely recommend de-duplicating leads based on email address at the very least, but you should also verify information such as zip code, phone number, and email address when possible to keep lead records up to date, and thus, functional. What constitutes a junk lead? Score your leads.