Updated on January 3, 2018
I get a lot of questions about what the confidence score is. This is hard to answer because, to be frank, the way I calculate it keeps evolving. Every time I identify a new source, or a new way to interpret an existing source or add a new database of patterns, our scoring changes.
What I can say about it is this:
The confidence score is a summary of the strength of multiple sources I use to determine if an email address exists.
The sources include social media databases, our own emails and patterns databases, and mail server testing resources. How those sources convert to a score is the constantly evolving Toofr algorithm, which I adjust based on source performance, feedback from our customers, and our own internal testing.
Importantly, I also re-compute the confidence score of an email address every 30 days. This is why you may see the score for a given email address change over time.
As of this writing, a rough breakdown of our confidence scoring looks like this:
- Low: Only pattern match. This is our weakest signal since it is based only on the most popular pattern for a given domain. These will have ~30% bounce rate.
- Medium: Only social media match or a highly skewed pattern match. This is a "decent" score in that some of our sources identified that the email existed at some point. It does not confirm that it exists today. Emails in this score range will have ~20% bounce rate.
- High: Mail server testing is positive. This is our best score and it should yield a bounce rate under 5%. It means there is both mail server and social media scoring or high pattern skew, and that all but guarantees the email won't bounce. It also means that the domain is not a "catchall", which often trips up our competitors.