Extrapolating accuracy indirectly

January 14, 2018

Say I have a list with 1,000,000 random numbers in it. I then calculate 10 other lists that are hopefully exactly like the original list. The process of comparing each list to the original list is too slow/expensive, so instead I only compare the first list to the original. List A is 86% identical to the "master" list.

Now, if I compare the remaining 9 lists to list A (which is a presumably much faster/cheaper process), I can assume that if a list is close to 100% identical to list A, then it will be close to 86% identical to the master list (maybe a bit better or maybe a bit worse than list A), correct? Also, would it be safe to conclude that if a list is < 86% identical to list A, that it is then < 86% identical to the master list?

