Quantcast
Channel: Dino's Anabasis
Viewing all articles
Browse latest Browse all 22

Levenshtein distance between 10 million usernames and their passwords

$
0
0

Mark Burnett, a security researcher, recently released a collection of 10 million passwords along with their usernames. My question was, how different are 10 million usernames from their passwords?  Taking a tiny bit of time, I performed a simple analysis looking at the Levenshtein distance between them and composed the graph below.

What this means is, if people in this dataset used their username as a password (ex: user dino, password dino), but then changed it a little (password dino1), how many insertions, deletions or substitutions did these users have to make from the set?  See for yourself.

Distance of 0 means usernames and passwords are exactly identical (in the graph below, 213,133 passwords are same as their usernames).  Distance of 1 means one character was added, deleted or changed. And so on...

The post Levenshtein distance between 10 million usernames and their passwords appeared first on Dino's Anabasis.


Viewing all articles
Browse latest Browse all 22

Trending Articles