Text-based passwords are the most widely used authentication mechanism. However, passwords suffer from well-known drawbacks and vulnerabilities, mainly due to the limited complexity and inherent structure present in human-generated passwords, which heavily restrict the regions of space in which such passwords reside. Traditional password guessing tools exploit this markedly uneven distribution in the password space to generate high-probability password guesses that fall in the dense areas of the space where human-like passwords reside. These tools are able to approximate the distribution of human-like passwords primarily based on carefully generated rules handcrafted by human experts, which is a laborious task that requires a high level of domain-specific expertise.
To overcome the limitations of traditional tools, recently different unsupervised learning-based approaches to password guessing based on generative models have been proposed. These generative models are carefully designed to autonomously learn structure and patterns that are characteristic of human-generated passwords, with the goal of improving password guessing performance and removing the need for domain-specific expertise. In this project we aim to create a novel generative model architecture based on Generative Flows and study its applicability and performances in the password guessing scenario, comparing our model to other state-of-the-art techniques.
To the best of our knowledge, we are the first to adopt and prove the effectiveness of flow architectures in this domain, and one of the first to explore the applicability of flow networks outside of the continuous domain of image generation.
We would like to highlight the interesting features that generative flows offer thanks to the structure of their latent space, showing that it allows operations such as interpolation between different password samples and exploration of dense areas of the space.
These properties could lead to a family of models that allow for a smarter latent space exploration.
We believe that the locality property of the latent space, described firstly in [1], implies that similar classes of passwords (i.e passwords with similar structures and patterns) are organized in related areas of the latent space. We want to demonstrate that this property can be used to introduce biases in the password generation process, forcing the model to explore regions of interest in the space. We aim to analyze the geometry of the latent space learned by our model and prove the above-mentioned smoothness and locality properties.
[1] Pasquini, D., Gangwal, A., Ateniese, G., Bernaschi, M., Conti, M.: Improving password guessing via representation learning. In: 2021 2021 IEEE Symposium on Security and Privacy (SP). pp. 265¿282. IEEE Computer Society, Los Alamitos,CA, USA (may 2021).