Anno: 
2017
Nome e qualifica del proponente del progetto: 
sb_p_624381
Abstract: 

In the social sciences, the analysis of behaviors or choices often involves statistical models where the response variable is observed only if a particular (selection) condition is met. Hence, the selection mechanism of data is not random and sample selection problems arise; the methodology introduced by Heckman (1978, 1979) allows solving this problem.
A more complex situation there is when the selection mechanism leads to a vast number of censored observations, so that the amount of data available for the estimation may become very low. In this situation random sampling is either inefficient, because very costly, or not feasible; to reduce costs in collecting data on choice behavior, often choices rather than decision makers are sampled, achieving a more balanced sample than random sampling would produce (hence the name, response-based or choice-based, of the sampling scheme). In this context, Greene (1992) used the Weighted Endogenous Sampling Maximum Likelihood (WESML) estimator proposed by Manski and Lerman (1977) that however requires the true population proportions of cases to be known.
Our research aims at deriving an alternative estimator for a binary choice model with sample selection problems and choice based sampling, which generalizes Greene¿s proposal.
A subsequent goal is to handle the possible misclassification of the response variable; for example, in fraud detection based on claims data, some claims classified as honest might actually be fraudulent (and vice versa). In particular, we will try to estimate the two probability of misclassification simultaneously with the parameters of the binary choice model, always considering the two kinds of sample bias.
We will assess the performance of our proposal by appropriate simulation studies and finally we will apply the estimation procedure to real data; in particular, we will analyze data on consumer loan default and credit card expenditure used by Greene (1992), to make a comparison with his results.

Componenti gruppo di ricerca: 
sb_cp_is_933856
Innovatività: 

The first goal of the research undoubtedly constitutes a development, compared to the Greene¿s proposal. Actually, our procedure allows correcting for the two forms of sample bias without assuming the true population proportions of cases for the response variable to be known; on the contrary, it allows to estimate these proportions. This characteristic makes our proposal very appealing in all the analyses concerning elusive populations or phenomena - such as child labor, undeclared or irregular work, and fraud detection.
The third aim of the research also represents a progress in the literature, where the problem of misclassified dependent variable is faced alone, without connections with other sources of bias and inconsistency.

Codice Bando: 
624381
Keywords: 

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma