Algorithmic methods are the driving force behind the mechanisms of interaction between people and online markets. Examples are the hiring of workforce on the Internet, algorithmic pricing of goods, matching customers to service providers in the sharing economy, or recommending contents and products. While these tasks have a deep impact in the everyday life of people, it is clear that some of these decisions can be biased by the algorithmic methods adopted or by the data used for training the machine learning algorithms, and this is one of the main reason behind the recent daily debate on their use.
The main research problem we face is the conflict between the algorithm-designer objective of optimizing the economic benefits and the importance of providing solutions that ensure fairness. Introducing fairness as a constraint opens a whole new set of challenging problems and requires the development of new algorithmic solutions. Fairness can make algorithmic solutions for the problems considerably harder to compute and the optimal fair solution can achieve a worse objective cost than the optimal unconstrained solution.
Our approach will be the one of using algorithmic approximation and min-regret analysis of online learning methods in order to ensure that the quality of the fair solution produced by the algorithms and by the market mechanisms will be provably close to the quality of the optimal unconstrained solution when possible. We'll also investigate methods that remove algorithmic discrimination by design through the execution of a preprocessing phase before the optimization algorithm is applied. We'll focus our investigation on the problems of designing fair algorithms and mechanisms for Internet advertisement, recommender systems, online labor marketplaces, fair clustering and classification.
i) Fairness in auction design for Internet advertising and two-sided markets.
We intend to investigate fairness in Internet advertising starting from position auctions used for sponsored search. The GSP mechanism for Internet advertising offers a good compromise between optimizing efficiency of advertisers and revenue of the auctioneer. Unfortunately, the GSP mechanism is not incentive compatible and it is not envy-free. However, it has been proved [11] that the Nash equilibria defined by the GSP mechanism provide a close approximation of the optimal efficiency.
One question that we plan to address is the integration of GSP with fair division schemes in order to achieve some desired level of fairness between several categories of users/advertisers. More specifically, we aim at the integration of GSP with the envy-free up to one item (EF1) and the envy-free up to any item fair division schemes [17] to obtain sponsored search mechanisms that are efficient and fair. We plan to study the problem of learning the fair equilibria through a series of repeated games, and to compensate with payments the loss of efficiency of the majority group due to the fair division scheme.
One further research question that we'll address is the design of non-discriminative pricing in two sided markets [12,13] that model several online markets of the sharing economy (Airbnb, Uber, Lyft, Amazon MTurk) in which both service providers and customers are strategic agents. We plan to investigate the efficiency of non-discriminative compensation for sellers and non-discriminative allocations to customers.
ii) Fairness and non-disparate impact in online labor markets
We address in this second objective fairness issues in algorithmic and mechanism design for online labor marketplaces. Private attributes like gender and ethnicity should not affect hiring, compensation of workers, or role assigned in a team. We plan to develop optimization algorithms and mechanism for two-sided matching problems arising in task allocation to online workers that ensure diversification between different groups. Our approach is to design fair online competitive and approximation algorithms that effectively process the tasks at low cost [6,7].
Preliminary investigation of the PI has established that this class of problems is much harder if the fairness constraint is imposed. Given this difficulty, we also plan to study algorithms for real-world instances. We plan to consider input instances obtained from assigning each worker to a random category, or models with a planted fair solution that requires to be discovered by the algorithm. We plan to evaluate the algorithmic solutions both theoretically and on real-life datasets obtained from labor marketplaces.
In a closely related research direction, we plan to design mechanisms able to provide the right incentives to workers in terms of economic rewards without discriminating between different categories of workers. This type of constraints on the payments rules out classical VCG solutions and therefore new ideas are needed. Our problem is strictly harder than the classical budget-feasible mechanism design problem [14] since the bound on the total payments for each category of workers is not defined a-priori.
iii) Fairness in large-scale data analysis.
Algorithm design for online markets requires working on problems with millions of users and billions of data points. We therefore consider fairness in algorithmic design problems which are crucial in clustering and classification since many decisions in online markets are based on the applications of algorithms for these problems.
Following the notion of disparate impact, we define the output of a network data analysis algorithm, e.g. clustering, to be fair if it is uncorrelated with the coloring of the vertices of the graph [8]. Similarly, we define the output of a classification algorithm to be fair, if it is uncorrelated with the private attribute of the elements to be classified. We plan to apply this notion of fairness to density-based problems pertaining to graphs and metric spaces such as sparsest cut, densest subgraph, k-means, k-centers, and principal component analysis (PCA).
Preliminary investigation of the PI shows that for some fair versions of these data analysis problems, computing any constant factor approximation is NP-hard, assuming the Small Set Expansion Hypothesis. For those problems that do not admit good fair approximations we plan to investigate color-constrained generalizations of planted models and show that, using the de-correlation preprocessing step, it is possible to uncover the fair ground truth. In order to overcome the hardness results in practice, we also plan to investigate the application of orthogonal transformations as a preprocessing step for achieving fairness.