TY - JOUR
T1 - Unsupervised classification for uncertain varying responses
T2 - The wisdom-in-the-crowd (WICRO) algorithm
AU - Ratner, Nir
AU - Kagan, Eugene
AU - Kumar, Parteek
AU - Ben-Gal, Irad
N1 - Publisher Copyright:
© 2023
PY - 2023/7/19
Y1 - 2023/7/19
N2 - This paper addresses the problem classification of instances/questions based on the opinions (classes) provided by anonymous agents. The solution aggregates the agents’ classifications, aiming to obtain as close as possible to an unknown correct classification. However, the agents’ fields or domains of competence and their levels of expertise are unknown and can vary extensively. Many popular classification algorithms address such a problem by following a “wisdom-of-the-crowd” approach while using different voting methods and expectation–maximization techniques. These algorithms lead to correct classifications when the majority of the agents are experts, thus classifying the instances correctly. However, they often result in erroneous classification when only a small subset of the agents are indeed correct. Moreover, these algorithms often assume a fixed set of classes for all instances. This study presents a fast (one-pass) classification algorithm that can estimate the unknown agents’ expertise level and aggregates their classifications accordingly, even when these are obtained from different questionnaires; thus, when the instances are not necessarily classified to a fixed set of classes. The proposed algorithm finds the experts and the nonexpert agents for each question by analyzing the distance between them. The algorithm identifies the expert agents for each instance and then classifies them accordingly. The suggested algorithm is validated and compared against known methods by using both simulated datasets and real-world datasets collected from various sources. The obtained results clearly demonstrate the effectiveness and advantages of the proposed method.
AB - This paper addresses the problem classification of instances/questions based on the opinions (classes) provided by anonymous agents. The solution aggregates the agents’ classifications, aiming to obtain as close as possible to an unknown correct classification. However, the agents’ fields or domains of competence and their levels of expertise are unknown and can vary extensively. Many popular classification algorithms address such a problem by following a “wisdom-of-the-crowd” approach while using different voting methods and expectation–maximization techniques. These algorithms lead to correct classifications when the majority of the agents are experts, thus classifying the instances correctly. However, they often result in erroneous classification when only a small subset of the agents are indeed correct. Moreover, these algorithms often assume a fixed set of classes for all instances. This study presents a fast (one-pass) classification algorithm that can estimate the unknown agents’ expertise level and aggregates their classifications accordingly, even when these are obtained from different questionnaires; thus, when the instances are not necessarily classified to a fixed set of classes. The proposed algorithm finds the experts and the nonexpert agents for each question by analyzing the distance between them. The algorithm identifies the expert agents for each instance and then classifies them accordingly. The suggested algorithm is validated and compared against known methods by using both simulated datasets and real-world datasets collected from various sources. The obtained results clearly demonstrate the effectiveness and advantages of the proposed method.
KW - Data analysis
KW - Expert voting
KW - Online decision-making
KW - Unsupervised classification
KW - Wisdom-of-the-crowd
UR - http://www.scopus.com/inward/record.url?scp=85156235381&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2023.110551
DO - 10.1016/j.knosys.2023.110551
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85156235381
SN - 0950-7051
VL - 272
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 110551
ER -