Name disambiguation in Aminer
Zhang, Jing; Tang, Jie
Sci China Inf Sci, 2021, 64(4): 144101
Name disambiguation, aiming at disambiguating who is who, is one of the fundamental problems of the online academic network platforms such as Google scholar, microsoft academic and AMiner. This study takes AMiner, a free online academic search and mining system, as the example to explain how we deal with the name ambiguity problem under three different scenarios. AMiner has already extracted 13 million researchers' profiles from the Web and integrated with 20 million papers from heterogeneous publication databases, with a growth rate of over 500000 per month. From the beginning when the system is built to the running and updating phases, we need to pay continuous attention on the problem of name disambiguation. In the following parts, we discuss the problem on three scenarios during the whole life cycle of AMiner, i.e., name disambiguation when the system is built from scratch (full ND), name disambiguation when persons' profiles are continuously updated (continuous ND) and error detection upon existing persons' profiles (error detection).