You are here

CPFinder: Finding an unknown Caller's profession from anonymized mobile phone data

TitleCPFinder: Finding an unknown Caller's profession from anonymized mobile phone data
Publication TypeJournal Article
Year of Publication2022
AuthorsZhang, J., H. Chen, X. Yao, and X. Fu
JournalDigital Communications and Networks
Date Published08/2022

Identifying an unfamiliar caller's profession is important to protect citizens' personal safety and property. Owing to the limited data protection of various popular online services in some countries, such as taxi hailing and ordering takeouts, many users presently encounter an increasing number of phone calls from strangers. The situation may be aggravated when criminals pretend to be such service delivery staff, threatening the user individuals as well as the society. In addition, numerous people experience excessive digital marketing and fraudulent phone calls because of personal information leakage. However, previous works on malicious call detection only focused on binary classification, which does not work for the identification of multiple professions. We observed that web service requests issued from users' mobile phones may exhibit their applications preferences, spatial and temporal patterns, and other profession-related information. This offers researchers and engineers a hint to identify unfamiliar callers. In fact, some previous works already leveraged raw data from mobile phones (which includes sensitive information) for personality studies. However, accessing users' mobile phone raw data may violate the more and more strict private data protection policies and regulations (e.g., General Data Protection Regulation). We observe that appropriate statistical methods can offer an effective means to eliminate private information and preserve personal characteristics, thus enabling the identification of the types of mobile phone callers without privacy concerns. In this paper, we develop CPFinder —- a system that exploits privacy-preserving mobile data to automatically identify callers who are divided into four categories of users: taxi drivers, delivery and takeouts staffs, telemarketers and fraudsters, and normal users (other professions). Our evaluation over an anonymized dataset of 1,282 users with a period of 3 months in Shanghai City shows that the CPFinder can achieve accuracies of more than 75.0% and 92.4% for multiclass and binary classifications, respectively.