A crowdsourcing method for online social networks security assessment based on human‑centric computing

Abstract

the computer cluster) on the Internet, to handle complex tasks that are difficult to accomplish with existing computing technologies. In addition, the human-centric computational abstraction called situation can be used into Human-Embedded Computing or End-User Embedded Computing [6].
To maximize human individual or collective intelligence in crowd computing, researchers have studied task assignment that allows one to assign real-time crowd tasks to appropriate user crowd for execution [7][8][9]. The majority of existing task assignment approaches in the crowd computing literature consider user behaviors as homogenous [10]; thereby, completely ignoring the influence of such behaviors on the recycling results. Existing approaches also often assume static scenes [11], although tasks occur dynamically in practical applications and results should be returned immediately within a specific time. Therefore, such approaches are generally not practicable in real-world applications.
In this paper, we focus on the task itself. This is because in crowd computing, user crowds are often passive, and user multi-dimensional information, functional experience, and integrity are minimally considered. The SocialSitu theory [12], for example, analyzes the SocialSitu(t) sequence of users' access behaviors and draws out the patterns of such behaviors under different intentions (i.e., frequent functional experience). Hence, in this paper we propose a task assignment algorithm to determine user task suitability, based on SocialSitu. After finding user crowds who frequently use a certain function, the task of calculating task suitability and assigning the related crowd tasks based on user crowds' multi-dimensional information will become considerably accurate.
The contributions of this paper are twofold. (1) We design a task assignment method that can be used to perform crowd assessment for the security and privacy of online social networks. (2) Moreover, we propose a task assignment algorithm to determine user task suitability based on SocialSitu, in order to achieve efficient and accurate assignment of crowd tasks about security and privacy.
The remainder of this paper is structured as follows. "Related work" section briefly introduces the extant literature. "Task assign method of crowd assessment for online social networks" and "Task assignment algorithm of user task suitability based on SocialSitu" sections present our proposed task assignment method and algorithm, respectively. In "Experimental setup and analysis" section, we describe the crowd assessment experiments and discuss the findings which show that the proposed approach facilitates efficient task assignment and improves the task precision and recall. The last section concludes this paper.

Related work
Context-Aware (CA) [13][14][15] was first proposed by Schilit et al. (1994), in which a scene is defined as a location, a collection of people and objects in the vicinity, and changes in these objects. Chang et al. [16,17] described situation analysis theory and the significance and influence of Situ architecture in software engineering, They provided a detailed description of the Situ architecture, which updates services in real time by identifying users' new intentions in software engineering; thereby, providing users with truly personalized services. To discover users' intentions in social media in a timely manner and provide additional personalized services to users, Zhang et al. [12] developed the SocialSitu theory based on the Situ theory of Chang et al. [16,17], and designed a discovery method for user behavior patterns in multimedia social networks. This method analyzes users' SocialSitu(t) sequences and draws their behavior patterns under different intentions. Consequently, users' current intentions can be predicted by comparing their current behavior sequences with those in the database.
In crowd computing, existing task assignment structures remain relatively simple and a few cases involve tasks that are randomly assigned (i.e., randomly assigns tasks to user crowds). Therefore, most of the tasks cannot be handled by suitably qualified individuals and the advantages of crowd computing human-computer collaboration cannot be completely realized. An et al. [18] proposed an algorithm for the discovery and selection of service nodes based on the analytic hierarchy process. This takes into consideration users' mobility, complexity, real-time performance and other characteristics. In [19], Zhang proposed a simple and effective framework, MacroWiz, to manage the wisdom of crowds on mobile social networks. MacroWiz encourages online users to contribute their own knowledge or opinions through an incentive mechanism. This framework also assists task requester in collecting answers, choosing the reliable answers, and making final decisions. In the respect of spatial crowdsourcing, in order to cope with some problems about the multi-skill aware task assignment, Song et al. [20] presented Online-Exact algorithm and Online-Greedy algorithm. For the uncertain mobile crowdsourcing scenario, Guo et al. [21] found out that the results of mobile crowdsourcing largely depend on the quality of location-related users. Subsequently, Sun et al. [22] formulated an optimization problem of mobile crowdsourcing task allocation by using the trustworthiness of workers and movement distance costs. Then, they present a Markov decision process according to mobile crowdsourcing model to solve the problem of dynamic trust-aware task allocation. In [23], Mao et al. proposed an optimal user crowd algorithm. This algorithm assigns tasks to the least number of users based on users' historical task completion status; thereby, minimizing the total cost and ensuring the completion of a task. Zhang et al. [24] designed a task assignment algorithm with user theme awareness. The algorithm is designed to solve the blind randomness of random task assign algorithm and obtain users' themes (i.e., their fields of specialization) by analyzing their historical task information. In this manner, the completeness of the overall tasks is substantially improved. However, users' themes are constantly changing. Hence, obtaining these themes solely by historical data is no longer possible. Evidently, such themes are inaccurate and not real-time. In [25], Kim et al. presented a multi-layered information analysis approach based on crowdsourcing theory and effectively uses topic analysis to track scientific issues. In social networks privacy preservation and trust ways, Yuan et al. [26] investigated privacy protection in spatial crowdsourcing and presented a privacy-preserving framework. Specifically, they also proposed a grid-based location protection method to protect the locations of workers and tasks. Moreover, Li et al. [27] exploited a secure grid-based index method to solve the problem of privacy-aware spatial crowdsourcing. This method can not only protect workers location privacy but also improve the spatial task processing time. In [28], Sharma et al. presented a novel trust relaying and privacy preservation architecture which included a distributed query system for social Internet of Things by using edge-crowdsourcing techniques. Ma et al. [29] utilized some advanced blockchain technologies to study security, privacy, and trust in the field of crowdsourcing services. Wang et al. [30] combined incentive mechanism and the techniques of location privacy-preserving to enhance the validity of mobile crowdsourcing systems. In [31], Chi et al. developed an effective location privacy protection method and solved the problem between location privacy protection and service quality, to better assess the reliability of workers for task allocation, Jiang et al. [32] developed a context-aware reliable and efficient crowdsourcing technique in simple social networks and multiple social networks. In [33], Huang et al. put forward an efficient reputation evalution technique for crowdsourcing participant, which is based on some important machine learning methods and multidimensional evaluation index mechanism.
In summary, most current assignment methods are unable to find suitable crowds for certain tasks; consequently, leading to low completeness, precision, and recall. To solve these problems, this study defines the decision factors that affect task assignment and suitability of user tasks in online social networks. The current research completely considers users' functional experience and historical information, which are used as bases to design a task assign method of crowd assessment for the security of online social networks. This study also proposes a task assign algorithm based on SocialSitu user task suitability, which can assign crowd tasks efficiently and accurately in real time.

Task assign method of crowd assessment for online social networks
Crowd assessment system architecture Crowd computing has five basic elements, namely: user crowds C, man-machine interaction M, purposeful crowd activity A, tasks T, and collective intelligence I. We use these characteristics as bases to propose a crowd computing application scene (i.e., crowd assessment) and discuss the function and architecture of different roles within scenes (see Fig. 1). Accordingly, crowd assessment has three steps, and each step is based on a crowd computing system. First, we discuss the input. Many users with certain purposes gather to form a user crowd. Interactive devices serve as media for user crowds and the system. Then, task publishers have many uncompleted tasks that should be solved by users with different backgrounds. This way, the relationship is established among the publication-> allocation-> execution. Second, we focus on processing. Crowd computing system analyzes the information of task collection T and user collection U and employs the task assign algorithm to select the suitable user crowd u 1 for task t 1 . Then, the output is discussed. After the processing of crowd computing system, the corresponding results or strategies will be obtained and data will be stored.
Three roles are mainly involved in crowd assessment, which are user collection U = (id, ο, χ, δ, λ), crowd computing system platform P, and task collection T = (id, ω, φ, θ, γ). Task publishers publish task t 1 ( t 1 ∈ T ) to system P. Users log into the system to query and receive task t 1 . The system finds user crowd u 1 ( u 1 ∈ U ) suitable for t 1 through the analysis of tasks and users. Moreover, id is the unique identifier of the two and ω is the description of t 1 , which is the general basic information. φ refers to the category of t 1 and θ and γ are the number of users required for t 1 and deadlines, respectively. ο is the devices for user crowd u 1 , including mobile devices m and fixed devices w. χ is the field of specialization of users, δ is the historical information, such as degree of completion and degree of correlation, and λ is the users' situation information (i.e., SocialSitu(t) sequence).
The purpose of a crowd assessment system is to assign appropriate assessment tasks to user crowd. Accordingly, the system architecture must meet the following three basic requirements: (1) user crowd who is best suitable for some tasks that can be found; (2) the server can independently design, publish, analyze, and process the result data; and (3) interface for users to perform tasks can be provided. Therefore, a hierarchical architecture is designed to meet these requirements. Figure 2 shows the crowd assessment system architecture. The specific layered architecture is designed as follows

Social platform facilities layer
The first layer is the social platform facilities layer, where social media platform server designs and publishes crowd assessment tasks. The main function of this layer is to process data returned to it from users who have completed tasks. In addition, the user crowds and task information are simultaneously updated.

Data processing middle layer
The second layer is the data processing middle layer. Data transmission passes the data of crowd users of social platform facilities layer to the data processing middle layer and assigns the appropriate tasks to crowd users based on the task assign method of crowd assessment proposed in this study. Moreover, this layer organizes, analyzes, and stores the result data from the crowd user application layer.

Crowd user application layer
The third layer is the crowd user application layer, which provides the corresponding user interfaces. After users accept the crowd assessment tasks, they can forward these tasks to friends who have common interests. After the tasks are executed, the result data will be returned to the data processing middle layer for processing.

Hierarchical assign structure
Due to online social network task assignment crowd selection existing many optional schemes, among the degree of completion, service response time, the degree of correlation, operation behaviors, and coincidence for each schemes have strong fuzziness. Hence, it is difficult to determine the importance of these attributes that affect decision making. Aim at the complexity and fuzziness of task assignment crowd selection, a fuzzy analytic hierarchy process [34] (FAHP) is adopted for decision-making. The FAHP breaks down elements that are constantly related to decision-making into target, criteria, attributes, and other layers. FAHP performs qualitative and quantitative analysis based on the comprehensive consideration of various factors that affect things. Accordingly, this process is a scientific assessment and decision-making method. This study defines task assignment factors and their attributes by analyzing that affects task assignment and builds a hierarchical assign structure based on this analysis.

Task assign factor
In OSNs, users' intentions may change constantly, which are ultimately reflected in changes in users' behaviors. Moreover, changes in behaviors directly reflect users' experience of certain functions. Therefore, task assignment patterns in online social networks are changed in response to such changes of users. Task assignment should consider users' service capability and focus on their functional experience. This study analyzes the operation behaviors and historical information of online social users and defines three task assign factors (TAF) that affect task assignment. This study also defines and calculates user-task-suitability (UTS) through the following three factors: service feedback quality (SFQ), operation behavior (OB), and coincidence degree (CD).
Service feedback quality Definition 1 Service feedback quality is an objective assessment of the degree of completion of social user tasks by the platform by assigning them to completion. SFQ of user U i is related to historical information δ of the user. Moreover, the historical service parameters of the user include completeness S succ , average delay S del , and degree of correlation S rel .
Definition 2 Degree of completion S succ is the proportion of successful completion of historical tasks by user U i , such as Eq. (1), where N succ is the number of successful historical services and N total is the total number of historical services.

Definition 3
Average delay S del is the time interval between a system's task assignment and task accepted by users, such as Eq. (2), where d(h j ) is the duration historical task h j (h j ∈ H), S succ (U i ) is a task's assigning time, and e(U i ) is the time when user U i accepted a task.

Definition 4
Degree of correlation S rel is the degree of correlation for users' skilled fields, such as Eq. (3), where s(h j ) is the users' degree of correlation (h j ∈ H) of historical tasks h j and ρ (Eq. (4)) is the attenuation factor. It is an amount that dynamically attenuates over time. That is, the longer the interval, the lesser the influence of users' degree of correlation on service decisions.

Operation behavior Definition 5
Operation behaviors are users' operation process in social networks. This study uses the user-intentional serialization algorithm based on situational analysis proposed in [12] to analyze the SocialSitu(t) sequence of users' operation behaviors which draws out their behavior patterns under different intentions (i.e., frequent functional experience).

SocialSitu(t):
the environmental information, including terminal devices and location information, and ID represents users' identity information, including users' crowds and their roles played in the crowds.

Coincidence degree Definition 6
Coincidence degree aims is to evaluate the consistency of the results returned by individual and overall users, as shown in Eq. (5), where R it is the rating of task t by user i and R t is the average rating of task t by overall users. Therefore, the higher the coincidence degree is, the more consistent the rating of the user with overall users is.

User-task-suitability Definition 7
User-task-suitability is how well a user performs a task. Suitability is calculated from the attributes and weights corresponding to the three task assign factors, namely, degree of completion, service response time, the degree of correlation, operation behaviors, and coincidence. Equation (6) shows that the greater the suitability, the more suitable the user is for this task. Accordingly, w ij represents the jth attribute of user i, α it represents the weight of the jth attributes of user i for task t, and p is the number of attributes. The specific weights of attributes are detailed in the next section.

Establishment of task assign hierarchy
Hierarchy structure To determine the appropriate user for a task, users' various attributes should be considered. First, the issue should be organized and the corresponding hierarchy allocation system should be established, which includes the target, criterion and attribute layer. Thereafter, the attributes of each layer are further analyzed. The target layer aims to determine the crowd that is most suitable for a certain task. The criterion layer is the criterion for finding crowd users (i.e., factors that affect task assignment). The attribute layer is the attributes that make up each criterion layer factor. Figure 3 shows the hierarchical assignment structure.
Constructing judgment matrix at each layer Before using FAHP to determine the weight of each attribute, the importance of each layer's attributes is expressed by fuzzy triangular numbers ã ij = (l ij , m ij , u ij ) and a fuzzy reciprocal judgment matrix can be constructed where ã ji = 1/ã ij . In order to facilitate operation, 1-9 and its reciprocal numbers are used as scales to determine the value of l ij , m ij , u ij , where l ij , m ij and u ij are the lower, the mean and the upper bounds of a triple ( l ij , m ij , u ij ), respectively. Hence, a relatively important standard seventeen meter is introduced when comparing the importance of each attribute using FAHP ( Table 1).

Determination of weights and consistency check Given that the fuzzy judgment matrix
A-B is reciprocal one, the non-linear programming modification of the Fuzzy Preference Programming (FPP) method that only rely on the elements of the upper right part for matrix A-B is used to estimate weights (Formula (7)). In formula (7), can be notated as the value of the consistency index and each w i be expressed as weight of attribute.  = 3 Element i is slightly more important than element j a ij = 5 Element i is more important than element j a ij = 7 Element i is considerably more important than element j a ij = 9 Element i is extremely more important than element j a ij = 2n , n = 1, 2, 3, 4 Importance of elements i and j is between ã ij = 2n − 1 and ã ij = 2n + 1 a ji = 1/ã ij Importance of elements i and j is opposite to that of elements ã ij , respectively The detailed steps in obtaining weights of attributes are as follows.
1. The optimal solution ( * , w * ) of non-linear program problem for the formula (7) which includes one equality and six inequality constraints is solved. agreeable. If not, the fuzzy judgment is unreasonable. 3. The fuzzy judgements consistency is validated. If the optimal value * is positive, the consistency of fuzzy judgments is acceptable. Otherwise, these judgements are strongly inconsistent so that the fuzzy judgment matrix should be modified appropriately. Consistency detection prevents inaccurate weights because of subjective factors. 4. Lastly, by calculating the weight of each attribute in attribute layer C relative to target layer A, Suitability can be further calculated and the task can be assigned thereafter. Table 2 shows the weight values.

Data preprocessing
When a task publisher submits crowd tasks to crowd computing platforms, the quantity may be large and varied. Therefore, tasks submitted by the publisher should be processed into certain task types that can be handled by crowd computing platforms. The daily data of user logs in online social networks are irregular and incomplete. First, data preprocessing should convert these irregular data into data formats that can be recognized by the proposed algorithm. The possible steps are as follows. 2. Data identification: This step comprises two parts: user identification and full session identification (i.e., to crowd data and select each user's completed session records). 3. Data conversion: Data are converted into formats that can be recognized by the proposed algorithm.

Specific steps of algorithm
The key issue in this study is how to accurately and efficiently assign suitable tasks to online social users. The main ideas are as follows: (1) use the SocialSitu(t) theory to obtain behavioral patterns under different intentions by analyzing the SocialSitu(t) sequence of users' behaviors, (2) find user crowd L 1 with relevant functional experience and calculate user task suitability, and (3) sort tasks by suitability based on the required number of users to form assignment collection L 2 . The assign algorithm is shown, where SituBehaviorAnlytics (DS, Min_Support, G) function refers to analysis of user behavior patterns under different intents, DS is user operation behavior data set, Min_Support is minimum support, and G is users' intentions. Figure 3 shows the algorithm process. The specific steps of the algorithm are as follows.
1. Input task collection T and user collection U. Traverse unaccomplished task list T = {t 1 , t 2 , … t i } and user list U = {u 1 , u 2 , … u j }, and obtain task t i and information of user u j . 2. The situation information λ of u j is analyzed and processed and the behavior sequence patterns of user u j are obtained through the user behavior pattern discovery algorithm SituBehaviorAnalytics(DS, Min_Support, G). 3. If the behavior sequence patterns of user u j match the category φ of uncompleted task t i , then the id of user u j is stored in user collection L 1 of the unallocated tasks. 4. Suitability of user collection L 1 for task t i is calculated S L m t i and is sorted from high to low. 5. The previous θ user is selected and stored in user collection L 2 . 6. Assign tasks to users in collection L 2 . 7. Repeat steps 1 to 6 until there are no newer unassigned tasks. And the detailed algorithm process is shown in Algorithm 1 and Fig. 4

Experimental setup and analysis
The purpose of crowd task assignment is to obtain large-scale user computing resources in an online social environment. To verify the validity and correctness of the proposed algorithm, it is compared with two typical and popular algorithms, namely, random algorithm and algorithm based on user theme awareness. Random algorithm is currently the most popular allocation algorithm, which is easy to operate, has short time consumption, and

Experimental environment
To analyze the correctness and effectiveness of the proposed algorithm, the self-developed technology social platform Shareteches (formerly CyVOD) [35] (http://www.share teche s.com) and its mobile applications are used as experimental platforms to conduct experiments and data analysis. The web server is used as task publisher and the client is used as mobile.

Experimental design
Social networks have gained widespread attention, and have been extensively applied as platforms for people to spread information on the Internet and conduct social exchange activities.
At present, security and privacy issues of social network platforms highlight the urgent need for social network users to assess platform functionality, security precautions, privacy protection, and other features [36]. According to the prevailing security and privacy issues in current social networks [37][38][39], this study designs seven assessment tasks of social networking platforms and evaluates security trust, functionality, and other aspects of social networking platforms. The details of crowd assessment information are showed in Table 3. In the course of the experiment,all the participants who come from Shareteches consist of ordinary users and expert users. Moreover, these users have experience of using this social networks platform and are able to perform better when they operate these assessment tasks.

Analysis of experimental results
Precision, recall, F-measure, and degree of competition are used as evaluation indicators of the algorithm. Precision and recall are two extensively used measures to assess the quality of returned results. Precision is the ratio of returned correct results to all returned results and measures the accuracy of task assignment. Recall refers to the ratio of the correct number of returned results to the total number of assignment and measures the coverage of task assignment. F-measure is the harmonic average of precision and recall. Degree of completion is the ratio of the number of returned results to the total number of assignment. We here use Ru to represent the total number of results for user u and Tu represents the total number of user u task assignments. Therefore, the formulas and notations for evaluation factors such as Precision, recall and F-measure are expressed as follows:

Algorithm correctness analysis
To reflect the correctness of proposed algorithm, it is used to assign 7 types of tasks about the security and privacy to 300 users of the social platform Shareteches. The precision and recall are shown in Fig. 5a, b. The figure shows that the precision is mainly stable between 0.60 and 0.70 and the recall is stable between 0.80 and 0.90. The F-measure is shown in Fig. 5c, which shows that the F-measure is mainly stable between 0.7 and 0.75.

Algorithm validity analysis
Task assignment was performed using random assignment, theme-aware assignment, and proposed algorithms. Precision, recall, and F-measure comparisons are shown in Fig. 6a-c. Given the increase of users, the precision and recall of the proposed and random assignment algorithms are gradually increasing and tend to be stable. Precision and recall of the theme-aware algorithm are slightly decreased and eventually tend to be stable. The precision of proposed algorithm is mainly stable at approximately 0.6, while recall is stable between 0.8 and 0.9. Both are higher than the other two algorithms. When the number of assigned users reaches 300, the F-measure of the proposed algorithm is approximately 0.3 and 0.5 higher than the other two algorithms, respectively. Moreover, the proposed algorithm has evident advantages in terms of timeliness. Figure 6d shows that the completeness of the three algorithms increases as the delay increases. The degree of completion of the proposed algorithm is higher than that of the other two algorithms with the same delay.

Conclusion
To accurately and efficiently assign assessment tasks about the security and trustworthiness of online social networks to social users, this study defined task assign factors and their attributes and user tasks suitability by analyzing users' situational information and historical records. Accordingly, this research designed a task assign method of crowd assessment for online social networks and proposed a task assign algorithm based on human-centric computational abstraction SocialSitu theory. Crowd assessment experiments were conducted on a real world social network Shareteches. The experimental results showed that the proposed method not only achieves both validity and effectiveness, but also further improves the security and trustworthiness for online social networks. In the further, we firstly mine more effective task allocation factors based on users' social behavior characteristics and content characteristics. Then, we further combine machine learning method with crowdsourcing theory to complete the security and trustworthiness assessment of social network platform.