Skip to main content

Advertisement

We’d like to understand how you use our websites in order to improve them. Register your interest.

Cyclist detection and tracking based on multi-layer laser scanner

Abstract

The technology of Artificial Intelligence (AI) brings tremendous possibilities for autonomous vehicle applications. One of the essential tasks of autonomous vehicle is environment perception using machine learning algorithms. Since the cyclists are the vulnerable road users, cyclist detection and tracking are important perception sub-tasks for autonomous vehicles to avoid vehicle-cyclist collision. In this paper, a robust method for cyclist detection and tracking is presented based on multi-layer laser scanner, i.e., IBEO LUX 4L, which obtains four-layer point cloud from local environment. First, the laser points are partitioned into individual clusters using Density-Based Spatial Clustering of Applications with Noise (DBSCAN) method based on subarea. Then, 37-dimensional feature set is optimized by Relief algorithm and Principal Component Analysis (PCA) to produce two new feature sets. Support Vector Machine (SVM) and Decision Tree (DT) classifiers are further combined with three feature sets, respectively. Moreover, Multiple Hypothesis Tracking (MHT) algorithm and Kalman filter based on Current Statistical (CS) model are applied to track moving cyclists and estimate the motion state. The performance of the proposed cyclist detection and tracking method is validated in real road environment.

Introduction

Autonomous vehicle has attracted great interest in much ongoing research [1, 2], and environment perception is an essential step toward the trajectory plan for autonomous vehicle [3,4,5]. The perception task is usually divided into several sub-tasks, including object detection and tracking, object localization and behavior prediction. In recent years, cycling is gaining the popularity, and 43% growth is reported in Ireland between 2011 and 2016 [6]. Since there is no special protection equipment, cyclists are the vulnerable road users. Moving cyclist inevitably interacts with other traffic participants in local environment, and vehicle-cyclist collision amounts to a large proportion of traffic accidents every year [7]. Therefore, to enhance the cycling safety, much attention must be paid to cyclist detection and tracking for autonomous vehicle system.

The existing cyclist detection methods using laser scanner are mostly conducted for counting the number of the cyclists without the interference of non-cyclist road users [8, 9]. Motivated by the analysis of the object detection studies, a novel method for cyclist detection and tracking using multi-layer laser scanner is proposed in this paper. IBEO LUX 4L is adopted as the main sensor and mounted on the front bumper of the autonomous vehicle, as shown in Fig. 1. First, subarea-based DBSCAN method is developed to segment the point cloud clusters. Second, three categories of the feature sets are employed with SVM and DT classifiers respectively, and six classifiers are obtained totally. Then, MHT algorithm based on Kalman filter is used to track multiple detected cyclists. It is validated that the proposed cyclist detection and tracking method has good performance in real road scene. The contributions of this paper are twofold: (1) It is the first attempt to separate the raw point cloud into several subareas based on the density distribution; (2) CS model is selected as the motion model of the cyclist, and MHT algorithm is used to track multiple cyclists.

Fig. 1
figure1

IBEO LUX 4L. a the laser beams at four layers. b the scanner is mounted on the autonomous vehicle

The remainder of this paper is organized as follows: Sect. “Related works” describes the point cloud clustering process. Section “Feature extraction” presents the feature extraction. Section  “Classification” introduces the cyclist classifiers. Section “Tracking” presents the cyclist tracking algorithm. Experimental results are given in Sect “Experiments”. Section “Conclusion” concludes this study.

Related works

In the field of cyclist detection and tracking, most state-of-the-art methods rely upon machine vision sensors [10,11,12], due to the advantages of high-resolution image with the color and texture information. Zangenehpour [10] presented the cyclist detection method based on histogram of oriented gradient feature for video datasets from crowded traffic scenes. Li [11] described a unified framework for cyclist detection, which included a novel detection proposal and a discriminative deep model based on Fast R-CNN with over 50000 images. Tian [12] explored geometric constraint with various camera setups and built cascaded detectors with multiple classifiers to detect cyclists from multiple viewpoints. Bieshaar [13] used 3D convolutional neural network to detect the motion of cyclists on image sequences with spatio-temporal features. Liu [14] utilized a region proposal method based on aggregated channel feature to solve cyclist detection problem for high-resolution images. However, cyclist detection and tracking based on vision sensors still remain challenging, since the camera is susceptible to the illumination and the large variability of cyclist exists in appearance, pose and occlusion. Compared with vision sensors, laser scanner can collect the spatial and motion information of the detected objects [15, 16]. As for the number of the laser layers, laser scanners are divided into three types, namely, 2D, 2.5D and 3D. 2D laser scanner generates the sparse single-layer point cloud which is inadequate for object classification in real driving environment. 3D laser scanners are usually installed on the top of autonomous vehicles and they can produce 3D dense point cloud to cover full-field of view for global environment construction. However, the high cost of 3D laser scanners limits the wide application. Compared with other kinds of laser scanners, 2.5D laser scanner, e.g., IBEO LUX 4L, is a more practical option for object classification.

Extensive algorithms have been presented for object classification using 2.5D multilayer laser scanner [17,18,19,20,21,22,23,24]. Wang et al. [19] built SVM classifiers based on 120-dimensional point cloud feature set for object recognition. Huang et al. [20] captured the discontinuities of the point cloud with the distance threshold to segment the clusters, and trained three categories of classifiers to recognize the dynamic obstacles in driving environment. Magnier et al. [21] employed belief theory to detect and track moving objects using multi-layer laser scanner. Kim et al. [22] extracted the geometrical features from four-layer point cloud data and detected the pedestrians with RBFAK classifier. Carballo et al. [23] used several intensity-based features for pedestrian detection. Arras et al. [24] conducted pedestrian classification with 14 static features of human legs at indoor environment. Compared with much research on the detection of other road users, e.g., vehicle and pedestrian, the study on cyclist detection using laser scanner is limited. Subirats et al. [8] employed single-layer laser scanner to monitor the road surface and count the cyclists in real traffic flow, and the length of the detected cyclists is subject to the speed of the cyclists. Prabhakar et al. [9] presented the last line check method to detect and count the Powered Two Wheelers (PTWs) with the laser scanner at a fixed angle. In general, the existing study on cyclist detection using laser scanner is mainly conducted for counting the number of the cyclists while ignoring the variation of the cyclists’ motion state. In this paper, the proposed method for cyclist detection and tracking using multi-layer laser scanner aims at capturing the moving cyclist with high accuracy.

Point cloud clustering

The range of the scanned obstacles affects the density of the returned point cloud directly. In real driving environment, the point cloud varies frame by frame, and the number of the surrounding clusters is uncertain. DBSCAN clustering method is often used to deal with the uneven point cloud, since there is no need for DBSCAN to set the number of the clusters in advance, and the input parameters are the minimum number of the laser points in the cluster and the neighbor radius [25].

It may happen that traditional DBSCAN method segments multiple obstacles at close range as one cluster, while the single obstacle at long range is clustered as several clusters. Therefore, subarea-based DBSCAN method is proposed in this paper. First, the density distribution curve is generated for the uneven point cloud, and the subareas are divided based on the characteristics of the density distribution. Then, the optimal neighbor radius is calculated for each subarea, and DBSCAN method is applied to the point cloud in each subarea. The point cloud data at one frame in campus environment is taken as an example. First, the point cloud in the region of interest is divided into several subareas along the motion direction of ego-vehicle installed with the laser scanner. Since the excessive subareas may cause the increase of the computation cost and the fewer number of the subareas may lead to some empty subareas, the number of the subareas need to be selected properly. According to the statistical analysis, the relation between the number of the subareas and the number of the points is given as follows:

$$ m = \,0.187\, \times \,(n - 1)^{2/3} $$
(1)

The ratio between the number of the points in each subarea and the number of total points is defined as the density of each subarea. The density histogram is used to describe the distribution of the point cloud. For the example frame, there are 424 points in the interest region, and the number of the subareas equals 10.55 according to formula (1). Thus, the number of the subareas is set as a round number 10. The density distribution curve is obtained based on the density histogram, as shown in Fig. 2. We can see that the wave crest of the density distribution curve means the dense distribution, while the wave valley corresponds to the sparse distribution. The wave valley of this curve is the inflection point of the point cloud density variation, and it is also considered as the partition point for subareas. Note that the valleys between any two wave crests is not always the appropriate partition point, in case of the redundant operation. If the density difference between two adjacent subareas is not too large, there is no need to divide these two subareas, and these subareas can be regarded as one subarea with uniform density. Therefore, the density difference threshold of the subarea is set as a constant value T, and the subareas are divided only when the difference between two adjacent wave peaks is greater than T. According to the attributes of the point cloud from LUX 4L laser scanner, the density difference threshold T is set as 50. As shown in Fig. 2, the density differences among three adjacent peaks XP1, XP2 and XP3 of the density distribution curve are greater than the threshold T. Thus, the valleys X1 and X2 among these three peaks are regarded as two partition points, and the partition locations are X1 = 12 m and X2 = 20 m. Finally, three subareas with various densities are obtained, as shown in Fig. 3.

Fig. 2
figure2

The density distribution

Fig. 3
figure3

Three subareas with different densities

The original DBSCAN algorithm [25] and the proposed subarea-based DBSCAN algorithm are used to cluster the point cloud at this example frame, respectively. Figure 4 shows the clustering results from these two DBSCAN algorithms, and each color of the point cloud denotes an individual cluster. We can see that the subarea-based DBSCAN algorithm separates two neighboring obstacles successfully while the original DBSCAN algorithm clusters two neighboring obstacles into one obstacle using the global neighbor radius. Thus, the subarea-based DBSCAN algorithm achieves better clustering performance than the original DBSCAN algorithm.

Fig. 4
figure4

The clustering results from two DBSCAN algorithms. a Real scene, b Traditional DBSCAN, c Subarea-based DBSCAN

Feature extraction

37-dimensional feature set is proposed for cyclist detection and tracking based on the point cloud characteristics of the cyclist. This feature set includes 11 number-of-points-based features, 16 geometric features and 10 statistical features, as listed in Tables 1, 2 and 3. Feature 23 denotes the length of the polyline which connects the horizontal projection points in ascending order of horizontal coordinate value. Features 28–31 denote the convexity of the scan points at each layer, and the convexity is equal to the distance between the center of the fitting circle and the origin point minus the distance between the geometric center of all the scan points and the origin point. Feature 37 denotes the residual ratio of the bounding areas for two edges of the laser points vs. the middle points, and this feature indicates that the middle of the point cloud cluster is denser than the edges.

Table 1 Number-of-points-based features
Table 2 Geometric features
Table 3 Statistical features

The effectiveness of the feature set is significant to improve the performance of classifier. However, the combination of multiple independent features cannot always make better classification ability than single feature. Therefore, the optimization of the feature set is necessary to reduce the redundancy among multiple features. Relief algorithm [26] and PCA method [27] are employed in the optimization process.

Relief algorithm

Relief algorithm is a feature selection method based on the multi-variable filtering, and it uses the sample learning to determine the weight of the features. Each feature is evaluated by the classification performance difference between the same category of the samples and the different categories of the samples. For 37-dimensional feature set F, the classification weight of each feature is calculated, and the weight proportion histogram is obtained, as shown in Figs. 5 and 6. According to the classification weight results from Relief, the weight threshold for single feature is set as 3000. If the feature weight exceeds 3000, this feature is retained, otherwise this feature is discarded. On the basis of the further redundancy calculation, 20-dimensional feature set FR are obtained: Feature 2, Feature 6, Feature 7, Feature 8, Feature 9, Feature 11, Feature 12, Feature 13, Feature 14, Feature 19, Feature 20, Feature 22, Feature 23, Feature 26, Feature 27, Feature 29, Feature 30, Feature 31, Feature 33 and Feature 37.

Fig. 5
figure5

The weights of 37-dimensional features

Fig. 6
figure6

The weight proportion histogram

PCA algorithm

PCA is a statistical method which can convert the series of features into new alternative feature components. This method can simplify the original features to ensure the minimum loss of feature information. PCA is conducted for the normalized features to generate the eigenvalue, as well as the variance and cumulative variance of each principal component, as shown in Table 4. The cumulative variance of the first and second principal components reaches 91.0%, thus the first and second principal components are regarded as new feature indexes: FP = [FP1, FP2].

Table 4 PCA results

Classification

SVM classifier

With the strong generalization capability, SVM is utilized as an independent classifier. First, the normalization is conducted as follows:

$$ scale(f) = \left\{ {\begin{array}{*{20}l} { - 1,} \hfill & {f \le f_{{\min }} } \hfill \\ { - 1 + \frac{{2(f - f_{{\min }} )}}{{f_{{\max }} - f_{{\min }} }},} \hfill & {f_{{\min }} < f < f_{{\max }} } \hfill \\ {1,} \hfill & {f > f_{{\max }} } \hfill \\ \end{array} } \right. $$
(2)

where f is the normalized eigenvalue, fmax and fmin are the maximum and minimum eigenvalues, respectively. Since Radial Basis Function (RBF) can achieve nonlinear mapping with a few parameters, RBF is selected as the kernel function:

$$ K(x,y) = \exp ( - \gamma \left\| {x - \left. y \right\|} \right.^{2} ) $$
(3)

where the parameter γ and penalty factor C are traversed by the grid optimization and cross validation using LIBSVM toolbox [28]. In our test, the optimal parameters are: c = 2068, γ = 0.006821.

DT classifier

Since the features are discrete variables, ID3 algorithm is selected as DT classifier. The main scheme of DT classifier is as follows. The feature with the largest information gain is selected as the classification benchmark. The classification process is repeated until a decision tree with the complete classification ability is constructed. The entropy of random variable x is:

$$ H(X)\, = - \sum\limits_{i - 1}^{n} {p_{i} } \times \,\log \,p_{i} $$
(4)

The entropy of each attribute of the dataset D is:

$$ H(D_{i} )\, = - \sum\limits_{i - 1}^{n} {p_{i} } \times \,\log_{2} \,p_{i} $$
(5)

where pi represents the ratio of the number of samples for the ith-dimensional feature and the number of total features, and n represents the number of total features. For the given feature Fi, the conditional entropy of the dataset D is:

$$ H\left( {\left. D \right|F_{i} } \right) = - \sum\limits_{i = 1}^{n} {\frac{{\left| {\left. {D_{i} } \right|} \right.}}{{\left| {\left. D \right|} \right.}} \times H(D_{i} )} $$
(6)

where |Di| represents the number of features for the ith subset, and |D| represents the number of all samples in the dataset D. On the basis of the entropy \( H(D_{i} ) \) and the conditional entropy \( H\left( {D\left| {F_{i} } \right.} \right) \) of the feature Fi, the information gain of the feature Fi is calculated by:

$$ g(D,F_{i} ) = H(D) - H(\left. D \right|F_{i} ) $$
(7)

Tracking

Object tracking can predict the future motion state and avoid the missing detection caused by temporary occlusion. Moving object tracking consists of data association and motion estimation. The data association procedure associates the same objects in successive frames, and the motion estimation procedure uses the filter method to estimate the position and speed of the associated objects. In this section, MHT algorithm is combined with Kalman filter algorithm to track the cyclist based on the CS model.

Data association

MHT algorithm selects the optimal association hypothesis for the same object at two consecutive frames. And the calculation of hypothesis probability is critical. From the tracking start time to the kth time step, all the measurements are recorded as Zk = {Z(1), Z(2),…, Z(k)} and all the hypothesis sets obtained by MHT algorithm at kth time step are recorded as \( \varOmega^{k} = \left\{ {\left. {\varOmega_{i}^{k} ,i = 1,2, \ldots ,I_{k} } \right\}} \right. \). The hypothesis probability \( P_{i}^{k} \) is calculated at the kth time step by the hypothesis \( \varOmega_{i}^{k} \) as follows:

$$ p_{i}^{k} \, = \,p\left( {\{ \varOmega_{i}^{k} \text{|}Z^{k} \} } \right) $$
(8)

Assumed that \( \varOmega_{i}^{k} \) is obtained by the correlation hypothesis φk between the hypothesis \( \varOmega_{g}^{k - 1} \) at previous frame and the measurements Z(k) at the current frame. Bayes theorem is utilized to compute the hypothesis probability:

$$ p\left( {\varOmega_{g}^{k - 1} ,\varphi \text{|}Z\left( k \right)} \right)\, = \frac{1}{c}p\left( {z\left( k \right)\text{|}\varOmega_{g}^{k - 1} ,\varphi_{k} } \right)\, \times p\left( {\varphi_{k} \text{|}\varOmega_{g}^{k - 1} } \right) \times \,p\left( {\varOmega_{g}^{k - 1} } \right) $$
(9)

where c is the normalized constant. In terms of the association hypothesis φk, the number of the measurements associated with the new object at the current frame is marked as NNT (h), the number of the measurements associated with the false object is set as NFT (h), the number of the measurements associated with the previous object is labeled as NDT (h), and the number of all the object is MK. Assumed that the number of the existing detected object obeys binomial distribution, the number of new objects is subject to Poisson distribution, and the number of false objects also obeys Poisson distribution, we can obtain:

$$ P(N_{DT} ,N_{FT} ,N_{NT} \left| {\varOmega_{g}^{k - 1} } \right.) = \left( \begin{aligned} N_{TGT} \hfill \\ N_{DT} \hfill \\ \end{aligned} \right)P_{D}^{{N_{DT} }} (1 - P_{D} )^{{(N_{TGT} - N_{DT} )}} \times F_{{N_{FT} }} (\beta_{FT} V)F_{{N_{NT} }} (\beta_{NT} V) $$
(10)

where PD is the detection probability, βFT is the probability density of the false alarm, βNT is the probability density of the new objects, Fn (λ) is the Poisson distribution with the rate parameter \( \lambda \). Then we get:

$$ P(\varphi_{k} \left| {\varOmega_{g}^{k - 1} } \right.) = \frac{{N_{FT} !N_{NT} !}}{{M_{k} !}}P_{D}^{{N_{DT} }} (1 - P_{D} )^{{(N_{TGT} - N_{DT} )}} \times F_{{N_{FT} }} (\beta_{FT} V)F_{{N_{NT} }} (\beta_{NT} V) $$
(11)

The hypothesis probability is provided by:

$$ P_{i}^{k} = \frac{1}{c}P_{D}^{{N_{DT} }} (1 - P_{D} )^{{(N_{TGT} - N_{DT} )}} \times \beta_{FT}^{{N_{FT} }} \beta_{NT}^{{N_{NT} }} \times \left[\prod \begin{aligned} N_{DT} \hfill \\ m = 1 \hfill \\ \end{aligned} N(Z_{m} - H\tilde{X},S)\right]P_{g}^{k - 1} $$
(12)

After the probability of each possible association hypothesis is obtained, all the association probabilities are represented by one correlation matrix. The hypothesis H with the maximum association probability is selected as the optimal hypothesis.

Motion model

The motion state of the object is usually described by several common models, including constant velocity model, constant acceleration model and CS model. The motion state of the cyclist varies frame by frame, and sudden acceleration or deceleration may happen at any time. To avoid the large cumulative error, CS model is used to estimate the motion state of the cyclist. CS model is time-correlated, and the mean value is the estimation of acceleration at the current time step, and the probability distribution of the acceleration is represented by the revised Rayleigh distribution [29]. In CS model, the acceleration model of the cyclist is as follows:

$$ x(t) = \bar{a}(t) + a(t) $$
(13)
$$ a(t) = - \alpha a(t) + w(t) $$
(14)

where \( \bar{a} \) is the mean value of acceleration, α is the reciprocal of the time constant, \( w(t) \) is the white noise with the mean value 0, the variance \( \sigma_{w}^{2} = 2\alpha \sigma_{a}^{2} \), \( \sigma_{a}^{2} \) is the acceleration variance. Thus, CS model for the cyclist is established by:

$$ \left[ \begin{gathered} \dot{x}(t) \hfill \\ \ddot{x}(t) \hfill \\ \dddot x(t) \hfill \\ \end{gathered} \right] = \left[ {\begin{array}{*{20}c} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & { - \alpha } \\ \end{array} } \right]\left[ \begin{gathered} \dot{x}(t) \hfill \\ \ddot{x}(t) \hfill \\ \dddot x(t) \hfill \\ \end{gathered} \right] + \left[ \begin{gathered} 0 \hfill \\ 0 \hfill \\ \alpha \hfill \\ \end{gathered} \right]\bar{a}(t) + \left[ \begin{gathered} 0 \hfill \\ 0 \hfill \\ 1 \hfill \\ \end{gathered} \right]w(t) $$
(15)

Filtering algorithm

The state variable of the cyclist is:

$$ x(t) = \left[ {x(t),\dot{x}(t),\ddot{x}(t),y(t),\dot{y}(t),\ddot{y}(t)} \right]^{T} $$

and the state equation is as follows:

$$ \left[ \begin{gathered} \dot{x}(t) \hfill \\ \ddot{x}(t) \hfill \\ \dddot x(t) \hfill \\ \dot{y}(t) \hfill \\ \ddot{y}(t) \hfill \\ \dddot y(t) \hfill \\ \end{gathered} \right] = \left[ {\begin{array}{*{20}c} 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & { - \alpha } & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & {- \alpha } \\ \end{array} 0} \right]\left[ \begin{gathered} \dot{x}(t) \hfill \\ \ddot{x}(t) \hfill \\ \dddot x(t) \hfill \\ \dot{y}(t) \hfill \\ \ddot{y}(t) \hfill \\ \dddot y(t) \hfill \\ \end{gathered} \right] + \left[ \begin{gathered} 0 \hfill \\ 0 \hfill \\ \alpha \bar{a}_{x} \hfill \\ 0 \hfill \\ 0 \hfill \\ \alpha \bar{a}_{y} \hfill \\ \end{gathered} \right] + \left[ \begin{gathered} 0 \hfill \\ 0 \hfill \\ w_{x} (t) \hfill \\ 0 \hfill \\ 0 \hfill \\ w_{y} (t) \hfill \\ \end{gathered} \right] $$
(16)

The above formula can also be provided by:

$$ \dot{x}(t) = Ax(t) + B + C(t) $$
(17)

where \( x(t) \), \( \dot{x}(t) \), \( \ddot{x}(t) \), \( y(t) \), \( \dot{y}(t) \), \( \ddot{y}(t) \) represent the position, speed and acceleration of the cyclist along X and Y directions, respectively.\( \bar{a}_{x} \), \( \bar{a}_{y} \) represent the average acceleration along X and Y directions. wx(t), wy(t) denote the zero-mean white noise in X and Y directions, the variance \( \sigma_{wx}^{2} = 2\alpha \sigma_{ax}^{2} \) and \( \sigma_{wy}^{2} = 2\alpha \sigma_{ay}^{2} \). \( \sigma_{ax}^{2} \),\( \sigma_{ay}^{2} \) represents the acceleration variance along X and Y direction.

The observation equation is:

$$ Z(k) = H(k)X(k) + V(k) $$
(18)

where \( H = \left[ {\begin{array}{*{20}c} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 \\ \end{array} } \right] \), \( Z(k) = \left[ \begin{aligned} x(k) \hfill \\ y (k )\hfill \\ \end{aligned} \right] \), \( v(k) \) represents Gaussian noise with zero mean and the variance R(k).

The cyclist motion system is modeled based on Kalman filter and CS model as follows:

$$ \hat{x}(k\left| {k - 1} \right.) = \phi (k\left| {k - 1} \right.)\hat{x}(k - 1\left| {k - 1} \right.) $$
(19)
$$ P(k\left| {k - 1} \right.) = \phi (k\left| {k - 1} \right.)P(k - 1\left| {k - 1} \right.)\phi^{T} (k\left| {k - 1} \right.) + Q(K - 1) $$
(20)
$$ K(k) = P(k\left| {k - 1} \right.)H^{T} (k)\left[ {H(k)P(k\left| {k - 1} \right.)H^{T} (k) + R(k - 1)} \right]^{ - 1} $$
(21)
$$ P(k\left| k \right.) = \left[ {1 - K(k)H(k)} \right]p(k\left| {k - 1} \right.) $$
(22)
$$ \hat{x}(k\left| k \right.) = \hat{x}(k\left| {k - 1} \right.) + K(k)\left[ {Z(k) - H(k)\hat{x}(k\left| {k - 1} \right.)} \right] $$
(23)

Experiments

Datasets

The independent clusters of the point cloud samples are segmented by subarea-based DBSCAN algorithm at each frame. In the experiments, the camera synchronously collected the scene image to manually mark the positive and negative point cloud samples of the cyclists. Figure 7 shows the real scene at one frame and the classification result in the point cloud scene. In this figure, each color of the point cloud denotes each scan layer of the laser scanner. The red and black rectangles denote the cyclist and non-cyclist, respectively. The positive samples include the cyclists with various poses, and the negative samples include pedestrians, vehicles, lamp posts, trees and other non-cyclist objects. Some positive and negative samples are shown in Fig. 8. We totally extracted 3500 samples, including 1500 positive samples and 2000 negative samples. fivefold cross-validation is utilized to make the classifier generalized.

Fig. 7
figure7

The classification result for real scene at one frame

Fig. 8
figure8

The samples. a Positive samples, b Negative samples

Detection algorithm validation

To demonstrate the performance of multiple classifiers for cyclist detection, six classifiers are constructed: (1) SVM + F, (2) SVM + FP, (3) SVM + FR, (4) DT + F, (5) DT + FP and (6) DT + FR. ROC curve is used to evaluate the performance of these classifiers, as shown in Fig. 9. AUC (area under the ROC) is the critical parameter to measure the cyclist detection performance. The classifier with larger AUC is superior to other classifiers. Table 5 lists the results of each classifier. It indicates that SVM classifier outperforms DT classifier with the same feature set for cyclist detection. For SVM classifier, the features extracted by PCA method show the best recognition result than other feature sets, while the features from Relief algorithm performs the best for DT classifier. In terms of the same classifier, the classification performance of the original 37-dimensional feature set is the worst. It means that the redundancy exists among the original feature set, and the proposed feature selection methods are essential. In general, AUC values of six classifiers exceed 0.93, and it demonstrates that each classifier show good performance for cyclist detection.

Fig. 9
figure9

ROC curves for six classifiers

Table 5 The results of each classifier

Tracking algorithm validation

To evaluate the proposed cyclist tracking algorithm, the point cloud of moving cyclist in campus is collected using LUX 4L on the stationary vehicle. As shown in Fig. 10, three cyclists moves away frame by frame. We set detection probability Pd = 0.9, and the probability that the correct measurement falls into the tracking gate Pg = 0.99.

Fig. 10
figure10

The tracking results. a Frame 80, b Frame 160, c Frame 240

The tracking results for Frame 80, Frame 160, Frame 240 are shown in Fig. 10. The black rectangle denotes the detected cyclist, and the speed is explicitly labelled in kilometers per hour. The tails dragged by three cyclists are the historical tracks. It indicates that the combination of MHT algorithm with Kalman filter based on CS model can accurately track multiple cyclists.

To further verify the motion estimation performance of the proposed tracking method, the estimated motion parameters of Cyclist 2 are compared with on-board GPS data. The lateral and longitudinal positions in the tracking process are shown in Fig. 11, and the velocities along the lateral and longitudinal directions are shown in Fig. 12. The point cloud of the cyclist varies with the range frame by frame, and the variation causes the instability of the cyclist center. The measured position and speed of the cyclist even fluctuate obviously, and it does not match the real motion of the cyclist. The stable position and speed of moving cyclists are obtained from the proposed Kalman filter method based on CS model. Thus, the proposed tracking method has a good applicability.

Fig. 11
figure11

Lateral and longitudinal position curves for the tracking of Cyclist 2

Fig. 12
figure12

Lateral and longitudinal velocity curves for the tracking of Cyclist 2

Conclusion

In this paper, cyclist detection and tracking method is presented based on multi-layer laser scanner. Firstly, subarea-based DBSCAN algorithm is developed to segment the uneven point cloud based on the density distribution. Secondly, considering the point cloud characteristics of the cyclists, we construct 37-dimensional feature set including number-of-points, geometric and statistical features. Relief algorithm and PCA are further used to optimize the feature set, respectively. Then, three feature sets are combined with SVM and DT classifiers to generate six categories of cyclist classifiers, and the detected cyclists are tracked using MHT algorithm and Kalman filter based on CS motion model. Experimental results show that the subarea-based clustering algorithm can effectively segment the laser points into independent clusters. For SVM classifier, the feature set extracted by PCA method brings the better classification result than other feature sets, while the feature set from relief algorithm performs the best for DT classifier. The proposed cyclist tracking method can obtain the position and speed of moving cyclists robustly. Because of the sparsity of the laser points, future work will be the utilization of various sensors to achieve accurate detection and tracking for the long-range cyclist.

Availability of data and materials

We declared that materials described in the manuscript will be freely available to any scientist wishing to use them for non-commercial purposes.

References

  1. 1.

    Sheehan B, Murphy F, Mullins M, Ryan C (2019) Connected and autonomous vehicles: a cyber-risk classification framework. Transp Res Pt A Policy Pract 124:523–536

  2. 2.

    Chen M, Tian Y, Fortino G, Zhang J, Humar I (2018) Cognitive internet of vehicles. Comput Commun 120:58–70

  3. 3.

    You C, Lu J, Filev D, Tsiotras P (2019) Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Robot Auton Syst 114:1–18

  4. 4.

    Berntorp K, Tru H, Stefano C (2019) Motion planning of autonomous road vehicles by particle filtering. IEEE Trans Intell Veh 4(2):197–210

  5. 5.

    Noh S (2018) Decision-making framework for autonomous driving at road intersections: safeguarding against collision, overly conservative behavior, and violation vehicles. IEEE Trans Ind Electron 66(4):3275–3286

  6. 6.

    Kearns M, Ledsham T, Savan B, Scott J (2019) Increasing cycling for transportation through mentorship programs. Transp Res Pt A Policy Pract 128:34–45

  7. 7.

    World Health Organization (2011) WHO global status report on road safety 2011: supporting a decade of action. Statistical Center of Iran. General Statistical Yearbook 1389-1390

  8. 8.

    Subirats P, Dupuis Y (2015) Overhead LIDAR-based motorcycle counting. Trans Lett 7(2):114–117

  9. 9.

    Prabhakar Y, Subirats P, Lecomte C, Vivet D, Violette E, Bensrhair A (2013) Detection and counting of powered two wheelers by laser scanner in real time on urban expressway. In: IEEE Conference on Intelligent Transportation Systems. pp 1149-1154

  10. 10.

    Zangenehpour S, Luis F, Nicolas S (2015) Automated classification based on video data at intersections with heavy pedestrian and bicycle traffic: methodology and application. Transp Res Pt C Emerg Technol 56:161–176

  11. 11.

    Li X, Li L, Flohr F, Wang J, Li K (2016) A unified framework for concurrent pedestrian and cyclist detection. IEEE Trans Intell Transp 18(2):269–281

  12. 12.

    Tian W, Lauer M (2015) Fast cyclist detection by cascaded detector and geometric constraint. In: 2015 IEEE International Conference on Intelligent Transportation Systems. pp 1286-1291

  13. 13.

    Bieshaar M, Zernetsch S, Hubert A, Sick B, Doll K (2018) Cooperative starting movement detection of cyclists using convolutional neural networks and a boosted stacking ensemble. IEEE Trans Intell Veh 3(4):534–544

  14. 14.

    Liu C, Guo Y, Li S, Chang F (2019) ACF based region proposal extraction for YOLOv3 network towards high-performance cyclist detection in high resolution images. Sensors 19(12):2671–2689

  15. 15.

    Song W, Zou S, Tian Y (2018) Classifying 3D objects in LiDAR point clouds with a back-propagation neural network. Human-centric Comput Inf Sci 8:29

  16. 16.

    Chu PM, Cho S, Park J (2019) Enhanced ground segmentation method for Lidar point clouds in human-centric autonomous robot systems. Human-centric Comput Inf Sci 9:17

  17. 17.

    Muhammad A, Pu Y, Rahman Z, Abro W, Naeem H, Ullah F, Badr A (2018) A hybrid proposed framework for object detection and classification. J Inf Process Syst 14(5):1176–1194

  18. 18.

    Truong M, Kim S (2019) A tracking-by-detection system for pedestrian tracking using deep learning technique and color information. J Inf Process Syst 15(4):1017–1028

  19. 19.

    Wang D, Posner I, Newman P (2012) What could move? Finding cars, pedestrians and bicyclists in 3D laser data. In: IEEE International Conference on Robotics and Automation. pp 4038–4044

  20. 20.

    Huang R, Liang H, Chen J, Zhao P, Du M (2016) Lidar based dynamic obstacle detection, tracking and recognition method for driverless cars. Robot 38(4):437–443

  21. 21.

    Magnier V, Gruyer D, Godelle J (2017) Automotive LIDAR objects detection and classification algorithm using the belief theory. In: IEEE Intelligent Vehicles Symposium (IV) pp 746-751

  22. 22.

    Kim B, Choi B, Park S, Kim H, Kim E (2015) Pedestrian/Vehicle detection using a 2.5-D multi-layer laser scanner. IEEE Sens J 16(2):400–408

  23. 23.

    Carballo A, Ohya A, Yuta S (2015) People detection using range and intensity data from multi-layered Laser Range Finders. In: IEEE/RSJ International Conference on Intelligent Robots & Systems. pp 5849–5854

  24. 24.

    Arras K, Mozos Ó, Burgard W (2007) Using boosted features for the detection of people in 2D range data. In: Proceedings of IEEE International Conference on Robotics and Automation. pp 3402–3407

  25. 25.

    Mahesh K, Rama M (2016) A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method. Pattern Recogn 58:39–48

  26. 26.

    Saeys Y, Abeel T, Peer Y (2008) Robust feature selection using ensemble feature selection techniques. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. pp 313-325

  27. 27.

    Nishino K, Nayar SK, Jebara T (2005) Clustered blockwise PCA for representing visual data. IEEE Trans Pattern Anal Mach Intell 27(10):1675

  28. 28.

    Chang C, Lin C (2011) LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol 2(3):27

  29. 29.

    Sohn J, Kim NS, Sung W (1999) A statistical model-based voice activity detection. IEEE Signal Process Lett 6(1):1–3

Download references

Acknowledgements

We would like to thank the reviewers for their valuable comments.

Funding

This work is supported in part by the National Key R&D Program of China under Grant No. 2018YFB1600500, National Natural Science Foundation of China under Grant Nos. 51905007, 51775053, 61603004, the Great Wall Scholar Program under Grant CIT&TCD20190304, Ability Construction of Science, Beijing Key Lab Construction Fund under Grant PXM2017-014212-000033 and NCUT start-up fund.

Author information

Affiliations

Authors

Contributions

The authors have contributed significantly to the research work presented of this manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Mingfang Zhang.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, M., Fu, R., Guo, Y. et al. Cyclist detection and tracking based on multi-layer laser scanner. Hum. Cent. Comput. Inf. Sci. 10, 20 (2020). https://doi.org/10.1186/s13673-020-00225-x

Download citation

Keywords

  • Cyclist detection
  • Feature extraction
  • Multi-layer laser scanner
  • Current statistical model
  • Multiple hypothesis tracking