Objective function
Assume that \({\varvec{X}}\in {{\varvec{R}}}^{n\times d}\) represents the training sample set, where \(n\) is the total number of training samples, \(d\) is the number of dimensions, and \(c\) is the number of clusters. Let \({\varvec{S}}\in {{\varvec{R}}}^{n\times n}\) denote the similarity matrix, with each element \({s}_{ij}(i,j=\text{1,2},…,n)\) representing the similarity between the \(i\) and \(j\) samples. Similarly,\({\varvec{D}}\in {{\varvec{R}}}^{n\times n}\) is the distance matrix \(X\), with \({d}_{ij}\) as the distance between the \(i\) and \(j\) samples, calculated using the Euclidean distance, \({d}_{ij}={\Vert {{\varvec{x}}}_{i}-{{\varvec{x}}}_{j}\Vert }_{2}\). By incorporating the fuzzy index into the local adaptive spectral clustering framework and referencing the objective function of FCM, we define the objective function of FSC as follows:
$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2} s.t\forall i,{\sum }_{j=1}^{n}{s}_{ij}=1,0<{s}_{ij}<1,m>0,m\ne 1$$
(10)
Obviously, compared to classical spectral clustering method, the optimized similarity matrix \({\varvec{S}}\) becomes more sparser so as to be directly used to partition without K-means clustering method. Besides, by means of the fuzzy index, it can reduce the impact of the similarity matrix \({\varvec{S}}\) on the clustering results and improve the adaptability of the spectral clustering.
Although the sparsity of similarity matrix \({\varvec{S}}\) has been improved by introducing the fuzzy index, it can not guarantee the datasets can be partitioned into exact clusters because the number of the connected component of similarity matrix \({\varvec{S}}\) can not be equal to \(c\). In order to the optimal similarity matrix \({\varvec{S}}\) can be divided into exact \(c\) clusters, referring to14, we add a rank constraint \(rank({{\varvec{L}}}_{{\varvec{S}}})=n-c\) into the Laplacian matrix of \({\varvec{S}}\). And the new objective function is as follows:
$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2} s.t\forall i,{\sum }_{j=1}^{n}{s}_{ij}=1,0<{s}_{ij}<1,m>0,m\ne 1,rank\left({{\varvec{L}}}_{{\varvec{S}}}\right)=n-c$$
(11)
According to14, by adding the rank constraint into the Laplacian matrix of \({\varvec{S}}\), the optimized \({\varvec{S}}\) has exact \(c\) connected components, which is equal to the number of clusters. Compared to the objective function in14, by introducing the fuzzy index, the new objective function further improves the sparsity of the similarity matrix and simultaneously improves the anti-sensitivity of spectral clustering method, it improves the adaptivity of FSC.
Optimizing objective function
In order to optimize the Eq. (11), we firstly need to transform the Eq. (11). Suppose \({\sigma }_{i}({{\varvec{L}}}_{{\varvec{S}}})\) denotes the i-th smallest eigen value of \({{\varvec{L}}}_{{\varvec{S}}}\), since \({{\varvec{L}}}_{{\varvec{S}}}\) is a semi-definite positive matrix, then we have \({\sigma }_{i}({{\varvec{L}}}_{{\varvec{S}}})\ge 0\). Considering \(\lambda\) is enough large, Eq. (11) can be transformed as follows:
$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2}+2\beta {\sum }_{i=1}^{c}{\sigma }_{i}\left({{\varvec{L}}}_{{\varvec{S}}}\right) s.t\forall i,{\sum }_{j=1}^{n}{s}_{ij}=1,0<{s}_{ij}<1,m>0,m\ne 1$$
(12)
When \(\lambda\) is enough large, the second term of Eq. (12) trends to zero during the optimization of Eq. (12), so the constraint \(rank({{\varvec{L}}}_{{\varvec{S}}})=n-c\) is satisfied. According to Ky Fan’s theory26, we have
$${\sum }_{i=1}^{c}{\sigma }_{i}\left({{\varvec{L}}}_{{\varvec{S}}}\right)=\underset{{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}}{min}Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right)$$
(13)
where \({\varvec{F}}\) is the generated by the \(c\) corresponding eigenvectors of the \(c\) smallest eigenvalues of \({{\varvec{L}}}_{{\varvec{S}}}\), then Eq. (12) can be further transformed into:
$$\underset{{\varvec{S}},{\varvec{F}}}{min}\hspace{0.33em}J={\sum }_{i,j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2}+2\beta Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right) s.t.\forall i,{\sum }_{j=1}^{n}{s}_{ij}=\text{1,0}\le {s}_{ij}\le 1,m>0,m\ne 1,{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}$$
(14)
Obviously, compared to Eq. (12), optimizing Eq. (14) is easier operator. So we propose an alternate optimization method to optimize Eq. (14), the main process is summarized as follows:
When \({\varvec{S}}\) is fixed, optimizing \({\varvec{F}}\). Equation (14) becomes
$$\underset{{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}}{min}J=Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right)$$
(15)
The optimal solution of Eq. (15) is generated by the \(c\) corresponding eigenvectors of the \(c\) smallest eigenvalues of \({{\varvec{L}}}_{{\varvec{S}}}\).
When \({\varvec{F}}\) is fixed, optimizing \({\varvec{S}}\). Equation (14) becomes
$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i,j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2}+2\beta Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right) s.t.\forall i,{\sum }_{j=1}^{n}{s}_{ij}=\text{1,0}\le {s}_{ij}\le 1,m>0,m\ne 1,{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}$$
(16)
Considering \({{\sum }_{i,j=1}^{n}\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}{s}_{ij}=2Tr({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}})\),\({\varvec{F}}\in {{\varvec{R}}}^{n\times c}\) and whose element \({{\varvec{f}}}_{i}\) represents the i-th column of \({\varvec{F}}\). Then Eq. (16) becomes:
$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i,j=1}^{n}\left({s}_{ij}^{m}{d}_{ij}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}{s}_{ij}\right) s.t.\forall i,{\sum }_{j=1}^{n}{s}_{ij}=\text{1,0}\le {s}_{ij}\le 1,m>0,m\ne 1$$
(17)
For the optimization of Eq. (17), we adopt the Lagrange multiplier method to solve the problem, and the Lagrange function is as follows:
$$J={\sum }_{i,j=1}^{n}\left({s}_{ij}^{m}{d}_{ij}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}{s}_{ij}\right)+{\sum }_{i=1}^{n}{\alpha }_{i}\left(1-{\sum }_{j=1}^{n}{s}_{ij}\right)$$
(18)
where \({\alpha }_{i}\) is the Lagrange multiplier, then let \(\frac{\partial J}{\partial {s}_{ij}}=0\), we can have
$$\frac{\partial J}{\partial {s}_{ij}}=m{s}_{ij}^{m-1}{d}_{ij}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}-{\alpha }_{i}=0$$
(19)
Then we have:
$${s}_{ij}={\left(\frac{{\alpha }_{i}-\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}}{m{d}_{ij}^{2}}\right)}^{\frac{1}{m-1}}$$
(20)
Inverting Eq. (20) into \({\sum }_{j=1}^{n}{s}_{ij}=1\), we can obtain \({\alpha }_{i}\), then inverting into Eq. (20), we have:
$${s}_{ij}={\left(\frac{{\left({\sum }_{k=1,k\ne i}^{n}\left(m{d}_{ik}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{k}\Vert }_{2}^{2}\right)\right)}^{\frac{1}{m-1}}-\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}}{m{d}_{ij}^{2}}\right)}^{\frac{1}{m-1}}$$
(21)
Time complexity
The main time complexity of the proposed method FSC is concentrated on Step1, Step2.1, Step2.2 of Algorithm 1. Step1 initializes the similarity matrix \({\mathbf{S}}\), it requires \({\rm O}(n^{2} dk)\). Step2.1 resolves the eigenvalues of \({\mathbf{L}}_{{\mathbf{S}}}\), it requires \({\rm O}(n^{2} d + n^{3} )\). The time complexity of Eq. (21) is \({\rm O}(mn^{2} dk)\), so the time complexity of Step2.2 is \({\rm O}(mn^{2} dk)\). Therefore, the total time complexity of Algorithm1 is \({\rm O}(n^{2} dk + t(n^{2} d + n^{3} + mn^{2} dk))\), where t is the number of the iteration. From the above descriptions, we can observe that the time complexity of FSC is related to the n, d and k.