A local adaptive fuzzy spectral clustering method for robust and practical clustering

Contents

Objective function Optimizing objective function Time complexity

Objective function

Assume that ${\varvec{X}}\in {{\varvec{R}}}^{n\times d}$ represents the training sample set, where $n$ is the total number of training samples, $d$ is the number of dimensions, and $c$ is the number of clusters. Let ${\varvec{S}}\in {{\varvec{R}}}^{n\times n}$ denote the similarity matrix, with each element ${s}_{ij}(i,j=\text{1,2},…,n)$ representing the similarity between the $i$ and $j$ samples. Similarly,${\varvec{D}}\in {{\varvec{R}}}^{n\times n}$ is the distance matrix $X$, with ${d}_{ij}$ as the distance between the $i$ and $j$ samples, calculated using the Euclidean distance, ${d}_{ij}={\Vert {{\varvec{x}}}_{i}-{{\varvec{x}}}_{j}\Vert }_{2}$. By incorporating the fuzzy index into the local adaptive spectral clustering framework and referencing the objective function of FCM, we define the objective function of FSC as follows:

$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2} s.t\forall i,{\sum }_{j=1}^{n}{s}_{ij}=1,0<{s}_{ij}<1,m>0,m\ne 1$$

(10)

Obviously, compared to classical spectral clustering method, the optimized similarity matrix ${\varvec{S}}$ becomes more sparser so as to be directly used to partition without K-means clustering method. Besides, by means of the fuzzy index, it can reduce the impact of the similarity matrix ${\varvec{S}}$ on the clustering results and improve the adaptability of the spectral clustering.

Although the sparsity of similarity matrix ${\varvec{S}}$ has been improved by introducing the fuzzy index, it can not guarantee the datasets can be partitioned into exact clusters because the number of the connected component of similarity matrix ${\varvec{S}}$ can not be equal to $c$. In order to the optimal similarity matrix ${\varvec{S}}$ can be divided into exact $c$ clusters, referring to¹⁴, we add a rank constraint $rank({{\varvec{L}}}_{{\varvec{S}}})=n-c$ into the Laplacian matrix of ${\varvec{S}}$. And the new objective function is as follows:

$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2} s.t\forall i,{\sum }_{j=1}^{n}{s}_{ij}=1,0<{s}_{ij}<1,m>0,m\ne 1,rank\left({{\varvec{L}}}_{{\varvec{S}}}\right)=n-c$$

(11)

According to¹⁴, by adding the rank constraint into the Laplacian matrix of ${\varvec{S}}$, the optimized ${\varvec{S}}$ has exact $c$ connected components, which is equal to the number of clusters. Compared to the objective function in¹⁴, by introducing the fuzzy index, the new objective function further improves the sparsity of the similarity matrix and simultaneously improves the anti-sensitivity of spectral clustering method, it improves the adaptivity of FSC.

Optimizing objective function

In order to optimize the Eq. (11), we firstly need to transform the Eq. (11). Suppose ${\sigma }_{i}({{\varvec{L}}}_{{\varvec{S}}})$ denotes the i-th smallest eigen value of ${{\varvec{L}}}_{{\varvec{S}}}$, since ${{\varvec{L}}}_{{\varvec{S}}}$ is a semi-definite positive matrix, then we have ${\sigma }_{i}({{\varvec{L}}}_{{\varvec{S}}})\ge 0$. Considering $\lambda$ is enough large, Eq. (11) can be transformed as follows:

$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i=1}^{n}{\sum }_{j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2}+2\beta {\sum }_{i=1}^{c}{\sigma }_{i}\left({{\varvec{L}}}_{{\varvec{S}}}\right) s.t\forall i,{\sum }_{j=1}^{n}{s}_{ij}=1,0<{s}_{ij}<1,m>0,m\ne 1$$

(12)

When $\lambda$ is enough large, the second term of Eq. (12) trends to zero during the optimization of Eq. (12), so the constraint $rank({{\varvec{L}}}_{{\varvec{S}}})=n-c$ is satisfied. According to Ky Fan’s theory²⁶, we have

$${\sum }_{i=1}^{c}{\sigma }_{i}\left({{\varvec{L}}}_{{\varvec{S}}}\right)=\underset{{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}}{min}Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right)$$

(13)

where ${\varvec{F}}$ is the generated by the $c$ corresponding eigenvectors of the $c$ smallest eigenvalues of ${{\varvec{L}}}_{{\varvec{S}}}$, then Eq. (12) can be further transformed into:

$$\underset{{\varvec{S}},{\varvec{F}}}{min}\hspace{0.33em}J={\sum }_{i,j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2}+2\beta Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right) s.t.\forall i,{\sum }_{j=1}^{n}{s}_{ij}=\text{1,0}\le {s}_{ij}\le 1,m>0,m\ne 1,{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}$$

(14)

Obviously, compared to Eq. (12), optimizing Eq. (14) is easier operator. So we propose an alternate optimization method to optimize Eq. (14), the main process is summarized as follows:

When ${\varvec{S}}$ is fixed, optimizing ${\varvec{F}}$. Equation (14) becomes

$$\underset{{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}}{min}J=Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right)$$

(15)

The optimal solution of Eq. (15) is generated by the $c$ corresponding eigenvectors of the $c$ smallest eigenvalues of ${{\varvec{L}}}_{{\varvec{S}}}$.

When ${\varvec{F}}$ is fixed, optimizing ${\varvec{S}}$. Equation (14) becomes

$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i,j=1}^{n}{s}_{ij}^{m}{d}_{ij}^{2}+2\beta Tr\left({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}}\right) s.t.\forall i,{\sum }_{j=1}^{n}{s}_{ij}=\text{1,0}\le {s}_{ij}\le 1,m>0,m\ne 1,{\varvec{F}}\in {{\varvec{R}}}^{n\times c},{{\varvec{F}}}^{T}{\varvec{F}}={\varvec{I}}$$

(16)

Considering ${{\sum }_{i,j=1}^{n}\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}{s}_{ij}=2Tr({{\varvec{F}}}^{T}{{\varvec{L}}}_{{\varvec{S}}}{\varvec{F}})$,${\varvec{F}}\in {{\varvec{R}}}^{n\times c}$ and whose element ${{\varvec{f}}}_{i}$ represents the i-th column of ${\varvec{F}}$. Then Eq. (16) becomes:

$$\underset{{\varvec{S}}}{min}\hspace{0.33em}J={\sum }_{i,j=1}^{n}\left({s}_{ij}^{m}{d}_{ij}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}{s}_{ij}\right) s.t.\forall i,{\sum }_{j=1}^{n}{s}_{ij}=\text{1,0}\le {s}_{ij}\le 1,m>0,m\ne 1$$

(17)

For the optimization of Eq. (17), we adopt the Lagrange multiplier method to solve the problem, and the Lagrange function is as follows:

$$J={\sum }_{i,j=1}^{n}\left({s}_{ij}^{m}{d}_{ij}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}{s}_{ij}\right)+{\sum }_{i=1}^{n}{\alpha }_{i}\left(1-{\sum }_{j=1}^{n}{s}_{ij}\right)$$

(18)

where ${\alpha }_{i}$ is the Lagrange multiplier, then let $\frac{\partial J}{\partial {s}_{ij}}=0$, we can have

$$\frac{\partial J}{\partial {s}_{ij}}=m{s}_{ij}^{m-1}{d}_{ij}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}-{\alpha }_{i}=0$$

(19)

Then we have:

$${s}_{ij}={\left(\frac{{\alpha }_{i}-\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}}{m{d}_{ij}^{2}}\right)}^{\frac{1}{m-1}}$$

(20)

Inverting Eq. (20) into ${\sum }_{j=1}^{n}{s}_{ij}=1$, we can obtain ${\alpha }_{i}$, then inverting into Eq. (20), we have:

$${s}_{ij}={\left(\frac{{\left({\sum }_{k=1,k\ne i}^{n}\left(m{d}_{ik}^{2}+\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{k}\Vert }_{2}^{2}\right)\right)}^{\frac{1}{m-1}}-\beta {\Vert {{\varvec{f}}}_{i}-{{\varvec{f}}}_{j}\Vert }_{2}^{2}}{m{d}_{ij}^{2}}\right)}^{\frac{1}{m-1}}$$

(21)

Time complexity

The main time complexity of the proposed method FSC is concentrated on Step1, Step2.1, Step2.2 of Algorithm 1. Step1 initializes the similarity matrix ${\mathbf{S}}$, it requires ${\rm O}(n^{2} dk)$. Step2.1 resolves the eigenvalues of ${\mathbf{L}}_{{\mathbf{S}}}$, it requires ${\rm O}(n^{2} d + n^{3} )$. The time complexity of Eq. (21) is ${\rm O}(mn^{2} dk)$, so the time complexity of Step2.2 is ${\rm O}(mn^{2} dk)$. Therefore, the total time complexity of Algorithm1 is ${\rm O}(n^{2} dk + t(n^{2} d + n^{3} + mn^{2} dk))$, where t is the number of the iteration. From the above descriptions, we can observe that the time complexity of FSC is related to the n, d and k.

Source link

Top Stories

Democratic Rep. Raúl M. Grijalva of Arizona dies at 77

This Galaxy S9 camera feature could finally make a comeback on S26 Ultra

Stephen Curry reaches 4,000 career 3s in Warriors win

Stay Connected

A local adaptive fuzzy spectral clustering method for robust and practical clustering

Objective function

Optimizing objective function

Time complexity

Like this:

Leave a Reply Cancel reply

Related Stories

Novel transfer learning based acoustic feature engineering for scene fake audio detection

Winter subglacial meltwater detected in a Greenland fjord

Phenolic compounds and safety of improved and local peanut varieties grown in Burkina Faso

Accurate lithofacies identification in deep shale gas reservoirs via an optimized neural network recognition model, Qiongzhusi Formation, southern Sichuan

Navigating trade-offs on conservation: the use of participatory mapping in maritime spatial planning

An intelligent framework for skin cancer detection and classification using fusion of Squeeze-Excitation-DenseNet with Metaheuristic-driven ensemble deep learning models

Multi-scale parallel gated local feature transformer

A novel pressure control method for nonlinear shell-and-tube steam condenser system via electric eel foraging optimizer

Quick Links

About US

Top Stories

Stay Connected

Objective function

Optimizing objective function

Time complexity

Share this:

Like this:

Leave a Reply Cancel reply

Related Stories

Quick Links

About US