C4O: chain-based cooperative clustering using coati optimization algorithm in WSN

ABSTRACT


INTRODUCTION
Wireless sensor networks (WSNs) are designed with a self-operated, small, and widely scattered mechanism for detecting certain limitations [1], [2], which aids in the transit of data through the network to sink.If all of the nodes in a WSN have the same capabilities, we call it a homogeneous WSN; otherwise, we call it a heterogeneous WSN [3].The WSN can be used in a wide variety of fields, including medicine, transportation, the military, industry, the environment, and agriculture.Considering how much power is used during internal sensor node communication, it is crucial to think of a routing technique that could help the sensor nodes conserve power [4], [5].
Clustering, also known as cluster analysis, is a technique used to organise large amounts of sensor data into manageable groups based on their shared properties [6].A critical function of the cluster head (CH) is to collect information from the cluster's nodes and forward it to the cluster's sink [7].Power efficiency and system longevity are prioritised during CH estimate in WSN.Several criteria are used to determine the CH [8], [9].The ideal CH is determined by optimising the number of nearest neighbours, the distance from the sink, and the amount of energy that is left over.The lifespan of WSN is also affected by other factors, but the choice of CH is especially crucial [10].So, selecting the appropriate CH is critical for enhancing the ISSN: 2089-4864  C4O: chain-based cooperative clustering using coati optimization algorithm in WSN (Preet Kamal Singh) 97 performance of the network as a whole.In the search for the best CH, numerous meta-heuristic approaches have been developed.They include particle swarm optimization (PSO), genetic algorithm (GA), and ant colony optimization (ACO).Yet, it is challenging to provide accurate estimates of real-time difficulties in terms of all relevant criteria [11], [12].
In contrast to conventional CH selection approaches, the proposed work uses coati optimization algorithm (COA) [13].The COA's exploitation capabilities converge rapidly, and the agency's exploring efficiency is top notch.The proposed approach effectively takes advantage of the COA's exploitation behaviour after convergence.So, the proposed technique is not constrained to a merely local optimal solution, but can instead discover the global optimal one.Further, COA conducts a global search for a better solution.The most important contribution of the proposed work is as: − A chain-based clustering architecture is proposed for data dissemination in the network as illustrated in Figure 1.− Furthermore, for CH selection, we employ the COA, which was recently proposed and has demonstrated significant improvement over other optimisation algorithms.

−
In this method, the parameters considered for selecting the CH are energy, node density, distance, and network's average energy.

−
The simulation results show tremendous improvement over the competitive cluster-based routing algorithms in the context of network lifetime, stability period (first node dead), throughput, and network's remaining energy.
In section 2 of this work, we give a literature review on the topic of CHs in WSN.In section 3, we present a model for how a WSN chooses its cluster leader.In section 4, we analyse the results and explore what they mean.Our conclusion and future directions are presented in section 5.

LITERATURE REVIEW
Various researchers have focused the problem of CH selection.Here are some of the important contributions of the researchers [14]- [16].By incorporating fuzzy logic into WSN, Baradaran and Navi [17] have introduced high-quality clustering algorithm (HQCA) and optimized CHs in 2020.The HQCA approach was utilised as a criterion for increasing the intra-cluster and inter-cluster distances and decreasing the error rate during clustering.Fuzzy logic was used together with other factors such as the distances between cluster nodes and the base station (BS), cluster node energies, and residual node energies to determine the best CH.Ultimately, the experimental results using the proposed method achieved great reliability and a reduced error rate than those obtained using more conventional methods.
Sahoo et al. [18] proposed the PSO-based energy efficient clustering and sink mobility (PSO-ECSM) method for CHs in WSN in 2020.Both the sink movement problem and the CHs issue were addressed by the suggested PSO-ECSM, and comprehensive simulations were run to evaluate its effectiveness.Using transmitted data in a multi-hop network and the PSO-ECSM algorithm with a mobile sink, authors compared alternative values for five criteria.Finally, when compared to more conventional models, the adopted approach's performance has shown considerable improvements in network lifetime, throughput, and stable period.
A GA-based optimum CHs for both multi and solitary data sinks in heterogeneous WSN was proposed in 2019 by Verma et al. [19].Because of the limitations of density, remaining energy, and range, the GA-based optimized clustering (GAOC) protocol was developed for optimal CHs.In addition, multiple data sinks based-GAOC (MS-GAOC) has analysed the hotspot problems and reduced the communicative lengths between the nodes and sink.The proposed MS-GAOC was put to the test empirically, and the results showed that it outperformed the other algorithms.
The misbehavior detection approach and the secure CH selection algorithm for clustering WSN were proposed by Ghawy et al. [20].It was based on the trust management strategy for clustering WSN.Moreover, the issue of selecting the reliable node as CH was solved.The SNs' actions served as a benchmark by which the monitoring plan was measured.Finally, testing results have proven that the adopted approach is superior in protecting the network from compromised nodes becoming CHs.
To facilitate clustering in WSNs, Priya et al. [21] presented a hybrid energy management approach in 2020.The authors have proposed a new method based on Lagrangian relaxation and entropy to achieve energy efficiency.In addition, the approach has been kept alive by changing the position for the multi-hop connectivity.In the end, simulation results showed that the adopted model performed better than the baseline.

Inferences drawn from the literature work
In this section, we discuss inferences that we have identified from the existing work: − The selection of CH has been the topic of concern handled by the various researchers [22], [23].

−
Meta-heuristic approaches have been considered as there are multiple parameters that decides its selection; hence, the effective fitness function is computed by using various optimization methods.− Since, these optimization methods have their own merits and demerits, therefore selecting the suitable optimization method becomes a crucial task [24].
As it is evident from the literature survey, PSO has been rigorously used for clustering as it has faster convergence in terms of delivering the solution [25].− However, it is weak in exploration capabilities.Whereas, COA has better exploration capabilities than PSO.

Background of coatis
Coatis, or coati mundis, belong to the procyonidae family and are found in the nasua and nasuella genera.All coatis have the same slim body, long, non-prehensile tail used for signalling and balance, black paws, small ears, and a long, flexible, expanded nose.Coatis reach lengths of 33-69 cm (13-27 inches) from snout to tail tip at adulthood.The average coati weighs between 2 and 8 kg, and they stand about 30 cm at the shoulder.The average adult male can grow to be almost twice as large as a female, and they have larger, more pointed canine teeth.You can use these dimensions to compare a South American coati to a white-nosed coati.The smaller of the two coati species is the mountain coati.Coatis are omnivores that consume a wide variety of foods, including invertebrates like tarantula and small vertebrate prey like birds, lizards, rodents, crocodile eggs, and bird eggs.The green iguana is one of the coati's favourite foods.Since these enormous lizards (iguanas) prefer to spend their time in the trees, coatis often band together to kill them.Some coatis may climb trees in an attempt to intimidate the iguana into jumping to the ground, while others will immediately launch themselves into an attack.Nonetheless, coatis are vulnerable to attacks from a variety of predators.Coatis employ clever planning in their attacks on iguanas, and they exhibit cunning in the face of and flight from predators.The proposed COA method was largely inspired by the simulation of these wild coatis' behaviour.

PROPOSED WORK 3.1. Network presumptions
This model takes the network concept into account while making a few assumptions.The following assumptions are made about the sensor nodes:

99
− Each and every one of the network's nodes is a permanent fixture.These low-cost, low-size nodes collect data and relay it to the sink.

−
It is assumed that the nodes in this network have varying degrees of energy, making the network heterogeneous.This protocol takes into account nodes at the normal, moderate, and advanced levels of energy.

−
Once the energy in the nodes is depleted, there is no way to replenish it.

−
A square of area A=MXM nodes is used for deployment.The sink is positioned smack dab in the midst of the system.

−
As soon as the nodes are deployed, they are each given a unique identifier.

−
The wireless data transmission security concerns are outside the scope of this paper.

Operating phases of proposed work 3.2.1. Set up phase
The proposed method is used to pick the CH at this stage.The network's nodes are a mixture of different types.There are three tiers of heterogeneity built into the network, and this results in nodes with varying amounts of available energy [26].N NOR , N INT , and N HGH stand for the respective numbers of lowenergy, medium-energy, and high-energy nodes in the network, respectively, in the (1)-( 9).The percentage of nodes with medium and high energy levels is denoted by the values λ1 and µ1, respectively.
The energy of the intermediate and high energy nodes is twice that of the low energy nodes, respectively.Following this procedure, which is also denoted by E Tot , the total energy of the network can be calculated.Higher node energy E HGH , intermediate node energy E INTI , and normal node energy E NORM , are all energy values.
For example, in (1)-( 9), the symbols 1 and µ1 represent the proportion of nodes that are intermediate and "advanced" respectively.Additionally, 11 and µ11 represent the energy portions of the aforementioned nodes.The entire network's energy is calculated for inclusion in the fitness function, which is used to define the reasons that led to the selection of CH.

Fitness function for C4O
A fitness function is an expression that can be optimized by increasing or decreasing the value of some set of performance parameters.The fitness of a person is determined by a number of factors, and these aspects are taken into account by the computed fitness function.The fitness function makes use of the following arguments.

Fitness parameters
The present value of the FP is determined using a complex formula.The more weight a parameter has, the better the ideal value will be.Here, efficiency in energy consumption and durability in the network are prioritized as fitness factors.The following criteria are considered throughout the fitness function design process.The efficiency parameter chooses CH based on the sensor node's power leftover.Since the sensor nodes' energy is depleted over the course of a round, the residual energy of those nodes must be tracked in order to choose one as CH.The fitness factor is introduced by the first fitness parameter (FP 1st ) in the following fashion.(10) In (10), since the goal is to reduce the efficiency variable by choosing the node with the highest remaining energy, the i th sensor node's remaining energy is included in the denominator term.

b) Distance between node and sink
After being randomly distributed around the network, the nodes' distances from the sink also fluctuate.Lower energy usage can be achieved by minimizing the distance between the sensor nodes and the sink.This makes it an important consideration when choosing a CH.In order to save power, the distance between the cluster nodes as well as between cluster centers and the sink should be as small as possible.This is calculated using the Euclidean distance formula, which uses the coordinates of the two objects as inputs.
As revealed by (11), the second fitness parameter (FP 2nd ) is concerned with the creation of the fitness function to gain the selection of CH via the distance parameter.
For each node, FP 2nd totals the distance costs, where i is an integer from 1 to N  (the total number of nodes).In (11), D N(i)−Sink represents the average distance between the i th node and the sink, and D AVG(N(i)−Sink) represents the Euclidean distance between the i th node and the sink.It's important to remember that the lower this parameter's value is, the better the network's CH selection will be.
c) Node density Therefore, the choosing of CH is performed according to the number of surrounding nodes, as it is vital to minimise the distance between the cluster nodes and the CH.This is done by finding the nodes that are the least connected to one another.The number of nearby nodes is defined by the third fitness parameter (FP 3rd ) in (12): for the purpose of computing cluster member nodes, the preceding (12) uses the notation D (Nd(i)−Nd(j)) for the Euclidean distance between each pair of nodes in the cluster.The N CLUS variable represents the total number of cluster nodes.This means that FP 3rd needs to be kept low if the CH is to be an efficient user of energy.

d) Network's average value of energy
Given how important it is to minimize the average energy between cluster nodes and the CH, the CH is chosen using this metric.The fourth fitness parameter (FP 4tℎ ) deals with average energy of the network and is computed by (13).
In ( 13), E () stands for the energy of the i th node in the network, and N  stands for sum total of the network's nodes.Thus, maximizing FP 4th is necessary for optimal CH selection.

The optimisation process and its fitness function
It is important to keep in mind that the fitness function is calculated by combining several variables into a single unified expression, as shown in (14).In order to achieve maximum efficiency, set the fitness function in (14) to its smallest value.Weight values considered along with fitness parameters are denoted by, α1, β1, γ1, Ὠ1 and in (15).It is important to notice that all of these parameters are assigned values on the same scale.According to (15), the sum of these weights is 1.

Steady state phase
As immediately as a CH is chosen using hybrid suggested research, the nodes' tunicates, position and velocity processes are updated.After that, in the data sending phase, packet data transfer continues.All the nodes submit their data to the CH node, which aggregates it and then sends it to the sink.

SIMULATION RESULTS
The simulation results are given in Figures 2-6.The performance comparison of proposed work is given in Figure 3.It is evident from the results shown that the proposed work i.e., C4O outperforms the other protocols in first node dead as well as for different percentage of nodes dead.
Each node's energy consumption during communications with other nodes and the data collection sink is calculated by using a mapping of the radio energy model [26].The nodes begin using energy in accordance with the energy model as soon as the second phase, the data transmission phase, begins.sensor node's energy consumption increases linearly with the square of its distance.Therefore, distance is an important consideration in determining the final energy level of the nodes in a network that have received messages.
It is evident from the Figures 3-6 and from Table 1, the proposed work has shown improvement in the network lifetime and at various stages of dead nodes.The stability or the first node dead is improved by 104.1% as compared to particle swarm optimization (PSO)-based dual sink mobility (PSODSM) protocols.Further, the 75% node dead achieved at 142% improvement over PSODSM protocol.The reason for such improvement for the proposed work is due to the optimized choosing the CH and also the use of hybrid optimization method that helps in providing the network the optimized solution at the faster rate.

CONCLUSION
The primary focus of WSNs research is on increasing energy efficiency and extending network lifetime.To handle this concern, various researchers have proposed multitude routing algorithms.It has been observed that the CH selection is a non polynomial (NP)-hard problem and seeks serious attention.In this paper, the CH selection is proposed using the COA algorithm.The simulations have been performed in MATLAB software and it is evident from the outcomes that the proposed algorithm has not only improved lifetime but also the stability period of the network tremendously.The network performance is improved in the context of the throughput and network' residual energy expenditure.The reason behind such improvement are as follows; high convergence and large exploration due to the COA, the CH selection parameters i.e., energy, distance, neighbouring nodes and the network's average energy.However, there are some limitations of the proposed work which can be addressed in the future.Firstly, the assumptions of the physical medium put a lot of challenges that need to be handled for real time implementation.Further, the security feature should be addressed as the wireless communication is vulnerable to various attacks.Lastly, the sink mobility can be considered for checking the performance of proposed work under different use case.

Figure 1 .
Figure 1.Proposed architecture of COA