Anno: 
2018
Nome e qualifica del proponente del progetto: 
sb_p_925165
Abstract: 

Underwater emerging applications such as monitoring the critical infrastructure, coastline protection, ecosystem analysis, pollution control and predicting disasters like underwater seismic and volcanic events, are becoming more and more sophisticated and produce more complex data which need to be delivered to the collection points on the surface. However, due to the time-varying and unstable underwater environment, designing and deploying a reliable, low latency and energy efficient underwater sensor network is yet a challenge. This research aims to design an adaptive model-free reinforcement learning based communication algorithm for underwater sensor networks that keeps up with the dynamic changes of the environment and obtains knowledge on the underwater channel conditions in real-time to automatically adapt to the system and make decisions on how to transmit the data packets accordingly. Model-free reinforcement learning approach seems to be a great fit for the time-varying underwater scenario as no predefined model or static assumptions of the environment is required in advance and the knowledge on the environment can be learnt and estimated by experience in the field. The decision on how to send the packets will be made based on the transmission cost, link quality and battery status. These decision factors will be updated depending on the current state of the underwater network and indeed are the key to make this system adaptive to the environment. The performance of this approach will then be realized and evaluated with "SUNSET", a simulator developed by Sapienza University UWSNs group, that models a wide variety of details of the underwater channel and environment realistically, before being deployed in the underwater fields.

ERC: 
PE6_2
PE6_7
PE6_11
Innovatività: 

In this framework, unlike some former solutions, underwater nodes do not need to be anchored to the bottom of the sea or be equipped with external tools or devices, the communication is not limited in between specific nodes and no static assumption or information such as geographic information concerning the state of the network is required. The feature that differentiates this algorithm from the latter solutions is its model-free and adaptive approach that cops with the underwater dynamic and time varying environment. In a model-free approach, the agent does not need to have a predefined model of the environment which is a great fit for the time varying underwater sensor networks. The conditions of the environment can be learnt, updated and estimated by experience and exploring in real time.

In this work, when a packet needs to be transmitted to a collection point on the surface (i.e. sink) in underwater sensor networks, the decision of how to send the packet and which links to use for transmission, will be made by formulating a Markov Decision Process (a method for formulating stochastic and dynamic decision problems) and finding an optimal solution for the formulated problem by reinforcement learning (a machine learning technique inspired by behaviorist psychology, with the goal of finding an optimal policy by maximizing a cumulative reward function).

The decision of how to send the packets is made based on:

1) Links quality: The link quality which determines the probability of a successful node to node transmission will be estimated based on the recent history of packet transmissions. The link quality shall be updated every time the link is used for a packet transmission.

2) Nodes battery status: The battery status is important to be considered for preventing network failures and extending network lifetime. Considering the nodes battery status when choosing a node to forward a packet, balances the workload and prevent network hotspots, as it avoids using the nodes with low battery which were probably used in the recent transmissions more frequently.

3) Energy cost: The energy cost concerns the fact that every attempt to forward a packet consumes energy, occupies channel bandwidth, and contributes to the number of hops to the sink. Considering this factor leads to choosing shorter paths and hence decreases the delay of transmission.

The cumulative reward is a function of all the above factors with their assigned weights. The weights can be set and altered according to the priorities and the state of the network. The goal is to find the optimal policy (that is the optimal way to transmit packets from underwater nodes to the sink) by making decisions such that the cumulative reward is maximized.

Updating the decision factors and their assigned weights based on the current state of the network is the key that makes this algorithm adaptive to environment.

The performance of this system will be later realized and evaluated with SUNSET [12], a simulator developed by Sapienza University UWSNs group, that models a wide variety of details of the underwater channel and environment realistically, before being deployed in fields.

[12] C. Petrioli, R. Petroccia, and D. Spaccini, SUNSET version 2.0: Enhanced framework for simulation, emulation and real-life testing of underwater wireless sensor networks,¿ in Proceedings of ACM WUWNet 2013, Kaohsiung, Taiwan, November 11-13 2013, pp. 1-8.

Codice Bando: 
925165

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma