Underwater Acoustics Data Challenge Workshop 2023: Using Machine Learning to Identify Bioaccoustic Signals

A N
Oct 1, 2023
5 min read

A two-day residential Underwater Acoustics Data Challenge Workshop 2023 was organized by the “Special Interest Group for Underwater Acoustics (SIGUA)” of the UK Acoustics Network on 11-12 September 2023 at Guyers House, near Bath, UK. The workshop objectives were to explore solutions to research challenges set by industry.

30 people attended the workshop, mainly from the UK but some also from France and the Netherlands. I was the only representative from Canada. The event was organized in the form of a hackathon, making available underwater ocean environment data available to the participants in advance. Over the two days, the participants worked in teams based on the challenges that interested them to develop innovative solutions. Each challenge was anchored by a company, including Thales, Ultra, and ORE Catapult. Representatives from the companies were present to provide more contextual information and answer questions.

When I worked on my www.MonitorMyOcean.com project to measure the quietening of global oceans during the COVID-19 lockdowns, I spent a lot of time gathering hydrophone data from different ocean observatories. While some of these sources were comfortable with sharing the data, many did not, as hydrophone data has military usage, too. It also explained why representatives from companies, such as Thales, which undertakes contracts from the UK Government and the military, could not fully disclose all the purposes they use the hydrophone data.

The Hackathon and Team-Making

On the evening prior to the start of the event, I met several participants, including the organizer, Dr. Alan Hunter, as well as researchers from various companies behind the three challenges. It allowed me to get a better understanding of the backgrounds of the participants, what their interests were, what they expected out of this event, and the challenge that interested them the most. Being the only high school student in the group, it was a wonderful learning opportunity to see a broad gathering of people from different universities and institutions interested in the specific subject of underwater acoustics.

The next day, after a hearty breakfast, everyone met in the common room. Alan gave an introductory talk about the workshop objectives and the schedule ahead. It was followed by a short presentation from representatives of companies acting as anchors for each of the three challenges. Many participants had already decided upon the challenge they were most interested in and had identified groups they would like to work with.

Working on Challenge 3: Using Machine Learning to Identify Bioacoustic Signals

I was interested in Challenge 3 as I have been working on acoustics signals for a few months using hydrophone data from Monterey Bay Aquarium Research Institute (MBARI) and also as a part of the working group member of the International Quiet Ocean Experiment (IQOE).

There were around ten people interested in the challenge. We all went into the courtyard and started brainstorming ideas on how to tackle the dataset. As some of the participants had already viewed the datasets and had a good understanding of them, a lot of great ideas were generated. We decided to break into even smaller groups to build upon those ideas. Three sub-groups were formed. I teamed up with Matthew Gracia, a master’s student from the University of Bath and Valentin Bordoux, a master’s student from France.

As Matthew had been working on the DLCDE dataset for his master’s degree, we were able to get a headstart on the project. The main issue we discovered was annotations on multiple sonobuoys that were needed to localize the data were missing. Labelling 32 sonobuoys manually was a challenging task, so only the first observed time of a specific whale call was usually annotated by the university researchers who had created the annotations for the DLCDE dataset. This meant there were many unannotated signals.

Our immediate objective was to use the pre-existing annotation data samples to train a machine-learning model to identify unannotated calls in the dataset. Our model should be able to go through a sliding window and detect whether a 5-second segment contains a signal. To do this, we needed to prepare a dataset containing samples of segments with signals in them as well as segments without any signal.

To make the dataset with positive detections, we created 5-second windows centred on each annotated call. For the negative call detections, we just picked a random time stamp that did not overlap with an annotated call. Although these windows may have contained a few unannotated calls, the duration of the call would have been miniscule in comparison to the duration in which there were absolutely no calls.

We trained a Convolution Neural Network (CNN) model on the negative and positive detections. As we only had 36 hours to work on our projects, we didn’t have time for much preprocessing and fine-tuning the data, which is essential for getting higher accuracy. Consequently, we could only achieve a test accuracy of 65%. Although low, it demonstrated that a highly accurate model capable of detecting new unannotated calls was possible. We applied our model to a sample buoy and were able to detect 7 new calls that had not been annotated before!

We planned to continue working to make an improved model and increase the annotations for surrounding hydrophone calls.

Group Presentations

On the last day, all the teams created presentations of the work they did and presented them to the entire group.

For Challenge 1, three teams presented their work. I found it interesting that one of the teams correlated signals detected from the fibre optic cables with the AIS vessel data. They used different segments of the cables to triangulate the distance to ships. Teams were also able to observe the overlap between marine mammal calls and the noise of the ship. In one instance, the sound of a ship overshadowed the mammal call when it passed close to it, validating the known fact that growing shipping noise is hampering the ability of marine mammals to use sound for communication and navigation. Some of the team did an additional analysis to determine the vessel and propeller type from the sound signals.

Two teams tackled the challenge 2 on sonar tracing. One of the teams tried three different models – convolutional autoencoders to generate traces from a full image, a CNN model that predicted segments, and a CNN/RNN hybrid combination that tried tracing in real-time.

Four teams tackled challenge 3. One team went in-depth into a single observed scenario, carefully analyzed a signal detected by 4 hydrophones, and used an algorithm to improve the time stamps for each call, allowing for accurate TDOA localization. Another team also went in-depth, tracing one call across 6 hydrophones and identifying the directional property of an NARW call. Our group took a more broad approach aimed at the entire dataset. Our model would automatically create annotations, allowing models created by other teams to do more in-depth analysis for each specific scenario in the dataset.

Thanks, and Until the Next Workshop

I learned a lot during this hackathon-styled workshop. The opportunity to work together with lots of researchers and young professionals allowed me to pick knowledge on new tools, algorithms and developments in the area of underwater acoustics.

Thank you to UK Acoustics Network for offering the overnight stay alongside breakfast, lunch and dinner and transportation to and from the Bath Spa train station. I hope to join you again in the next workshop.

ARTASHNATH.COM

Underwater Acoustics Data Challenge Workshop 2023: Using Machine Learning to Identify Bioaccoustic Signals

The Hackathon and Team-Making

Working on Challenge 3: Using Machine Learning to Identify Bioacoustic Signals

Group Presentations

Thanks, and Until the Next Workshop

Recent Posts