You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Automatic interpretation of salmon scales using deep learning
Abstract
Determining the age structure of a fish species is important for understanding population and ecosystem dynamics and for stock assessment and management. For Atlantic salmon, age and other important biological information is collected from scale samples through manual qualitative interpretation. Reliable automatic methods are so far not widely utilised.
We use a state of the art Convolutional Neural Network (CNN) architecture called EfficientNet and a novel data set consisting of 9056 images of salmon scales to train different CNNs for four different prediction tasks. We consider two binary classification tasks regarding the origin of the fish (wild/farmed) and the spawning state (spawner/non-spawner) as well as two regression tasks predicting the number of years spent in the river and the sea. We take advantage of transfer learning by starting our training process with a CNN pre-trained on existing open-access image database ImageNet. To further test the predictive performance of our two regression CNNs, a set of 150 additional salmon scale images were analyzed for river and sea age both by the CNNs and by six human expert readers.
We find that the CNNs perform well on the two binary classification tasks and on predicting sea age, while the prediction of river age is less accurate. Estimates of river age by experts exhibit higher variance and lower levels of agreement compared to sea age, and may indicate why this task is also more difficult for the CNN. We see substantial benefit in using transfer learning. Comparing the performance of the CNN to six expert readers using standard precision measures for age reading, we confirmed the high performance of the CNN predicting sea age, well within the top of human expertise.
Automatic interpretation of scales offers a cost-efficient and effective way of investigation of fish age and life-history traits, which may further support the management of these biological resources.
salmon-scale CNN results
Comparison of different metrics for prediction of salmon scales. I have also added metric from Greenland otolith prediction for comparison. The metrics is from the validation set. Except the first line which is from Greenland Halibut and is calculated from mean of pairs of right and left otolith.
In the wild/farmed dataset there is 5427 wild salmon, and 505 (8.5%) farmed salmon. Salmon classified as something else like unknown or trout are not included in training.
In the spawning/non-spawning dataset there is 8835 non-spanwning scales and 238 spawned scales (2.6%). Note: There is spawners 422 (4.6%) but missing images for 184 of these. Therefore they are not included in the training set.
(MAPE: Mean absolute percentage error)
(MCC: mathews correlation coefficient)
Species
Predict
testLOSS
MSE
MAPE
ACC
MCC
#trained
activ. f
classWeights
Greenland Halibut(1)
age
x
2.65
0.124
0.262
x
8875
linear
x
Greenland Halibut(2)
age
-"-
2.82
0.136
0.294
x
8875
linear
x
Salmon
sea age
-"-
0.239
0.141
0.822
x
9056
linear
x
Salmon B4(12)
sea age
1.476
1.476
60.25
0.471
x
9056
linear
x
Salmon B4(13)
sea age
0.17
0.173
8.97
0.846
x
8286
linear
x
Salmon B4 v1.1.0
sea age
0.1570
0.1570
8.6405
0.8699
x
8286
linear
x
Salmon B4(14)patience20
sea age
0.158
0.158
7.88
0.863
x
8286
linear
x
Salmon B4(14)rerun(lr=0.00007)
sea age
0.158
0.158
7.1598
0.864
x
8286
linear
x
Salmon B4(14)rerun(lr=0.00007) seed=9
sea age
0.199
0.199
7.1524
0.863
x
8286
linear
x
Salmon B4(14)rerun(lr=0.00007) no weights
sea age
1.08
1.08
53.9
0.496
x
8286
linear
x
Salmon B4(15)path20batch16
sea age
x
x
x
x
x
8299
linear
x
Salmon
river age
-"-
0.431
0.252
0.585
x
6300
linear
x
Salmon B4(9)
river age
2.35
2.35
x
0.37
x
9056
linear
x
Salmon B4(11)
river age
0.359
0.359
19.58
0.618
x
6238
linear
x
Salmon B4 v1.1.0
river age
0.336
0.336
17.34
0.632
x
6238
linear
x
Salmon B4(16)patience20
river age
0.359
0.359
17.315
0.6297
x
6238
linear
x
Salmon B4(16) rerun(lr=0.00008)
river age
0.3237
0.3237
17.47
0.6371
x
6238
linear
x
Salmon B4(16) rerun(lr=0.00008) seed=9
river age
0.3884
0.3884
17.11
0.6339
x
6238
linear
x
Salmon B4(16x) rerun(lr=0.00008) no weights
river age
0.4896
0.4896
26.70
0.5347
x
6238
linear
x
Salmon missing_loss1
river & sea
9.4372
2.955
0.97
0.707
x
9056
linear
x
Salmon missing_loss2
river & sea
0.5915
2.992
0.974
0.707
x
9056
linear
x
Salmon missing_loss3
river & sea
2.0107
2.011
0.744
0.607
x
9056
linear
x
Salmon (3)
Spawned
0.113
x
x
0.964
x
9056
softmax
{0: 0.5, 1: 19}
Salmon (5)
Spawned
0.132
x
x
0.958
x
476
softmax
{0: 1, 1: 1}
Salmon (8)
spawned
0.6417
x
x
0.944
x
476
sigmoid
{0: 1, 1: 1}
Salmon (18)
spawned
x
x
x
x
x
9056
softmax
{0: 0.5, 1: 19}
Salmon (6)
Wild/farmed
0.155
x
x
0.9697
x
5917
softmax
{0: 5.87, 1: 0.54}
Salmon batch=8
Wild/farmed
0.187
x
x
0.967
x
5919
softmax
{0: 5.87, 1: 0.54}
Salmon (10)lr=0.0005
Wild/farmed
1.21
x
x
0.924
x
5919
softmax
{0: 5.87, 1: 0.54}
Salmon (4)
Wild/farmed
0.213
x
x
0.94
x
1010
softmax
{0: 1, 1: 1}
Salmon (7)
Wild/farmed
0.693
x
x
0.075
x
5919
sigmoid
{0: 5.87, 1: 0.54}
Salmon (17)
Wild/farmed
0.2057
x
x
0.96292
x
5919
softmax
{0: 5.87, 1: 0.54}
(1) is test-set
(2) is validation-set
(3) train/val/test size: 70, 15, 15
*
*
Training-set (negative example, positive example): (4861, 129)
(16) river age: NB have forgotten to set new directory: checkpoints_salmon_sea_uten_ukjent. Patience 20
rerun: batch size=12
(16x) river age: Same as (16) but with no weights. 150 epochs, 1600 steps and batch size of 12. 150 * 1600 * 12 = 2.880.000 images looked at in 150 epochs. 6246 images augmented by rotation of 360 degrees with mirroring which results in 360 * 2 * 6246 = 4.497.120 possible images. Best epoch was in epoch 122.