author:"Alexandros Iosifidis" | Pollux - Fachinformationsdienst Politikwissenschaft

Filter

Format

Medientyp

4 2

Sprache

Jahre

6 Ergebnisse

Sortierung:

Open Access#12021

Feedforward neural networks initialization based on discriminant learning

Kateryna Chumachenko; Alexandros Iosifidis; Moncef Gabbouj

In this paper, a novel data-driven method for weight initialization of Multilayer Perceptrons andConvolutional Neural Networks based on discriminant learning is proposed. The approach relaxes some of the limitations of competing data-driven methods, including unimodality assumptions, limitations on the architectures related to limited maximal dimensionalities of the corresponding projection spaces, as well as limitations related to high computational requirements due to the need of eigendecomposition on high-dimensional data. We also consider assumptions of the method on the data and propose a way to account for them in a form of a new normalization layer. The experiments on three large-scale image datasets show improved accuracy of the trained models compared to competing random-based and data-driven weight initialization methods, as well as better convergence properties in certain cases. ; This work is supported by Business Finland under project 5GVertical Integrated Industry for Massive Automation (5G-VIIMA). A. Iosifidis acknowledges funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 957337 (MARVEL).

Zugriff(Open Access)

BASE

Exportieren

Open Access#22021

Improving the Accuracy of Early Exits in Multi-Exit Architectures via Curriculum Learning

Arian Bakhtiarnia; Qi Zhang; Alexandros Iosifidis

Deploying deep learning services for time-sensitive and resource-constrained settings such as IoT using edge computing systems is a challenging task that requires dynamic adjustment of inference time. Multi-exit architectures allow deep neural networks to terminate their execution early in order to adhere to tight deadlines at the cost of accuracy. To mitigate this cost, in this paper we introduce a novel method called Multi-Exit Curriculum Learning that utilizes curriculum learning, a training strategy for neural networks that imitates human learning by sorting the training samples based on their difficulty and gradually introducing them to the network. Experiments on CIFAR-10 and CIFAR-100 datasets and various configurations of multi-exit architectures show that our method consistently improves the accuracy of early exits compared to the standard training approach. ; This work was partly funded by the European Union's Horizon 2020 research and innovation programme under grant agreement No 957337, and by the Danish Council for Independent Research under Grant No. 9131-00119B. This publication reflects the authors views only. The European Commission and the Danish Council for Independent Research are not responsible for any use that may be made of the information it contains.

Zugriff(Open Access)

BASE

Exportieren

Open Access#32021

Multi-Exit Vision Transformer for Dynamic Inference

Arian Bakhtiarnia; Qi Zhang; Alexandros Iosifidis

Deep neural networks can be converted to multi- exit architectures by inserting early exit branches after some of their intermediate layers. This allows their inference process to become dynamic, which is useful for time critical IoT applica- tions with stringent latency requirements, but with time-variant communication and computation resources. In particular, in edge computing systems and IoT networks where the exact computa- tion time budget is variable and not known beforehand. Vision Transformer is a recently proposed architecture which has since found many applications across various domains of computer vision. In this work, we propose seven different architectures for early exit branches that can be used for dynamic inference in Vision Transformer backbones. Through extensive experiments involving both classification and regression problems, we show that each one of our proposed architectures could prove useful in the trade-off between accuracy and speed. ; This work was funded by the European Union's Horizon 2020 research and innovation programme under grant agreement No 957337, and by the Danish Council for Independent Research under Grant No. 9131-00119B. This publication reflects the authors views only. The European Commission and the Danish Council for Independent Research are not responsible for any use that may be made of the information it contains.

Zugriff(Open Access)

BASE

Exportieren

Open Access#42021

Within-layer Diversity Reduces Generalization Gap

Firas Laakom; Jenni Raitoharju; Alexandros Iosifidis; Moncef Gabbouj

Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization. At each optimization step, neurons at a given layer receive feedback from neurons belonging to higher layers of the hierarchy. In this paper, we propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage diversity of the activations within the same layer. To this end, we measure the pairwise similarity between the outputs of the neurons and use it to model the layer's overall diversity. By penalizing similarities and promoting diversity, we encourage each neuron to learn a distinctive representation and, thus, to enrich the data representation learned within the layer and to increase the total capacity of the model. We theoretically and empirically study how the within-layer activation diversity affects the generalization performance of a neural network and prove that increasing the diversity of hidden activations reduces the generalization gap. ; This work was funded by the NSF-Business Finland Center for Visual and Decision Informatics (CVDI) project AMALIA. Jenni Raitoharju acknowledges funding from the Academy of Finland under project No 324475. Alexandros Iosifidis acknowledges funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 957337 (MARVEL).

Zugriff(Open Access)

BASE

Exportieren