By Carsten Alesch (ERNI Germany)
This article is for those who are interested in the topic of neural networks and simply want to dive deeper into this theme. We encounter such neural networks in numerous applications, including some very well-known examples such as LLMs (Large Language Models) — as used in chatbots, facial recognition, autonomous driving and medical image processing. They are fundamental to a specialised branch of AI known as machine learning (ML). In the course of the development of neural networks, different learning methods have become established in machine learning. The most famous are Supervised Learning (also Deep Learning – applies to networks with more than one intermediate layer), Reinforcement Learning and Unsupervised Learning (Self-Organised Learning). In this Techletter, we focus on supervised learning, one of the most commonly used learning methods.
Principle of neural networks
The real strength of a neural network (NN = neural network) lies in its ability to abstract or deal with blurring. This is particularly useful in image processing to identify similarities. For example, the NN learns which number looks like what based on numerous images of handwritten numbers. If you now show the network a new, unknown image of a handwritten digit, it also deduces which number was written here, even if it has not explicitly learned (“seen”) this exact image beforehand. This property dramatically increases the possible uses of a NN.
It is now possible to recognise an object in a camera image. It must first be understood that even if you mount a camera and leave the properties unchanged, it is highly unlikely that you will create an identical image twice. The image will always be different in some pixel. A one-to-one assignment of an input pattern to a specific output pattern is therefore not possible. It is good that NN can deal with such blurring!
In order to explain the structure and topology of a NN, some basic components and terms need to be understood.
Neuron/unit and layers
Based on the biological nerve cells in a brain, neurons, also called units, are used in the NN to store information and transmit it to other units. While there are very complex networks in the biological brain, a neural network mostly has very simplified mappings. A typical topology is the mapping onto several layers. Networking usually occurs entirely layer by layer.

A NN usually consists of at least one input layer and one output layer, as well as one or more hidden layers in between. If a NN has more than one hidden layer, the training process is called deep learning. Each connection between two units has a weighting. Each unit also has its own activity level in the form of a simple value. The activity level is created by using an activity function that is applied to the network input. There are numerous different activity functions; however, the simple case of a sigmoid function is often used. The use of activity functions primarily serves to limit the values to an easily usable range with small numerical values. Typical value ranges are between 0.0 and 1.0 or -1.0 and 1.0.

The activity level itself is often simply assigned to the output level or the value that comes from a unit and is transferred to subsequent units. In conjunction with the weightings on the connections or edges, the network input of all units is then calculated.
The calculation of the activity levels of all output units with input values applied to the input layer is also called forward propagation. The following simple formula applies:




Input: The values that “come in” via the individual connections.

Activity level: Activity function (network input) = sigmoid (netinput)
Output: Activity level (simplest/most common case)
Learning in NN through backpropagation
Now we know how the corresponding output is calculated for a corresponding input. But how can such a neural network learn? Everything that is “learned” is practically manifested in the weightings between the individual units. These weightings are by no means fixed, but are adjusted algorithmically, just like the activity levels of the units.
In a NN that is trained with “supervised learning”, this is done by comparing an output currently calculated in the network and a desired output assigned to the current input. If the actual output does not correspond to the desired one, the deviations flow backwards into the network (backpropagation) according to a certain rule and are taken into account in the weightings.
The question here is: how high is the change in the respective weights in the network? These so-called deltas are basically determined as follows:


If i is an output unit:

If i is a hidden unit:


Sample container detection (demonstrator)
We have now used the algorithm described above to build a demonstrator that addresses a recurring challenge in the medical sector. It involves the recognition of so-called sample containers, as shown in the following figure, in relation to their closure caps.

Twenty-six different sample containers were taught into the network. Each of these sample containers was given a label (0–25).
For this purpose, we built the demonstrator, consisting of a camera, a holder for a sample container, a Raspberry Pi 4, a display with a button and indicator light and a power supply for autonomous operation. We then packed the whole thing into a foldable, portable box. Attached are some pictures of this installation.

The process is now as follows. By pressing the button, the camera is prompted to take a picture of the opposite sample container. A static part is algorithmically cut out of this image (vertical strip with the sample container). In this partial image, the exact position of the sample container is then determined using an edge detection algorithm and this image area is cut out and standardised to a fixed size. For this purpose, the RGB (red-green-blue) colour values are written into the input vector, as well as a height specification of the sample container. The height specification is taken from the sample container determined by edge detection and noted in the file name when the image is saved. Based on a limit value for the height (currently 75 pixels), it is then determined whether the detected sample container needs to be halved vertically. The halved or full image is then standardised to a 60×80 pixel image. The data from these standardised images then serve as network input.


After calculating the network output, the probability values for all previously learned sample containers are in the output layer. The output unit with the largest value wins and should represent the desired result. The camera image, the ROI (region of interest) in the camera image and the determined result are then shown on the display.

During operation, a light is switched on to the right of the sample holder. This light makes the system less sensitive to external brightness influences. This could affect the correct detection of the sample container and lead to incorrect results.
The network topology and scope of the NN used in this demonstrator is as follows:

The general characteristics of a neural network are described by its key data, which, among other things, make statements about the topology, its dimensions and the properties of the learning algorithm used. These are the adjustment screws when testing a network in terms of the quality of the results and performance. These characteristics are called hyper parameters. The hyper parameters used in this network are as follows.
Hyper parameter used:
- Number of hidden nodes: 150
- Learning rate: 0.073
- Batch size: 128
- Number of training images: 1,428
- Number of validation images: 749
During the learning process, all training images were repeatedly presented to the network. The training images were divided into smaller sets of images, so-called batches, and presented to the neural network as input during the learning process. This procedure has shown that learning was somewhat accelerated.
Ultimately, around 2.5 million images were processed with the associated, expected result. After the learning process, the validation images were recognised correctly 99% of the time. The validation images represent a set of images that are not included in the training data and are used again and again to validate what has been “learned”.
Demonstrator as MQTT client
The demonstrator was designed as an MQTT client, so that all essential operations – such as creating new training or validation images, evaluating a camera recording, initiating the learning process, and reading information – can be remotely controlled from another computer.
Technologies used
- Raspberry Pi 4.0
- Raspbian OS
- Python 3 (NumPy Arrays)
- Tkinter
- Visual Studio Code 1.76.0
- OpenCV 4.5.1
- Mosquitto for MQTT connection

Conclusion
The demonstrator developed here is only used to show the possibilities and the basic functionality of neural networks. It is used at trade fairs and IT meetups as a demonstration of the capabilities of NN in MedTech applications.
The demonstrator does not claim to be suitable for industrial use, neither in terms of speed nor in terms of accuracy. For an industrial application, appropriate cameras and a different network technology would be used. One such alternative topology would be a convolutional network, for example. This would enable much more precise image evaluations.
For this demonstrator, only a few different images were used for training, and the number of units is also rather small. Nevertheless, quite good results were achieved.
In many networks used, the number of neurons is well over 100,000 – and may be significantly more depending on the application. In comparison to ChatGPT, which works with over 175 billion input neurons, it quickly becomes clear how large such a NN can be and what computing capacity is required to train it.
In general, it can be said that the development of suitable NNs for certain applications also requires a lot of experimentation. In our case, numerous test runs were indeed necessary to determine the appropriate hyperparameters that delivered the best results.
Literature
[1] Book: “Neuronale Netze, Günter Daniel Ray, Karl F. Wender, S.16”
[2] Book: “Machine Learning für Softwareentwickler”, Paolo Perrotta
[3] Book: “Neuronale Netze”, Jeannette Lawrence
[4] Book: “Neuronale Netze programmieren mit Python”, Joachim Steinwendner, Roland Schwaiger