Why is feature engineering so difficult

Deep learning replaces feature engineering

Deep learning is a form of feature engineering with the aim of learning features from (little processed) raw data. For this purpose, the raw data is generalized over several layers on top of each other, hence the term deep learning. An example: The training data consists of photos / images. If a neural network is trained with images of faces, individual neurons of the first hidden layer are maximally active if a special edge is present in the photo. In a sense, this is the key stimulus for the neurons of the first layer.

  1. ALPMA through R. B. Personnel & Management Consulting, Rott am Inn
  2. Konstanz University of Applied Sciences, Constance

The neurons of the next layer, on the other hand, respond to the presence of sections of the face, such as a nose or an eye. The neurons of the next layer, in turn, are maximally active when prototypes of faces are placed at the input of the neural network. A feature hierarchy is learned, with higher layers corresponding to more abstract, higher-value features.

This also makes it clear why the decision-making function is easier on the higher representations. If, for example, a neuron of the 3rd layer, which stands for a face prototype, is active, this means that a face can be seen in the image. If a decision has to be made on the activities of the 1st neuron layer, this is much more difficult, since special edge combinations have to be recognized as a face.

Where does the basic principle of deep learning come from?

The basic idea of ‚Äč‚Äčlearning characteristics hierarchically over many layers comes from the cognitive sciences, among other things: it has long been shown that the information received by the eyes is processed layer by layer in the visual cortex and transferred into higher representations. The neurons are also arranged in layers in the visual cortex of the brain. The key stimuli also become more and more complex in higher layers.

In the past, neural networks with many hidden layers could not learn properly. Among other things, the amount of data was too small and the computing power was too low. Therefore, in practice, mostly only neural networks with only one hidden layer and very few neurons were used. This only changed in 2006, when researchers working with Professor Geoffrey Hinton, Toronto, presented a training algorithm with which layer-by-layer feature transformations can be learned. This publication sparked renewed strong interest in neural networks in the research community.

  2. 1
  3. 2
  4. 3
  5. 4
  6. 5
  7. 6
  8. 7
  9. 8