A single neuron obviously would be limited in the complexity of tasks that it could perform. Therefore, we need to utilize stacks or groups of neurons in order to analyze more complex data and mirror more complex processes. In the simplest form of this type of network, we can have an input layer, a hidden layer and an output layer.
Each layer has multiple neurons, and all the neurons in each layer are connected to all the neurons in the next layer. These networks can also be called fully-connected networks. The following figure shows examples of a Multi-Layer Perceptron:
Features of a Multilayer Perceptron (MLP)
One of the things you will need to deal a lot with is the features or different characteristics of the datasets you are using. It is very important for us to gather and identify these features. If the features are very few, our data might not be informative enough to get the results we need. Too many features will cause problems as well. It is important that we do not just blindly feed the network everything we know about the dataset. We must be careful and somewhat selective about the features we want to use and why. Feature processing and selection is a discipline in and of itself with its own best practices.
When it comes to feature selection, proper domain knowledge is a must. You must understand the problem domain in order to know what is and what is not important. Additionally, we do not want the entire feature set to be presented in our model. Fewer features are desirable because of the following points:
- It reduces the complexity of the model.
- It is faster and more cost effective.
- A simpler model is simpler to understand and explain.
- It makes the machine learning algorithm faster to train.
- It improves the accuracy of a model if the right subset is chosen.
- It reduces overfitting.
A word of caution: Should you do feature selection on a different dataset than you train on (which is a big no-no; you can introduce bias in your models.