statslink

linking statistics to all

What are neural networks? (Part I)

As I was looking at job postings for a machine learning engineer, I looked at the job description and one of the requirements is knowledge of deep learning and PyTorch. For the traditional data scientist, deep learning is typically not part of the repertoire of studies, typically a part of the study of computer science and artificial intelligence. However, given the rise of AI and the demand of these skills it may behoove the data scientist to know a bit about deep learning.

I found two references that may be a good introduction to the topic:

Mueller, J. P., & Massaron, L. (2024). Python for data science for dummies (3rd ed.). John Wiley & Sons.

Julian, D. (2018). Deep learning with PyTorch quick start guide: Learn to train and deploy neural network models in Python. Packt Publishing Ltd.

For this post, let’s focus on Chapter 19 of Mueller and Massaron (2024) “Playing with Neural Networks”.

Neural networks (NNs) developed as an attempt to reverse engineer how a brain processes signals with terms such as axons and neurons, but effectually are a sophisticated form of linear regression. NNs are effective for complex problems such as image and sound recognition and machine language translation. Deep learning (DL) lies behind Siri and other digital assistants as well as ChatGPT. DL typically requires specialized hardware such as GPUs with special frameworks such as Keras, TensorFlow, and PyTorch.

The core NN algorithm is the neuron or unit. Many neurons arranged in an interconnected structure make up layers of a neural network with each neuron linking to the inputs and outputs of other neurons. NNs can only process numeric (continuous) information rather than categorical variables but this can be resolved by converting categorical to binary values (similar to dummy coding in regression).

In biology, neurons receive signals but don’t always release a signal and only fire an answer when it receives enough stimuli (otherwise it remains silent). In NNs, after receiving weighted values, sum them and use an activation function to evaluate the result which transforms it in a possibly nonlinear way. For example, the activation function can release a zero value unless the input achieves a certain threshold, or it can dampen or enhance a value by nonlinearly rescaling it. Each neuron in the network receives inputs from previous, weights them, sums them all and transforms the results into an activation function. After activating, the computed output becomes input for other neurons or the prediction of the network. The weights of the NN are similar to the coefficients of a linear regression, and the network learns their value by repeated passes (iteration or epochs) over the dataset.

The book provides code examples using Keras from TensorFlow. The example comes from the hand-written dataset as an example of multiclass classification. NNs are sensitive to the scale and distribution of the data (it’s good to normalize the variables by creating z-scores, i.e., scaling the mean to 0 and standard deviation to 1). The target or outcome requires each terminal neuron to make a prediction which is a numeric value or probability. For classification, one approach involves using one-hot encoding (similar to dummy coding in regression) and assigning a separate neuron with sigmoid activation to predict the probability for each class. The class with the highest probability is the winner among others.

To construct the architecture and train the data, initialize an empty network and add layers of neurons progressively starting from the top where the data is inputted to the bottom where results are obtained. The example has two layers of 64 and 32 neurons activated by the ReLU function (Rectified Linear Unit) defined mathematically as f(x) = \max(0,x) which means it outputs only non-negative values. The activation function enables the network to learn nonlinear patterns and each layer is followed by a dropout layer to serve as a regularization technique to prevent overfitting. The network concludes with a layer containing the probabilities for the classes where the winning class is determined. The softmax activation function is employed in the final layer which was defined previously in Transformers, BERT, and GPT (Chapter 4). The code will iterate over the data 50 times (epochs) and process the data in batches of 32 examples each. During the training process there will be updates and guidance regarding process made as well as evaluation metrics relative to the test data. The training loss and validation loss can be plotted over the epochs. Training loss is defined as the error between the prediction and the actual outcomes in the training data. In linear regression this is similar to the residual \hat{e}_{train} = y_{train}-\hat{y}. Validation loss is assessed similarly but for the testing data as in \hat{e}_{test} = y_{test}-\hat{y}.

6 responses to “What are neural networks? (Part I)”

  1. 一九资源网 Avatar

    人生多艰,快乐一天是一天!

  2. 摸币网 Avatar

    时间真快,一年又快结束了,啥也不说了,祝你幸福吧!

  3. Bulk commenting Avatar

    Bulk commenting service. 100,000 comments on independent websites for $100 or 1000,000 comments for $500. You can read this comment, it means my bulk sending is successful. Payment account-USDT TRC20:【TLRH8hompAphv4YJQa7Jy4xaXfbgbspEFK】。After payment, contact me via email (helloboy1979@gmail.com),tell me your nickname, email, website URL, and comment content. Bulk sending will be completed within 24 hours. I’ll give you links for each comment.Please contact us after payment is made. We do not respond to inquiries prior to payment. Let’s work with integrity for long-term cooperation.

  4. Make Money At Home Avatar

    Hi, how have you been lately?​

  5. 免费资源下载 Avatar

    真免费!价值万元资源,不要一分钱,网址:https://www.53278.xyz/

  6. 益群网 Avatar

    益群网:终身分红,逆向推荐,不拉下线,也有钱赚!尖端资源,价值百万,一网打尽,瞬间拥有!多重收益,五五倍增,八级提成,后劲无穷!网址:1199.pw

Leave a Reply