Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo

activation
for a neuron, the process of sending an output signal after having received appropriate input signals
activation function
non-decreasing function f that determines whether the neuron activates
artificial intelligence (AI)
branch of computer science that aims to create intelligent systems capable of simulating humanlike cognitive abilities, including learning, reasoning, perception, and decision-making
backpropagation
an algorithm used to train neural networks by determining how the errors depend on changes in the weights and biases of all the neurons, starting from the output layer and working backward through the layers, recursively updating the parameters based on how the error changes in each layer
bias
value b that is added to the weighted signal, making the neuron more likely (or less likely, if b is negative) to activate on any given input
binary cross entropy loss
loss function commonly used in binary classification tasks: 1ni=1n[yilny^i(1yi)ln(1y^i)]1ni=1n[yilny^i(1yi)ln(1y^i)]
ChatGPT
Powerful natural language processing platform, created by OpenAI
connecting weight
in a recurrent neural network, a weight that exists on a connection from one neuron to itself or to a cycle of neurons in a feedback loop
convolutional layers
layers of a CNN that applies convolution filters to the input data to produce a feature map
convolutional neural network (CNN)
class of neural network models developed to process structured, grid-like data, such as images, making use of the mathematical operation of convolution
deep learning
training and implementation of neural networks with many layers to learn hierarchical (structured) representations of data
deepfake
product of an AI system that seem realistic, created with malicious intent to mislead people
depth
number of hidden layers in a neural network
dimension
the number of components in a vector
dynamic backpropagation
adjustment of parameters (weights and biases) and the underlying structure (neurons, layers, connections, etc.) in the training of a neural network
epoch
single round of training of a neural network using the entire training set (or a batch thereof)
exploding gradient problem
failure to train an RNN due to instability introduced by having connecting weights at values larger than 1
feature map
output of convolutional layers in a CNN, representing the learned features of the input data
feedback loop
internal connection from one neuron to itself or among multiple neurons in a cycle
fully connected layers
layers of a neural network in which every neuron in one layer is connected to every neuron in the next layer
generative art
use of AI tools to enhance or create new artistic works
gradient descent
method for locating minimum values of a multivariable function using small steps in the direction of greatest decrease from a given point
hallucinations
in the context of NLP, AI-generated responses that have no basis in reality
hidden layers
layers between the input and output layers
hinge loss
loss function commonly used in binary classification tasks: 1ni=1nmax(0,1yiy^i)1ni=1nmax(0,1yiy^i)
hyperbolic tangent (tanh)
common activation function, tanhx=exexex+extanhx=exexex+ex
imbalanced data
datasets that contain significantly more data points of one class than another class
input layer
neurons that accept the initial input data
large language model (LLM)
powerful natural language processing model designed to understand and generate humanlike text based on massive amounts of training data
leaky ReLU
common activation function, LReLU(x)=max(cx,x)LReLU(x)=max(cx,x), for some small positive parameter cc
long short-term memory (LSTM) network
type of RNN incorporating memory cells that can capture long-term dependencies
loss (or cost) function
measure of error between the predicted output and the actual target values for a neural network
margin
measure of the separation of data points belonging to different classifications
memory cells
internal structures that allow the network to store and access information over long time intervals
multilayer perceptron (MLP)
basic paradigm for neural networks having multiple hidden layers
natural language processing (NLP)
area of AI concerned with recognizing written or spoken language and generating new language content
neural network
structure made up of neurons that takes in input and produces output that classifies the input information
neuron
individual decision-making unit of a neural network that takes some number of inputs and produces an output
nonlinear
not linear; that is, not of the form f(x)=mx+bf(x)=mx+b
output layer
neurons that are used to interpret the answer or give classification information
perceptron
single-layer neural network using the step function as activation function, designed for binary classification tasks
pooling layers
layers of a CNN that reduce the dimensions of data coming from the feature maps produced by the convolutional layers while retaining important information
rectified linear unit (ReLU)
common activation function, ReLU(x)=max(0,x)ReLU(x)=max(0,x)
recurrent neural network (RNN)
neural network that incorporates feedback loops
responsible AI
ethical and socially conscious development and deployment of artificial intelligence systems
semantic segmentation
process of partitioning a digital image into multiple components by classifying each pixel of an image into a specific category or class
sigmoid function
common activation function, σ(x)=11+exσ(x)=11+ex
softmax
activation function that takes a vector of real-number values and yields a vector of values scaled into the interval between 0 and 1, which can be interpreted as discrete probability distribution
softplus
common activation function, f(x)=ln(1+ex)f(x)=ln(1+ex)
sparse categorical cross entropy
generalization of binary cross entropy, useful when the target labels are integers
static backpropagation
adjustment of parameters (weights and biases) only in the training of a neural network
step function
function that returns 0 when input is below a threshold and returns 1 when input is above the threshold
tensor
multidimensional array, generalizing the concept of vector
vanishing gradient problem
failure to train an RNN due to very slow learning rates caused by having connecting weights at values smaller than 1
vector
ordered list of numbers, x=(x1,x2,,xn)x=(x1,x2,,xn)
weight
value w that is multiplied to the incoming signal, essentially determining the strength of the connection
Citation/Attribution

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution-NonCommercial-ShareAlike License and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/principles-data-science/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/principles-data-science/pages/1-introduction
Citation information

© Dec 19, 2024 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.