API
SimpleNNs.AdamOptimiser — MethodAdamOptimiser(gradients::AbstractArray{T}; lr = Float32(1e-3), beta_1 = 0.9f0, beta_2 = 0.999f0) where {T}Create an Adam optimiser for gradient-based parameter updates.
Arguments
gradients(AbstractArray{T}): Template array matching the shape of gradients to be optimised
Keyword Arguments
lr(T): Learning rate (default: 1e-3)beta_1(T): Exponential decay rate for first moment estimates (default: 0.9)beta_2(T): Exponential decay rate for second moment estimates (default: 0.999)
SimpleNNs.Conv — MethodConv(kernel_size::NTuple{N, Int}, out_channels::Int; kwargs...)A convolutional layer with a given kernel size and specified number of output channels.
This can automatically infer the number of input channels based on the preceeding layers.
Keyword Arguments
use_bias(default:Val(true)) - Whether or not to add a bias vector to the output. Wrapped in aValfor optimisation.activation_fn(default:identity) - A custom activation function. Note that not all functions are supported by backpropagation.parameter_type(default:Val(Float32)) - The datatype to use for the parameters, wrapped in aValtype.init(default:GlorotNormal()) - Weight initialisation scheme. SeeInitialiserfor available options.
SimpleNNs.Dense — MethodDense(outputs::Integer; kwargs...)A representation of a dense layer. By default this can be constructed by specifying the desired number of outputs. The input size can be inferred from the rest of the chain when constructing a model.
Keyword Arguments
use_bias(default:Val(true)) - Whether or not to add a bias vector to the output. Wrapped in aValfor optimisation.activation_fn(default:identity) - A custom activation function. Note that not all functions are supported by backpropagation.parameter_type(default:Val(Float32)) - The datatype to use for the parameters, wrapped in aValtype.inputs(default:Infer()) - Specify the number of inputs, or infer them from the rest of the model.init(default:GlorotNormal()) - Weight initialisation scheme. SeeInitialiserfor available options.
SimpleNNs.Flatten — TypeFlatten()Flatten the dimensions of the preceeding layer, leaving the batch dimension unaffected. The output should be (k x n) where k is the product of the non-batch dimensions of the previous layer.
SimpleNNs.GlorotNormal — TypeGlorotNormal()Glorot normal initialisation (also called Xavier normal). Samples weights from a normal distribution with mean 0 and standard deviation √(2 / (fanin + fanout)).
Best suited for layers with sigmoid or tanh activations.
SimpleNNs.GlorotUniform — TypeGlorotUniform()Glorot uniform initialisation (also called Xavier uniform). Samples weights from a uniform distribution in the range [-limit, limit] where limit = √(6 / (fanin + fanout)).
Best suited for layers with sigmoid or tanh activations.
SimpleNNs.HeNormal — TypeHeNormal()He normal initialisation (also called Kaiming normal). Samples weights from a normal distribution with mean 0 and standard deviation √(2 / fan_in).
Best suited for layers with ReLU activations.
SimpleNNs.HeUniform — TypeHeUniform()He uniform initialisation (also called Kaiming uniform). Samples weights from a uniform distribution in the range [-limit, limit] where limit = √(6 / fan_in).
Best suited for layers with ReLU activations.
SimpleNNs.Initialiser — TypeAbstract type for weight initialisation strategies.
SimpleNNs.LeCunNormal — TypeLeCunNormal()LeCun normal initialisation. Samples weights from a normal distribution with mean 0 and standard deviation √(1 / fan_in).
Best suited for layers with SELU activations.
SimpleNNs.LogitCrossEntropyLoss — TypeLogitCrossEntropyLoss(targets, num_classes::Int)Expects the targets in a single vector containg class labels, which have to be between 1 and num_classes inclusive.
SimpleNNs.MSELoss — TypeMSELoss(targets)Expects the targets in the form (K x N) where K is the output dimension (usually 1) and N is the batch size.
For efficiency, this is just ∑ (y-̂y)² and NOT scaled by a half.
SimpleNNs.MaxPool — MethodMaxPool(pool_size::NTuple{N, Int}; kwargs...)A convolutional max-pool layer with a given kernel size.
This can automatically infer the necessary sizes if specified.
SimpleNNs.RMSPropOptimiser — TypeRMSPropOptimiser{T, X<:AbstractArray{T}} <: AbstractOptimiserRMSProp optimiser with exponential moving average of squared gradients.
Fields
lr::T: Learning raterho::T: Exponential decay rate for moving averageeps::T: Small constant for numerical stabilityv::X: Exponential moving average of squared gradients
SimpleNNs.RMSPropOptimiser — MethodRMSPropOptimiser(gradients::AbstractArray{T}; lr = Float32(1e-3), rho = 0.9f0, eps = Float32(1e-8)) where {T}Create an RMSProp optimiser for gradient-based parameter updates.
RMSProp maintains a moving average of squared gradients to adaptively scale the learning rate.
Arguments
gradients(AbstractArray{T}): Template array matching the shape of gradients to be optimised
Keyword Arguments
lr(T): Learning rate (default: 1e-3)rho(T): Exponential decay rate for moving average of squared gradients (default: 0.9)eps(T): Small constant added to denominator for numerical stability (default: 1e-8)
Examples
opt = RMSPropOptimiser(gradients; lr=0.001f0, rho=0.9f0)SimpleNNs.SGDOptimiser — TypeSGDOptimiser{T} <: AbstractOptimiserStochastic Gradient Descent optimiser with optional momentum.
Fields
lr::T: Learning ratemomentum::T: Momentum coefficient (0.0 for no momentum)velocity::AbstractArray{T}: Velocity buffer for momentum (internal state)
SimpleNNs.SGDOptimiser — MethodSGDOptimiser(gradients::AbstractArray{T}; lr = Float32(1e-3), momentum = 0.0f0) where {T}Create an SGD optimiser for gradient-based parameter updates.
Arguments
gradients(AbstractArray{T}): Template array matching the shape of gradients to be optimised
Keyword Arguments
lr(T): Learning rate (default: 1e-3)momentum(T): Momentum coefficient, 0.0 for standard SGD (default: 0.0)
Examples
# Standard SGD
opt = SGDOptimiser(gradients; lr=0.01f0)
# SGD with momentum
opt = SGDOptimiser(gradients; lr=0.01f0, momentum=0.9f0)SimpleNNs.Static — MethodStatic(inputs::Union{Int, NTuple}; kwargs...)Used for specifying the input type to a neural network. inputs should be a single integer for a dense network, representing the number of features. For a image network, inputs can be a tuple specifying the size of the images in the form (WIDTH, HEIGHT, CHANNELS).
SimpleNNs.Zeros — TypeZeros()Initialise all weights to zero. Note: This is generally not recommended for training as it breaks symmetry.
Base.deepcopy — MethodBase.deepcopy(model::Model)Create a deep copy of the model with its own independent parameter array.
This function creates a new model with:
- A new parameter array (using
similarandcopyto!) - New parameter views for each layer pointing to the new array
- The same layer structure and configuration
The copied model is completely independent from the original - modifying parameters in one will not affect the other.
Arguments
model::Model: The model to copy
Returns
A new Model with copied parameters and structure.
Examples
model = chain(Static(10), Dense(5))
model_copy = deepcopy(model)
# Modify copy - original unchanged
parameters(model_copy) .= 0.0See also parameters, chain.
SimpleNNs.activation_gradient_fn — MethodDertivatives are used to backpropagate the gradients of the layer outputs back to the activations of that layer. To save space, these are calculated exclusively using the outputs of the layer. Instead of functions written as dy/dx=f(x), we instead write dy/dx = g(y). This can be done for the 3 major functions.
Whenever y is used below, assume this is a function of the output, not the input.
SimpleNNs.add_loss — Methodadd_loss(model::Model, loss_layer::AbstractTargetsLayer)Create a new model with the given loss layer appended to the end of the existing model.
This function reconstructs the entire model chain with the loss layer added as the final layer. The original model's parameters are copied to the new model.
Arguments
model::Model: The existing model to extendloss_layer::AbstractTargetsLayer: The loss layer to append (e.g.,BatchCrossEntropyLoss)
Returns
A new Model with the loss layer appended.
Examples
model = chain(
Static(10),
Dense(32, activation_fn=relu),
Dense(5, activation_fn=identity)
)
# Add a loss layer
targets = zeros(Int, batch_size)
loss_layer = BatchCrossEntropyLoss(targets=targets, num_classes=5)
model_with_loss = add_loss(model, loss_layer)See also remove_loss, has_loss, get_predictions.
SimpleNNs.backprop! — Methodbackprop!(partials_buffer, gradient_buffer, inputs, outputs, layer)Backpropagates the partial gradients of the outputs of the current layer into the parameters of the current layer. partial_buffers is used as a buffer for the gradients of the output of this layer. gradient_buffer should be filled up with the gradients of the parameters of the current layer, using the chain rule. inputs is the array fed into the layer and outputs is the output of this layer in the forward pass. layer is the struct containing information about the layer.
SimpleNNs.chain — Methodchain(layers...)Combines the given layer definitions into a single model and propagates the layer sizes through the network.
The first layer must always be a Static layer which specifies the feature size. If this is a simple fully connection network, then the first layer should be Static(nf) where nf is the number of features in your input matrix. Do not specify the batch size in this static input.
The default datatype for most layers is Float32, but this may be changed. The parameters of the entire model must be of the same datatype. This function will create a flat parameter vector for the model which can be accessed using the parameters function.
Examples
A simple dense, fully-connected, neural network which has 3 input features:
model = chain(
Static(3),
Dense(10, activation_fn=tanh),
Dense(10, activation_fn=sigmoid),
Dense(1, activation_fn=identity),
);An example convolutional neural network:
# Image size is (WIDTH, HEIGHT, CHANNELS)
img_size = (28, 28, 1)
model = chain(
Static(img_size),
Conv((5,5), 16; activation_fn=relu),
MaxPool((2,2)),
Conv((3,3), 8; activation_fn=relu),
MaxPool((4,4)),
Flatten(),
Dense(10, activation_fn=identity)
)See also Static, Dense, Conv, MaxPool, Flatten and preallocate.
SimpleNNs.forward! — Methodforward!(cache::ForwardPassCache, model::Model)Execute a forward pass through the neural network model.
This function computes the forward propagation through all layers of the model, storing intermediate results in the pre-allocated cache. This is a zero-allocation operation when used with properly pre-allocated caches.
Arguments
cache::ForwardPassCache: Pre-allocated cache containing input data and space for intermediate resultsmodel::Model: The neural network model to evaluate
Returns
- The cache object (for convenience), with updated intermediate and output values
Examples
# Create model and data
model = chain(Static(4), Dense(8, activation_fn=relu), Dense(1))
inputs = randn(Float32, 4, 32) # 32 samples, 4 features each
# Pre-allocate cache and set inputs
cache = preallocate(model, 32)
set_inputs!(cache, inputs)
# Execute forward pass
forward!(cache, model)
# Get outputs
outputs = get_outputs(cache)Notes
- Requires pre-allocated cache from
preallocate(model, batch_size) - Input data must be set using
set_inputs!(cache, inputs)before calling - This is a mutating operation that modifies the cache in-place
- Designed for zero allocations when properly used
- Works on both CPU and GPU when model and data are on the same device
See also: preallocate, set_inputs!, get_outputs
SimpleNNs.get_loss — Methodget_loss(model::Model)Returns the loss layer if the model has one, otherwise returns nothing.
If the model has a loss layer (a layer extending AbstractTargetsLayer) as its final layer, this function returns that layer. Otherwise, it returns nothing.
Returns
- The loss layer if present
nothingif the model has no loss layer
Examples
model = chain(Static(10), Dense(5, activation_fn=relu))
get_loss(model) # nothing
model_with_loss = add_loss(model, BatchCrossEntropyLoss(targets=zeros(Int, 32), num_classes=5))
loss_layer = get_loss(model_with_loss) # Returns the BatchCrossEntropyLoss layerSee also has_loss, add_loss, remove_loss.
SimpleNNs.get_outputs — Methodget_outputs(cache::ForwardPassCache)Gets the last output from the forward pass buffer.
SimpleNNs.get_predictions — Methodget_predictions(model::Model, forward_cache)Extract predictions from the forward cache based on whether the model has a loss layer.
If the model does not have a loss layer, returns the final output from the cache. If the model has a loss layer, returns the input to the loss layer (i.e., the output of the second-to-last layer).
Arguments
model::Model: The model that was used for the forward passforward_cache: The forward cache containing layer outputs
Returns
An array containing the model's predictions (before the loss computation if applicable).
Examples
model = chain(Static(10), Dense(5, activation_fn=identity))
forward_cache = preallocate(model, batch_size)
set_inputs!(forward_cache, inputs)
forward!(forward_cache, model)
predictions = get_predictions(model, forward_cache) # Returns final layer output
# With loss layer
model_with_loss = add_loss(model, loss_layer)
forward_cache_with_loss = preallocate(model_with_loss, batch_size)
set_inputs!(forward_cache_with_loss, inputs)
forward!(forward_cache_with_loss, model_with_loss)
predictions = get_predictions(model_with_loss, forward_cache_with_loss) # Returns output before lossSee also add_loss, remove_loss, has_loss, get_outputs.
SimpleNNs.gpu — Methodgpu(x)Move data or models to GPU using CUDA. This function requires CUDA.jl, cuDNN.jl, and NNlib.jl to be loaded before use.
Arguments
x: The object to move to GPU. Can be aModel,AbstractArray, or other supported types.
Returns
- GPU version of the input object
Examples
using CUDA, cuDNN, NNlib
using SimpleNNs
# Move model to GPU
model = chain(Static(10), Dense(5))
gpu_model = gpu(model)
# Move array to GPU
cpu_array = randn(Float32, 10, 32)
gpu_array = gpu(cpu_array)Notes
- Requires NVIDIA GPU with CUDA support
- CUDA.jl, cuDNN.jl, and NNlib.jl must be loaded before calling this function
- For models, creates a new model with parameters on GPU
- For arrays, converts to CuArray
- Returns input unchanged with warning for unsupported types
SimpleNNs.gradients — Methodgradients(cache::BackpropagationCache)Extracts the gradient array from the backwards pass buffer, filled from use of the backprop! function.
SimpleNNs.has_loss — Methodhas_loss(model::Model)Check whether the model has a loss layer (a layer extending AbstractTargetsLayer) as its final layer.
Returns true if the last layer is a loss layer, false otherwise.
Examples
model = chain(Static(10), Dense(5, activation_fn=relu))
has_loss(model) # false
model_with_loss = add_loss(model, BatchCrossEntropyLoss(targets=zeros(Int, 32), num_classes=5))
has_loss(model_with_loss) # trueSee also add_loss, remove_loss, get_predictions.
SimpleNNs.initialise! — Methodinitialise!(model::Model)Initialise the parameters of a model according to each layer's initialisation scheme.
This function walks through all layers in the model and initialises their weights and biases according to the initialisation method specified in each layer's init field.
Examples
model = chain(
Static(10),
Dense(64, activation_fn=relu, init=HeNormal()),
Dense(10, activation_fn=identity, init=GlorotNormal())
)
initialise!(model)See also: GlorotUniform, GlorotNormal, HeNormal, HeUniform, LeCunNormal
SimpleNNs.parameters — Methodparameters(model::Model)Returns the array used to store the parameters of the model.
Modifying this array will change the parameters of the model.
SimpleNNs.preallocate — Methodpreallocate(model::Model, batch_size::Integer)Creates a buffer to store the intermediate layer outputs of a forward pass, along with the input.
The inputs can be set using set_inputs! and the outputs can be retrieved using get_outputs.
SimpleNNs.preallocate_grads — Methodpreallocategrads(model::Model, batchsize::Integer)
Creates a buffer to store the intermediate arrays needed for backpropagation.
The gradients can be retrieved from the buffer using gradients on the buffer.
SimpleNNs.pullback! — Methodpullback!(input_partials, output_partials, layer)Here, we complete the backpropagation of the partial gradients to the inputs of the current layer. This should be called after backprop!. This method will fill the input_partials buffer with partial gradients calculated via the chain rule from the gradients of the partials from this layer's output.
SimpleNNs.relu — Methodrelu(x)Rectified linear unit activation function.
Computes max(0, x)
Arguments
x: Input value
Returns
Output in range (0, ∞)
SimpleNNs.remove_loss — Methodremove_loss(model::Model)Create a new model with the loss layer removed from the end, if one exists.
This function reconstructs the model chain without the final loss layer. The original model's parameters are copied to the new model.
Arguments
model::Model: The model to remove the loss layer from
Returns
A new Model without the loss layer. If the model doesn't have a loss layer, returns the original model unchanged.
Examples
model_with_loss = chain(
Static(10),
Dense(5, activation_fn=identity),
BatchCrossEntropyLoss(targets=zeros(Int, 32), num_classes=5)
)
model = remove_loss(model_with_loss)
has_loss(model) # falseSee also add_loss, has_loss, get_predictions.
SimpleNNs.reset! — Methodreset!(opt::AbstractOptimiser)Reset the internal state of the optimiser to its initial values.
SimpleNNs.set_inputs! — Methodset_inputs!(cache::ForwardPassCache, inputs)Sets the input array in the forward pass cache.
SimpleNNs.sigmoid — Methodsigmoid(x)Logistic sigmoid activation function.
Computes the sigmoid function: 1 / (1 + exp(-x)).
Arguments
x: Input value
Returns
Output in range (0, 1)
SimpleNNs.tanh_fast — Methodtanh_fast(x)Fast hyperbolic tangent activation function.
Computes an optimized version of the hyperbolic tangent function. This may use approximations for better performance compared to the standard tan for Float32 and Float64.
Arguments
x: Input scalar
Returns
Output in range (-1, 1)
SimpleNNs.update! — Methodupdate!(parameters, gradients, opt::AbstractOptimiser)Update the parameters using the provided gradients and optimiser.
Arguments
parameters: Model parameters to be updatedgradients: Gradients computed from the loss functionopt(AbstractOptimiser): The optimiser instance containing update rules