API

SimpleNNs.AdamOptimiser — Method

AdamOptimiser(gradients::AbstractArray{T}; lr = Float32(1e-3), beta_1 = 0.9f0, beta_2 = 0.999f0) where {T}

Create an Adam optimiser for gradient-based parameter updates.

Arguments

gradients (AbstractArray{T}): Template array matching the shape of gradients to be optimised

Keyword Arguments

lr (T): Learning rate (default: 1e-3)
beta_1 (T): Exponential decay rate for first moment estimates (default: 0.9)
beta_2 (T): Exponential decay rate for second moment estimates (default: 0.999)

source

SimpleNNs.Conv — Method

Conv(kernel_size::NTuple{N, Int}, out_channels::Int; kwargs...)

A convolutional layer with a given kernel size and specified number of output channels.

This can automatically infer the number of input channels based on the preceeding layers.

Keyword Arguments

use_bias (default: Val(true)) - Whether or not to add a bias vector to the output. Wrapped in a Val for optimisation.
activation_fn (default: identity) - A custom activation function. Note that not all functions are supported by backpropagation.
parameter_type (default: Val(Float32)) - The datatype to use for the parameters, wrapped in a Val type.
init (default: GlorotNormal()) - Weight initialisation scheme. See Initialiser for available options.

source

SimpleNNs.Dense — Method

Dense(outputs::Integer; kwargs...)

A representation of a dense layer. By default this can be constructed by specifying the desired number of outputs. The input size can be inferred from the rest of the chain when constructing a model.

Keyword Arguments

use_bias (default: Val(true)) - Whether or not to add a bias vector to the output. Wrapped in a Val for optimisation.
activation_fn (default: identity) - A custom activation function. Note that not all functions are supported by backpropagation.
parameter_type (default: Val(Float32)) - The datatype to use for the parameters, wrapped in a Val type.
inputs (default: Infer()) - Specify the number of inputs, or infer them from the rest of the model.
init (default: GlorotNormal()) - Weight initialisation scheme. See Initialiser for available options.

source

SimpleNNs.Flatten — Type

Flatten()

Flatten the dimensions of the preceeding layer, leaving the batch dimension unaffected. The output should be (k x n) where k is the product of the non-batch dimensions of the previous layer.

source

SimpleNNs.GlorotNormal — Type

GlorotNormal()

Glorot normal initialisation (also called Xavier normal). Samples weights from a normal distribution with mean 0 and standard deviation √(2 / (fanin + fanout)).

Best suited for layers with sigmoid or tanh activations.

source

SimpleNNs.GlorotUniform — Type

GlorotUniform()

Glorot uniform initialisation (also called Xavier uniform). Samples weights from a uniform distribution in the range [-limit, limit] where limit = √(6 / (fanin + fanout)).

Best suited for layers with sigmoid or tanh activations.

source

SimpleNNs.HeNormal — Type

HeNormal()

He normal initialisation (also called Kaiming normal). Samples weights from a normal distribution with mean 0 and standard deviation √(2 / fan_in).

Best suited for layers with ReLU activations.

source

SimpleNNs.HeUniform — Type

HeUniform()

He uniform initialisation (also called Kaiming uniform). Samples weights from a uniform distribution in the range [-limit, limit] where limit = √(6 / fan_in).

Best suited for layers with ReLU activations.

source

SimpleNNs.Initialiser — Type

Abstract type for weight initialisation strategies.

source

SimpleNNs.LeCunNormal — Type

LeCunNormal()

LeCun normal initialisation. Samples weights from a normal distribution with mean 0 and standard deviation √(1 / fan_in).

Best suited for layers with SELU activations.

source

SimpleNNs.LogitCrossEntropyLoss — Type

LogitCrossEntropyLoss(targets, num_classes::Int)

Expects the targets in a single vector containg class labels, which have to be between 1 and num_classes inclusive.

source

SimpleNNs.MSELoss — Type

MSELoss(targets)

Expects the targets in the form (K x N) where K is the output dimension (usually 1) and N is the batch size.

For efficiency, this is just ∑ (y-̂y)² and NOT scaled by a half.

source

SimpleNNs.MaxPool — Method

MaxPool(pool_size::NTuple{N, Int}; kwargs...)

A convolutional max-pool layer with a given kernel size.

This can automatically infer the necessary sizes if specified.

source

SimpleNNs.RMSPropOptimiser — Type

RMSPropOptimiser{T, X<:AbstractArray{T}} <: AbstractOptimiser

RMSProp optimiser with exponential moving average of squared gradients.

Fields

lr::T: Learning rate
rho::T: Exponential decay rate for moving average
eps::T: Small constant for numerical stability
v::X: Exponential moving average of squared gradients

source

SimpleNNs.RMSPropOptimiser — Method

RMSPropOptimiser(gradients::AbstractArray{T}; lr = Float32(1e-3), rho = 0.9f0, eps = Float32(1e-8)) where {T}

Create an RMSProp optimiser for gradient-based parameter updates.

RMSProp maintains a moving average of squared gradients to adaptively scale the learning rate.

Arguments

gradients (AbstractArray{T}): Template array matching the shape of gradients to be optimised

Keyword Arguments

lr (T): Learning rate (default: 1e-3)
rho (T): Exponential decay rate for moving average of squared gradients (default: 0.9)
eps (T): Small constant added to denominator for numerical stability (default: 1e-8)

Examples

opt = RMSPropOptimiser(gradients; lr=0.001f0, rho=0.9f0)

source

SimpleNNs.SGDOptimiser — Type

SGDOptimiser{T} <: AbstractOptimiser

Stochastic Gradient Descent optimiser with optional momentum.

Fields

lr::T: Learning rate
momentum::T: Momentum coefficient (0.0 for no momentum)
velocity::AbstractArray{T}: Velocity buffer for momentum (internal state)

source

SimpleNNs.SGDOptimiser — Method

SGDOptimiser(gradients::AbstractArray{T}; lr = Float32(1e-3), momentum = 0.0f0) where {T}

Create an SGD optimiser for gradient-based parameter updates.

Arguments

gradients (AbstractArray{T}): Template array matching the shape of gradients to be optimised

Keyword Arguments

lr (T): Learning rate (default: 1e-3)
momentum (T): Momentum coefficient, 0.0 for standard SGD (default: 0.0)

Examples

# Standard SGD
opt = SGDOptimiser(gradients; lr=0.01f0)

# SGD with momentum
opt = SGDOptimiser(gradients; lr=0.01f0, momentum=0.9f0)

source

SimpleNNs.Static — Method

Static(inputs::Union{Int, NTuple}; kwargs...)

Used for specifying the input type to a neural network. inputs should be a single integer for a dense network, representing the number of features. For a image network, inputs can be a tuple specifying the size of the images in the form (WIDTH, HEIGHT, CHANNELS).

source

SimpleNNs.Zeros — Type

Zeros()

Initialise all weights to zero. Note: This is generally not recommended for training as it breaks symmetry.

source

Base.deepcopy — Method

Base.deepcopy(model::Model)

Create a deep copy of the model with its own independent parameter array.

This function creates a new model with:

A new parameter array (using similar and copyto!)
New parameter views for each layer pointing to the new array
The same layer structure and configuration

The copied model is completely independent from the original - modifying parameters in one will not affect the other.

Arguments

model::Model: The model to copy

Returns

A new Model with copied parameters and structure.

Examples

model = chain(Static(10), Dense(5))
model_copy = deepcopy(model)

# Modify copy - original unchanged
parameters(model_copy) .= 0.0

See also parameters, chain.

source

SimpleNNs.activation_gradient_fn — Method

Dertivatives are used to backpropagate the gradients of the layer outputs back to the activations of that layer. To save space, these are calculated exclusively using the outputs of the layer. Instead of functions written as dy/dx=f(x), we instead write dy/dx = g(y). This can be done for the 3 major functions.

Whenever y is used below, assume this is a function of the output, not the input.

source

SimpleNNs.add_loss — Method

add_loss(model::Model, loss_layer::AbstractTargetsLayer)

Create a new model with the given loss layer appended to the end of the existing model.

This function reconstructs the entire model chain with the loss layer added as the final layer. The original model's parameters are copied to the new model.

Arguments

model::Model: The existing model to extend
loss_layer::AbstractTargetsLayer: The loss layer to append (e.g., BatchCrossEntropyLoss)

Returns

A new Model with the loss layer appended.

Examples

model = chain(
    Static(10),
    Dense(32, activation_fn=relu),
    Dense(5, activation_fn=identity)
)

# Add a loss layer
targets = zeros(Int, batch_size)
loss_layer = BatchCrossEntropyLoss(targets=targets, num_classes=5)
model_with_loss = add_loss(model, loss_layer)

source

SimpleNNs.backprop! — Method

backprop!(partials_buffer, gradient_buffer, inputs, outputs, layer)

Backpropagates the partial gradients of the outputs of the current layer into the parameters of the current layer. partial_buffers is used as a buffer for the gradients of the output of this layer. gradient_buffer should be filled up with the gradients of the parameters of the current layer, using the chain rule. inputs is the array fed into the layer and outputs is the output of this layer in the forward pass. layer is the struct containing information about the layer.

source

SimpleNNs.chain — Method

chain(layers...)

Combines the given layer definitions into a single model and propagates the layer sizes through the network.

The first layer must always be a Static layer which specifies the feature size. If this is a simple fully connection network, then the first layer should be Static(nf) where nf is the number of features in your input matrix. Do not specify the batch size in this static input.

The default datatype for most layers is Float32, but this may be changed. The parameters of the entire model must be of the same datatype. This function will create a flat parameter vector for the model which can be accessed using the parameters function.

Examples

A simple dense, fully-connected, neural network which has 3 input features:

model = chain(
    Static(3),
    Dense(10, activation_fn=tanh),
    Dense(10, activation_fn=sigmoid),
    Dense(1, activation_fn=identity),
);

An example convolutional neural network:

# Image size is (WIDTH, HEIGHT, CHANNELS)
img_size = (28, 28, 1)
model = chain(
    Static(img_size),
    Conv((5,5), 16; activation_fn=relu),
    MaxPool((2,2)),
    Conv((3,3), 8; activation_fn=relu),
    MaxPool((4,4)),
    Flatten(),
    Dense(10, activation_fn=identity)
)

See also Static, Dense, Conv, MaxPool, Flatten and preallocate.

source

SimpleNNs.forward! — Method

forward!(cache::ForwardPassCache, model::Model)

Execute a forward pass through the neural network model.

This function computes the forward propagation through all layers of the model, storing intermediate results in the pre-allocated cache. This is a zero-allocation operation when used with properly pre-allocated caches.

Arguments

cache::ForwardPassCache: Pre-allocated cache containing input data and space for intermediate results
model::Model: The neural network model to evaluate

Returns

The cache object (for convenience), with updated intermediate and output values

Examples

# Create model and data
model = chain(Static(4), Dense(8, activation_fn=relu), Dense(1))
inputs = randn(Float32, 4, 32)  # 32 samples, 4 features each

# Pre-allocate cache and set inputs
cache = preallocate(model, 32)
set_inputs!(cache, inputs)

# Execute forward pass
forward!(cache, model)

# Get outputs
outputs = get_outputs(cache)

Notes

Requires pre-allocated cache from preallocate(model, batch_size)
Input data must be set using set_inputs!(cache, inputs) before calling
This is a mutating operation that modifies the cache in-place
Designed for zero allocations when properly used
Works on both CPU and GPU when model and data are on the same device

source

SimpleNNs.get_loss — Method

get_loss(model::Model)

Returns the loss layer if the model has one, otherwise returns nothing.

If the model has a loss layer (a layer extending AbstractTargetsLayer) as its final layer, this function returns that layer. Otherwise, it returns nothing.

Returns

The loss layer if present
nothing if the model has no loss layer

Examples

model = chain(Static(10), Dense(5, activation_fn=relu))
get_loss(model)  # nothing

model_with_loss = add_loss(model, BatchCrossEntropyLoss(targets=zeros(Int, 32), num_classes=5))
loss_layer = get_loss(model_with_loss)  # Returns the BatchCrossEntropyLoss layer

source

SimpleNNs.get_outputs — Method

get_outputs(cache::ForwardPassCache)

Gets the last output from the forward pass buffer.

source

SimpleNNs.get_predictions — Method

get_predictions(model::Model, forward_cache)

Extract predictions from the forward cache based on whether the model has a loss layer.

If the model does not have a loss layer, returns the final output from the cache. If the model has a loss layer, returns the input to the loss layer (i.e., the output of the second-to-last layer).

Arguments

model::Model: The model that was used for the forward pass
forward_cache: The forward cache containing layer outputs

Returns

An array containing the model's predictions (before the loss computation if applicable).

Examples

model = chain(Static(10), Dense(5, activation_fn=identity))
forward_cache = preallocate(model, batch_size)
set_inputs!(forward_cache, inputs)
forward!(forward_cache, model)

predictions = get_predictions(model, forward_cache)  # Returns final layer output

# With loss layer
model_with_loss = add_loss(model, loss_layer)
forward_cache_with_loss = preallocate(model_with_loss, batch_size)
set_inputs!(forward_cache_with_loss, inputs)
forward!(forward_cache_with_loss, model_with_loss)

predictions = get_predictions(model_with_loss, forward_cache_with_loss)  # Returns output before loss

source

SimpleNNs.gpu — Method

gpu(x)

Move data or models to GPU using CUDA. This function requires CUDA.jl, cuDNN.jl, and NNlib.jl to be loaded before use.

Arguments

x: The object to move to GPU. Can be a Model, AbstractArray, or other supported types.

Returns

GPU version of the input object

Examples

using CUDA, cuDNN, NNlib
using SimpleNNs

# Move model to GPU
model = chain(Static(10), Dense(5))
gpu_model = gpu(model)

# Move array to GPU  
cpu_array = randn(Float32, 10, 32)
gpu_array = gpu(cpu_array)

Notes

Requires NVIDIA GPU with CUDA support
CUDA.jl, cuDNN.jl, and NNlib.jl must be loaded before calling this function
For models, creates a new model with parameters on GPU
For arrays, converts to CuArray
Returns input unchanged with warning for unsupported types

source

SimpleNNs.gradients — Method

gradients(cache::BackpropagationCache)

Extracts the gradient array from the backwards pass buffer, filled from use of the backprop! function.

source

SimpleNNs.has_loss — Method

has_loss(model::Model)

Check whether the model has a loss layer (a layer extending AbstractTargetsLayer) as its final layer.

Returns true if the last layer is a loss layer, false otherwise.

Examples

model = chain(Static(10), Dense(5, activation_fn=relu))
has_loss(model)  # false

model_with_loss = add_loss(model, BatchCrossEntropyLoss(targets=zeros(Int, 32), num_classes=5))
has_loss(model_with_loss)  # true

source

SimpleNNs.initialise! — Method

initialise!(model::Model)

Initialise the parameters of a model according to each layer's initialisation scheme.

This function walks through all layers in the model and initialises their weights and biases according to the initialisation method specified in each layer's init field.

Examples

model = chain(
    Static(10),
    Dense(64, activation_fn=relu, init=HeNormal()),
    Dense(10, activation_fn=identity, init=GlorotNormal())
)
initialise!(model)

source

SimpleNNs.parameters — Method

parameters(model::Model)

Returns the array used to store the parameters of the model.

Modifying this array will change the parameters of the model.

source

SimpleNNs.preallocate — Method

preallocate(model::Model, batch_size::Integer)

Creates a buffer to store the intermediate layer outputs of a forward pass, along with the input.

The inputs can be set using set_inputs! and the outputs can be retrieved using get_outputs.

source

SimpleNNs.preallocate_grads — Method

preallocategrads(model::Model, batchsize::Integer)

Creates a buffer to store the intermediate arrays needed for backpropagation.

The gradients can be retrieved from the buffer using gradients on the buffer.

source

SimpleNNs.pullback! — Method

pullback!(input_partials, output_partials, layer)

Here, we complete the backpropagation of the partial gradients to the inputs of the current layer. This should be called after backprop!. This method will fill the input_partials buffer with partial gradients calculated via the chain rule from the gradients of the partials from this layer's output.

source

SimpleNNs.relu — Method

relu(x)

Rectified linear unit activation function.

Computes max(0, x)

Arguments

x: Input value

Returns

Output in range (0, ∞)

source

SimpleNNs.remove_loss — Method

remove_loss(model::Model)

Create a new model with the loss layer removed from the end, if one exists.

This function reconstructs the model chain without the final loss layer. The original model's parameters are copied to the new model.

Arguments

model::Model: The model to remove the loss layer from

Returns

A new Model without the loss layer. If the model doesn't have a loss layer, returns the original model unchanged.

Examples

model_with_loss = chain(
    Static(10),
    Dense(5, activation_fn=identity),
    BatchCrossEntropyLoss(targets=zeros(Int, 32), num_classes=5)
)

model = remove_loss(model_with_loss)
has_loss(model)  # false

source

SimpleNNs.reset! — Method

reset!(opt::AbstractOptimiser)

Reset the internal state of the optimiser to its initial values.

source

SimpleNNs.set_inputs! — Method

set_inputs!(cache::ForwardPassCache, inputs)

Sets the input array in the forward pass cache.

source

SimpleNNs.sigmoid — Method

sigmoid(x)

Logistic sigmoid activation function.

Computes the sigmoid function: 1 / (1 + exp(-x)).

Arguments

x: Input value

Returns

Output in range (0, 1)

source

SimpleNNs.tanh_fast — Method

tanh_fast(x)

Fast hyperbolic tangent activation function.

Computes an optimized version of the hyperbolic tangent function. This may use approximations for better performance compared to the standard tan for Float32 and Float64.

Arguments

x: Input scalar

Returns

Output in range (-1, 1)

source

SimpleNNs.update! — Method

update!(parameters, gradients, opt::AbstractOptimiser)

Update the parameters using the provided gradients and optimiser.

Arguments

parameters: Model parameters to be updated
gradients: Gradients computed from the loss function
opt (AbstractOptimiser): The optimiser instance containing update rules

source