API

SimpleNNs.AdamOptimiserMethod
AdamOptimiser(gradients::AbstractArray{T}; lr = Float32(1e-3), beta_1 = 0.9f0, beta_2 = 0.999f0) where {T}

Create an Adam optimiser for gradient-based parameter updates.

Arguments

  • gradients (AbstractArray{T}): Template array matching the shape of gradients to be optimised

Keyword Arguments

  • lr (T): Learning rate (default: 1e-3)
  • beta_1 (T): Exponential decay rate for first moment estimates (default: 0.9)
  • beta_2 (T): Exponential decay rate for second moment estimates (default: 0.999)
source
SimpleNNs.ConvMethod
Conv(kernel_size::NTuple{N, Int}, out_channels::Int; kwargs...)

A convolutional layer with a given kernel size and specified number of output channels.

This can automatically infer the number of input channels based on the preceeding layers.

Keyword Arguments

  • use_bias (default: Val(true)) - Whether or not to add a bias vector to the output. Wrapped in a Val for optimisation.
  • activation_fn (default: identity) - A custom activation function. Note that not all functions are supported by backpropagation.
  • parameter_type (default: Val(Float32)) - The datatype to use for the parameters, wrapped in a Val type.
  • init (default: GlorotNormal()) - Weight initialisation scheme. See Initialiser for available options.
source
SimpleNNs.DenseMethod
Dense(outputs::Integer; kwargs...)

A representation of a dense layer. By default this can be constructed by specifying the desired number of outputs. The input size can be inferred from the rest of the chain when constructing a model.

Keyword Arguments

  • use_bias (default: Val(true)) - Whether or not to add a bias vector to the output. Wrapped in a Val for optimisation.
  • activation_fn (default: identity) - A custom activation function. Note that not all functions are supported by backpropagation.
  • parameter_type (default: Val(Float32)) - The datatype to use for the parameters, wrapped in a Val type.
  • inputs (default: Infer()) - Specify the number of inputs, or infer them from the rest of the model.
  • init (default: GlorotNormal()) - Weight initialisation scheme. See Initialiser for available options.
source
SimpleNNs.FlattenType
Flatten()

Flatten the dimensions of the preceeding layer, leaving the batch dimension unaffected. The output should be (k x n) where k is the product of the non-batch dimensions of the previous layer.

source
SimpleNNs.GlorotNormalType
GlorotNormal()

Glorot normal initialisation (also called Xavier normal). Samples weights from a normal distribution with mean 0 and standard deviation √(2 / (fanin + fanout)).

Best suited for layers with sigmoid or tanh activations.

source
SimpleNNs.GlorotUniformType
GlorotUniform()

Glorot uniform initialisation (also called Xavier uniform). Samples weights from a uniform distribution in the range [-limit, limit] where limit = √(6 / (fanin + fanout)).

Best suited for layers with sigmoid or tanh activations.

source
SimpleNNs.HeNormalType
HeNormal()

He normal initialisation (also called Kaiming normal). Samples weights from a normal distribution with mean 0 and standard deviation √(2 / fan_in).

Best suited for layers with ReLU activations.

source
SimpleNNs.HeUniformType
HeUniform()

He uniform initialisation (also called Kaiming uniform). Samples weights from a uniform distribution in the range [-limit, limit] where limit = √(6 / fan_in).

Best suited for layers with ReLU activations.

source
SimpleNNs.LeCunNormalType
LeCunNormal()

LeCun normal initialisation. Samples weights from a normal distribution with mean 0 and standard deviation √(1 / fan_in).

Best suited for layers with SELU activations.

source
SimpleNNs.LogitCrossEntropyLossType
LogitCrossEntropyLoss(targets, num_classes::Int)

Expects the targets in a single vector containg class labels, which have to be between 1 and num_classes inclusive.

source
SimpleNNs.MSELossType
MSELoss(targets)

Expects the targets in the form (K x N) where K is the output dimension (usually 1) and N is the batch size.

For efficiency, this is just ∑ (y-̂y)² and NOT scaled by a half.

source
SimpleNNs.MaxPoolMethod
MaxPool(pool_size::NTuple{N, Int}; kwargs...)

A convolutional max-pool layer with a given kernel size.

This can automatically infer the necessary sizes if specified.

source
SimpleNNs.RMSPropOptimiserType
RMSPropOptimiser{T, X<:AbstractArray{T}} <: AbstractOptimiser

RMSProp optimiser with exponential moving average of squared gradients.

Fields

  • lr::T: Learning rate
  • rho::T: Exponential decay rate for moving average
  • eps::T: Small constant for numerical stability
  • v::X: Exponential moving average of squared gradients
source
SimpleNNs.RMSPropOptimiserMethod
RMSPropOptimiser(gradients::AbstractArray{T}; lr = Float32(1e-3), rho = 0.9f0, eps = Float32(1e-8)) where {T}

Create an RMSProp optimiser for gradient-based parameter updates.

RMSProp maintains a moving average of squared gradients to adaptively scale the learning rate.

Arguments

  • gradients (AbstractArray{T}): Template array matching the shape of gradients to be optimised

Keyword Arguments

  • lr (T): Learning rate (default: 1e-3)
  • rho (T): Exponential decay rate for moving average of squared gradients (default: 0.9)
  • eps (T): Small constant added to denominator for numerical stability (default: 1e-8)

Examples

opt = RMSPropOptimiser(gradients; lr=0.001f0, rho=0.9f0)
source
SimpleNNs.SGDOptimiserType
SGDOptimiser{T} <: AbstractOptimiser

Stochastic Gradient Descent optimiser with optional momentum.

Fields

  • lr::T: Learning rate
  • momentum::T: Momentum coefficient (0.0 for no momentum)
  • velocity::AbstractArray{T}: Velocity buffer for momentum (internal state)
source
SimpleNNs.SGDOptimiserMethod
SGDOptimiser(gradients::AbstractArray{T}; lr = Float32(1e-3), momentum = 0.0f0) where {T}

Create an SGD optimiser for gradient-based parameter updates.

Arguments

  • gradients (AbstractArray{T}): Template array matching the shape of gradients to be optimised

Keyword Arguments

  • lr (T): Learning rate (default: 1e-3)
  • momentum (T): Momentum coefficient, 0.0 for standard SGD (default: 0.0)

Examples

# Standard SGD
opt = SGDOptimiser(gradients; lr=0.01f0)

# SGD with momentum
opt = SGDOptimiser(gradients; lr=0.01f0, momentum=0.9f0)
source
SimpleNNs.StaticMethod
Static(inputs::Union{Int, NTuple}; kwargs...)

Used for specifying the input type to a neural network. inputs should be a single integer for a dense network, representing the number of features. For a image network, inputs can be a tuple specifying the size of the images in the form (WIDTH, HEIGHT, CHANNELS).

source
SimpleNNs.ZerosType
Zeros()

Initialise all weights to zero. Note: This is generally not recommended for training as it breaks symmetry.

source
Base.deepcopyMethod
Base.deepcopy(model::Model)

Create a deep copy of the model with its own independent parameter array.

This function creates a new model with:

  • A new parameter array (using similar and copyto!)
  • New parameter views for each layer pointing to the new array
  • The same layer structure and configuration

The copied model is completely independent from the original - modifying parameters in one will not affect the other.

Arguments

  • model::Model: The model to copy

Returns

A new Model with copied parameters and structure.

Examples

model = chain(Static(10), Dense(5))
model_copy = deepcopy(model)

# Modify copy - original unchanged
parameters(model_copy) .= 0.0

See also parameters, chain.

source
SimpleNNs.activation_gradient_fnMethod

Dertivatives are used to backpropagate the gradients of the layer outputs back to the activations of that layer. To save space, these are calculated exclusively using the outputs of the layer. Instead of functions written as dy/dx=f(x), we instead write dy/dx = g(y). This can be done for the 3 major functions.

Whenever y is used below, assume this is a function of the output, not the input.

source
SimpleNNs.add_lossMethod
add_loss(model::Model, loss_layer::AbstractTargetsLayer)

Create a new model with the given loss layer appended to the end of the existing model.

This function reconstructs the entire model chain with the loss layer added as the final layer. The original model's parameters are copied to the new model.

Arguments

  • model::Model: The existing model to extend
  • loss_layer::AbstractTargetsLayer: The loss layer to append (e.g., BatchCrossEntropyLoss)

Returns

A new Model with the loss layer appended.

Examples

model = chain(
    Static(10),
    Dense(32, activation_fn=relu),
    Dense(5, activation_fn=identity)
)

# Add a loss layer
targets = zeros(Int, batch_size)
loss_layer = BatchCrossEntropyLoss(targets=targets, num_classes=5)
model_with_loss = add_loss(model, loss_layer)

See also remove_loss, has_loss, get_predictions.

source
SimpleNNs.backprop!Method
backprop!(partials_buffer, gradient_buffer, inputs, outputs, layer)

Backpropagates the partial gradients of the outputs of the current layer into the parameters of the current layer. partial_buffers is used as a buffer for the gradients of the output of this layer. gradient_buffer should be filled up with the gradients of the parameters of the current layer, using the chain rule. inputs is the array fed into the layer and outputs is the output of this layer in the forward pass. layer is the struct containing information about the layer.

source
SimpleNNs.chainMethod
chain(layers...)

Combines the given layer definitions into a single model and propagates the layer sizes through the network.

The first layer must always be a Static layer which specifies the feature size. If this is a simple fully connection network, then the first layer should be Static(nf) where nf is the number of features in your input matrix. Do not specify the batch size in this static input.

The default datatype for most layers is Float32, but this may be changed. The parameters of the entire model must be of the same datatype. This function will create a flat parameter vector for the model which can be accessed using the parameters function.

Examples

A simple dense, fully-connected, neural network which has 3 input features:

model = chain(
    Static(3),
    Dense(10, activation_fn=tanh),
    Dense(10, activation_fn=sigmoid),
    Dense(1, activation_fn=identity),
);

An example convolutional neural network:

# Image size is (WIDTH, HEIGHT, CHANNELS)
img_size = (28, 28, 1)
model = chain(
    Static(img_size),
    Conv((5,5), 16; activation_fn=relu),
    MaxPool((2,2)),
    Conv((3,3), 8; activation_fn=relu),
    MaxPool((4,4)),
    Flatten(),
    Dense(10, activation_fn=identity)
)

See also Static, Dense, Conv, MaxPool, Flatten and preallocate.

source
SimpleNNs.forward!Method
forward!(cache::ForwardPassCache, model::Model)

Execute a forward pass through the neural network model.

This function computes the forward propagation through all layers of the model, storing intermediate results in the pre-allocated cache. This is a zero-allocation operation when used with properly pre-allocated caches.

Arguments

  • cache::ForwardPassCache: Pre-allocated cache containing input data and space for intermediate results
  • model::Model: The neural network model to evaluate

Returns

  • The cache object (for convenience), with updated intermediate and output values

Examples

# Create model and data
model = chain(Static(4), Dense(8, activation_fn=relu), Dense(1))
inputs = randn(Float32, 4, 32)  # 32 samples, 4 features each

# Pre-allocate cache and set inputs
cache = preallocate(model, 32)
set_inputs!(cache, inputs)

# Execute forward pass
forward!(cache, model)

# Get outputs
outputs = get_outputs(cache)

Notes

  • Requires pre-allocated cache from preallocate(model, batch_size)
  • Input data must be set using set_inputs!(cache, inputs) before calling
  • This is a mutating operation that modifies the cache in-place
  • Designed for zero allocations when properly used
  • Works on both CPU and GPU when model and data are on the same device

See also: preallocate, set_inputs!, get_outputs

source
SimpleNNs.get_lossMethod
get_loss(model::Model)

Returns the loss layer if the model has one, otherwise returns nothing.

If the model has a loss layer (a layer extending AbstractTargetsLayer) as its final layer, this function returns that layer. Otherwise, it returns nothing.

Returns

  • The loss layer if present
  • nothing if the model has no loss layer

Examples

model = chain(Static(10), Dense(5, activation_fn=relu))
get_loss(model)  # nothing

model_with_loss = add_loss(model, BatchCrossEntropyLoss(targets=zeros(Int, 32), num_classes=5))
loss_layer = get_loss(model_with_loss)  # Returns the BatchCrossEntropyLoss layer

See also has_loss, add_loss, remove_loss.

source
SimpleNNs.get_predictionsMethod
get_predictions(model::Model, forward_cache)

Extract predictions from the forward cache based on whether the model has a loss layer.

If the model does not have a loss layer, returns the final output from the cache. If the model has a loss layer, returns the input to the loss layer (i.e., the output of the second-to-last layer).

Arguments

  • model::Model: The model that was used for the forward pass
  • forward_cache: The forward cache containing layer outputs

Returns

An array containing the model's predictions (before the loss computation if applicable).

Examples

model = chain(Static(10), Dense(5, activation_fn=identity))
forward_cache = preallocate(model, batch_size)
set_inputs!(forward_cache, inputs)
forward!(forward_cache, model)

predictions = get_predictions(model, forward_cache)  # Returns final layer output

# With loss layer
model_with_loss = add_loss(model, loss_layer)
forward_cache_with_loss = preallocate(model_with_loss, batch_size)
set_inputs!(forward_cache_with_loss, inputs)
forward!(forward_cache_with_loss, model_with_loss)

predictions = get_predictions(model_with_loss, forward_cache_with_loss)  # Returns output before loss

See also add_loss, remove_loss, has_loss, get_outputs.

source
SimpleNNs.gpuMethod
gpu(x)

Move data or models to GPU using CUDA. This function requires CUDA.jl, cuDNN.jl, and NNlib.jl to be loaded before use.

Arguments

  • x: The object to move to GPU. Can be a Model, AbstractArray, or other supported types.

Returns

  • GPU version of the input object

Examples

using CUDA, cuDNN, NNlib
using SimpleNNs

# Move model to GPU
model = chain(Static(10), Dense(5))
gpu_model = gpu(model)

# Move array to GPU  
cpu_array = randn(Float32, 10, 32)
gpu_array = gpu(cpu_array)

Notes

  • Requires NVIDIA GPU with CUDA support
  • CUDA.jl, cuDNN.jl, and NNlib.jl must be loaded before calling this function
  • For models, creates a new model with parameters on GPU
  • For arrays, converts to CuArray
  • Returns input unchanged with warning for unsupported types
source
SimpleNNs.gradientsMethod
gradients(cache::BackpropagationCache)

Extracts the gradient array from the backwards pass buffer, filled from use of the backprop! function.

source
SimpleNNs.has_lossMethod
has_loss(model::Model)

Check whether the model has a loss layer (a layer extending AbstractTargetsLayer) as its final layer.

Returns true if the last layer is a loss layer, false otherwise.

Examples

model = chain(Static(10), Dense(5, activation_fn=relu))
has_loss(model)  # false

model_with_loss = add_loss(model, BatchCrossEntropyLoss(targets=zeros(Int, 32), num_classes=5))
has_loss(model_with_loss)  # true

See also add_loss, remove_loss, get_predictions.

source
SimpleNNs.initialise!Method
initialise!(model::Model)

Initialise the parameters of a model according to each layer's initialisation scheme.

This function walks through all layers in the model and initialises their weights and biases according to the initialisation method specified in each layer's init field.

Examples

model = chain(
    Static(10),
    Dense(64, activation_fn=relu, init=HeNormal()),
    Dense(10, activation_fn=identity, init=GlorotNormal())
)
initialise!(model)

See also: GlorotUniform, GlorotNormal, HeNormal, HeUniform, LeCunNormal

source
SimpleNNs.parametersMethod
parameters(model::Model)

Returns the array used to store the parameters of the model.

Modifying this array will change the parameters of the model.

source
SimpleNNs.preallocateMethod
preallocate(model::Model, batch_size::Integer)

Creates a buffer to store the intermediate layer outputs of a forward pass, along with the input.

The inputs can be set using set_inputs! and the outputs can be retrieved using get_outputs.

source
SimpleNNs.preallocate_gradsMethod

preallocategrads(model::Model, batchsize::Integer)

Creates a buffer to store the intermediate arrays needed for backpropagation.

The gradients can be retrieved from the buffer using gradients on the buffer.

source
SimpleNNs.pullback!Method
pullback!(input_partials, output_partials, layer)

Here, we complete the backpropagation of the partial gradients to the inputs of the current layer. This should be called after backprop!. This method will fill the input_partials buffer with partial gradients calculated via the chain rule from the gradients of the partials from this layer's output.

source
SimpleNNs.reluMethod
relu(x)

Rectified linear unit activation function.

Computes max(0, x)

Arguments

x: Input value

Returns

Output in range (0, ∞)

source
SimpleNNs.remove_lossMethod
remove_loss(model::Model)

Create a new model with the loss layer removed from the end, if one exists.

This function reconstructs the model chain without the final loss layer. The original model's parameters are copied to the new model.

Arguments

  • model::Model: The model to remove the loss layer from

Returns

A new Model without the loss layer. If the model doesn't have a loss layer, returns the original model unchanged.

Examples

model_with_loss = chain(
    Static(10),
    Dense(5, activation_fn=identity),
    BatchCrossEntropyLoss(targets=zeros(Int, 32), num_classes=5)
)

model = remove_loss(model_with_loss)
has_loss(model)  # false

See also add_loss, has_loss, get_predictions.

source
SimpleNNs.reset!Method
reset!(opt::AbstractOptimiser)

Reset the internal state of the optimiser to its initial values.

source
SimpleNNs.sigmoidMethod
sigmoid(x)

Logistic sigmoid activation function.

Computes the sigmoid function: 1 / (1 + exp(-x)).

Arguments

x: Input value

Returns

Output in range (0, 1)

source
SimpleNNs.tanh_fastMethod
tanh_fast(x)

Fast hyperbolic tangent activation function.

Computes an optimized version of the hyperbolic tangent function. This may use approximations for better performance compared to the standard tan for Float32 and Float64.

Arguments

x: Input scalar

Returns

Output in range (-1, 1)

source
SimpleNNs.update!Method
update!(parameters, gradients, opt::AbstractOptimiser)

Update the parameters using the provided gradients and optimiser.

Arguments

  • parameters: Model parameters to be updated
  • gradients: Gradients computed from the loss function
  • opt (AbstractOptimiser): The optimiser instance containing update rules
source