Getting Started

Firstly, you can add this package directly using the URL:

import Pkg; Pkg.add("https://github.com/JamieMair/SimpleNNs.jl")

Note: This package has plans to be registered and should be available with ] add SimpleNNs in the future.

Once the package is installed, go ahead and load the package.

using SimpleNNs

To start with, let's create a simple test dataset:

batch_size = 256
x = collect(LinRange(0, 2*pi, batch_size)')
y = sin.(x)

Note that we use the adjoint ' so that the last dimension is the batch dimension.

We can now create our small neural network to fit a curve that maps from $x$ to $y$. The syntax will be familiar to users of Flux.jl or SimpleChains.jl.

model = chain(
    Static(1),
    Dense(10, activation_fn=tanh),
    Dense(10, activation_fn=sigmoid),
    Dense(1, activation_fn=identity),
);

Here, we specify the expected feature size of the input, leaving out the batch dimension.

Inference (Forward-Pass)

To run inference with this model, we first need to preallocate a buffer to store the intermediate forward pass values. This preallocation is by design, so that memory is only allocated once at the beginning of training.

forward_buffer = preallocate(model, batch_size);

This buffer also contains the input to the neural network. We can set the inputs to the neural network via

set_inputs!(forward_buffer, x);

The above function can be used to set the new inputs at each epoch.

We can access the flat parameter vector of the model via parameters(model) to initialise the weights of the network, i.e.

using Random
Random.seed!(1234)
params = parameters(model);
randn!(params);
params .*= 0.1;

We can run inference with

forward!(forward_buffer, model);
yhat = get_outputs(forward_buffer);

Training (Backward-Pass)

We can specify a mean-squared error loss via

loss = MSELoss(y);

and preallocate the buffer used for calculating the gradients via back-propagation:

gradient_buffer = preallocate_grads(model, batch_size);

Now we have all the ingredients we need to write a simple training script, making use of the built-in optimisers.

# Setup Adam optimiser
optimiser = AdamOptimiser(gradient_buffer.parameter_gradients; lr=0.01f0)

epochs = 1000
losses = zeros(Float32, epochs)
for i in 1:epochs
    forward!(forward_buffer, model)
    losses[i] = backprop!(gradient_buffer, forward_buffer, model, loss)
    grads = gradients(gradient_buffer) # extract the gradient vector
    # Apply the optimiser
    update!(params, grads, optimiser)
end

We can plot the losses over time, for example using Plots.jl:

using Plots
plot(losses, xlabel="Epochs", ylabel="MSE Loss", lw=2, label=nothing)

Finally, we can run one final forward pass to get the predictions

forward!(forward_buffer, model);
yhat = get_outputs(forward_buffer);

and then plot the predictions

using Plots
plt = plot(x', y', linestyle=:solid, label="Original", lw=2);
plot!(plt, x', yhat', linestyle=:dashdot, label="Prediction", lw=2);
xlabel!("x")
ylabel!("y")
plt

To see an example using convolution layers and GPU training, see the MNIST training example.