Getting Started
Firstly, you can add this package directly using the URL:
import Pkg; Pkg.add("https://github.com/JamieMair/SimpleNNs.jl")Note: This package has plans to be registered and should be available with ] add SimpleNNs in the future.
Once the package is installed, go ahead and load the package.
using SimpleNNsTo start with, let's create a simple test dataset:
batch_size = 256
x = collect(LinRange(0, 2*pi, batch_size)')
y = sin.(x)Note that we use the adjoint ' so that the last dimension is the batch dimension.
We can now create our small neural network to fit a curve that maps from $x$ to $y$. The syntax will be familiar to users of Flux.jl or SimpleChains.jl.
model = chain(
Static(1),
Dense(10, activation_fn=tanh),
Dense(10, activation_fn=sigmoid),
Dense(1, activation_fn=identity),
);Here, we specify the expected feature size of the input, leaving out the batch dimension.
Inference (Forward-Pass)
To run inference with this model, we first need to preallocate a buffer to store the intermediate forward pass values. This preallocation is by design, so that memory is only allocated once at the beginning of training.
forward_buffer = preallocate(model, batch_size);This buffer also contains the input to the neural network. We can set the inputs to the neural network via
set_inputs!(forward_buffer, x);The above function can be used to set the new inputs at each epoch.
We can access the flat parameter vector of the model via parameters(model) to initialise the weights of the network, i.e.
using Random
Random.seed!(1234)
params = parameters(model);
randn!(params);
params .*= 0.1;We can run inference with
forward!(forward_buffer, model);
yhat = get_outputs(forward_buffer);Training (Backward-Pass)
We can specify a mean-squared error loss via
loss = MSELoss(y);and preallocate the buffer used for calculating the gradients via back-propagation:
gradient_buffer = preallocate_grads(model, batch_size);Now we have all the ingredients we need to write a simple training script, making use of the built-in optimisers.
# Setup Adam optimiser
optimiser = AdamOptimiser(gradient_buffer.parameter_gradients; lr=0.01f0)
epochs = 1000
losses = zeros(Float32, epochs)
for i in 1:epochs
forward!(forward_buffer, model)
losses[i] = backprop!(gradient_buffer, forward_buffer, model, loss)
grads = gradients(gradient_buffer) # extract the gradient vector
# Apply the optimiser
update!(params, grads, optimiser)
endWe can plot the losses over time, for example using Plots.jl:
using Plots
plot(losses, xlabel="Epochs", ylabel="MSE Loss", lw=2, label=nothing)Finally, we can run one final forward pass to get the predictions
forward!(forward_buffer, model);
yhat = get_outputs(forward_buffer);and then plot the predictions
using Plots
plt = plot(x', y', linestyle=:solid, label="Original", lw=2);
plot!(plt, x', yhat', linestyle=:dashdot, label="Prediction", lw=2);
xlabel!("x")
ylabel!("y")
pltTo see an example using convolution layers and GPU training, see the MNIST training example.