In this visualization, we train a neural network N to approximate the sine function in the sense that N(x) should be approximately sin(x) whenever |x| is small enough. In particular, we want to minimize the mean distance squared between N(x) and sin(x) for all training values x.
The neural network is of the form Chain(Dense(1,mn),SkipConnection(Dense(mn,mn,atan),+),SkipConnection(Dense(mn,mn,atan),+),SkipConnection(Dense(mn,mn,atan),+),SkipConnection(Dense(mn,mn,atan),+),SkipConnection(Dense(mn,mn,atan),+),SkipConnection(Dense(mn,mn,atan),+),Dense(mn,1)) where mn=40.
In particular, the neural network computes a function from the field of real numbers to itself. The visualization shows the graph of y=N(x).
The neural network is trained to minimize the L_2 distance between N(x) and sin(2*pi*x) on the interval [-d,d] where d is the difficulty level. The difficulty level is a self-adjusting constant that increases whenever the neural network approximates sin(2*pi*x) on [-d,d] well and decreases otherwise.
The layers in this network with skip connections were initialized with zero weight matrices.
The notion of a neural network is not my own. I am simply making these sorts of visualizations in order to analyze the behavior of neural networks. We observe that the neural network exhibits some symmetry around the origin which is a good sign for AI interpretability and safety. We also observe that the neural network is unable to generalize/approximate the sine function outside the interval [-d,d]. This shows that neural networks may behave very poorly on data that is slightly out of the training distribution.
The neural network was able to approximate sin(2*pi*x) on [-d,d] when d was about 12, but the neural network was not able to approximate sin(2*pi*x) for much larger values of d. On the other hand, the neural network has 9,961 parameters and can easily use these parameters to memorize thousands of real numbers. This means that this neural network has a much more limited capacity to reproduce the sine function than it does to memorize thousands of real numbers. I hypothesize that this limited ability to approximate sine is mainly due to how the inputs are all in a 1 dimensional space. A neural network that first transforms the input x into an object L(x) where L([-d,d]) is highly non-linear would probably perform much better on this task.
It is possible to train a neural network that computes a function from [0,1] to real number field that exhibits an exponential (in the number of layers) number of oscillations simply by iterating the function L from [0,1] to [0,1] defined by L(x)=2x for x in [0,1/2] and L(x)=2-2x for x in [1/2,1] as many times as one would like. But the iterations of L have very high gradients, and I do not know how to train functions with very large gradients.
Unless otherwise stated, all algorithms featured on this channel are my own. You can go to github.com/spo... to support my research on machine learning algorithms. I am also available to consult on the use of safe and interpretable AI for your business. I am designing machine learning algorithms for AI safety such as LSRDRs. In particular, my algorithms are designed to be more predictable and understandable to humans than other machine learning algorithms, and my algorithms can be used to interpret more complex AI systems such as neural networks. With more understandable AI, we can ensure that AI systems will be used responsibly and that we will avoid catastrophic AI scenarios. There is currently nobody else who is working on LSRDRs, so your support will ensure a unique approach to AI safety.
21 сен 2024