A really nice and helpful video! I'm just reading through the books about digital signal processing but mostly there are formulas with integrals. But I want to cross-correlate two signals in an FPGA (looking for a barker code in a stochastic signal) and it doesn't understand integrals. This video gives me an idea of how to do it - thank you!
Thanks a lot.. I did not expected it to be so simple. Directly going to look for other of your tutorials. Maybe I really have a chance to understand the math I did not have an opportunity to learn until now... and how to use it in Matlab. Once more great thanks for your affords of providing such a nice (and dummies-friendly) explanation.
But hmmm.. I have issue about the lag is 2 Cuz I thought data2 shall be source and data1 shall be a time-lag series It make sense in real. So lag is 2 shall be a negative time (-2) in real. Btw its opposite. Wonder this right or not, please explain it for me. Thanks alot.
Hi, Nice tutorial. Thanks. I have a small query. I am supposed to calculate the "average of cross correlation" of over 20 series at zero lag. If i do it pair wise, i am assuming there would be 20C2 (20 Choose 2 ) coefficients which is a very high number and then i will have to calculate the average. Is there an easier way to do it ? perhaps something that can be implemented on excel ? Many thanks.
Actually, when I plot lags against xcorr I get on the Y axes values up to 150. How to make sense of this? More generally, how to know if two signals are crosscorrelated? Is there an objective measure?
Great explanation, but I'm curious why Matlab's xcorr function uses FFT to calculate the correlation, instead of shifted and trunctates Hadamard products of the signals?
Hi! Why correlation looks different when we use digital samples like [.... 1 0 1 1 1 0 ..... ]? In this case, final plot shows when correlation occurs (which is ok), but the rest values seems to form an triangular shape along the "lag" values
I have a doubt with some homework I have to do. The thing is that I have a .wav signal and I have to compute its autocorrelation. I wrote this code in a script: [xt,fs]=wavread('signal8.wav'); Nt=length(xt); t1=0; t2=Nt/fs; t0=(t2-t1)/Nt; t=t1:t0:t2-t0; %Compute the autocorrelation, phitau and the shift tau using the xcorr function [phitau,tau]=xcorr('signal8.wav'); close all; plot(t,xt); xlabel('t sek'); ylabel('x(t)'); figure; plot(t0*tau,t0*phitau); xlabel('tau sek'); ylabel('phi(tau)'); and at the end in the command window I try to execute my script but I have an error like this: Undefined function 'fft' for input arguments of type 'char'. Error in xcorr>vectorXcorr (line 105) X = fft(x,2^nextpow2(2*M-1)); Error in xcorr (line 53) [c,M,N] = vectorXcorr(x,autoFlag,varargin{:}); Error in lab4b (line 8) [phitau,tau]=xcorr('signal8.wav'); Could you help me with there problem?
Great explanation. I have one problem with this approach, maybe you can clear things up. I have two signals which look like peaks correlate where one peak is towards the beginning of signal 'A', and the other peak is towards the end of signal 'B'. Using cross correlation, the lag (which is large) to allign these samples doesn't provide the largest correlation value purely because there are less points involved in calculating the the correlation at this lag i.e. because many of the data points on one signal don't have an associated point on the other signal to be multiplied by because the two signals now have only a small region of overlap. I hope I have explained that comprehendably. Do you know of a solution for this/is this a known problem of cross-correlation, or am I missing something major in my understanding. Thanks in advance.
+volcEmpire In matlab there is an unbiased version of the xcorr (cross correlation) function. I think this just divides each correlation measure by the 'overlapping' vertically aligned samples which gives more weight to the correlations associated with larger lags. You should be careful when using this technique as sometimes the correlation measures at large lags can be excessively scaled.
Hi David, excellent video. I'm using excel 2003 but with a vast set of data (over 40,000 rows) is there a formula to calculate the correlation sequence value without having to individually multiply each numerical value associated with each sample? This is killing me!
You could downsample your data before correlating. Or you could cross correlate over a smaller range of lags. Both of these approaches would require a good understanding of the data you are working with to avoid missing useful info. Alternative you could use octave to process your data (an online version is available at octave-online.net/)
i have a query that i was hoping you would be able to help me with, for my final year project i have been researching into calculating distance using sound on an iphone 6. I have been playing short frequency sweeps on one iphone and recording the data on another phone sitting on top of the other. What i'm planning to do is calculate the delay between the initial sound and the reflected sound and combining that with the speed of sound to give me the distance between a wall and the iphone. however i'm struggling to do so. I know you are a wizard on MatLab and was wondering if there was any techniques or methods to approach in calculating that time delay within MatLab or Audacity.
+Khash Ghalam This should be doable, though I'm not sure what kind of resolution you'll get. The key is to send a very short but high amplitude, high frequency impulse (a click) and record it on the second phone. Ideally, the second phone should record two clicks, one directly from the first phone, and one (much weaker) from the reflected surface. Auto-correlate the signal to determine the delay between the two pulses. That delay is the distance. Complicating factors will be the limited bandwidth of the audio circuits at high frequency distorting your pulse (that's why this is usually done with ultrasonic transducers) and the multipath and smearing of the return signals since it isn't going to bounce off of just one point on the wall, but off of multiple points with slightly different times.
Hello David. Great tutorials. I have one question, I want to derive approaches for the signals are not aligned vertically. In normalized correlation also the data points are vertically aligned. How can we derive correlation , normalized correlation for non vertical aligned signals.
so if we have the largest number in correlation sequence (in this vide, it is 23.18 with lag 2) means that there is highest similarity but at the end of video, you said that between 0 to 2 the signals are most similar which means the highest similarity at at lag1. It is disconnected story. What is the criteria to select the correlation sequence with highest similarity?
I'm not sure where I said "said that between 0 to 2 the signals are most similar which means the highest similarity at at lag1". If I did then I was incorrect. The lag at which the signals are most similar is at a lag of 2 samples.
yes. In the event that you have two sequences of numbers that do not have the same number of elements you can either zero-pad the shorter one or truncate the longer one.
David Dorran magma169 No they do not. Correct your intuition in Matlab if you can. The resultant of xcorr(a,b) will be of length a.length + b.length - 1 and the 'zero index' will be the (b.length -1)th entry.
Andrew Gallasch Perhaps we have different versions of matlab - I have 7.11 (R2010b) and it always returns 2N-1 correlation value, where N is the length of the longest input sequence. There is also a note in the help on xcorr that the shorter sequence is zero padded by the function.
so it does. That will only result in extra zeros appearing at the end of the xcorr result. They can be ignored. There is no mathematical limitation that inputs have to be equal length however.
The meaning of the number is dependent upon the signals involved. A value of 7.52 might mean signals are identical for one pair of signals but extremely dissimilar for another pair of signals. The reason for this is because the number returned by a standard correlation function are dependent upon the energy in the signals. Normalised correlation attempts to resolve this by normalising to the energy of both signals so that the result lies within +- 1. A value of 1 in this case means that the signals are identical, -1 means the signals are an inverted form of each other and 0 means that they are orthogonal to each other. So you are probably wondering what 0.5 means versus say 0.9 - the simple answer is to say that a result of 0.9 means that the signals are more similar than signals that have a normalised correlation of 0.5. A more complete answer could be obtained by looking at the equations - I was trying to come up with a verbal description but couldn't come up with something that was easy to interpret - this question did get me thinking about it though so I'll get back to this at some stage.
I guess I meant standard, if that is what you're using in this video. You have 7.52 and -12.48 and so on so I am curious as to what 100% identical signals would yield.
17joren As an example say you had a signal [ -2 3 4 -10] then a standard correlation measure if you correlated this signal with itself would be 4+9+16+100 = 129.