Extracting the amplitude envelope feature from scratch in Python

Valerio Velardo - The Sound of AI

Подписаться 49 тыс.

Просмотров 49 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

29 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 104

@sawcon Год назад

If anyone runs into issues with librosa.display.waveplot ... it is now librosa.display.waveshow

@markusbuchholz3518 4 года назад

From my point of view there is only one thing, which can be improved here. You should publish the new video each day - no just joking I understand your effort to make this priceless content. For me you are one of the ten top RU-vidr. Exciting and impressive as always. Applause also for the team who make a librosa package

@ValerioVelardoTheSoundofAI 4 года назад

Thank you very much :)

@lukelutio1246 3 года назад

Absolutely amazing course! Valerio's serenity and clarity while narrating highly convenient slides and code samples makes this course even accessible to idiots like me. Outstanding effort! Much much appreciated.

@pawegrabinski9641 4 года назад

Someone here is not afraid of a copyright strike! Joking. Introduction to librosa is something missing on RU-vid, so it's great you are closing the gap!

@ValerioVelardoTheSoundofAI 4 года назад

Thank you Pawel!

@sathyanarayananvittal7832 10 месяцев назад

Such clear explanation for complex topic. Like the step by step approach.

@yaswanthjagilanka 4 года назад

I couldn't wait to stay upto date with this series.... Thanks to the effort and time you are putting in into this... Will be waiting for more on the freq part.... Truly Sound of AI ...

@ValerioVelardoTheSoundofAI 4 года назад

Thank you!

@jaypatel3233 4 года назад

Awesome tutorials. Thank you so much for all your countless efforts.

@ValerioVelardoTheSoundofAI 4 года назад

Thank you Jay!

@WolframBlechner 4 года назад

Wuuah! I got it running. Thank you so much, Valerio! You know what? The biggest problem is not to write the code itself, but fighting with package installation, virtual environments, jupyter notebooks took me hours.:-((

@ValerioVelardoTheSoundofAI 4 года назад

You're welcome Wolfran!

@pawegrabinski9641 4 года назад

Check out Google Colaboratory. It is not a permanent solution, but you can start coding and studying ASAP.

@prof.ravindravyas3035 3 месяца назад

Absolutely no words to describe...........amazing❤

@quinxx12 28 дней назад

For the calculation of the amplitude envelope you should have taken the absolute value of the sound signal. That's why you see certain blue peaks above the redline in the graphs.

@kobyfr Год назад

Hey. I tried to find a cause for the graphical overshoot of the signal plot librosa creates, compared to the MAX plot. It is visible in several places in the plots shown in the video. I could not find any possible cause, or fix. Is this a known issue? Is this something purely graphical, and can I be sure that there is no problem with the calculation of MAX, or the time-axis alignment of the signal and the AE function?

@quinxx12 28 дней назад

The calculation of the max value should have been done by using the absolute value (to also consider negative values).

@sohamdats Год назад

In librosa version 0.10, the line librosa.display.waveplot should be replaced with librosa.display.waveshow.

@saiswayamshree Год назад

thanks for the fix, helped me

@rahuldeora5815 3 года назад

At 33:00, for the Duke Ellington plot, at just ahead of the 5 minute mark, there is a peak with the max marked as lower than the max value. How can this be? There are other such peaks which are not captured in that plot as well.

@ivanlo20 3 года назад

I see this situation too, I think maybe there is a bug in librosa. Because when I use the matplotlib to plot the original signal, all the max points matched. You can try it yourself. orignalTimestep = [] for i in range(len(debussy_signal)): orignalTimestep.append(i*sample_duration) plt.plot(orignalTimestep,debussy_signal)

@inverseai 7 месяцев назад

It's a problem with the plot function itself which generates artifacts. The code and result are correct.

@quinxx12 28 дней назад

@@inverseai No, that's not true. The code is not considering negative values within the signal data. He should have taken the absolute value of the sound signals.

@gergeerew6811 3 года назад

Great content! I just have a minor question: if fancy_ae_debussy is completely matched with ae_debussy then why are there some peaks not being covered by the red line?

@antunmodrusan828 3 года назад

Had the same problem. When applying the max() function to the whole signal, the max values were not included so I applied np.abs() to the signal and got the desired result. I guess the librosa.display.waveplot() has some logic to do something similar :)

@bec_Divyansh 2 года назад

Thanks A lot Valerio , I can finally apply my course subjects in a field of my interest.

@stephanefedim6759 Год назад

Thanks for the tutorial, really helpful. It will be interesting to also share the packages versions.

@gabrielgardin2592 Год назад

literally saving my startup

@MarkEdwardsGreenside 7 месяцев назад

first time here - loving the course - will definitely follow. One small bit of feedback if I may be so bold - surely you should be taking ABS() value of each sample rather than relying simply on the positive excursions of the waveform? Amplitude is dependent on both positive and negative excursions of the sample value.

@andrewyzd7746 4 года назад

Hi, I have a question. May I know how to determine the value for the FRAME_SIZE? This is because my result of the red graph does not match the signal perfectly.

@danhamilton6169 Год назад

If you're getting the error: AttributeError: module 'matplotlib' has no attribute 'pyplot'. You need to import matplotlib BEFORE librosa!!

@saiswayamshree Год назад

thank you so much for the fix. btw why does it happen? I am a newbie in python so I am curious and it would be cool if you state the reason

@siddharthkumar5206 2 года назад

At 23:30, why would we use overlapping frames for envelope? We may end up getting repeat/redundant values in two consecutive hops .

@baghdadiabdellatif1581 8 месяцев назад

You mean every 512?

@siddharthkumar5206 8 месяцев назад

@@baghdadiabdellatif1581 basically the frame size; i.e. hop size = frame size, meaning no hops .. But I guess having a hop gives a better granularity. They are approximations anyway

@baghdadiabdellatif1581 8 месяцев назад

@@siddharthkumar5206 thank you

@baghdadiabdellatif1581 8 месяцев назад

Thank you @@siddharthkumar5206

@baghdadiabdellatif1581 8 месяцев назад

@@siddharthkumar5206I think using Hilbert () is better for envelope detection

@francomarinelli9078 2 года назад

How do you keep the same size between the ae vector and the signal vector? the hop_length jump shouldn`t reduce 1/hop_length the size of ae vectors? Debussy has 661500 data, ae_debussy has 1292 data, but in the graphic they have equal number of data.

@ha97sa Год назад

does the Spectrum Envelope differs from Amplitude Envelope? and if yes, how can i extend the python code to get the Spectrum Envelope?

@tonym2540 3 года назад

Nice tutorial. I agree with the comment by Wolfram Blechner that installing packages is the hard part. Without that, nothing can be done. So, I suggest providing info on that, or better yet showing at least one way of doing it. For example, after installing anaconda and jupyterLab I made a 'tutorial' environment that works for me using: conda create -n tutorial matplotlib numpy librosa ipython

@bassman9261995 3 года назад

Would you ever want to use the maximum absolute value instead of the raw maximum? Do you think those would be too similar where it doesn’t matter, or would it be useful on a more volatile signal (something where a digital process has introduced error)

@raghebbenhamouda3857 4 года назад

Great explanation.... Thank you very much

@pinghe795 4 года назад

Thank you for your video! It's very helpful. I have a question: Why there are some signal values which seem higher than the plotted AE values? Shouldn't all the maximal values included in the amplitude envelope? But the code seems correct.

@yanbingong5638 2 года назад

I have the same question. May I know if it's answered?

@siwarmassoud7989 3 года назад

Hi! please how can we split audio by silence detection : our audio have different silence time ..any help plz...

@humairakhan3372 3 месяца назад

Hi Valerio, I know this is an old video, but I would like to point out that for getting frames to time, you need to specify sample rate, I’m assuming librosa uses some default which is the same for the audio you used 22050?

@patrickzeng5668 Год назад

Awesome tutorial! Quick question on the amplitude envelop plot. How come some of the amplitude spikes in the original waveform are not captured by the red AE line?

@SuperLucasGuns 4 года назад

How come there were some spikes in the amplitude envelope graph not highlighted in red?? thank you so much for the videos.

@ValerioVelardoTheSoundofAI 4 года назад

I think there are probably some artifacts with the plot. We're using librosa to visualise the waveform and matplotlib for the AE.

@soilman3b 3 года назад

Lucas, as a test, I trimmed down the 3 audio files to just a couple hundred samples where there was the big spike in debussy (just samples 355148:355348). The Debussy amplitudes in this 200 samples ranged from negative 0.62515 to positive 0.2832, but It looks like the librosa.display.waveplot is drawn as an envelope that is symmetric and ranges from negative 0.62515 to positive 0.62515. So I guess the waveplot tool is actually displaying an envelope of plus/minus the abs(samp values) rather than drawing the actual amplitude values.

@kevinchen1339 2 года назад

Thanks! I managed to do this successfully and it helped me a lot :D

@baghdadiabdellatif1581 8 месяцев назад

Thank you.i have a question plz. Why this method better than using Hilbert ()? import numpy as np import matplotlib.pyplot as plt from scipy.signal import hilbert, chirp duration = 1.0 fs = 400.0 samples = int(fs*duration) t = np.arange(samples) / fs signal = chirp(t, 20.0, t[-1], 100.0) signal *= (1.0 + 0.5 * np.sin(2.0*np.pi*3.0*t)) #The amplitude envelope is given by magnitude of the analytic signal. analytic_signal = hilbert(signal) amplitude_envelope = np.abs(analytic_signal) print(amplitude_envelope) plt.plot(t, signal, label='signal') plt.plot(t, amplitude_envelope, label='envelope') plt.show()

@sanazkhalili4698 6 месяцев назад

Hello, thank you very much. It has been very useful for me. I'm having trouble understanding alpha in librosa.display. How does it affect the color in this plot? I mean, why do some points change even though there isn't any overlap in the plot?

@KingQuetzal 2 года назад

if you are trying to do this now waveplot() is now waveshow()

@i_am-ki_m 2 года назад

Nice tutorial, tip and tool! Keep MLAing! ;)

@meghnaadibhatla2342 4 года назад

i am facing this error while trying to use ipd.audio ValueError: rate must be specified when data is a numpy array or list of audio samples. how do i resolve it ?

@DOMINIK32110 3 года назад

just write .wav at the end of path

@la_kukka 2 года назад

@@DOMINIK32110 Doesn't work. Still facing the same issue

@la_kukka 2 года назад

There is a small mistake, avoid the apostrophe marks from the code. eg: should look like - ipd.Audio(debussy_file) instead of ipd.Audio('debussy_file')

@mizhibridge-to-knowledge7502 Год назад

I have the same error.. Couldn't able to rectify

@richi1235 8 месяцев назад

then your path is not correct

@pvlr 10 месяцев назад

What is the reason of some peaks being higher than the envelope?

@azizgundogdu9916 4 года назад

There's a problem occuring which i cannot solve when installing librosa by cmd window. Anybody can help me?

@tomielcapodeamnesia Год назад

thank you so much for your video, it was really useful

@notallama1868 Год назад

In the final amplitude envelope plots, I notice that there are a few peaks in blue that go above the amplitude envelope that was calculated. How did that happen?

@quinxx12 28 дней назад

Gotta use the absolute value for the calculation of the amplitude envelope as it's not considering negative values of the signal data

@yerassylabilkassym7609 3 года назад

Thanks! That's really cool

@qayyumm5759 3 года назад

Hi, thank you for your efforts. its really awesome video and explanation. however, it is possible to split audio with overlap audio. for example in one audio have two person are talking and both of them speak on the same time then the audio will overlap, it is possible to split the audio and recognise the audio belong to speaker 1 or 2? Thank you

@fatmademir554 2 года назад

Hi, Firstly I wanna thank you so much because I have watched all videos up till this and each one is extremely beneficial. But now I have a problem. To import librosa firstly I installed librosa on anaconda prompt by using this code "conda install -c conda-forge librosa". After that, I tried to import again librosa but I took an error on jupyter whisch is "ModuleNotFoundError: No module named 'librosa'". I searced it on google but I can find just this ( "conda install -c conda-forge librosa") for solution. If you help me, I will be thankful to you. Thanks in advance 🙏🙏

@Zuke22 3 года назад

this video is so valuable

@nitinmalhotra2560 3 года назад

Amazing videos, thanks a lot. And also one request, please use larger font size as it would be better when watching in Mobile devices

@ValerioVelardoTheSoundofAI 3 года назад

Thanks! When I'm in PyCharm I always use a large font. TBH I haven't found a way to increase the font size in Jupyter notebooks.

@kunalsuri8316 3 года назад

@@ValerioVelardoTheSoundofAI Most probably you would found the way. In case you haven't, you can try increasing font size of the entire browser

@wesleyklhk 4 года назад

It is a great tutorial but I have a question regarding to the librosa.frames_to_time. Since each frame contains 1024 samples. Does the function always return the timestamp associated with the 1st sample of each frame(1024 samples)? If it is, then let's say there exists a frame where the maximum occurs at the 100th sample. If we now draw the amplitude envelope on the waveplot, then we are assigning the 100th sample value of that frame to the 1st sample timestamp of that frame. Am I correct?

@ValerioVelardoTheSoundofAI 4 года назад

That's a good point! I checked the documentation, but it doesn't say how librosa.frames_to_time works under the hood (you should check the code to learn more). However, I think your assumption regarding getting the timestamp for the first sample of a frame is reasonable. If that's the case, your second assumption is also correct.

@wesleyklhk 4 года назад

@@ValerioVelardoTheSoundofAI Thank you very much for your prompt response and advice. After digging into the implementation, I discovered that the 1st assumption is correct in this scenario because we have not passed in anything to the n_fft paramters. Hence, it will be the 1st sample of each frame. However, if n_fft is not 0 or None, then it will not be the 1st sample.

@ValerioVelardoTheSoundofAI 4 года назад

Interesting!

@nazmdar 4 года назад

There is another classical way of extracting envelope of a signal which uses Hilbert Transform. Please refer to this article. en.wikipedia.org/wiki/Analytic_signal

@shanmukhchandrayama3903 3 года назад

Can some one explain me what is the need of converting the frames array to time by using frames to time function. Can some one brief me what are we actually doing there and what was the necessity to do so?

@oussamaabouzid2887 4 года назад

Thank you so much

@sharonm1261 3 года назад

This is great!! Thank you!! A couple of questions if you have time, no worries if not!... Do you have a link as to how to sign up for the soundofAI Google community? I've spent an hour already but haven't worked it out yet, I accidentally made my own community 😂 Google plus seems to have disappeared. I'll have another go later. I'm working with fairly high frequency animal calls and my amplitude envelopes are not looking quite right (They follow the signal both negative and positive , I'm not sure if it's due to the high frequency or if maybe I need to make the amplitude of all the signals uniform before I start or probably I've made a mistake somewhere.Not sure if I can post a screenshot, if I can I will... (My sample frequency is 192kHz, and I've also tried using a smaller frame size of only 64). Strangely the AE worked for one of my three signal's but not the other two, so also maybe I've just forgotten code for the other two somewhere..

@mahdialipourstudent4992 2 года назад

perfect explanation, How can I get the amplitude value of a given time?

@joaquimgirbauxalabarder9765 3 года назад

Hello! First of all, thank you for all the videos; your content is really interesting! I have two questions (maybe they are basic): 1) I have seen the video from Leakage, however, I do not understand why the high frequency appears when discontinuity. And why does this "discontinuity" appear (we do not replay the sample; even though it cant be separated in integer periods). 2) I have try with 2 different samples (one music that I have and the noise file that you have in your website) but, when I calculate the time for the plt.plot(t, etc.) I have to calculate the time specifically for each sample. I cannot use the same t because then an error appears. Thank you again for all of your videos! :D

@HelenProkonina Год назад

Hi! How can we calculate amplitude envelope for a multi-channel audio?

@ValerioVelardoTheSoundofAI Год назад

You can calculate per different channel, or, more typically, reduce the signal to mono and then calculate the envelope.

@siddharthshyamsunder418 3 года назад

A small question where can I get the audio files so that I can practice on them?

@ValerioVelardoTheSoundofAI 3 года назад

If I remember correctly, I picked these files from the GTZAN Genre Dataset. I think you can find a link to the dataset in a previous video in the series.

@CULTURE_dz 4 года назад

Thanx beore watch .

@jeremynx 4 года назад

thank you! great content!!!!

@ValerioVelardoTheSoundofAI 4 года назад

You're welcome!

@NonIntellego 2 года назад

Thank you so much for you videos. I really appreciate the time you've taken for your explanations. I have one question on the amplitude envelope discussed here. I thought it would be the same as a envelope detector (such as a Hilbert transform or a rectifier with a low pass filter), but in the video you get a scalar per frame and an envelope detector would provide a vector of the length of the frame. I don't understand the difference between the two types of envelopes, concept-wise? Is the one in this video more to assess for loudness? If you have the time, I'd love to watch a video of you on envelope detectors in python.

@raghavagrawal6263 3 года назад

Very nice tutorial. But if you could zoom your screen a little bit it would be very very nice.

@ValerioVelardoTheSoundofAI 3 года назад

I don't know how to increase the font size in Jupyter notebooks.

@amangoel8724 4 года назад

Hey Valerio if possible try to increase the font size in your upcoming videos.

@ValerioVelardoTheSoundofAI 4 года назад

Thank you for the tip! I still haven't found a way of resizing the font in Jupyter. That's one of the n reasons why I prefer to make videos with Pycharm :D

@6tyelement979 4 года назад

Great but I wish it was done in matlab but nevertheless great

@ValerioVelardoTheSoundofAI 4 года назад

Thanks!

@noumanijaz5353 2 года назад

its a great and tiresome you did thank you alot. but please it will be really good if you get an input data of multiple .wav file from an audio folder thank you

@zalasyu 2 года назад

Had to make it 666 Likes lol

@Bihari_Chaman 2 года назад

Wear black tshirt. Your face is only visible in video then :-)

@meganhung9698 2 года назад

Hi! I am excited in this calculation. Currently, the calculation is to calculate the whole piece, but I am wondering is there any way to get the array of envelopes in every second? I mean in the first second, the amplitude_envelope is [[0.00000000e+00, 2.32199546e-02, 4.64399093e-02, ..., 4.00544218e+01, 4.00776417e+01, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00], and second is..., and so on. Thanks!

@albertgonzalez9691 Год назад

Anyone in 2023 thats running into issues with librosa.display.waveshow(debussy) or librosa.display.waveplot(debussy). I was able to get it to work using numpy instead: import numpy as np plt.plot(np.linspace(0, len(debussy) / 22050, len(debussy)), debussy) plt.title("debussy") plt.ylim([-1, 1])