Radioactive data: tracing through training (Paper Explained)

Yannic Kilcher

Подписаться 266 тыс.

Просмотров 5 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

27 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 33

@um5548 4 года назад

So it is basically a variant form of pixel attack? Or an attack based on adding noise, I forgot what the corresponding name was. Thank you for all your work on this channel, you are doing a great service to the community.

@JoaoVitor-mf8iq 4 года назад

33:45 inducing correlated weights might be good on a type of distillation, since you could check main characteristics of the "professor neural network" and induce these correlations and weight distributions to the "student".

@EditorsCanPlay 4 года назад

This paper is such an AI troll I love it!

@simongiebenhain3816 4 года назад

I didn't really get your idea at the end. If your alterations to the data are not bound to a specific class, how would you force the network to pay attention to the alterations?

@YannicKilcher 4 года назад

because the features itself would be bound to specific combinations of classes

@herp_derpingson 4 года назад

You already talked about everything I wanted to say. Nice.

@julespoon2884 4 года назад

If the cos diff is significant between unmarked and marked data in the same class, it should be just as easy to tell the difference between the two by comparing cosine differences between the feature vectors of the samples within the same class in the black box test. Or say taking all pairwise differences between the feature vectors of samples of the same class and maybe doing a PCA or smth, we should expect one of the eigenvectors to be a sign of marked data. For a defence to be effective tho, the effort to twart the defence has to be less than the benefit of using said dataset. Given that u have to train a decent model to detect if the data has been marked in the first place, ill say this defence is effective? Somewhat? EDIT: Ooo ur suggestion does make it way more sneaky

@YannicKilcher 4 года назад

There are certainly many defenses that would work here, yours included.

@zihangzou5802 3 года назад

Can you explain how to understand figure 4 and figure 5? And since you mention alignment throughout the paper, why don't you use the angle between translation vector (\phi_0(x)-\phi_t(x)) and u to determine if the marked data were used? What is the benefit of referring beta distribution?

@zihangzou5802 3 года назад

and what is the value of x axis and y axis in those figures.

@alexanderchebykin6448 4 года назад

I'm highly skeptical about this whole data marking idea: whatever you do, it needs to be invisible to the eye, i.e. small. And if it's small, it'll surely disappear after converting the image to jpg/blurring/applying some other slight modification. And to me it seems downright impossible to go around this problem

@YannicKilcher 4 года назад

That's true, but as you deteriorate the image to defend, you will also make your classifier less accurate

@tubui9389 Год назад

I thought the same. Deteriorating data will help not only defend against this kind of membership inference attack, but also make the classifier more robust to noise. I wish the authors explored effects of more data augmentations to attack performance, other than just crop and resize. Regarding the solution to go around this problem, the watermark needs to be robust to noise during marking time. Hence eq7 in the paper should take that into account.

@ulm287 4 года назад

What happens if you use a bigger model on the radio active data ? Or just a different arch ? Shouldn’t that break the whole thing ? Assuming different arch will learn different feats. Ie FFN or CNN for example?

@emuccino 4 года назад

I was confused too. But I think you just need to feed the model a sample that only contains the radioactive feature for a particular class and see if it tends to classify it as such.

@YannicKilcher 4 года назад

no necessarily, because adversarial examples are known to generalize across datasets and architectures

@zihangzou5802 3 года назад

@@YannicKilcher but how you can compute the cosine similarity when the features size are different? The transformation M would not be dxd. And in this case, do you need to train a model with the same architecture and find out?

@zihangzou5802 3 года назад

and in that case, you cannot guarantee the training process is the same as other trainer according to the prior assumption that the training process is unknown.

@sacramentofwilderness6656 4 года назад

Well, what I feel uneasy about - that the feature extractors would be related simply by a linear transformation. I may be wrong, but there was a video on your channel, where It was shown, that even a different initializations of the neural network with the same architecture may lead to drastically different result, after the training process, having stucked into a completely different region in the weight space. And for the different architecture, the behaviour inside, the feature extraction seems to have little in common, with the setting, trained by those, whose want to protect their data

@YannicKilcher 4 года назад

just because the weight space is different doesn't mean that different features are learned, but still a good observation!

@Markste-in 4 года назад

Doesn't this just mean that we intentionally add some bias towards a certain class in the data? (Something that we actually want to avoid?)

@YannicKilcher 4 года назад

in a way, yes

@neur303 4 года назад

Is there a difference to the concept of watermarking?

@drdca8263 4 года назад

This seems like it is supposed to be like watermarking pictures in that it allows you to demonstrate that a network was trained using the data you marked (analogous to demonstrating that a pictures used was watermarked by you by, pointing at the watermark), but different in that without knowledge of how it was marked, one can't tell if it was marked? Or, wait, is watermarking already an established idea in the context of training data?

@YannicKilcher 4 года назад

watermarking tags the datapoint, this tags the model trained on the datapoint