Answering your question in the video about Pydantic validation (~ 14:53 ), pydantic's default mode is to validate on instantiation only. But you can set validate_assignment=True in the ConfigModel of your model to validate when you assign as well.
I don't get the purpose of this. When would you want to validate on instantiation, but not on assignment? Sounds like a good way to complicate your debugging if you assume a variable is a particular type and then turns out to be overwritten with something completely different.
@@drheaddamage I guess the use case for only validating at instantiation is that after that you may trust it based on the code and not validating later may provide performance benefits. The instantiation could be with external data (user input, config file, API data) while the subsequent assignments are within the code itself (e.g. calculations)
Also validation on every assignment is quite expensive operation for pydantic. There is a pattern when you create only unchangeable instances. Like using dataclasses ‘frozen’ init parameter. And if you want to change instance, you should create a new one.
@@drheaddamage Validation on assigment could be very expensive. If you use a functional approach, you will never modify data (objects) just create new ones, so makes sense.
@@drheaddamage It is not needed with frozen classes and those are extensively used in some projects. I think there was some argument that this is a bit more secure way of keeping data - creating new objects over editing existing one
Arjan it would be awesome if you'd interview the creator of pydantic. It would be a fantastic episode of "Become A Better Software Developer" or something like that. I don't know if you've thought about this kind of format. I'd love to see pros dive into the nitty gritty and share ideas or the way to do stuff. Just a thought.
Great video! The idea is great ... comparing tools and them giving your nuggets of wisdom on which one to choose for each job ... It's great to have a glimpse of your experience while choosing tools. Suggestion for a next one: comparison between web frameworks (such as fast api, django, flask)
It's worth to mention the performance. Dataclasses are way faster than BaseModel (I learnt this the hard way :-/). There are some improvements expected with Pydantic 2.0, but for now: In [1]: from dataclasses import dataclass In [2]: @dataclass ...: class A: ...: a: int = 1 ...: b: str = "one" ...: In [3]: %timeit a = A() 80.5 ns ± 0.212 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each) In [5]: from pydantic import BaseModel In [6]: class B(BaseModel): ...: a: int = 1 ...: b: str = "one" ...: In [7]: %timeit b = B() 1.03 µs ± 3.7 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
Hi Arjan, thanks for your effort you put in this videos, I got a lot of inspiration from them in the last years! I wanted to add, that Pydantic also has a dataclass decorator which is a drop-in replacement for the standard dataclasses, with all the validation features available as for the Pydantic BaseModel. Perhaps that would have been the more comparable choice. Keep up the great work! Timo
@@evandrofilipe1526 it has all the validation features, but it is not a replacement for the Pydantic BaseModel. I recommend their docs for further details.
One thing to note is that the Pydantic dataclass has some restrictions compared to the Pydantic BaseModel features. Would be nice to compare the Python built-in dataclass, the Pydantic dataclass and the traditional Pydantic BaseModel.
I had chosen pydantic for a few projects, but now I'm ripping it all out because of the incompatible API changes brought by 2.0. It appears the authors of pydantic are _very_ opinionated about how method names should be formatted, so they cavalierly replaced parse_raw by model_validate_json. Not only is the function name much longer, but it apparently does exactly the same thing. If I keep pydantic in the project, then it's like a ticking time bomb for future developers.
I’m been rocking Pydantic for about a month now and I’m absolutely in love with it. It’s simpler to use than data classes IMO when using REST API. Granted I have a bunch of base models into another class then I use my own methods to manipulate the data but it works out nicely in the end.
Thanks for the video! We use both dataclasses and pydantic in our projects for 2 different cases. Pydantic is more than a schema validation library today but we consider it should be used as a schema validation tool only. So, we use it extensively for the request/response schemas and some other things. On the other hand we have data models that we map on the corresponding DB tables and we use dataclasses for that. Basically, the standard flow is request -> pydantic validation -> some logic and dataclass models -> pydantic validation -> response. Sometimes it may be handy to map the schema to the model data automagically and for that case there is an orm_mode flag in pydantic. I just want to say they are not competitors at all.
I've used pandera pretty heavily in production and it's very capable and the developer is super helpful. But I haven't explored pydantic extensively, and I feel there might be some advantages to using it in exchange for writing quite a lot more code
I have a question about dataclass VS normal class. When I use dataclass define one kind of data, at the beginning it is perfect. However, then more and more methods were added based on the attributes of the dataclass and the class is becoming more and more heavy and some of the methods doing something very complicated. I am not sure whether this is a good practice and whether I should convert the dataclass to a normal class. Could you please give me some suggestion about the use case of normal and data classes?
Great video ! a question that pops - Where are you getting your knowledge from ? reading the documentation is nice, and playing around with a module or a library is great - but aren't you afraid to miss out few of the features ? are there any good forums or knowledge resources you're using to stay up to date and understand all the possible features of a library or module you use ?
for pydantic not printing the object type just do print(repr(banana)) and it will put it in a nice form that looks exactly like you would create it in code repr calls the __repr__ dunder method which all python classes inherit from default object. In python print calls __str__, which unless overridden actually calls __repr__. If you have ever printed an object to terminal and you get something like: its because it __str__ has not been overwritten and it has inherited __repr__ from object. I guess somehow youtube is messing with my underscores and turning my text italics but I think its still understandable.
Hehe, this is REALLY a NIT. I was told that zero is neither negative or positive, that function in the attr example is non-negative, not positive. (Of course with IEEE-754 we can get zero BOTH negative or positive, but that is only useful for a zero that isn’t really zero, just too small)
Can’t read all comments, but I think I saw a bug in the dataclass example. I think if your code would ever run through a midnight you’d need that default for the order date to also be dynamic with default_factory, otherwise it’ll be the date of the creation of the *class* and not the instance. So if I started my program December 2023, all my orders will come up with that date by default.
as a newcomer I was hoping to find The Answer, what I should be using beginning from now. After watching this I am not sure, it seems odd to have a choice of 3. What would it be like 5 years from now?
And another nit in the attrs example, and this may very well be my misunderstanding of str, but instead of lower should rather we use casefold for comparisons? As I understand that is dedicated for that purpose.
Awesome video! I use dataclasses and Pydantic a lot. I'll take a look at attrs now. Adding Mypy to my dev workflow has also helped me deliver better code.
strange that attrs is the base class for dataclass and the same functions for validation are not inherited. is attrs using a different package to add in validation? What about just using a validation class in the first place as this is a time when inheritance is a good thing?
Great video. Good explanation for using int in price. This, like your other videos, come in handy in lots of different scenarios. Thank you for your contribution.
About the representation in pydantic, instead of printing the banana, you can do print(repr(banana)), you will get Banana(name='banana', category='fruit'....).
I haven’t been able to understand Enum classes and what exactly are they useful? I've been looking for videos that explain it well but there aren't any. Could you please make one?
I honestly used dataclasses for some time, but I stopped using them. I prefer having all classes written according to the standard rules for classes. So no instance variables in the part where you would standard see the class variables defined. Instead of mixing the dataclasses into regular classes, It would have been better if a new type of datastructure would be implemented in Python, and which is available by DEFAULT. Making the distinction with a decorator just doesn't do it for me... I like to say that pydantic is by the way a 'cleaner' solution imo...
You missed something important: pydantic has its own dataclass decorator which can be used much like a BaseModel, but is fully API compatible with a built-in dataclass
Great overview thank you :) Personal preference: dataclass with `slots=True`, I hope we get build in object pooling like `pool=300` one day for for dataclasses
Nice video, I realised about attrs because of Fluent Python 2nd edition by Luciano Ramalho. There is an entire section in that book about Data classes. If you ask me, I prefer to use attrs or Pydantic. You can use dataclass to prototype an initial version of some tiny app. However, once you need to build something more professional, you definitely need to go beyond. And Attrs or Pydantic is the correct choice.
Great Video! I actually started using dataclasses when I saw your video about it few months ago. I 'd like you to make a short video with pros and cons about your recommendation to use int() instead of float() type for prices fields. You left me thinking about that idea. Many thanks! 😋
I like attrs a lot, but it is a bit funny to me that we say: "Python is great because it is dynamically typed", and then the first thing we do is strictly typing using one of these packages...
I am not sure if why question is clear and related to this video. Sometimes in my classes there are a lot of lines in the init/post_init method. The reason of this is I don’t want the same thing being calculated many times and saving the result in memory can speed up the process. Maybe I should just use function instead?
If it's a complicated computation, I would suggest to move it out of the init method and into a separate function. You could also look into functools cached_property decorator. This computes a value once and then memoizes it.
7:38 As soon as you said it uses subclassing, I immediately thought “there must be a metaclass involved”. Had a look at the source code, and yes, that is how it works.
Pydantic's Soo good to ensure that your python objects can easily become dicts, serialize in JSON, parse_raw query results and validate at the same time. I use pydantic all the time so other developpers only need to learn them and not have to deal with the differences coming in dataclasses fields