2018-11-09 | DjangoCon US 2018 | Django REST Framework: Moving Past the Tutorial to Production by Drew Winstel
Django REST Framework 从教程到生产实践:优化 API 设计与第三方库应用
标签
媒体详情
- 上传日期
- 2025-06-21 19:08
- 来源
- https://www.youtube.com/watch?v=-9WniUBt0fo
- 处理状态
- 已完成
- 转录状态
- 已完成
- Latest LLM Model
- gemini-2.5-pro
转录
speaker 1: Thank you, Russell. Morning, everyone, and welcome to my first ever conference talk. Today I'll be talking about moving jgo rest framework from the tutorial to production. Jgo rest framework, great piece of software, but it's a bit of a mouthful. So I'm just going to say drf from here on out. So why are we here today? What's the goal behind the presentation? Are you here just to see me do a song in dance? Too bad you're all in the wrong room. You've made your first step b at creating a drf api. But what's next? You want to make things easier on your client developer so they can get actually the data they want. I'm here to show you the things that help me and my team move from the drf tutorial into production while converting a homegrown php stack into a drf and reexcuse me a drf into react web app. I'm not here to authoritatively say this is the only way to get things done, but it works for me. So by now, you've probably seen jokes like this on the web before. How to draw a horse? Step one, draw two circles. Step two, draw the legs. Step three, draw the face. And step four, draw the hair. Step five, add small details. Well, today I'm going to help you put some of those small details into your project to help you move towards production. So a little bit of assumptions. I'm assuming that you at least have that basic familiarity with drf terminology and you've at least touched Jango filter and how it works with drf. If you were at ph's api driven api driven Jango tutorial on Sunday, you'll be fine. So here's a quick overview about what I'll be talking today. I'll introduce myself, talk about making the apis nice for end users and talk about a couple of useful libraries for documentation. Before I get started, though, a couple quick notes. If you have questions but you see your little camera shy or you don't want your voice recorded, that's fine. I understand that feeling entirely. You can either use that link up at the top of the slide or send a slack dm or Twitter message to ruour conference chair. He is at freak boy 3742. That's freak boy 37, four, two on both Twitter and slack and Hebe, happy to ask for you. And I also just tweeted out that link a few minutes ago. You can find me on Twitter at hobsansmoke like any good Python developer that's in snake case. So here's a rough overview of how I got to where I am today in front of you. I've had about four years of drf experience, starting way back when south migrations were still a thing. It wasn't built into Jango yet and kept going all the way through the Jango 2.0. My degree is in wireless electrical engineering from auburn boeagle in 2008. I've done a mix of defense and Internet of things work. And since then, before coming to rec space, where I've been since June of this year, where I'm developing rest apis using flask and mangodb, and I got code up on GitHub and gitlab already. You can find me there and take a look at the example code. Run for learn, sorry, read it and hopefully learn from it. No guarantees. And as I mentioned, I'm on Twitter also on masedon as well. So a quick recap serializers, these are probably the most powerful thing in drf. It's wonderful. They're responsible for conting your data between the Jango instances, you know and love, and then formats. We can easily be transferred over the web, like json, xml, or you can write your own renderers if you really feel like doing something unusual. They delegate that responsibility to the individual fields, which are defined in your serializer classes. They use the two internal value and two representation methods to convert to and from serializable types, respectively. On the left, it's a json submitted from a client. It's been converted to a Python dictionary by the view set. It has a date in a couple of decimal objects. A quick tip, if you're using a coordinates in methods in your models but don't actually have geojango installed, make sure you save them as decimal fields, otherwise you'll lose data surrounding errors. My, the previous database found that out the hard way. And also, Jason does not have a decimal object, which is why you see them as strings in there. And then this is what the results from the validated data property of the serializer. As you can see, the serializer has converted the date into a dattime date object and the two decimal strings into Python decimal objects. Then the serializers use either the create or update methds to save those fields into the database using your Jango models. So two internal value, like I mentioned, takes serializable types like numbers and strings, and turns them into Python types, like date times, decimals, you name it. You can build as a translator for it. And then there's two representation, which does the same thing, only in reverse. Then drf serializers save that data using create an update. Typically, you won't have to modify these unless you're modifying multiple models in the same serializer, which I'm going to demonstrate a little bit later. The view sets are extremely powerful accommodations of generic classes that provides general purpose methods for basic api access. You get off access control, query, set generation, serializer selection, and filtering, all just by providing class attributes like these. If your first exposure to the drf was fil's tutorial on Sunday, just combine the list of detailed eugeneric classes that you use during that tutorial, and that gives you the concept behind a view set. It also includes very simple, create, retrieve, update and destroy actions that are known as crud that will serve the majority of your needs really well. So quick. So let's talk about what it does in detail. After authenticating the user, the vsefirst task is to call each permission classes has permission method. If any of these methods don't return, the request stops and returns a four or three forbidden. Next up, the view wset will refresh the query set by calling all on this attribute. If it's looking for a specific object, like for doing an update or a retrieve action, then the viewset will also call each permission class has object permission on that with that object as an argument. That's another chance for a four or three perbidden to fall out. In the case of list actions, the view set will then PaaS the query through each filter back and you specified, which modifies the query set appropriately. If, like me, you're using the Jango filter back end, which you probably will be, the Jango filter back end looks up the filter class I'll attribute and uses that to modify the query set. Next up, the viewset passes the queries that results into the serializer class, which converts from complex data types into easily serializable types, which you'll remember. There are things like strings, numbers, boolons, and null. And if you're using a list route, itfeed the pagination class into the serializer, so it only has to serialize a subset of the query sentence, assuming you have more results than your pagination allows. In this talk, I'm going to use some horribly, horribly contrived that clinic examples. This has nothing to do with the market that our app was working in. It's just a convenient excuse for animal pictures. So please enjoy. This is sassy. She was my grandparents, half black lab, half rottweiler. She wouldn't never be caught dead without a toy in her hands or her mouth at any time. She was probably the sweetest dog you ever could have met. So what do I mean by making the api client friendly? It's about knowing your users. If you're working with end users, accessing your data via web or a mobile app, speed is usually far more of a concern than just if you're dealing with automated services, where the extra second delay for getting all the data at once, it is worth a while. But for mobile apps especially, speed is so important that an extra delay of, say, a second while fetching data is a difference between a four star and two star rating. In the app stores, nobody wants a bad rating. So and talk about related fields, there's a problem you'll always run into when dealing with information transfer. Your model of the data never matches what users, how your users see the data. It's not a bad thing. It's just a fact of life. Your users care about different things than you do. For example, in a vet clinic, your user only cares that cida is a black lab, not that he is. Breed ID 42. How do we simultaneously give the user the information she needs black lab, and the information she needs in case she needs to update things later? Breed ID 42. We use nested serializers. So here's a trivial example of how you embed those into the serializer. You can just call the serializer, and then drf will take care of the wrist. Forget, get operations. But wait, remember how I only said get requests? Drf documentation specifically says they don't provide an implementation for saving serializers with writable nested fields. So how do we work around that? By the way, that is my dog, Henry. He's a half corgi, half something. He was badly abused as a puppy left outside during those terrific tornadoes that came through Alabama in 2011, horribly afraid of people. It took almost a year before I could pet him. But with my three year old daughter, he's like her best friend ever. So it's kind of fun. Of course, she drops food all the times. That makes it easier. Now you're not going to like this part. There is no single right answer for all use cases. You've got three really viable options for handling updated related data. You can either override the create and updmethods in the serialize or the host relationship, such as the breeds serializer hosting the species relationship, or you could create a separate right only field, like species ID. Or last, you could create a separate field entirely to to describe the relationship where you actually build a field class. If you choose option one, overwriting, create an update. You use this if your users are most likely to create new data rather than updating existing data. How? You have a few questions to answer, though. Number one, how does the user create data? The easy answer. You post a dictionary without a primary key. Should the next question is, should the user be allowed to update existing data if the data in that related dictionary does not match what's in the database for the object? You can say yes. You can say no either way is fine. Just make sure you document it and be extremely firm and consistent with your decision. So don't have one endpoint where you can update things and one where you can't. That will just confuse your users and make everyone miserable. So the create method is basically fairly ltively simple. You have to pull out the related field and look up the source data. Let's dig in. The first thing we do is we pull out the related object, which will be a Python dictionary. We don't have to worry about the related object being missing because the drf validation will return a 400 bad request before this code even runs. If it's missing, because we set required, because we did not specify as an optional field. Next we've got to handle that related object. If it's a new object, we need to create it. If you have nested fields inside those nested fields, first of all, I'm sorry, second of all, you have to go into that serializer and handle those related fields. You can call that serializers create method, where I just call species that objects that create you don't also been note that you don't have to manually validate those serialirelated object. Df took care of that for you already. You also have a big design decision to make here. What happens if the user, the data the user provides for the related object does not match what's already in the database? You can either reject the request given a 400 bad request, or you can implicitly update the related data. Both options have their pros and cons. Just make sure you document your just design decision extremely well. Also, if you're fan of functional programming, you'll hate the side option, the side effect, laent option of updating the related models, and you're probably crinching right now. And then lastly, we have to just drop that Jango model instance back into the validated data dictionary and PaaS it to the drf base implementation, where theytake care of saving everything. The update method is pretty much the same as create, but we have to consider what edge one case where the user does a patch where you don't have to include the entire body of the object when you're submitting the eight request. So just pop it out first and use a data, use that value that's completely invalid if it's not present, like I used false there, for example, and then it's actually, if that is that bogus value, just return it upstream, let upstream do its thing. It's easier that way than writing it yourself. Other than that, it is exactly the same as update. Now the option two is creating a separate right only field. This requires firm agreement with your api clients that what you receive via geas a user is not what you post or put. And that's like an invention I've seen in a lot of apis where you could take the result of a get and put that into a put, and it will just work and be a completely null operation. But it is a valid operation. This breaks that convention. It's not a huge deal. You just have to document it thoroughly. And show good examples in the documentation. Then if your users complain about it, point to the docs and say, Hey, you didn't read them, that's on you. It's not the end of the world. Just a caveat you have to be aware of. And then you'll also have to write a very small validate method, which I'll show in a moment. In this option, it's pretty simple. First, you define two separate fields. The read only explato serializer previous, just like in the read only version. And then secondly, you use a right only primary key related field. This field's two internal value method looks up the object using the given query set, also does validation, and then returns an instance of the model being looked up. Remember that I said it returns an instance of the latmodel. That's important because if you try and save it, as is the Jango orwill barfetch, you because you're trying to save an instance to a field under the ID field, where Jango's expecting a primary key, like an integer or euuid working around, that is very simple. All you have to do is just move the species ID into the species key in the dictionary. After that, drf takes care of everything for us. Sorry, button, this is the option that we chose to use in our api that we prevointo production because our primary front end was an orienhouse web developer. So your mileage may vary. And then one thing I would to point out that may not be readable in the back, even though we set species ideas required in the serializer, it is possible for that to be missing in our body in the case of patches again. So all that means we have to do is handle the case where it didn't provide the species ID by just doing a try and catch on it. You can really fine ignoring this error and moving on, unless you really like making your uers miserable, in which case, why are you developing apis? Shouldn't you be forcing them to write to html scrapers instead? So creating a separate relationship field, the third option is nice because it doesn't require your clients to use a separate field for sending versus receiving of data, but it has its own tradeoffs. You can accept a primary key or a dictionary of the data type coming in, but if you accept a primary key, you prevent the user from creating new data. Or if you do accept a dictionary, you have to handle the same question as an overriding update and create if you get up the primary key of existing listings, do you update that? You create a new object or you return 400 bad requests and you can create a field that does both and just use this statement to switch back and forth. But again, you have to be clear with your users about what will happen. Okay, that's all great, but this means you have to do extra database lookups, right? As you probably know, you need to use select related and or prefetch related to look up extra data. However, it might not make sense to dump everything when you're looking at a list route, particularly when you're dealing with lots of data. Like I had one endpoint that returned probably 100 fields over by the time that was done, expanding everything that would take 10s to return 200 entries and a list list route. Tes, how do we deal with this? We use separate serializers for liand detail, routes and queries to match. That's all these penguins at the Lincoln Park zoo last month in Chicago. It's a night, little zoo. It's a little on small side, but it had the upside of being free to enter, which is great when you're going with five kids. Well, yes and no. For a single related object, you know there's select related. It's pretty much almost free. The only cost is the sql join. It's much faster than doing a second database lookup, unless your tables are horribly misconfigured, in which case you may need to go talk to the postgres people out there that might be able to help you. I can't. If you're traversing a many I relationship or looking across a reverse foreign key lookup, then you need to use prefetch related to look up the data. This causes an extra database lookup and then makes Jango do the merging of data in Python land. If that sounds slow to you, you're right. But it is way faster than not using prefetch related. In which case Jango does one database hit per Record Return to the main query. That's what' S Referred to as the n plus one problem into the banof. Many developers prefetch objects are absolutely wonderful. They let you filter related model lookup, and also on select related as part of the prefetch, which could save you an extra database hit if you do it right. But be careful with using prefetch related and make sure you cover every related field lookup. If you don't, things will get hairy quickly. Now, what do I mean by that? Thanks to Jeff for letting me use him as an example here. I've made this mistake many times as well. I just didn't have the foresight to tweet about it. I don't have time to cover today, but I highly, highly, highly recommend using tests to count the number of queries used in a particular api test. Jango's test case has the method to count the number of queries run. It's easy. Just a good way to make sure you don't accidentally trigger an n plus one problem when you modify a view. So next up, we'll talk about using different serializers for different actions. Now, you've probably seen this pattern before where you have get serializer class. Looking at the action of the request. If it's a detail lot route, you simply return the detail serializer, otherwise return the list serializer. This is in the view set, but you simple and obvious. But what if I told you drf provides a way to differentiate between listing serializer classes with one line of code. This was during a road trip last year where the dog objected to being left in the car while we went inside to take the kid to the bathroom. So they jumped over the back seat, and both of them somehow fit in my kicar seat. It was very fun getting them back over willingly. Drf provides a very handy list t serializer class attribute in the meta class. How does it work though? You use the attribute in your detail serializer to point to your list serializer class. Here's how it works. When the view set initializes a serializer instance with many equals as an argument, the serializer will actually switch out the instance created and replace it with the class defined by that list serializer class attribute. Now wait, you may be saying, aren't you supposed to get an instance of the class you instantiate when you construct an instance of the class? Let's take a look at the drf source code and see what happens. It uses with a little bit of Python magic and overrides the double underscore new method to call a different innit method entirely, which is too long to show here, but it ultimately returns an instance of that class as list serializer class. If it's specified. It's pretty nifty, and it was a nice little light bulb moment when I discovered this. Now here's the view set using a serializer that has the list serializer class defined. There are two things I want to point out here. When you're specifying the serializer class attribute, you want to use the detail serializer. It's a little bit counterintuitive, but drf doesn't know how to go from the list serializer back to find the detail. Serializer, because that relationship is only a one way relationship. And then also because you have two different tiior lights of showing different data, you should definitely overriget query set to return just the data you want and nothing else. And also, when you overwrite get query set in the view set, this means you have control over what data is looked up for different methods. It's pretty easy. You know, just a pick based on method, the action being chosen and go from there. Now in addition to changing what you do based on the http action, you can also change based on who's looking at your api. If you have different classes of users that need different data, you can override, get serializer class and get query set to limit or expand data as needed. Here are the trivial serializers I'm using for this example. Nothing fancy, I'm just extending the animal detail serializer to add an extra field that only matters to staff. Just an appointment serializer, there is, excuse me, a lot to go through here. So I'm going to break it into a couple of chunks. I just wanted to show it all so you can get quick glance as to how it interacts. So here is get serializer class only at a readable zoom level. It's pretty simple. Just check to see if a user has permission. Remember that you define those at the model level. And then if the user has the permission in question, you give them the expanded serializer. Otherwise you fall back through to the normal drlimitation, which is just refreshing. That all from the query set attribute. And then here is get query set. Sorry, that was get serializer class, not get query set my bed. And then here is get query set, only slightly more readable. Now what you do is you can do the permission check and make whatever changes you want. I snip those out here because otherwise it was way too small to read. And then you can do the same thing with them based on the action as well. And I'm leaving, you know. And then you can see the full list on my GitHub to see what was going on. And then next up, we will talk about view set actions, which are additional http enpoints you can define to related it to a model or an instance. So when your user needs to take action on the model that's related to the one you care about, use an action to make your user's life easier. For instance, I probably don't want to PaaS that primary key when booking an appointment at the groomer. When Ringo here decided to roll in Canadian goose poop for the third time in as many weeks, that was a lovely smell. This isn't the only way to use actions, but it definitely has been the most convenient for me. Here's a simple example of booking an appointment using an action. The user does not have to specify the animal in the request body at all, because it's already in the url, thereby reducing the chance of error. You still have to write your own validation and access code. I'm not going to write that for you. You got to do something here, and the code looks up, the animal passes it to a serializer, validates that serializer, saves a new instance, serializes the new instance, and then renders that response back to the user. The detail argument determines whether the action operates on a single instance or a list of instances. You can set detail equals false to perform an action on a list of instances. Why would you do this? A couple of ideas. You can use it for pre canfilters, such as looking up and getting all dogs who are overdue for their shots, or alternative output formats like spreadsheets or pdf's, like say you've got a manager who demands everything being excel format, even if your web tables are much easier to use. That's one way you can do this. Open pi excel is quite useful for that, by the way. So the action decorator also takes a couple of other very useful arguments. Methods is just a list of strings, you know, get, put, patch, delete, dot dot. The if you do not use that methods argument, the default is just get and get only. And then permission classes is a list of classes that will be applied just to that particular action. However, the larger use sets, permission classes are also enforced. So if you have, say, an endpoint that is accessible to people who don't have access to the larger outer endpoint, you'll need to put in code to let that single endpoint fall through the permission classes of the view set. So we've so far only covered presenting data. How do we help users find the right data, like helping Celine here find the right purch to sleep on while trying to stay out of reach of Ringo and Henry? This was have actually taken by my wife last night. She was sewing her Halloween costume and Celine decided to help by pulling the pins out with her teeth. Yes, she's about two years old and very fluffy, very lovely, but also very obnoxious. So a cat. So we'll talk about filtering, writing filters. For while each model gets tedious rather quickly, it also doesn't easily handle looking up attributes based on the related objects, like searching for a species name of dog while you're looking at the animal view set, because it has to go through breed to get there. This is Sherlock. He wasn't my cat, but he blown to one of my wife's best friends. He is probably the most stereotypical cat possible when it came to sitting in boxes. If there was an open box anywhere, he was in there within 30s, no matter how small the box may have been compared to his body. Rest framework filters is a very handy library for extending Jango filter to make it even more powerful. Its headlighting feature is the ability to nest filter sets, allowing you to diverse related models in your queery parameters, like that little filter expression right there. That way you can wire your species filters into your breed filters, meaning you don't have to write a second filter to look up only dogs, which are two levels deep in this example. Now there is a risk of information disclosure when you're nelike this. Read the dogs very carefully and make sure you know what you're doing. Now here is a quick, quick example of how to implement rest framework filters in the view layer. It is literally a drop in replacement for Jango filters. It all codes based on Janko filter. So you just swap out how you're importing it and then everything else is exactly the same. The base filter set class is actually a subclass of jgo filters version two. So here's the target filter, which is just a trivially simple species, filters those constants I defined up. Further up in the class, you can see mongithub. It's just you know for like numeriis, just equal, not equal is null, greater than, less than, etcetera. Pretty simple stuff. And then next you will just use that related filter class to tell Jango where to look. And the rest is almost magic. Remember what I said about information disclosure risk? That is, in the query set breed filter that I showed a couple slides back. You have to be very careful about what you expose in your filters, especially to untrusted users. A clever adverse area can use well build filter expressions to determine the sense of the existence of objects they wouldn't have access to, like, say, unpublished draft in a blog app. I took this the next, last, but not least, the most important thing, documentation. I took this polar bear picture at the Cincinnati Zoo back in 2008. If you've never been there, I highly recommend it. It's probably the second best zoo I've been to behind the one right here in San Diego. And if you're looking to get a trip together to go to the zoo while you're here, I think Andrew Carl's organizing one for tomorrow. So you may able to check with him. There are many ways you can present documentation. There are far too many for me to even list here. But the built in browser api, it's a Spartan, but functional. It works. You can make requests, fill informed data, make test requests with relative ease. You can use a Genka reswagger to provide an easy to use playground for users to test out requests and provide slightly better formatting than the browsible ui provides. It also lets you show what method, exactly what methods, you can use with a given endpoint, all in one large, very long list. Now, drf 3.7 did add a schema based documentation generation that mostly renders jket ristswagger obsolete. However, it requires you to manually update the schema before it can read from it. So you actually manually ally run command from the command line. Just integrate that in your tooling and you're done. Now, if you're a fan of readthe docs io, you can use make docs, which is actually included with the cookie cutter template I built this example code product on. If you're starting a new project cs, definitely use it. It's called cookie cutter dash Jango dash. Rest on the GitHub, definitely use it. It doesn't quite work with pipenyet, although you can use my code base two that is modified to work with pipen, if that's your style. So here is rest framework slacker. It's a nice way to generate your standard swaacker uis using your view sets and the filter sets they reference. It requires basically dework. Aside from making them write good dock strings in the view sets themselves, now you are enforcing good dock strings in your pull request, right? Me neither. Here's a screenshot of my example code using rest St framework swagger. Each api action is expanandable, letting you play with filtering options and posting data where appropriate. Think of it as the built in browsable api on steroids. Also, quick tip. If you have api resources you don't want visible to users, like say they're for an internal use only, you just put the attribute exclude from schema and set that to, in your view, set, and that will hide it from this documentation entirely. So here's clicking on one of those particular lpoints, and you can see what rest fraswager offers. Gives you almost all the things you would normally use postman for. And one thing to note, it does not seem to disto automatically discover nested filters, well, like from rest framework filters. Not the end of the world, just slightly disappointing. And then there's also make docs, which, as I mentioned, came prere, installed with that cookie cutter template. It's preconfigured, runs in a separate docker container inside that template. Very easy. And this is what the home page looks like. It's just you read me, format it very, very nicely. And the template also comes with off and the user api preconfigured. Now make docs requires you to write all of your docs in markdown, which is wonderful. And it's nice that you get the classic control over what goes where, but it also requires you to do all the work manually. If you have lazy developers like me, that might be troublesome. So let's I've got a few minutes extra time, so I'll talk about a couple of other useful libraries we had Jango simple history was written originally by a Trey hunter who's actually talking. Next in this room is a great little audit tool for tracking when users make changes. So your user comes back to you and say, Hey, where my dog go? And you can look at the history for that dog and say, you deleted that. That's on you. And then a jgo market field is great if you say you want to be able to let your users create craft like announcements or general purpose messages that would be sent up to the user, but you don't want to go through the trouble of putting into full cms. They can just use markdown to put in their message. And then you can just read that from the api endpoint. It gives you both the raw markdown and hdml formatted output. And then Jango countries, if you've ever had to deal with addresses, you know that countries are a pain. For example, is England a country? It depends on who you ask. Like for soccer, yes. For the Olympics, no. Okay. So go ahead and wrap it up. I talked about making your api user friendly with related fields, listing detail, serializers and actions. I also talked about improving filtering. You can use rest framework filter to get a little bit extra niceties and then look about documentation. And I've got an example code over there on GitHub. Feel free to take a look. And the link to the slides is there as well. I'd like to take a moment, give a special thanks to Lacey, Anna and Jeff for reviewing my talk and my proposal. This would not have gotten anywhere near this way without their help. And then also thank my wife, Bonnie, for putting up with me going to San Diego without her. All right, that's it. So Jango, rest framework exists. It's a wonderfully flexible framework. There is also GraphQL out there. Yes. Are you able to comment on why you would use one or the other other than buzzword compliance? Buzzword compliance is exactly correct. But realistically, I mean, I don't have enough, I don't not have enough experience with GraphQL to provide educated answer on that one. All right, then I'll know other questions. I'll just show you a couple more. Which one is your favorite dog? Definitely, Henry. He gets into less trash that way. I did. Thank you. Excellent talk, by the way. Thank you. My question is you have you has your company done any work with using binary serializers with drf? We have not. We've been pretty much been fortunate enough that we've been able to use json for everything. They haven't really had a format that we need something for binary four. Do you have any recommended patterns for testing serializers? Yeah, I'd like to what I typically do will just feed it in a actually fills example code from his tutorial on Sunday has a great example of testing the serializers you basically defeated a dictionary in and then walk through the validation steps on it and then make sure that the return validated data matches what youexpect it to be and going vice versa in that same order. All right. If that been the case, thank you again, drew, for the wonderful presentation. Thank you so much.
最新摘要 (详细摘要)
概览/核心摘要 (Executive Summary)
本次演讲由 Drew Winstel 主讲,主题为“Django REST Framework (DRF): 从入门到生产实践”,旨在分享将DRF从基础教程水平提升到可用于生产环境的实用技巧和模式。演讲核心围绕如何构建对客户端开发者更友好、性能更优、功能更完善的API。
Winstel 强调,生产级API的关键在于解决“最后一公里”的细节问题。他深入探讨了处理嵌套关系数据时遇到的核心挑战,特别是可写嵌套序列化器(Writable Nested Serializers),并给出了三种解决方案:1) 重写 create/update 方法;2) 使用独立的只写字段;3) 创建自定义关系字段,并详细分析了各自的优缺点与适用场景。
在性能优化方面,演讲提出了针对不同场景(如列表视图 vs. 详情视图,不同权限的用户)使用不同序列化器和查询集的策略。他特别介绍了一个高效技巧:通过在详情序列化器的Meta类中定义list_serializer_class属性,可自动实现列表/详情视图的序列化器切换,无需手动覆写get_serializer_class方法。同时,他强调了使用 select_related 和 prefetch_related 避免 N+1 查询问题的重要性,并建议通过测试(如Django TestCase的assertNumQueries方法)来监控查询数量。
为了扩展API功能,Winstel 推荐使用DRF的 @action 装饰器来创建超越标准CRUD的自定义端点,以简化客户端操作。最后,他推荐了一系列能显著提升开发效率的第三方库,包括用于高级过滤的 rest-framework-filters、用于API文档化的 django-rest-swagger,以及用于数据审计的 django-simple-history。
引言与核心理念
演讲者 Drew Winstel 将构建生产级API比作“画马”的最后步骤——添加细节。他认为,在完成了DRF的基础教程后,开发者需要关注如何让API对客户端(尤其是前端开发者)更加友好和易用。本次分享的内容并非唯一标准,而是其团队在将一个PHP应用栈迁移到DRF和React过程中的成功实践总结。
- 目标受众:已具备DRF基础术语知识,并接触过
django-filter的开发者。 - 核心观点:
> "You want to make things easier on your client developer so they can get actually the data they want."
> (“你需要为你的客户端开发者提供便利,让他们能真正获得所需的数据。”)
核心挑战:处理可写的嵌套关系 (Writable Nested Serializers)
在API设计中,数据库模型(如 breed_id: 42)与用户期望的数据模型(如 breed: "Black Lab")常常不匹配。使用嵌套序列化器可以解决GET请求中的数据展示问题,但DRF本身不提供对可写嵌套字段的默认实现,这成为了一个核心痛点。
Winstel 提出了三种处理该问题的可行方案:
-
方案一:在主序列化器中重写
create和update方法- 适用场景:用户更倾向于创建新的关联数据,而非更新现有数据。
- 实现要点:
- 在
create方法中,手动从validated_data中弹出关联数据字典。 - 根据关联数据中是否存在主键来判断是创建新对象还是更新现有对象。
- 关键决策点:当用户提供的关联数据与数据库中已有的不匹配时,是拒绝请求(返回400)还是隐式更新?讲者强调,无论选择哪种,都必须保持一致性并清晰地文档化。
update方法逻辑类似,但需额外处理PATCH请求中关联字段可能不存在的情况。
- 在
-
方案二:使用独立的只写字段 (Separate Write-Only Field)
- 描述:这是演讲者团队在生产中采用的方案。它通过为读(嵌套对象)和写(主键ID)操作提供不同字段来解决问题。
- 实现:
- 定义一个只读的嵌套序列化器字段用于数据展示(
read_only=True)。 - 定义一个只写的
PrimaryKeyRelatedField字段用于接收客户端提交的ID(write_only=True)。
- 定义一个只读的嵌套序列化器字段用于数据展示(
- 缺点:
- 破坏了“GET请求的响应体可以直接用于PUT请求”这一常见约定。
- 需要编写一个
validate方法,将通过ID查找到的模型实例从species_id键移动到species键,以避免Django ORM在保存时出错。
- 建议:必须在文档中清晰说明这一行为,并提供示例。
-
方案三:创建自定义关系字段类 (Custom Relationship Field)
- 优点:客户端无需区分读写字段,体验更统一。
- 实现:创建一个继承自
Field的自定义字段,使其既能接受主键ID,也能接受数据字典。 - 权衡:
- 如果只接受主键,用户将无法创建新的关联数据。
- 如果接受字典,则会面临与方案一相同的“更新还是创建”的决策问题。
- 可以设计一个字段同时支持两种输入,但这会增加复杂性,同样需要清晰的文档。
API性能与体验优化
为不同场景提供不同序列化器 (Context-Specific Serializers)
为了优化性能和根据用户权限展示不同数据,可以动态选择序列化器。
-
列表 (List) vs. 详情 (Detail):
- 问题:在列表视图中返回所有嵌套数据会非常缓慢。
- 常规方案:在ViewSet中重写
get_serializer_class方法,根据self.action的值(如'list'或'retrieve')返回不同的序列化器。 - 更优方案 (DRF Tip):通过在详情序列化器的
Meta类中定义list_serializer_class属性,可自动实现列表/详情视图的序列化器切换,无需手动覆写get_serializer_class方法。当ViewSet使用many=True初始化序列化器时,DRF的内部机制(重写__new__方法)会自动切换到list_serializer_class指定的类。
-
基于用户权限:
- 通过重写
get_serializer_class和get_queryset,可以根据用户权限(如request.user.has_perm(...))返回包含更多字段的“员工版”序列化器,或返回经过筛选的数据集。
- 通过重写
查询优化与N+1问题
select_related:用于优化一对一和外键关系,通过SQL JOIN一次性获取数据,开销极小。prefetch_related:用于优化多对多和反向外键关系。它会执行一次额外的数据库查询,然后在Python中进行数据合并。虽然比N+1(为每个主记录执行一次查询)快得多,但仍需谨慎使用。Prefetch对象:允许在prefetch_related中进一步筛选或嵌套select_related,实现更精细的优化。例如,在预取一个作者的所有书籍时,可以只预取已出版的书籍,并同时使用select_related获取每本书的出版社信息,从而将多次查询合并为两次。- 核心建议:
> "I highly, highly, highly recommend using tests to count the number of queries used in a particular api test."
> (“我极力、极力、极力推荐使用测试来计算特定API测试中使用的查询数量。”)
> 可通过DjangoTestCase的assertNumQueries方法在自动化测试中监控查询次数,有效防止意外引入N+1问题。
扩展API功能:ViewSet Actions
当需要为模型提供标准CRUD之外的操作时,@action 装饰器非常有用。
- 目的:为模型或模型实例添加自定义的HTTP端点,简化客户端逻辑。
- 示例:为一个动物预约美容,与其让客户端向
/appointments/端点POST并附带动物ID,不如提供一个更直观的/animals/{pk}/book_appointment/端点。 - 关键参数:
detail=True(默认): 动作作用于单个实例,URL中包含主键。detail=False: 动作作用于集合,如获取所有“疫苗过期的狗” (/animals/overdue_shots/) 或导出报表。methods: 指定允许的HTTP方法列表(默认为['get'])。permission_classes: 为该特定动作定义权限类,它会与ViewSet级别的权限共同生效。
推荐的第三方库
-
高级过滤:
rest-framework-filters(核心推荐)- 功能:作为
django-filter的增强版,其核心特性是支持嵌套过滤。 - 示例:允许客户端通过
?breed__species__name=Dog这样的查询参数跨模型进行过滤。 - 实现:在代码层面是
django-filter的“即插即用”替代品,通过RelatedFilter将不同的FilterSet连接起来。 - 风险警告:存在信息泄露风险。不当的配置可能让恶意用户通过构造的过滤条件推断出他们本无权访问的数据的存在。例如,在博客应用中,攻击者可通过过滤条件
?status=draft&author_id=123来测试特定作者是否存在草稿文章,即使他们无权查看草稿内容。
- 功能:作为
-
API文档化 (核心推荐)
django-rest-swagger:生成交互式的Swagger UI,让用户可以在浏览器中直接测试API。它比DRF自带的Browsable API功能更强大。- 注意:讲者提到DRF 3.7之后内置的Schema生成功能在很大程度上可以替代它,但需要手动执行命令生成Schema。
- 隐藏端点:在ViewSet中设置
exclude_from_schema = True可以在文档中隐藏该端点。
mkdocs:用于创建类似Read the Docs风格的静态文档网站。它要求手动编写Markdown文档,提供了完全的控制权,但工作量较大。
-
其他实用工具 (可选)
django-simple-history:一个强大的审计工具,可以追踪模型实例的每一次变更。方便追溯“谁在什么时间做了什么修改”,例如当用户询问“我的数据去哪了”,可以查询历史记录并明确责任。django-market-field(注:疑似为django-markdown-field的口误或转录错误):允许用户在字段中输入Markdown,并能同时提供原始Markdown文本和渲染后的HTML,适用于公告、消息等富文本内容场景。django-countries:处理复杂的国家/地区数据,避免在处理地址表单等场景时自己造轮子。
问答环节 (Q&A) 精华
- DRF vs. GraphQL:讲者表示对GraphQL经验不足,无法提供有深度的比较。
- 二进制序列化器:团队未使用过,因JSON已满足所有需求。
- 测试序列化器:建议的模式是:
- 向序列化器输入一个字典数据。
- 执行验证过程。
- 断言
validated_data的内容符合预期。 - 反向测试
to_representation的输出是否正确。
结论
演讲总结道,Django REST Framework 是一个极其灵活的框架。通过掌握处理嵌套关系、为不同场景优化序列化器和查询、使用@action扩展功能以及善用优秀的第三方库等高级技巧,开发者可以构建出健壮、高效且对客户端友好的生产级API。