2023-12-15 | DjangoCon 2023 | Building Powerful APIs with Django, Django Rest Framework, and OpenAPI with Velda Kiara

媒体详情

上传日期: 2025-06-21 17:55
来源: https://www.youtube.com/watch?v=qcxio8C9Mh0
处理状态: 已完成
转录状态: 已完成
LLM 提供商/模型: openai/gemini-2.5-pro

转录

speaker 1: This is amazing. Grateful to be here first of all. But then apologies. I'm going to have to rush through a couple of things. But don't worry, you have good content to go home with. So there is that. So when I was younger, say, six, seven years old, we used to get the super strikers magazine, which is which is about having football players just play football. And then they have other teams that they need to also come up against, and then theyface a couple of challenges, and then at some point, then theywould either win theyeither lose, and then learn a couple of lessons. So this is actually the first comic book I came across, and it used to come on the weekend newspaper. So every single week on the weekend, this is what I used to look forward to. So in this particular scenario, I didn't get the issue for this, for that particular weekend, which was Obama, because we were left at a Cliff hanger. We didn't know if the person had scored the goal or not, what was going on. So I was really excited for this particular issue. But then when the newspaper was delivered to us, it was missing the issue. So I told my parents this. I was like, okay, you know this. My routine is normally involved with reading the superstrkers every morning and then telling my brothers about fake stories. So every time they're reading it, they're looking at the scenario I told them about, but then they can't find it. So that's how I got to also improve my storytelling skills. Don't don't tell them that though. And this particular weekend you were like, okay, we'll find or get you one at the end of the day, I'm like seven years old at the end of the day is like a decade to me, like my time is different, right? So I devised a plan to actually leave the house and go to the distribution center, was like, which was like two, three minutes away, so I could easily walk. And I walked. I rehearsed what I was supposed to say and tried to act like an adult at that time and ask for the magazine that was owed to me at that point. And the person at the reception desk was actually really nice and she was like, Oh Yeah, there an issue that was missing, but then we were supposed to deliver it to you. At the end of the day, I'm like, my time doesn't work like that, so I just need to pick up my issue right now. So the person was nice to me and they actually gave me the issue. And I went back to my routine to tell my brothers about the fake stories and all that. So that went well. So in this case, when I talking about apis, the request I raised to the reception lady is actually the request that as a user, as a client, in this case, raises to apis. And the response of heart, checking the back and seeing if there's actually an issue and checking if I should actually get the issue is a response from the particular saver. So that's basically how apis work. You raise a request and get a response. So yes, I needed answers. I could not wait the whole freaking day. So there is also that. And moving on to status codes. So generally, when you raise a request, the response you're supposed to get is in form of status codes. So the five general classes that you should get responses in, and the first digit signifies the general class, and the last two digits are specific information about that particular bug. So if you're getting an error code or a status code of one, that means, okay, we got your request, we're working on it. If you get two, that means like two, one or 200, that means we've got any your request and here's the result. But if you're getting three, that means a't, nobody got time for that. We're not going to do that today, but we can send you to a place where you could get your request fulfilled. And if you're getting four as the saver, my response to you is like, nah, you did something wrong. It's your fault. Check the syntax that you're using. And if you're getting five, and that means, okay, my bad, it's me. It's not you. If you know, you know so if you're looking at also whenever you're trying to build your apis, we tend to still build out and design the particular specification and then build that out before you building the front end interface in other components. So the first reason why it's recommended to actually design and define your api first is because you get to separate your concerns. So you get to separate the backhand logic and the front end implementation so you can easily develop the ouples components, right? And it's easier to maintain an update. And then you also get flexibility and scalability because you're able to have different teams working at different times. And then you're also able to get all the information that you need to accommodate both users, that is, the existing users and the new clients that are going to have it also improves in terms of collaboration and parallel development because the teams can work side by side without having dependencies. And then thinking about what the api is going to be or what the results and responses are going to be allows you to have a clear and concise interface because now you've thought about the response and the various formats and error messages that the user is are supposed to get. If you're starting with the api first, it's also easier to get your documentation. You can easily automate this, and then you can easily add a few concepts that you want the user to know. So it makes work easier for your documentation. And also in terms of each approving, that means that you're able to account for to have a stable version. So regardless of whether the front end changes or you're using a different backend or there's a recent development, you can still be able to use the api regardless of that. And another thing is that you're also able to have other people join. You could have other people also support and use your particular api to build other services so they to increase the user base that you have. So it's a win win for everybody. And if you're working with apis, the five common requmethods that are these are not the complete requests. There is also one I came across called PaaS. I haven't used it yet, but this is basically the common things that you should accept like the get p method. So the get request method is where you the request basically retrieves the data that's already existing in this summer and the without making any adjustments to it. So for post is where you actually add a resource to the saand for put and patch, which is interviewers favorite question, what the differences are between put and patch. So if you're using the put request, then you're changing the resource in its entirety. That is, if the resource has like ID, price description and all that, you get to change all of it. But if you're just patching, then you're just updating partial information to that. And deleting is where you get rid of the entire resource. And then in this case, we're going to be talking about breaapis, which means is representational state transfer Yeah which means that rest apis, the state is actually included in the request so that someone doesn't really need to store anything as pertains to that request. And then when it comes to the open enapi specification, so it's a list of what needs to happen at what time, like what is the resource, what is the endpoint going to look like, what the format of the responses are supposed to get, what is the data schema you're using? What are the data objects involved and what not. So moving on, why you chose to use your f? I'm going to explain like five of this because of time, but I chose your f because of one. There is a lot of support community wise. You can easily get on and learn because there are so many forums, there are so many tutorials out there. And it's also built for jungle. So it just makes it an easier choice. Also, it handles serialization, which means that rendering and passing your data is quite easy. And then in regards to pagination and filtering, it also helps. It comes with this like out of the box. So you just need to specify the amount of fields or the amount of data the particular user is supposed to get. And there's also a lot of that party usage, and it also covers security. So in terms of views there, different methods. You could implement views. You could easily use generics or research. This is highly determined by the decision that your team needs to make, or rather the project specifications and also the needs of their team as well. So for me, I would go with genics if I have a simple application I want to build and if the operations are really easy to also implement, it also uses the standard conventions, and you have less cool because most of the time, a one liner can handle, let's say, retrieve, update and destroy data with just one line. And then it also runs on the default behavior of how apis should work. So it still works. Now if you want more control or if the application you're working on actually needs more logic implemented based on your business needs, then I'd recommend view sets, because you get to customize the logic that you have. You get to have the control in terms of who has access to what, and you can easily group this in roles. And then there is also reusable code as well. And you're also able to manage complex relationships in regards to whether you're having a lot of interconnections in regards to also authentication. What are you using for authentication? How are you going to group these people in particular groups? And before I go into caching, so I have code that I was supposed to demo here just a second, so it actually works. So the endpoint I built was on treasures. Okay, Yeah it works. This is what we were trying to achieve. So it actually works. I didn't do ice wear, it works. I actually have rest tests here so you could easily test it out. It's also a docker image and I'll share the book in just a second. So for treasures, say we want to create one, a resource. So we have name, prize and description. So let's say jusweater and let's price it to say $5, $0.66 and then description is yellow sweater. Yeah then post this and see it. Actually, yes, it works. So when you go to the treasures list. You can actually see juswehas been updated so the endpoints actually work. You can create, you can see the list view and whatnot and something else about the code. It's also a docker container, so you can easily run it and you can give me feedback or if you also have any questions about it, then we could get into that. So caching is storing frequently accessed data in a temporary location, which means you don't need necessarily need to send requests every single time to your server. And that means that your users get to get the data as fast as possible. So first things, when you're implementing caching, you need to know data volatility. So how frequent does your data change? Is it like stock prices, which is highly volatile and highly changes? And how do you even get to manage your how set traffic? Can your api manage a lot of traffic at peak times? How does it handle all the requests that come in? And also in regards to response time, how fast do you want your your responses to get to your users? Is it instantaneous? Does it lag for a minute? And then also, api resources are finite, so it's a little bit you need to be able to extend these resources to fit the particular users that you have. Something that I'd like to talk about is then if you're caching, then the next question is how do you actually maintain the freshness of your data? So to maintain the freshness of your data, you need to have things like cash acexpiration needs. This is where you actually set timelines based on the data. As I mentioned, if you're having like stock price kind of data, then you need to know like at a certain time at like say, every five minutes, then refresh this data. And in terms of validation, there are three methods that you could actually do, cache invvalidation, which means mechanisms that remove or update their entries once the data changes. So one is time based, which is basically like cache expiration. And then there is event driven. For people who love event driven architecture, that is where you set triggers to notify the cache system when data changes occur. And then manual invvalidation, where you can have an interface to actually have a person do it for you. We also have lazy loading or cache aside pattern where the cache is only updated when the data is requested. So when the data is requested, it first gets to the cache and then the cache sends the data to the user. And then you could also version your particular api endpoints where if the cache, the server compares the data that it has to the data that is cached. So if the cache data is older than the Sava data, then it updates the cache. And then you could also have cache control headers, where you leverage cache control methods or headers like cache control max age or no cache to instruct their cache proxies and clients on how to cache or revalidate the data. We also have conditional requests, which means implementing mechanisms like etags and last modified headers, so clients can include this particular headers if their request, and the Sava can respond with Ava 304, which is and not modified status code if that date is not modified. And then you could also have background updates for those who have also home rojobs. So you could easily have set up jobs to actually refresh and update data that necessarily doesn't change every single time. And then if you're using real time data, you could have web sockets or saascent events to push updates to clients immediately by bypassing their cache every time it's necessary. And then you could also monitor and measure that by implementing a lot and systems to track the health and the freshness of your particular data. The last thing I hope I'm still in time is the benefits of caching. So when you implement caching, there are a few things that you are actually having. You're getting improved performance because then the users are getting the data instantaneously. So it's not somebody who's like waiting for the data because as a user, every time you click okay, you want it to be like, okay, you're done, you can go to the next thing because other if it takes a lot more seconds, you're complaining that it has a performance issue, right? So you also get reduced latency because then the user can easily get the data instantaneously and then you also get the lower saver load because the survey is not being engaged at every single request. And then you also have enhanced scalability because once the users that have already requested their data and the data ase of the cache, then you can serve new users who are using your particularly api. So I had something on implementation. So you could either do in memory caching or database caching. So in memory caching basically means that you store data in the salaries realm for fast retrieval. So this is how you implement this. Don't worry, I'll share the slides. You could just you could use something like jungle Reis for this. The code is updated with the file, so don't worry about that. And then we have data. Ase caching, which is basically database caching, is storing frequently accessed data in the database itself for quick retrieval, and then that's how you implement that. And then we have cdn caching. If you're using a cdn, which is storing the static assets and content on distributed servers for global delivery. So the last thing I'd want to look at is in terms of security, we have four factors, which means that we can look into authentication, authorization and data protection as well as api cueues. So I think you've all had of the time when people are celebrating each other and they're like, okay, say Abby is exactly who she says she is like because they've done like a bit of boss moves or something. So that is basically what authentication is. It's actually proving who you say you are. And a few things that you can look into is on user authentication, like using tokens. You could have api keys or two fa, which we all use every time somebody else logs in into your Google account, always get, is it you? Are you sure? Is it you? So Yeah, that's pretty much it. And then on authorization, it's more or less what actions can you take based on who you are? Like what are you allowed to do in the system? So the next thing is on data protection in itself. You could easily do this through encryption, such as using aas, which is a symmetric block cipa, which means the same key, the sender and the receiver have the same key to encrypt or encrypt the data. And then data masking is what happens when you're feeding your credit card information to the particular system they're using because you don't want anybody to just have this particular data. And then there's also the actual validation, which is validating and sanitizing the user input to prevent injection attacks. We have lastly api keys, which is basically you can do this through key management by rotating the keys that are there. You could have usage limits if you have rate limiting on your api, which is highly advised so that you don't have your resources disrupted or destroyed. And then you could also have scope based keys, which means if you have particular keys, then you have access to more functionality. So I tried my best. Any questions so far? Okay, there are no questions I could. This is a qr code that has the GitHub repo. It has the slides with more information. It also has access to the code basin itself. And then I have been wanting to come here for the longest time. It only took me three years to actually attend this conference. I knew about it in 2020. So this is a huge moment to me, or for me in this case, regardless of whatever took place, I enjoyed coming to this conference. I have enjoyed talking to you. Hopefully, if there are any questions that come up, I'm still around the conference, so you could easily talk about that. And yes, thank you so much. And I hope you enjoy the rest of your conference.

概览/核心摘要 (Executive Summary)

本次演讲由主讲人全面介绍了如何使用Django、Django Rest Framework (DRF) 和OpenAPI构建强大、可扩展且安全的API。演讲以一个生动的个人故事类比API的“请求-响应”模式，深入浅出地阐述了API开发的核心流程与最佳实践。核心观点强调“API优先” (API-First) 的设计哲学，即在编写任何实现代码之前，先设计和定义API规范。这种方法能有效分离前后端关注点，促进并行开发，提升灵活性与可维护性，并简化文档工作。

演讲详细剖析了API的基础知识，包括HTTP状态码的含义、五种常用请求方法（GET, POST, PUT, PATCH, DELETE）的区别，以及REST API的核心原则。在技术选型上，演讲者推荐使用DRF，因其拥有强大的社区支持、内置的序列化、分页和安全功能。特别地，演讲对比了DRF中Generics（适用于简单、标准化的场景）和ViewSets（适用于需要复杂业务逻辑和更多控制的场景）的选型考量。

性能优化方面，演讲重点讨论了缓存策略，详述了多种保持数据新鲜度的技术，如缓存过期、事件驱动失效、懒加载、版本控制和条件请求等。安全是另一大重点，内容涵盖了认证（你是谁）、授权（你能做什么）、数据保护（加密、掩码）和API密钥管理（轮换、速率限制）四个关键层面。最终，演讲者通过一个实际的代码演示，并分享了包含代码、幻灯片和更多信息的GitHub资源，为开发者提供了一套完整的API构建指南。

开场与API核心概念类比

演讲者通过一个生动的童年故事来解释API的工作原理：

背景: 小时候，她每周都期待周末报纸里的“超级前锋 (Super Strikers)”漫画。一次，报纸送来时唯独缺少了这部分，让她对一个悬念（主角是否进球）的结果心急如焚。
类比:
- API请求 (Request): 她无法等待，于是亲自跑到发行中心，像个小大人一样向接待员索要自己应得的漫画。这如同客户端向API发起请求。
- API处理与响应 (Response): 接待员核实情况后，确认确实存在遗漏，并将缺失的漫画交给了她。这如同服务器处理请求后返回数据或结果。
核心理念: 整个过程被精炼为API交互的基础模型："You raise a request and get a response." (你发起一个请求，然后得到一个响应)。

API基础知识

HTTP状态码 (Status Codes)

HTTP状态码的第一个数字代表其所属的类别，演讲者用通俗的语言解释了五个主要类别：

1xx (信息性): "我们收到了你的请求，正在处理中。"
2xx (成功): "我们成功处理了你的请求，这是结果。" (例如 200 OK)
3xx (重定向): "我们现在不处理这个，但可以把你引导到能处理的地方。"
4xx (客户端错误): "不，你做错了什么。这是你的问题。" (例如 404 Not Found)
5xx (服务器错误): "我的错，是我的问题，不是你的。" (例如 500 Internal Server Error)

HTTP请求方法 (Request Methods)

演讲介绍了五种最常见的HTTP请求方法：

GET: 从服务器检索已存在的数据，不产生副作用。
POST: 向服务器提交数据，创建一个新资源。
PUT: 完整地替换或更新一个已存在的资源。请求体需要包含资源的全部信息。
PATCH: 部分地更新一个已存在的资源。只需提供需要修改的字段。
- 演讲者特别指出，区分PUT和PATCH是“面试官最喜欢的问题”。
DELETE: 从服务器删除一个指定的资源。

REST API与OpenAPI规范

REST API: 全称“表现层状态转移 (Representational State Transfer)”，其核心特点是无状态，即请求本身包含了服务器处理它所需的所有信息，服务器无需存储客户端的状态。
OpenAPI规范: 一个定义API的行业标准。它详细描述了API的各个方面，包括：
- 资源和端点 (Endpoint) 的结构。
- 响应的格式。
- 使用的数据模式 (Data Schema) 和数据对象。

API优先 (API-First) 的设计原则

演讲强烈推荐采用“API优先”的设计方法，即在开发前端或其他组件之前，首先设计和定义API。其主要优势包括：

关注点分离: 将后端逻辑与前端实现解耦，便于独立开发和维护。
灵活性与可扩展性: 允许不同团队在不同时间工作，并能轻松适应新旧客户端的需求。
协作与并行开发: 前后端团队可以基于共同的API契约并行工作，减少依赖。
清晰的接口定义: 促使开发者预先思考响应格式和错误信息，形成清晰一致的接口。
简化文档工作: 可以基于API规范自动生成文档，确保其准确和同步。
未来保障 (Future-proofing): 即使后端技术栈或前端框架发生变化，稳定的API接口也能保持不变。
促进生态系统发展: 清晰的API允许第三方开发者基于你的服务构建新的应用，扩大用户基础。

技术选型：Django Rest Framework (DRF)

演讲者选择并推荐DRF的原因如下：

强大的社区支持: 拥有海量的教程和论坛，易于学习和解决问题。
为Django而生: 与Django无缝集成。
内置序列化: 轻松实现数据的渲染和解析。
开箱即用的功能: 自带分页 (Pagination) 和过滤 (Filtering) 功能。
丰富的第三方包: 生态系统完善，扩展性强。
安全性: 提供了多种安全机制。

视图实现：Generics vs. ViewSets

在DRF中，如何实现视图是一个关键决策，取决于项目需求和团队偏好：

Generics (通用视图):
- 适用场景: 简单的应用，标准化的CRUD（创建、读取、更新、删除）操作。
- 优点: 代码量少（有时一行代码即可实现多个操作），遵循标准约定。
ViewSets (视图集):
- 适用场景: 需要更多控制和自定义逻辑的复杂应用。
- 优点: 允许自定义业务逻辑，能更好地控制访问权限，易于组织和重用代码，能有效管理复杂的数据关系和认证逻辑。

演讲者还提及她进行了一个现场演示，展示了一个用于管理“宝藏 (treasures)”的API端点，证明了其创建、列表查看等功能均可正常工作。

性能优化：缓存策略 (Caching)

缓存是将频繁访问的数据存储在临时位置，以加快响应速度并减轻服务器负载。

实施缓存前的考量

数据易变性 (Data Volatility): 数据更新的频率如何？（例如，股票价格数据高度易变）
流量模式: API在高峰时段能否处理大量请求？
响应时间要求: 用户需要多快得到响应？
API资源: API资源是有限的，需要有效利用。

保持数据新鲜度的策略

缓存过期 (Cache Expiration): 为缓存数据设置一个固定的生命周期（TTL, Time-to-Live）。
缓存失效 (Cache Invalidation): 当源数据发生变化时，主动移除或更新缓存。
- 基于时间: 类似于缓存过期。
- 事件驱动: 设置触发器，在数据变更时通知缓存系统进行更新。
- 手动失效: 提供一个接口，由人工操作来清除缓存。
懒加载/旁路缓存 (Lazy Loading / Cache-Aside): 仅当数据被请求时才更新缓存。请求首先访问缓存，若未命中，则从数据库加载数据，存入缓存后再返回给用户。
API版本控制: 服务器比较缓存数据和自身数据的版本。如果缓存数据过时，则更新缓存。
缓存控制头 (Cache-Control Headers): 使用Cache-Control、max-age等HTTP头指令，告知客户端和代理如何缓存数据。
条件请求 (Conditional Requests): 使用ETags或Last-Modified头。客户端在请求中带上这些标识，如果数据未变，服务器可返回304 Not Modified状态码，避免传输完整数据。
后台更新: 使用定时任务（如Cron Job）在后台定期刷新不经常变动的数据。
实时数据推送: 对于实时数据，使用WebSockets或Server-Sent Events (SSE)直接将更新推送到客户端，绕过缓存。

缓存的实现方式

内存缓存 (In-memory Caching): 如使用Redis，将数据存储在服务器内存中，速度最快。
数据库缓存 (Database Caching): 将缓存数据存储在数据库中。
CDN缓存: 将静态资源（如图片、CSS）存储在分布式的内容分发网络（CDN）服务器上。

API安全保障

演讲从四个方面探讨了API安全：

认证 (Authentication) - 证明你是谁
- 定义: 验证用户身份的过程。演讲者比喻为：“Abby就是她所声称的那个人。”
- 方法: 用户名/密码、Token（令牌）、API密钥、双因素认证 (2FA)。
授权 (Authorization) - 你能做什么
- 定义: 根据用户身份，确定其被允许执行的操作。
数据保护 (Data Protection)
- 加密 (Encryption): 使用如AES（高级加密标准）等对称块加密算法保护数据传输和存储安全。
- 数据掩码 (Data Masking): 在展示时隐藏敏感数据，如将信用卡号显示为**** **** **** 1234。
- 输入验证 (Input Validation): 验证和清理用户输入，防止SQL注入等攻击。
API密钥管理 (API Key Management)
- 密钥轮换 (Key Management): 定期更换API密钥。
- 使用限制 (Usage Limits): 实施速率限制 (Rate Limiting)，防止滥用或DDoS攻击。
- 基于范围的密钥 (Scope-based Keys): 为不同的密钥分配不同的权限范围。

结论与资源分享

演讲者最后表达了参加DjangoCon的激动之情，并表示乐意在会场与参会者继续交流。她提供了一个QR码，其中包含：
* GitHub仓库: 包含完整的代码示例。
* 演讲幻灯片: 含有更详细的信息。
* 代码库访问权限。

摘要历史 (2)

Detailed Summary 摘要

模型：gemini-2.5-pro

2025-06-21 18:09

Detailed Summary 摘要

模型：gemini-2.5-pro

2025-06-21 17:58

StreamSparkAI