2024-10-02 | Michał Rokita | Multimedia processing with FFMpeg and Python

使用FFmpeg与Python进行多媒体处理

媒体详情

上传日期
2025-06-21 19:33
来源
https://www.youtube.com/watch?v=J9PhTL9NnxM
处理状态
已完成
转录状态
已完成
Latest LLM Model
gemini-2.5-pro

转录

下载为TXT
speaker 1: It's actually my first time at eo Python. So nice to see you. And first, very, very briefly about me. I'm a Python developer. I'm also one of the Python Poland organizers, and I have been a pyconpoland organizer since 2016. I've been helping Philip, who is present here too, and also I'm working part time for the Warsaw University of	technology, which is displayed here. I am actually a teaching assistant. I am teaching advanced Python programming, and I am also a matbeach volleyball player. And yes, you can play beach volleyball indoors, especially in Polish. Wit's awesome, really. And I also play guitar. This is me on the way to last year's Pyon cizte on the train. And Yeah hopefully we'll play together today's evening and the agenda of today's talk. First, I'm going to talk about what f of unack is actually. Then we are going to go through some very simple tasks using the f of mpx cli. Then we are going to move to complex video stream processing with the ffmx cli. And then we are going to move this like the defining the whole processing to Python to make it not so complex with ffmpython. And then we will go through a more advanced use case of frame by frame object detection with ffm, pepython and OpenCV. And then we are going to briefly talk about testing. So what is fffpec exactly? Does anybody here know what fffpec stands for? The abbrevation? Anyone? Probably, Yeah. Fast and furious? No, I don't think so, but it would be funny. It's just fast forward Motion Picture Experts Group because nx stands for Motion Picture Experts Group and ffff stands for fast forward. So Yeah, fast and furious wouldn't be so, so bad either, I guess. But it is like a Swiss army knife for multimedia processing. It supports most of audio, video and subtitle, codex even and formats. And it is used internally by a lot of tools you probably use. And you were not aware that they run ffm peg underneath multimedia processing, like Google Chrome, like audacity, blender, obs studio, ambient, jelathan, and many more, including YouTube gel. Actually, the core function of managing the mu playlists of YouTube dl is based on ffm peg. And this tool was built in 2000 by fabrius bilard, and it's developed rapidly for the years after that. So to get started with a fpeg, you can just trim a video file using this very, very simple command. Well, first you specify the binary, which is ffmpeg. Then you have this hyidebana argument, which is not really necessary. But it's nice because it hides all the information fmback prints by default, including the flags it's been compiled with, the features is been compiled with, and you just get the output you probably expect. Then you PaaS the input flag, the I flag, here with the video file as an anguthen. You tell it to seek to the thirtieth second of that video file. And then the t flag means to take the 10s after that. And then this y flag means that if the output file already exists, it should override it. And the output file is video. Trim that mp four this time. So it's very simple. Let's do something a bit more complex. And let's say we have this original video. Actually in Poland, it's quite nice because it's legal to record and publish daash cam videos. So I can use this as an example and avoid, like copyright infringement on YouTube. And we have this original video, and let's detect some edges using ffm peg. So again, we opened the terminal. We use the ffm pecommand. We hide the banner, then we passed the input, and here's the input file. And then we tell it to use a video filter called aggidetect, which uses cany's aggidetect algorithm. We PaaS some arguments, like the low and high threshold. Again, we tell it to override the output, and we specify the output file. And then it runs. It tells us the progress of our processing. We wait for some time, and then we can open our fireworks. Look like it looks like this. We have detected all the edges. It was very, very simple. Yeah, it's a simple use case, but it has a lot of filters. You can just look at the documentation to see how many of them are available. So another tool which is provided by ffpeg is ffprobe, which is a very nice small tool which allows you to see some information about the multimedia files you're inspecting. So when you run it on some mp four file, for example, you can see the codex it's using. You can see a lot of metadata like the encoder, the duration, the bitrate, the streams that are present in this file, including the codex, these streams used like H2 hundred 64 here. And also, as you can see, you have audio stream which is also marked as Japanese. So you can see a lot about a video file if you run ffpro bonnet. And it's a very useful tool for testing too, which we are going to talk about later. So how to do a bit more complex filtering? Because what I have shown you is just very simple tasks of applying just one filter and having one input and one output. But ffmcan do a lot more than that. You can use it to design pretty complex processing pipelines using just the cli. So now we are going to go for an example of generating an rgb color histogram from the source video. Then we are going to run eddetection on the source video like in parallel, and then place the results of the above on top of the source video as an overlay. So we can see three of these streams just in one video. So the ffmcommand for this looks like that. And this might be not very pleasant to look at at the first sight, but over the time, if you use ffm long enough, it starts to be understandable, but still not very easy to maintain, especially if you get something a bit more complex at what I'm showing here. But the basic concept here is that in this filter, complex flag, which is used well for complex filters, you specify your signal graph that you use for processing. So here you have actually a small chain of filters like this first line. The first filter is the format filter, which changes the format of the stream to gbrrp, which is just plain rgb. We are using the generate the rgb histogram later here. This is the filter which we use to generate the histogram. And on the left in these square brackets you have the filter input arguments. And here you have the output argument, more like an output node. And the zero here is always the zero file passed the ffmcall. But Yeah, I mean, it's pretty complex. Like later on you have some other stuff. But an easier way to represent this is just to draw the graph. So here you can see that first we have our input file, which is marked as zero. Then we apply here this gbrp format, and then the histogram filter with display mode stuck, and we assign it to the Hist node. Then we scale this node to make it just you know, small enough to display on the original video and to have the right proportions. Then we marked it as his scaled, and then we PaaS it as the second argument, the overlay filter, which accepts two arguments. So here, the first overlay filter usage gets the original, the zero stream as the first argument. And this second argument is our histogram, which we overlay on top. So the second path is the edge detection and scaling. We just scale it to 25% of the original video size so we can fit it on top of the original video. And then we apply edge detection, sorry. And then we PaaS it, we mark it as edges, and then we PaaS it again, the overlay filter with the arguments x of 800 and y of zero, which is just the placement of the overlay. And then again, we merge these together as output slash, histogram, dot, mp four, which is the output file. So this is the result. And this is pretty cool, right? Like compared to the just plane agg detection, this is already much more useful. So let's move to f ic Python because well, writing this filter complex argument, well, it wasn't too hard, but it wasn't easy. Just like straightforward. If you never used ffmpic before, and well, if you use something like this in a project and you added some conditional filtering, it would get very unpleasant to maintain, I would say. So ffmpepython is a nice and pretty small Python project which is available on GitHub. And here have the GitHub link, and it is a convenient wrapper for the ffmx cli, which is focusing on the filter graphs that ffmx supports. And it can generate fic arguments from your Python code, which sounds very nice compared to using the cli, right? It also provides helper functions for running ffmic in a subprocess, and it also includes a ffmprobe function, which returns ffprobe results as a Python diso. It's much easier to PaaS than the outputs I've shown you before. And how to do the very same thing. We've done a few slides before using the cli in a Python. So first we import ffpeg, then we define our input file and assign it to the input video variable. Then we first define our histogram stream, like the filter chain, right? We run a few filters on the input video. We don't really run the filters. We will run the filters when we run the ffm command command. But we design our filter graph here, right, using this filter calls. So first we apply this format filter with the argument gbrp, then we filter with the histogram filter with display mode stuck. And then we apply the scale filter to adjust the size of our histogram appropriately. Then we define our edges processing chain. And here we have again our input video and the azdetect filter, and then the scale filter, which of course makes the video fit on top of the original video. Then we define that. We want to use the overlay filter to overlay the Hist stream we defined before on top of the input video. And then we use the overlay again to just overlay this histogram, the overlay, this edges stream on top of the Hist overlay stream. And then we just define our output. And we say that it should PaaS the y flag to ffm peg to override the output, if it exists, and just run it. So it's much simpler. And it still generates pretty much the same when you compile it to print the string that ffmpeg Python generates, like the ffm pecolgenerates, it basically returns the same with very small differences, like here for the overlay, by default, it uses the argument of enof file action repeat. But in this case, it's not necessary because the streams are of the same length. So if one stream ends, we don't really have to repeat anything in the overlay, right? So it's not necessary. And another not necessary thing it does is mapping explicitly. Oh, sorry, we must click mapping explicitly. This stream, this node, number six, to the output file because, well, there is just one output file. Es, not necessary. But ffmact does support multiple output files in one processing. So again, the output is pretty much the same. It's exactly the same, right? It's the same call so we can get the same results. So well, we've done some processing with fic Python. We use it to generate a ffmic call for complex processing. So how can we actually run some Python code on the frames of our source video and then write it to our output video? So now we are going to do something a bit more interesting. We have this original video here. And again, it's some dash cam footage. And we are going to try to detect registration plates of the cars passing by. So let's go briefly for the logical concept of a solution I'm going to propose. Like this is probably one of the possible solutions. There is plenty of them, but this is fairly simple and intuitive. So first, we are going to create and spawn our first back process, which is just going to decode our stream because, well, and before, files usually contain compressed streams and we can't just load them as simple frames to Python, so we have to decompress them. And ffm peg is an excellent tool for decompressing video streams because it accepts pretty much any codec existing. So then when we read the standard output, the just pipet to our Python process, we are going to load these frames as an NumPy arise and then use open cv and the open cv's cascade classifier to find the registration plates and then mark them using OpenCV again. And then after we, Marthe, found and detected and Marthese registration plates, we are going to reencode these frames to another mp four file by using a second fffprocess. So how the source code would look like. Well, first we have some imports, and we need to import nampy and OpenCV two. Then we define our input and output file, and then we have this mark plates function. I am not going to go for this function like step by step because I don't think it's necessary for the stock. And actually all the examples are available on the GitHub pages page for the talk. So you will be able to access it and run it afterwards with all these source videos and everything, so we can just test it afterwards. But the important part is, does this function accepts a open cv frame, which is basically a NumPy array, and it accepts a list of the plates we found as just open cv rectangles. So we just passed the coordinates of the plates we found on the frame. And then this function just, you know, it draws a nice square around the registration plate. And then to get some information about our input video, you have to use ffff probe. And to do this, we just call fffprobe here on the input file. And then we tag the streams from this file and extract information like the width of the stream, the height of the stream and the frame rate of the stream because we are going to want to slow it down. So we see step by step what our tool detects and to just like we assume that there is just one video stream in this file. So we use this if condition in our generator and we just use pythons next to just get one video stream because this should be just one. If there is none, we would get a stop iteration error, which is not too bad in this case. And then we define our first of impact process, which is also not too complex. We just define the input. Then we define our output, which is just pipe. And we want the output format to be raw video because we want to PaaS raw frames to Python. And our pixel format would be a bgr, which stands for, well, blue, Green, red n 24 for 24 bits. So it's pretty much pretty much understandable. And then we just run the stream asynchronously and we tell the subprocess underneath, like ffm, c, Python, tithe, subprocess, which it creates to pipe the standard output. And then we define our second process, which is a bit more complex because we need to specify some information about the input because it can't really guess what the input is and when it's just piped, especially that these are just raw frains. So we define our input, which is just pipe, and our format is raw video. Again, we PaaS the pixel format we use. Then we tell it that the size of the video is well specified here. This is the information we extracted from the probe coand. Then we specify the output frame rate, which is the original frame rate divided by four. So it will be four times slower than the original video. And then we define our output file. Here's the mp four file, right? The mp four file lane string and the pixel format. The standard most common pixel format for streams in mp four s is youth 420p. And then we tell it again to override the output, if it exists, and again, run a ync. And the most important part of detection is the OpenCV classifier. This xml file is available on the opcv repository, so you can just download it. And still it's available on the GitHub repo for this talk too, so you can just run it and test it yourself. And this cascade classifier, well, this xml file contains some features that the classifier is going to apply to detect the registration plates. And here we iterate over the frames just by reading from the standard output of the first process. And we know that the size of a frame would be video width multiplied by video height and multiplied by free channels because, well, it's rgb, right? So it's just free channels, free free bytes per per pixel. So here we define our input frame because, well, we read some bytes from the standard output of the first process, and we need to know, load it as a OpenCV frame. So we ran nampy from buffer, and we know that these are unsigned. And then we reshaped our array for it to be just video height by video width by free channels. And then again, we converted to type uneight. And then here, this is not really necessary for this case, but you need to remember but that OpenCV would do most of the processing on the objects you PaaS. So you can expect that this marplate scin, which we apply some OpenCV functions, would mutate the frame. So if you apply something afterwards or try to detect something else, you might be surprised that you are trying to detect something and me on a frame you already painted on. And then here's our function, which finds the plates function call, which finds the plates. We use the OpenCV's detect multiscale function and we PaaS the frame would tell it some parameters which we can find tune accordingly to our results. And well, this is the most obvious part that it we define that our registration plate, the size should be greater than 50 by 20 pixels and smaller than 100 by 40 pixels. And then we run our marplates function, and then we write the standard input of the second process. Again, we have to convert our OpenCV frame just to plain bite so we can write to the other process. And after everything, we close the second process and then we wait for both processes to exit. So this is the result of this code. And well, it's not 100% accurate, but as you can see, it does work. And if you applied some more complex processing and maybe if the video had you know the plaates in a bit better resolution, because like Yeah, for me, it's pretty hard to read it and I'm a human, not a computer program, but yes, it works well. So if you are using fic, Python, how are you supposed to test your software? Because it's a very important thing to do, right? So for simple processing, you can just use ffic probe to check if the output file is healthy and contains streams of the duration you're expecting. You can also use ffmcompile to retrieve the generated command, so sure that your complex conditions, which use ffpeg Python for defining these processing pipelines, generate the correct output. And a very useful thing I found a pretty long time ago, which is not that easy to find, is that the ffm peg matainers provide a very diverse library of multimedia samples at fasuite that ffm pec org. It's a very, very like you can get codex you don't really see like most of the time. So it's very useful if you want to develop a solution which accepts everything as input. So it's very nice. And some conclusions. Well, ffmpack is very powerful and complex. You can just run the full help to see how long it is. It's impossible to remember every filter of ffmpeg when you are using it. For complex multimedia processing, you should probably use some wrapper to make it more human readable. The cli is awesome for basic operations. I use it all the time. When I need to compress some video file, I want to send over some chat or something. It's excellent. And f of mpic Python has one issue, that it seems to be abandoned. Last comet was two years ago, and there is no response from them maintaining on the issues, asking if the project project is still alive. And this is an issue, but still, it's an open source project. I think it's under the mit license, so you are free to clone it. And a very, very nice thing, which happened recently, is that my students at Warsaw University of	technology used the project idea I provided of rewriting this library because it lacked some features like proper tyhints and isupport. They took an attempt to rewrite this library using then compatible api, and it actually covers, I would say, half of the features original library provides. But it's a nice start. And it actually generates filter functions from the ffmx source code, like the c source code, with all the type hints and everything. And it's a very nice idea if you want to check it out. It will be also linked to this talk. I'm not really affiliated with the project, but this is a project of my stit's, also under the mit license, and it's published on on GitHub. And it has some very nice ideas. It's not perfect, but it's a nice way to start writing something that would be a worthy success or the original library. And another thing I would like to tell you is that pyconpoland, which I helped to organize, happens this year by the end of August, actually from the 20 ninth of August till the first of September. And it happens in Gita in south thern, Poland. And 90% of content is in English. We have talks, workshops, a social event, everything. It's very nice. You are very welcome there and thank you very much. This is the qr code for the links related to the stock. Now sometimes time for questions. I yes.
speaker 2: So if you have any questions, please stand by the mic.
speaker 1: Let's see if this is on. Thanks for the talk. That was very inspiring. The Python library, is this focused on creating output files? Or can you also have a live output like from the command line of impact that you could reuse? Yes, you can do live output losing using this library. I mean, this library is just an interface to calling fffprocesses, and this is just a plain fic process. And fic does support doing stuff in real time, even like you know live rtmp streaming or something like this. Like ffemic is really a ssami knife of multimedia processing and streaming. And Yeah, it's also you can I have some extra slides? Not for the stock really, but they show you how to connect to the fffp procusing in a unix socket. So you can track gle the progress of your processing too in real time. So it's a very nice thing. So yes, you can stream real time stuff using ffic and ffbic Python.
speaker 3: Thank you so much for your talk. It was really inspiring. Throughout it, my mind was racing about side project ideas. Have you used ffmpeg in any side projects or friends .
speaker 1: have used it? You mean like for my side projects? Like Yeah, Yeah, actually I well, I actually use it for a project at the company I work for, and I'm not sure how much I can tell about the project, so maybe we can talk about it afterwards, but well, it's a very nice tool. Like I use it for you know for the very simple tasks. If you you know there is some video you want to download from some page, for example, YouTube deal doesn't support it. You can use ffack for that too. Yeah. Like it's a very, very fun thing. I even used it to downscale the image of the Warsaw University of	technology in this talk because it was like 10 mb. And I was like, this is too much. And Yeah, like there's a lot of tools that can do this about fdo. Ffnack can even process gifts and generate gifts. So I think I generated a few gifts in ffm peg just for fun. So Yeah, it's a very, very versatile tool. And at first it looks scary to use it, but it's not really like after you get used to it, it's much, much more pleasant than like running some graphical user interface for editing videos. So Yeah.
speaker 3: thank you for the talk. Is there more higher level library also available? Because all of this piping seemed quite low level. And it seems like how .
speaker 1: you mean that this process communication exactly? Yeah, I don't think so. At least I haven't found a more high level library. You should use ffand. There are some high level libraries that utilize fflike playing audio from Python as far as I know. But I mean like somebody can implement something that's even easier, I mean like piping. Yes, it does require some knowledge about about no mulprocessing in Python, which can be pretty hard if you have not used it before. But I think this is one of the easiest libraries to get started with ffm. For example, there is pi av, which is a binding to the ffm peg c libraries. And this is much, much harder to use because you know like it's you have great control over everything, like over processing videos frame by frame, but you have to think about a lot of things. But actually this code, yes, it's a bit complex what I have shown you with this OpenCV processing, but I think after you you know you can run the examples afterwards and I think you can text me on LinkedIn or just like anywhere on the discord. I will be glad to to help to somebody to understand how this mulprocessing part works.
speaker 3: Thank you.
speaker 2: Thank you, everyone, for participating and asking the question. And you could ask more question in the discord. And thanks again. And let's give a round of applause for my.

最新摘要 (详细摘要)

生成于 2025-06-21 19:39

概览/核心摘要 (Executive Summary)

本次演讲由Python开发者Michał Rokita主讲,全面介绍了如何利用强大的多媒体处理工具FFmpeg及其Python封装库ffmpeg-python进行音视频处理。演讲首先将FFmpeg比作多媒体处理的“瑞士军刀”,展示了其通过命令行(CLI)执行基本任务(如视频剪辑、边缘检测)和构建复杂处理管道(如在原视频上叠加RGB直方图和边缘检测结果)的能力。

核心论点在于,尽管FFmpeg功能强大,其CLI语法对于复杂任务而言难以阅读和维护。为此,演讲重点推荐了ffmpeg-python库,它能以更具可读性和结构化的Python代码来构建和执行FFmpeg命令,极大地简化了开发流程。演讲通过一个高级案例,演示了使用ffmpeg-python与OpenCV实现视频逐帧车牌识别的技术方案,清晰地展示了该库在集成外部工具进行复杂分析时的实用性。

最后,演讲者指出了一个关键问题:ffmpeg-python项目似乎已无人维护。但他同时介绍了一个由华沙理工大学学生发起的继任项目,该项目旨在通过解析FFmpeg的C源码自动生成带类型提示的Python函数,以提供更现代、全面的API。总体而言,本次演讲为希望在Python中利用FFmpeg强大功能进行高效、可维护的多媒体处理的开发者提供了清晰的入门指南和实践范例。


FFmpeg简介与基础应用

FFmpeg被誉为多媒体处理领域的“瑞士军刀”,其名称意为“Fast Forward Motion Picture Experts Group”。它是一个功能全面的跨平台解决方案,能够处理绝大多数音视频及字幕的编解码和格式转换。

  • 广泛应用:许多常用工具和平台都在底层使用了FFmpeg,包括:
    • Google Chrome
    • Audacity
    • Blender
    • OBS Studio
    • YouTube-dl
  • 基本CLI操作
    • 视频剪辑:通过简单的命令即可实现,例如从视频第30秒开始截取10秒片段。
      bash ffmpeg -hide_banner -i video.mp4 -ss 30 -t 10 -y video_trimmed.mp4
    • 视频滤镜:应用内置滤镜进行处理,如使用edgedetect滤镜进行边缘检测。
      bash ffmpeg -hide_banner -i input.mp4 -vf edgedetect=low=0.1:high=0.4 -y output.mp4
  • FFprobe工具
    • 这是一个与FFmpeg配套的小工具,用于检查和显示多媒体文件的详细信息。
    • 可以获取的数据包括:编码器、时长、比特率、文件内包含的视频流和音频流信息(如编码格式H.264、音频流语言标签等)。

使用CLI处理复杂滤镜图

FFmpeg允许通过-filter_complex参数构建复杂的处理流水线(信号图),可以并行处理多个输入流并将它们合成为一个输出。

  • 案例:生成多重叠加视频
    1. 从源视频生成一个RGB颜色直方图。
    2. 对源视频进行边缘检测。
    3. 将上述两者作为叠加层,放置在原始视频之上。
  • CLI命令的复杂性
    • 实现该功能的CLI命令非常长且难以阅读,尤其是对于初学者。
      > ffmpeg -hide_banner -i ... -filter_complex "[0]format=gbrp,histogram=display_mode=stack[hist];[hist]scale=...,overlay=...[out];[0]scale=...,edgedetect[edges];[out]overlay=...[final]" ...
    • 演讲者指出,虽然长期使用后可以理解,但这种命令在项目中难以维护,尤其是在需要加入条件逻辑时。
  • 处理流程图解
    • 路径一(直方图):输入文件 → 格式转换(gbrp) → 生成直方图 → 缩放 → 与原视频叠加。
    • 路径二(边缘检测):输入文件 → 缩放 → 边缘检测 → 与路径一的结果再次叠加。
    • 最终所有处理结果合并输出为一个MP4文件。

使用ffmpeg-python简化复杂处理

ffmpeg-python是一个针对FFmpeg CLI的Python封装库,它允许开发者用Python代码来定义和构建滤镜图,从而替代复杂的CLI字符串拼接。

  • 核心优势
    • 可读性与可维护性:将复杂的滤镜链分解为一系列Python函数调用,代码结构清晰。
    • 自动生成命令:能够从Python代码生成对应的FFmpeg CLI参数。
    • 辅助功能:提供在子进程中运行FFmpeg的帮助函数,并包含一个能将ffprobe输出解析为Python字典的函数。
  • 代码示例:使用ffmpeg-python重现上述多重叠加视频的案例。
    ```python
    import ffmpeg

    input_video = ffmpeg.input('input.mp4')

    定义直方图处理流

    histogram_stream = (
    input_video
    .filter('format', 'gbrp')
    .filter('histogram', display_mode='stack')
    .filter('scale', ...)
    )

    定义边缘检测处理流

    edges_stream = (
    input_video
    .filter('edgedetect')
    .filter('scale', ...)
    )

    叠加处理

    processed_stream = (
    input_video
    .overlay(histogram_stream)
    .overlay(edges_stream, x=800, y=0)
    )

    定义输出并运行

    (
    processed_stream
    .output('output.mp4', y=None) # y=None 对应 -y 参数
    .run()
    )
    ```
    这段代码生成的CLI命令与手动编写的几乎完全相同,但代码本身更易于理解和修改。

高级案例:结合OpenCV进行逐帧车牌识别

演讲演示了使用ffmpeg-python与OpenCV实现视频逐帧处理的技术方案。

  • 目标:检测行车记录仪视频中的车牌。
  • 技术方案
    1. 进程一(解码):启动一个FFmpeg进程,将输入的MP4视频文件解码为原始视频帧(raw video, BGR24格式),并通过管道(pipe)将数据流传输到Python主进程。
    2. Python处理
      • 从管道读取原始帧数据。
      • 将字节数据加载为NumPy数组。
      • 使用OpenCV的级联分类器(Cascade Classifier)在每一帧上检测车牌位置。
      • 使用OpenCV在检测到的车牌周围绘制矩形框。
    3. 进程二(编码):启动第二个FFmpeg进程,从Python主进程的管道接收处理后的原始帧,并将它们重新编码成一个新的MP4视频文件。
  • 关键实现细节
    • 使用ffprobe预先获取输入视频的宽度、高度和帧率等信息。
    • 在Python中,通过循环读取标准输入流,并根据width * height * 3(BGR三通道)计算每帧的字节大小,然后用numpy.frombuffer将字节转换为图像数组。
    • 处理完成后,将NumPy数组转换回字节,并写入到第二个FFmpeg进程的标准输入流中。
  • 结果:该方法能够成功检测到视频中的车牌,尽管准确率并非100%。

测试策略与项目现状

  • 测试方法
    • 简单测试:使用ffprobe检查输出文件是否健康,以及其流信息(如时长)是否符合预期。
    • 逻辑测试:使用ffmpeg.compile()方法获取生成的CLI命令字符串,以验证复杂的条件逻辑是否生成了正确的命令。
    • 样本库:推荐使用FFmpeg官方维护的多媒体样本库 fates.ffmpeg.org,它提供了大量罕见的编码格式样本,非常适合用于开发需要广泛兼容性的解决方案。
  • ffmpeg-python项目现状 (争议与不确定性)
    • 演讲者明确指出,ffmpeg-python项目似乎已被放弃
      > "it seems to be abandoned. Last comet was two years ago, and there is no response from them maintaining on the issues, asking if the project project is still alive."
    • 这是一个显著的问题,给长期使用带来了风险。
  • 潜在的继任者 (建议)
    • 演讲者介绍了一个由他在华沙理工大学的学生发起的项目,旨在重写ffmpeg-python
    • 新项目特点
      • 采用更现代的API设计。
      • 能够基于FFmpeg的C源代码自动生成带类型提示(type hints)的Python函数接口,确保API的全面性与准确性。
    • 演讲者评价其为“一个好的开始”,虽然目前功能尚不完整,但为社区提供了一个有潜力的替代方案。

问答环节与其他信息

  • 关于实时输出ffmpeg-python可以用于实时流处理(如RTMP直播),因为它本质上是调用FFmpeg进程,而FFmpeg本身支持实时功能。
  • 关于更高层次的库:演讲者表示,他未发现比ffmpeg-python更高级、能完全隐藏进程管道细节的库。他提到了pyav,这是一个更底层的C语言绑定库,使用起来更为复杂。他认为,尽管涉及进程通信,ffmpeg-python仍是上手FFmpeg最简单的方式之一。
  • 会议推广:演讲者邀请与会者参加将于8月29日至9月1日在波兰格利维采(Gliwice)举办的PyCon Poland会议,并提到会议90%的内容将以英语进行。