AudioFifo.readmethod expects number of samples to read. You mentioned that the library supplying
lengthexpects length amount of bytes. With Audio, amount of samples is not equal to amount of bytes. For example, in your case,
pcm_s16lemeans that each sample is 16 bit little endian and this is 2 bytes per sample. If library is indeed supplying to you number of bytes it expects, you need to read
length / 2number of samples. With that, I hope the library is aligned with audio realities and asks for number of samples.
fifo: av.AudioFifo = av.AudioFifo(format="s16le") resampler: av.AudioResampler = av.AudioResampler( format="s16", layout="stereo", rate=48000) def on_played_data(_, length): data = fifo.read(length / 4) if data: data = data.to_ndarray().tobytes() return data group_call_factory = GroupCallFactory(client, CLIENT_TYPE) group_call = group_call_factory.get_raw_group_call( on_played_data=on_played_data) await group_call.start(PEER) while not group_call.is_connected: await asyncio.sleep(1) input_ = av.open(STREAM) for frame in input_.decode(): if frame: print(frame) frame = resampler.resample(frame) fifo.write(frame) await client.run_until_disconnected()~~~
_input = av.open(SOURCE, metadata_errors="ignore") for frame in _input.decode(): fifo.write(frame)
Invalid data found when processing input; last error log: [mp3float] Header missing
@dong-zeyu yes, you have to calculate pts / dts yourself. Something like below worked for me:
import av avin = av.open("test.264", "r", "h264") avout = av.open("test.mp4", "w", format="mp4") avout.add_stream(template=avin.streams) time_base = int(1 / avin.streams.time_base) rate = avin.streams.base_rate ts_inc = int(time_base / rate) ts = 0 for pkt in avin.demux(): pkt.pts = ts pkt.dts = ts avout.mux(pkt) ts += ts_inc avin.close() avout.close()
That works! Thank you!
time_baseunits. For example, if
time_basefor the video is
1/1000, then 1 unit of pts is actually a millisecond. Time base is something arbitrary that video creator (or software creating the video) chooses and then other timestamps (like pts, dts) align. Going with example of 1/1000 as timebase, if a video has 2 FPS frame rate, each packet / frame will have a step of half second, which is 500 in terms of pts / dts increases.
Hi everyone, I could use some help regarding trimming the padding from a video's frames. I've created an issue here with more details: PyAV-Org/PyAV#802. But basically, if I reshape numpy array of a frame's plane along line size and then slice along frame width to get a frame without padding, I'm not getting a properly aligned frame. However, when I try accounting for memory alignment, results are better which leads me to suspect it might be something to do with PyAv ignoring memory alignment issues in its frame width property.
def _remove_padding(self, plane): """Remove padding from a video plane. Args: plane (av.video.plane.VideoPlane): the plane to remove padding from Returns: numpy.array: an array with proper memory aligned width """ buf_width = plane.line_size bytes_per_pixel = 1 frame_width = plane.width * bytes_per_pixel arr = np.frombuffer(plane, np.uint8) if buf_width != frame_width: align_to = 16 # Frame width that is aligned up with a 16 bit boundary frame_width = (frame_width + align_to - 1) & ~(align_to - 1) # Slice (create a view) at the aligned boundary arr = arr.reshape(-1, buf_width)[:, :frame_width] return arr.reshape(-1, frame_width)
Although the above manual alignment kind of works, it's not correct for every format. In my case, the above works correctly for luma plane but not chroma planes. I'm not sure how to proceed so any advice would be really helpful. Thanks.
output = av.open(output_name, 'w') stream = output.add_stream('h264', fps) ... #later in a loop frame = av.VideoFrame.from_ndarray(frame, format='bgr24') print(frame) packet = stream.encode(frame) output.mux(packet)