So I've started with something like this:
self._fifo = av.AudioFifo(format="s16le")
def _start_av(self):
input_ = av.open(self.url, options={"format": "s16le", "acodec": "pcm_s16le", "ac": "2", "ar": "48k"})
for frame in input_decode():
if frame:
self._fifo.write(frame)
def read(self, length):
data = self._fifo.read(length)
if data:
data = data.to_ndarray().tobytes()
return data # returns the data to the library
But I have a small problem, the data received from the AudioFifo is bigger than the one requested
AudioFifo.read
method expects number of samples to read. You mentioned that the library supplying length
expects length amount of bytes. With Audio, amount of samples is not equal to amount of bytes. For example, in your case, pcm_s16le
means that each sample is 16 bit little endian and this is 2 bytes per sample. If library is indeed supplying to you number of bytes it expects, you need to read length / 2
number of samples. With that, I hope the library is aligned with audio realities and asks for number of samples.
fifo: av.AudioFifo = av.AudioFifo(format="s16le")
resampler: av.AudioResampler = av.AudioResampler(
format="s16", layout="stereo", rate=48000)
def on_played_data(_, length):
data = fifo.read(length / 4)
if data:
data = data.to_ndarray().tobytes()
return data
group_call_factory = GroupCallFactory(client, CLIENT_TYPE)
group_call = group_call_factory.get_raw_group_call(
on_played_data=on_played_data)
await group_call.start(PEER)
while not group_call.is_connected:
await asyncio.sleep(1)
input_ = av.open(STREAM)
for frame in input_.decode():
if frame:
print(frame)
frame = resampler.resample(frame)
fifo.write(frame)
await client.run_until_disconnected()~~~
_input = av.open(SOURCE, metadata_errors="ignore")
for frame in _input.decode():
fifo.write(frame)
gives Invalid data found when processing input; last error log: [mp3float] Header missing
@dong-zeyu yes, you have to calculate pts / dts yourself. Something like below worked for me:
import av avin = av.open("test.264", "r", "h264") avout = av.open("test.mp4", "w", format="mp4") avout.add_stream(template=avin.streams[0]) time_base = int(1 / avin.streams[0].time_base) rate = avin.streams[0].base_rate ts_inc = int(time_base / rate) ts = 0 for pkt in avin.demux(): pkt.pts = ts pkt.dts = ts avout.mux(pkt) ts += ts_inc avin.close() avout.close()
That works! Thank you!
time_base
units. For example, if time_base
for the video is 1/1000
, then 1 unit of pts is actually a millisecond. Time base is something arbitrary that video creator (or software creating the video) chooses and then other timestamps (like pts, dts) align. Going with example of 1/1000 as timebase, if a video has 2 FPS frame rate, each packet / frame will have a step of half second, which is 500 in terms of pts / dts increases.
Hi everyone, I could use some help regarding trimming the padding from a video's frames. I've created an issue here with more details: PyAV-Org/PyAV#802. But basically, if I reshape numpy array of a frame's plane along line size and then slice along frame width to get a frame without padding, I'm not getting a properly aligned frame. However, when I try accounting for memory alignment, results are better which leads me to suspect it might be something to do with PyAv ignoring memory alignment issues in its frame width property.
def _remove_padding(self, plane):
"""Remove padding from a video plane.
Args:
plane (av.video.plane.VideoPlane): the plane to remove padding from
Returns:
numpy.array: an array with proper memory aligned width
"""
buf_width = plane.line_size
bytes_per_pixel = 1
frame_width = plane.width * bytes_per_pixel
arr = np.frombuffer(plane, np.uint8)
if buf_width != frame_width:
align_to = 16
# Frame width that is aligned up with a 16 bit boundary
frame_width = (frame_width + align_to - 1) & ~(align_to - 1)
# Slice (create a view) at the aligned boundary
arr = arr.reshape(-1, buf_width)[:, :frame_width]
return arr.reshape(-1, frame_width)
Although the above manual alignment kind of works, it's not correct for every format. In my case, the above works correctly for luma plane but not chroma planes. I'm not sure how to proceed so any advice would be really helpful. Thanks.
output = av.open(output_name, 'w')
stream = output.add_stream('h264', fps)
... #later in a loop
frame = av.VideoFrame.from_ndarray(frame, format='bgr24')
print(frame)
packet = stream.encode(frame)
output.mux(packet)