Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
Anshin A.
@Icekhaos
Whatever "VOX ADPCM" is, it seems really noisy with it.
Anshin A.
@Icekhaos
On to other things.
image.png
I know the second 4 byte field is file size.
The third is related to it, and the difference between the two fields grows with file size.
Considering this stores uncompressed .wav, I'm not sure this is decompressed size.
Anshin A.
@Icekhaos
Any ideas?
I'm stuck.
Mikhail Yakshin
@GreyCat
Might be offset?
Might be compressed / uncompressed file size?..
sebbu2
@sebbu2
oh yeah, talking about compression, i had a hard time detectiong RAW deflate (without gzip or zlib header)
heard there wasn't any magic signature, just that a certain operation on the bits of the first (two?) byte(s) should return a multiple of 32 or something
Anshin A.
@Icekhaos
@GreyCat Thing is, ripping this with another rather brute force utility (Dragon Unpacker) gives me raw WAVs in the size of what I have determined to be file size
and they are the correct sounds
I got a theory that it was number of samples, but it still varies, but is always more samples than are in the file
Anshin A.
@Icekhaos
but then that is according to audacity
and I am not sure whether it is accurate
Anshin A.
@Icekhaos
image.png
Redid some numbers.
They seem much closer and more predictable... but
Using as offset lands in the middle of a .wav
doesn't seem to be valid
Anshin A.
@Icekhaos
now that i did the math correctly
these are almost double the actual file sizes
but never quite
they still relate somehow, and I do not know how
Mikhail Yakshin
@GreyCat
Number of bytes vs number of samples?
Number of samples as in whole vs as per channel?
dgelessus
@dgelessus
Perhaps there should be a separate room for reverse engineering discussions/brainstorming/help? I don't have anything against these kinds of discussions - quite the opposite, sometimes I could use some help myself when reverse engineering data formats. But I think it would be a good idea to have a separate room for this, so that the main room can stay focused on Kaitai Struct itself, and so that people can opt out of the RE help discussions if they're not interested.
(I can't help much with the actual question, sorry - I know basically nothing about audio formats, and everything I could think of has been said already.)
Anshin A.
@Icekhaos
These are mono, @GreyCat
Everything in the game is mono, at a frequency specified of course by the WAV itself and reflected inside the .vxvs.
Petras Jokubauskas
@sem-geologist

Hello there,
I am trying to parse some c#/.net produced binary files. I am quite annoyed with strings. Let me explain: In that binary files strings are predefined with string length, and to read them I have defined such type:

types:
  c_hash_string:
    seq:
      - id: str_len
        type: u4
      - id: text
        type: str
        size: str_len
        encoding: ASCII

Now this makes unnecessary ugly tree when parsed and to access some parsed element I need to call path_to_the_string.text. I would like it to be able to access without '.text' part. Is there some easy technique to achieve this elegantly?

Mikhail Yakshin
@GreyCat
Hi @sem-geologist! There are multiple concerns about this one.
One of the tasks that KS aims to achieve is to fully reflect all the attributes encountered in the data stream — so, from that point of view, it is necessary to output and give access to all of them.
However, you're completely right, sometimes it makes sense to hide all the details on this level from other levels of abstraction.
There is a proposal to implement something to make it better, let me find it...
Mikhail Yakshin
@GreyCat
Petras Jokubauskas
@sem-geologist
Thanks @GreyCat , I understand now how it is problematic. I guess I have two options: use opaque types (but that cripples a bit RE process, and I need it to be available as vendor is constantly changing binary format so it is moving target), or to abstract the reading of text fields in my application.
Mikhail Yakshin
@GreyCat
@sem-geologist Opaque types are always opaque user types — i.e. you won't be able to get value from then in any way
i.e. you can't do:
instances:
  my_str:
    pos: 0
    type: c_hash_string
  my_str_2:
    value: my_str + 'blah_blah'
as the compiler will have no idea that my_str actual type is a string
Could you please add your use case to kaitai-io/kaitai_struct#171
Petras Jokubauskas
@sem-geologist
I have another question. I am RE parsing of really terribly designed binary format. It is scientific data format for some kind of electron microprobe. Format contains datesets and items. Every dataset and item have its header with sub headers and so on. The terribliness is in that that to get to any n item in m dataset you need to read everything before, as positions/address offsets of items are not mapped to any table (which is common in other formats) - they are not existing. when it comes about the main data - it is much larger than the headers. The header contains the size of data. Now, my question: Is it possible using ksy to get any lazy loading implementation? What I think I need is ability to get the current pointer position and save it for reusing it in making instances of data, but I think I am thinking overcomplicated way. I am not sure if substreams would help me out in this.
Mikhail Yakshin
@GreyCat
@sem-geologist As of now, all seq attributes are eager and all instances are lazy
You can get away with some tricks like saving _io.pos into instance and then reusing it with some other instance
like that:
Petras Jokubauskas
@sem-geologist
Thanks, I however had realized that vendor program which is producing the binary files are limited to 3GB as it is 32bit software. So I think I don't need lazy method for this as that overcomplicates it abit