2020-02-21 19:16:31 (edited by camlorn 2020-02-21 19:19:16)

@19
Along those lines, yes.

@21
You have to equalize the MIT dataset unless you use the diffuse field equalized ones they provided, and their equalization isn't very good.  The most basic way is to take the average of the power spectrum--the magnitude response squared--and divide out by that, but that's what they did and it's meh.  One of the things I don't like about the MIT dataset is that they're kinda blase about their methodology.  Good to know Cipic is worthwhile; I've considered going that route but their data is kinda in a weird format and undocumented.  I may have a question for you regarding their coordinate system at some point; their papers are completely inaccessible.

@22/23
It wouldn't be graph based, no.  I'm thinking parallel(synth generators)->source->(serial number of effects)->(split to HRTF panner bank/global effects).  This gives at least a factor of 2 increased performance on top of Libaudioverse, enough that fiddling with multithreading work schedulers shouldn't be necessary, possibly much more than a factor of 2 if SIMD counts for as much as I think.  That's on top of the factor of 2 or 3 from shortening/preprocessing the impulse responses.  You can get much higher performance with less than half the code--MVP of this thing is probably 2000 lines at most, Libaudioverse is around 15k and unfinished.

SteamAudio might be good.  I think they're Ambisonics though.  A lot of the Ambisonics stuff turns into simulating 12-16 speakers, not full HRTF.  Full HRTF from ambisonics would be the holy grail, but no one has done that and talked about how to do it at the same time.

Issue with WebAudio is that if you want to change how fast the footstep is playing, you need to change the length of that silence.  Other issue with WebAudio is that to do anything interesting with custom nodes, you have to make 2 or 3 copies of large audio assets.  By the time I was copying out of a buffer into an ArrayBuffer, truncating/extending it,then converting it back to a buffer I was sad inside, and that stops before we talk about the part where I want to be able to yank audio out of the middle of assets so that you don't have to open reaper every time you want to grab a snippet from some long audio file you got out of freesound.org or whatever.  Also any decent custom effects for WebAudio have to be Rust/C++ and compiled to Webasm, and keep in mind that these sorts of copy the array manipulations have to happen at least in part on the UI thread.

That's not to say it doesn't beat BGT/Pygame/Bass/etc, because it does.  It just doesn't go as far as I want it to.

@24
I've given thought to surround sound, but surround sound is both easy from the aspect of there being tons and tons of things that'll work for it, and hard from my perspective in that I don't have a home theater or the space for one.

So: watch this space?  I've given thought to how to abstract this so that the panner is swappable, but it's in such an early stage that I have no idea what any of that looks like yet.

My Blog
Twitter: @ajhicks1992

2020-02-21 19:41:41

@26 Yeah, I ended up equalizing by hand. It took hours, and it hurt. A lot. smile

The format took some figuring out. I ended up converting it to tables in C, and it worked well even though the resulting file is a couple of megabytes.

Kind regards,

Philip Bennefall

2020-02-21 19:52:33

@27
The format can be read by Numpy, I'm almost sure.  The tricky thing is that I think they have some sort of weird elevation scheme, so getting from 360 degree azimuth to what they have is weird, or something like that?

Have a gist: https://gist.github.com/camlorn/83a0666 … dd5412be72

I took the two files of my HRTF Python thing, threw them back to back with comments, search for main.py to skip the first one.  That equalizes the MIT dataset if you extract it in the hardcoded path.  It's not complete but it has the minimum phase formulas, passable equalization code (could be better, but passable), the bilinear interpolation formulas, and the Woodworth ITD formula for computing the interaural time difference.  It's not really commented but the prints give some idea what each step is.  Your mileage may vary depending on how much of the underlying math you know, but it's at least something until such time as I have the time to put together a blog post.

If you separate the files and install numpy, scipy, and pyo, it should be runnable.

My Blog
Twitter: @ajhicks1992

2020-02-21 19:53:44

Also do note that if you use Python for your data processing, you can bring in jinja2 and pretty trivially write Python list of floats and/or Numpy array to C array variable generators.  I think the Libaudioverse code has one somewhere, but I don't have it handy.

My Blog
Twitter: @ajhicks1992

2020-02-21 19:59:56 (edited by philip_bennefall 2020-02-21 20:01:01)

Thanks for that! I'll check it out next time I dive back into the HRTF stuff; right now I'm a bit swamped with other things and my existing implementation works well enough for my needs at the moment. Though as I say, I need to profile it properly before I consider releasing it. Though honestly if you put your stuff out under a liberal non-binary attribution license, I'll give it a spin and replace mine if yours is better.

Kind regards,

Philip Bennefall

2020-02-22 00:58:04

Repository is here: https://github.com/camlorn/synthizer

I'm using the unlicense.  Hopefully this will work out without having to drag in things that require attribution; I know of at least public domain MP3 and Ogg libraries.  If I can find wav and flac we appear to be good.

My Blog
Twitter: @ajhicks1992

2020-02-22 11:26:36

There's dr_wav, dr_mp3 and dr_flac from the same author as miniaudio.

https://github.com/mackron/dr_libs/

Kind regards,

Philip Bennefall

2020-02-22 20:21:53

I would be happy to help any way I can to implement MacOS support. I Know python, My comp sci degree we learned c++, and I have a math degree. I am quick at picking up formulas and algorithms. Let me know how I can help. A dm on twitter is probably the best way to contact me. see my signature.

I don’t believe in fighting unnecessarily.  But if something is worth fighting for, then its always a fight worth winning.
check me out on Twitter and on GitHub

2020-02-23 17:28:21

The math will be portable, which is part of why I chose Clang.  Not the only reason; Clang generally tends to be ahead of everyone else for the latest C++ features.  In theory the audio backend is also portable.  Some of the I/O stuff may not be.

It isn't that porting this to Linux or Mac is particularly hard, it's that I don't have those systems or the motivation and also hate Voiceover with the burning intensity of a thousand suns.  But actually doing at least Mac should be pretty easy, though Linux with Alsa was a really big challenge for Libaudioverse because it's barely documented and buggy,.  Hopefully miniaudio just handles Alsa, however.

My Blog
Twitter: @ajhicks1992

2020-02-23 18:18:22

@camlorn, That is a very good idea. If I had a dll that could bring 3d sound to bgt, I could make almost any game I want.

2020-02-23 20:54:02

I was right about the performance.  Naive cross-platform SIMD implementation of convolution seems to be giving a theoretical max between 1000 and 6000 HRTF sources on my machine.  That will be less in practice because there's more than just convolution in an HRTF and some crossfading and etc, the list goes on.

By comparison, Libaudioverse needed several cores to even get close to 1000.

Eventually I'll probably make a Synthizer thread. But so far so good.  Probably no audio output for a couple weeks however.  There's a lot of groundwork yet.

My Blog
Twitter: @ajhicks1992

2020-02-24 17:57:19

Hi,
Will I be able to statically link this using the Mingw64 GCC compiler?
I'd rather not lug around an extra DLL in every project if I could avoid it.
Thanks.

2020-02-24 21:26:05

@37
Probably, but making sure that's the case isn't my priority.  If Mingw has Clang, or it can consume the output from clang-cl, or it can consume the output from Microsoft's linker, then it will probably always be possible one way or the other.  But the dll is more valuable by far, so I won't actively be working to preserve this capability (but I won't actively be working to get rid of it either).

My Blog
Twitter: @ajhicks1992

2020-02-25 00:28:47

Again @camlorn, just let me know if you want help with Mac related stuff. Building, testing, etc. I have Mac and don't seem to find VoiceOver such a dumpster fire or hate it with the burning intensity of a thousand suns as you say. Its by no means perfect and I can think of at least about 3 or 4 things off the top of my head that if I could I'd like to ad or change about VoiceOver, but it gets the job done.

I don’t believe in fighting unnecessarily.  But if something is worth fighting for, then its always a fight worth winning.
check me out on Twitter and on GitHub

2020-02-26 14:36:00

@camlorn: any news regarding this project, or was it abandoned or anything?

2020-02-26 16:14:40

I started it last weekend (see above for the repo link) and am doing a 9-to-5 programming job Monday through Friday.  Patience is required.  This isn't going to happen in a few days.  I know what needs to happen and there is a plan, but I only have weekends and some weeknights to work on it.  Keep in mind I spend all of every day Monday through Friday programming; it's not like I get off work with tons of creative energy, going "hey, I know, let's do what I just got done doing for 8 hours".

My Blog
Twitter: @ajhicks1992

2020-02-26 17:10:29

This is possibly the fastest abandonment assumption I've ever seen. At least wait a month since the last post by the author LOL.

2020-02-28 13:55:35

This is awesome. I was going to strongly suggest you to do it and thank you a lot in the end of my post, but now I just thank you a lot as you have started working on this project. I'm very confident that we're going to use it on our upcoming projects, if it's going to be as easy to use and having low latency as you said. Plus, since the licencing complies with lucia we're gonna include it in lucia, too, if its performance doesn't bring up weird issues like sound_lib, which I'm very confident that it's gonna be all right too.
Keep it up. We're patiently waiting for the great work of yours. Good luck.

---
Co-founder of Sonorous Arts.
Check out Sonorous Arts on github: https://github.com/sonorous-arts/
my Discord: kianoosh.shakeri2#2988

2020-03-03 02:43:35

@camlorn
Nice to see that it's getting somewhere even though it is slow progress.
I've stared in on gh and as @kianoosh said in #43 we (well me atm) would love to include in in lucia if everything goes as planned (and it doesn't take multiple years to complete)
Would love to help out, but have very limited knowledge with C/C++ programming.
I wish you the best of luck

If you like what I do, Feel free to check me out on GitHub, or follow me on Twitter

2020-03-03 03:17:53

Multiple years, hah, no.  I already almost have the world's most underwhelming media player.  There may be something to show for it by this time next week. Just depends how much free time I get.  Once the audio decoding infrastructure is finished, things should move at a pretty decent pace for a bit.  I've just not bothered posting progress here because it's beyond technical.  But to provide the quick list:

1. Finish the media decoding infrastructure to a basic level. That's in progress now.
2. Code a C++ source/HRTF panner bank architecture. I have a chunk of this in Python outside the repo waiting to be ported, some in Libaudioverse where I'm the sole copyright holder, and some in the repo already.
3. Write some really annoying code generators and implement a command/response architecture that I could go on about at length, the short version being that I want to be able to let game code not have to wait on locks *and* batch updates so that you don't hear the audio equivalent of tearing.

At 3 you can use it in something because there'll be a C++ API.  I'd call that an alpha, because it's where we find out if miniaudio is amazing or amazingly sucky.  I'm hoping for amazing, but if it's amazingly sucky then we might have a 3.5 where I have to write an audio backend which I really really don't want to have to do.  But then the fun starts (in rough priority order):

4. Implement a transparent buffer caching system.  We can hold audio assets in memory in an encoded format and we can take your sounds directory and load it in the background so that it's ready for you when you need it.
5. Reverb. Probably a feedback delay network. I expect several iterations of this, in order to build up to reverb zones and things like that but want this in the hands of people early.
6. Make sure that things are in order enough to make the protocol system public--that is, there's already the beginnings of a piece that lets you plug in your own asset pack format for encryption or go to the internet or etc which is getting used internally.
7. Implement effects chains, per source and global (i.e. chorus, etc).
8. Investigate bringing EEL2 in for custom effects.  For the Reaper people, it turns out that the language you do JSFX in is open source, so anyone who knows that can write effects in theory.
8.5. Also maybe investigate bringing Lua in via LuaJIT.
9. Maybe work on raycasting to compute reverb.

And then after that it's based off demand or my own usage or what have you.  For some idea as to the C API I have in mind:

syz_handle audio, source;
// could be http, my-magic-format, whatever.
syz_createGenerator(&audio, "file", "my_file.ogg", "");
syz_createSource(&source);
// TBD. Also the default.
// Third parameter is options, key1=value1 key2=value2. Almost never used.
// In future/from a higher level language you can parse urls to this.
syz_sets(source, "panning_strategy", "hrtf");
syz_setf3(source, "position", 5, 5, 0);
syz_addGenerator(source, file);
syz_command(file, "play");

The point being that anyone wanting to use this just has to bind seti/setf/setf3/sets, command, and then the general purpose library initialization stuff and the few things that sadly absolutely must be special cases.  No enums or anything like that.  Do one magic 20 line Python class and you're done, you can set all the properties for today and tomorrow as well.  Obviously I'm omitting error handling for the sake of expediency.

If you're wondering "but strings are slow": yes, they can be.  Probably not too slow for this application. But through magic pointer fuckery, anyone who cares will be able to go through syz_magicFastString("foo") and get a magic value that is magic that they can magically use instead to be faster. Because it's magic, in case you didn't get the point. I probably won't actually name the function that.

My Blog
Twitter: @ajhicks1992

2020-03-03 17:46:25

Could you maybe add a way to output windows SAPI through Synthizer so that silence can be trimmed from the beginning?
This looks really, really cool!

2020-03-03 18:08:40

Just noticed I dropped the file line from the above post: the comments about URL stuff were supposed to go with something along the lines of syz_loadAsset("file", "path", "");.

@46
Maybe. Problem is that doing that requires also forwarding to all the other methods that those interfaces provide, and they're daunting.

I might tackle speech at some point but if I do, I'm going to do it via reimplementing one of the screen reader speech libs under the unlicense.

Regardless it's not really my priority.

My Blog
Twitter: @ajhicks1992

2020-03-03 19:22:51

Or, you could also add a way to send raw audio samples, let's say 16 bit, stereo, 44100.

2020-03-04 01:40:11

Raw audio samples is possible one way or the other, yeah.  That specific feature was a goes without saying sort of thing, in my book.  It might be the one thing that's not doable from BGT however.  It depends on if you're looking to just send everything at once then close, or if you want to do more.  It's hard in the extreme to send samples interactively without either a thread or a function pointer or both.

That said, the protocol thing is a protocol (i.e. file), a path (i.e. the protocol gets a string and can do whatever) and a set of options.  In theory, this could eventually be extended sufficiently to allow you to hook weirder things like sapi in, though today I'm coding it assuming that protocols are opaque streams of bytes.

My Blog
Twitter: @ajhicks1992

2020-03-04 07:54:44 (edited by manamon_player 2020-03-04 07:55:33)

please make a great 3d soundpool for bgt to work with 2 or 1 or 512 rams too