2020-04-26 22:35:02

This is the Synthizer thread, where I update from time to time with progress and try to answer questions.

For those who don't know, Synthizer is my third attempt at solving the 3D audio problem.  I've got several more years experience under my belt from Libaudioverse and I know what I did wrong there, but unfortunately those mistakes make finishing Libaudioverse harder than doing this.  So I'm doing a game-specific library that drops a lot of the Libaudioverse "you can build any sort of synthesizer" features in exchange for speed and maintainability around my dayjob.

For those who do know, this isn't dead.  It's maybe 5000 lines of C++ and Python for HRTF and wave file reading.  I've been plucking away every weekend, but I didn't have progress to show, until now.  As I said this is certainly an undertaking.

To cut to the chase, have a demo.  This is Jewel Thieves from Sound Image going in a clockwise circle around your head.  It's not perfect, there's still a little bit of buggyness with the panning and some artifacts, and going through virtual audio cable certainly didn't help.  But it's better than (what used to be and I think still is) OpenALSoft's default settings and it's certainly better than Libaudioverse.

So: to answer the obvious questions.  Elaboration below.  It will probably be usable from BGT.  Not by design, just because the design of this overlaps well with what BGT can call.  You can't use it yet.  I estimate 3 weeks until you can.  It's very, very, very fast and I haven't even optimized it yet.  Reverb isn't here but will be coming.  You can't customize the HRTF but you will be able to customize some parameters, most notably the speed of sound and the radius of the listener's head.  Currently supported file formats are wav, but MP3, Ogg, and Flac are coming.

License is the Unlicense and I'm taking care to only use permissively licensed components, so you can do whatever with it without even crediting me.  I ask that you do credit me, but you don't have to per the license.

Now the long form:

The current state of the project is such that there's a file wav_test.cpp which grabs a bunch of internal components, mashes them together, and produces HRTF.  I tackled that first since if I can't do all the things Libaudioverse didn't do to make this sound good and be fast, it's not worth continuing.  The HRTF probably isn't done, but it's to the point of proven: there's some automated Python that runs the MIT Kemar dataset and produces some C files, and a bunch of C to run it.

Wav is implemented because it's the easiest, but code is in the repository to implement Flac, MP3, and Ogg.  It's just not wired up as of yet.  MP3 is now patent free, so we get that, and a bunch of people have written helper libraries that aren't libsndfile so we get all of these formats while maintaining the ability to statically link the library and producing only one DLL if you use it dynamically.  The system to manage file I/O is extensible, and will be able to read from anything.  HTTP streaming, custom DRM, and possibly even streaming realtime encoding from someone else's machine are all on the table, all be it not implemented.

I know how to do reverb, and when we get to that point it shouldn't take me long.  A bunch of DSP components toward that exist.  The initial version of that won't support reverb zones or anything, but there's a good algorithm in Libaudioverse that I can port over.

Unfortunately we don't currently have surround sound, and likely we also won't get it either.  I'm no longer living in a big house with a home theater I can borrow.  Once this is further along, if someone with the gear to test and the knowledge to write the algorithms properly offers to contribute I'm certainly open to making it happen, but I live in an apartment where my living room is also my kitchen and my office so even if I wanted to spend that kind of money I couldn't put it anywhere (however: if you buy me a house with space for this, I also promise to do surround sound).

The next task is taking what I have and turning into a C API and a Python package.  This will take a couple weeks because i only have weekends and I need to abstract things just so.  How that will look is something like this very undocumented and wildly incorrect pseudocode:

syz_createTable(&file_handle, "file", "bla.mp3", "");
syz_createSource(&source_handle, file_handle);
syz_linkGenerator(ssource_handle, file_handle);
syz_generatorStart(file_handle);
// then in a loop or somewhere, you call the function to set a vector of f3 given a property...
syz_setf3(source_handle, SYZ_SOURCE_POSITION, x, y, z);

Unlike Libaudioverse and OpenALSoft it will be possible to batch commands like a database transaction so that you don't get halfway done changing sources and then it decides to update.  One of the places that a lot of libraries for this fall down (including OpenALSoft) is that all the commands grab a mutex or something and it's fast but only until you start calling into it all the time.  That opens up a few things like streaming Synthizer commands over the network if you want but primarily it's for performance.

And that brings me to the last point that I think is worth bringing up: how fast is it?  I don't know exactly yet because it's kind of unfair to benchmark it without having a proper file reader and stuff like that, but the above demo is live streaming a file from disk and applying HRTF in a debug build of C++, and (according to task manager and very unscientific measurements) is taking about a second to synthesize a minute of audio for 4 sources (you always synthesize at least 4 sources; if you only do 3 there's a silent fourth there).  So in theory a debug build gets you into the hundreds.  That won't be the case in practice because debug builds are horrible once abstracted, but I fully expect to see upwards of 1000 sources on a single core when this is done, possibly much more.  I've run an OpenALSoft benchmark on my machine, it gets 1600 sources per core.  So preliminary indications are that this competes with OpenALSoft even as-is, but we'll find out in the longer run if that lasts.  For the curious, you can thank Clang's vector extensions for easy SIMD stuff.

For people who want to see the code, it's here.

My Blog
Twitter: @ajhicks1992

2020-04-27 09:13:02

Thanks Camlorn, finally an engine that really makes 3D sound heard. I hope the Python wrapper will arrive soon.

2020-04-27 10:11:05

yeah. i also can't wait for a python rapper.

best regards
never give up on what ever you are doing.

2020-04-27 15:15:46

Will be following this. THanks for the effort big_smile

If you like what I do, Feel free to check me out on GitHub, or follow me on Twitter

2020-04-27 16:56:28

Hi.
Last time I tried Open AL I thought that I saw some filters which you can apply to the effects. Are you planning to add those, or is reverb going to be the only thing you add besides HRTF?

2020-04-27 18:23:36

@5
I'm doing the most important things first, but I plan for this to go as far as allowing DSP-capable people to write their own effects.  I don't think that I will be matching OpenAL parameters however, both because OpenAL actually provides way more than is useful and also because for i.e. echo it's convenient to have some other arrangements that let you do things like the Shades of Doom thing for hallways.  There will be parameters, and plenty of them, but probably not the ones in the OpenAL manual.  In general literally 50+ sliders for your effect which the hardware can implement however it wants is only useful to masochists.

The signal graph for a source will eventually go as follows, each stage feeding into the next:

One or more generators which output sound.
A set of effects that apply to all generators on a source, one after another.
A set of effects that bypass panning and route directly to speakers, for example echo.
HRTF/stereo/no panning, reverb (with reverb zones), other spacialization stuff.

The first stage gets implemented first, the last stage is halfway done.  Then comes the C API and Python package. Then reverb (but not reverb zones).  Then a more comprehensive manual and a synthizer.github.io or something for the manual.  Then after that I prioritize depending on what's easy: for instance simple filters are exposing something I've already got, as is echo, but chorus and nonlinear stuff like waveshapers are decidedly harder and full-on custom effects has too many open design questions for me to even estimate the effort yet.

My Blog
Twitter: @ajhicks1992

2020-05-24 04:41:00 (edited by camlorn 2020-05-24 04:42:23)

We now have:

A basic C API. Example here.  This is fragile; in particular there's at least one concurrency related thing that I haven't found yet, which seems to be causing rare crashes/freezes.  If you want to try it synthizer.h is the functions, synthizer_constants.h is the constants, and synthizer_properties.h is a DSL that shows you what the properties on various things are.

CI infrastructure for Windows.  Build status is here.

And finally, build artifacts.  This is way too early for an official release to be worth doing yet, but every CI build on Appveyor gives you a zip file containing the most recent versions of the public headers and 64-bit versions of the library in all combinations of debug/release static/dynamic named as synthizer[d]_[static].lib and synthizer[d]_[static].dll (so dynamic release is synthizer.lib/synthizer.dll).

I'll work out 32-bit artifacts eventually.  The library works on 32-bit machines if you build it, but it's not so trivial to make Appveyor do both without putting in more time than I want to put in at the moment.

All of this is literally so untested that I haven't even tried playing more than one source yet, to be clear; while playing more than one source is interesting, being able to give things to people is also interesting, especially when giving things to you now just means committing something and doing git push.  After I write python bindings, hopefully this weekend but we'll see, I'm going to put together some sort of testbed where you can walk around with playing sources.  That'll knock a lot of the obvious bugs out.  But be warned: here there be incredibly alpha-quality software.

The one notable limitation is that we only support wav as well.  That's going to be changing, probably in the order mp3, flac, ogg as I have time.  There's also obvious necessary improvements needed to the HRTF if you start playing with elevation.  But like with not having tested things thoroughly, HRTF working, Python bindings, etc. are higher priority than other audio types at the moment.

My Blog
Twitter: @ajhicks1992

2020-05-24 05:01:05

wow, cool!
Will there be a pure basic rapper for this?
It's my native programming language and the sound3d in it is less than optimal. You can't even tell the different from behind or in front sounds.

2020-05-24 05:08:48

@8
If you or someone else writes and maintains it, yes. Otherwise no.  Even if I had a particular interest in non-mainstream languages, if I start being responsible for all the bindings, I won't have time to write the library itself.

My Blog
Twitter: @ajhicks1992

2020-05-24 05:45:32

Hi.

@Camlorn, so, for the time being, it'll work only work with C, or will it work with Python?

Also, since its C, will it work in BGT? Just out of curiosity

You ain't done nothin' if you ain't been cancelled
_____
I'm working on a playthrough series of the space 4X game Aurora4x. Find it here

2020-05-24 06:09:25

@10
The short answer is yes for now to BGT and Python is coming.

If by work with Python do you mean "could I get out ctypes and use this by writing my own wrapper" then yes, but I suspect what you want is pip install synthizer, which we are close to.

BGT is more complicated.  One of the limitations of BGT is that the thing for calling C can only "speak" about half of what C actually has to offer.  The designs I have in mind for much of Synthizer avoid going outside what BGT can handle, but I'm not going to make any particular effort to keep BGT working, and if the choice ends up being between "offer a nice C API" and "BGT works" I'll pick a nice API every time.  Also one of the big limitations of BGT is that you can't use function pointers, so things like custom effects, handling logs, and a variety of other sorts of "respond to the interesting thing" sorts of stuff won't be on the table ever.

My Blog
Twitter: @ajhicks1992

2020-05-24 07:48:59

hi
just out of curiosity, what language is this program made in?
Also, if you provide a readme displaying how to rap the functions and their var types, I could quite easily make a rapper, I just need the fundamentals, and am not sufficiently knowledgeable in whatever c headers, I don't know what these are, I think pb supports to import them into functions though, I don't know.

2020-05-24 09:23:11

Is this dll in a usable state as of yet?

best regards
never give up on what ever you are doing.

2020-05-24 14:23:29 (edited by amerikranian 2020-05-24 14:37:23)

Post 12: Google it, please! It’s not that hard
Also, it’s going to be a lot more complicated than just wrapping a few simple functions.

2020-05-24 16:15:10

@12
The language is C++, with a C API.

You can probably call it from Purebasic, but I can't just answer a couple questions and have you doing it.  You'll need to learn to read C headers and figure out how Purebasic maps them.  But since Purebasic is basically C with a different syntax and added weirdness, it almost certainly can do it, and it almost certainly can do it easily.

Alternatively wait until this is easier to use and stable and then use it as your excuse to finally switch to mainstream stuff, helping audiogames.net enter the 21st century at long last.

but even if I put the functions in the readme, the prototypes would just be the same thing as in the C header, because that's just how this works.  There's no "easier" syntax or something that I'm hiding from you where I use words and then it somehow maps to the Purebasic manual.  C is the least common denominator when it comes to what programming languages can talk to, and part of learning to write bindings (for anyone and any language, not just you and Purebasic) is learning just enough C to handle reading a C header.

There will be a manual though, probably using mdbook.  Just not yet.  Like with many other things, it comes down to write the library or write the manual.  This is also early enough that if I did decide write the manual, lots of it would probably change before I start making firm compatibility guarantees.

@13 and anyone else who wants to ask the "is this DLL usable", please read post 7 where I lay out what the state of it is and what the limitations are.  I'm not going to copy/paste that down here.

My Blog
Twitter: @ajhicks1992

2020-05-24 16:16:49

@14
It actually is just calling a few functions.  Probably more than people are expecting, but the current header is only like 15 and the linked example only uses 6 or 7 of them, at least if my it's early in the morning brain is remembering properly.

My Blog
Twitter: @ajhicks1992

2020-05-24 16:49:06

Any chance this may not need the latest and greatest LLVM at some point? I'm a bit worried about projects' likelihood to adopt this as a dependency if it only builds under 10.

I.e. the Godot folks desperately need a better audio subsystem than what they have now, and I'd like to try writing a Godot addon for use in my own projects maybe in a few months. I can require LLVM 10 for my own addon, but if I've proven out the concept and want to pitch it upstream, it'll probably need to be a bit less strict on dependencies.

If the answer is "that's how it is in alpha and we can fix that later," that's fine. Just want to make sure it isn't LLVM 10 or nothing forever.

Thanks for writing this.

2020-05-24 17:42:34

@17
It's C++17, maybe with C++20 features.  Trying to use older C++ versions or to specifically track the required C++ standard is a massive maintenance burden in the specific case of this being a weekend project that I still want to move on quickly.  The recent C++ standards have been fixing gaping ergonomics holes in the language and it's already saved me at least a man-day of effort.

It probably compiles under GCC with a little work to make CMake aware that GCC exists, but I say Clang because on Windows getting GCC to work is an epic headache and even if you do I expect it still falls over because of CMake being weird.  Clang 9 probably still works and probably will work for a while, and maybe I can get Appveyor to specifically build with it to prove that that's the case.

MSVC won't work for a while because in MSVC doing SIMD requires using architecture-specific extensions rather than just declaring some vector types, meaning that future plans to target AVX/AVX2 would require writing multiple versions of everything, plus yet another version for ARM.  I could, and I will eventually if there's demand, but it's nontrivial and right now it should compile to something decent on all Clang-supported architectures, even the weird ones.

If someone big comes to me and says "We need xyz compiler versions" I will make xyz compiler versions happen because that makes my time more valuable, if that makes sense, but I do not relish the idea of using C++11 or C++14 *and* writing multiple algorithms because of MSVC stupidity at the moment.  Also before anyone like Godot can be interested there's the 5.1 and 7.1 surround sound problem, namely that I'm in an apartment and there's no way you're getting a home theater in here, and that's in many ways bigger than the compiler issue since it needs someone with domain specific knowledge to do it.

My Blog
Twitter: @ajhicks1992

2020-05-24 17:55:09

Thanks for the explanation. Clang 9 should be sufficient--I thought 10 was a hard requirement.

Regarding 5.1 setups, I snagged a Vizio '45 5.1 soundbar on Amazon for $250 or so a while back. The bar itself is the most complicated bit, but it sits on my desk in front of my monitor and takes up very little depth. The wall speakers are 1X4 inches or so, and the sub has a flat profile and can either stand on its side or lay flat under furniture. The sub is also wireless, so you can position it away from your desk, then hook the rear speakers into that. There are command line utilities and Python APIs for controlling the thing. Sound quality isn't great, but good enough for my purposes.

Anyhow, I know that costs money and takes effort, so I'm not saying go out and get one. Just noting that this setup takes up far less space than the 5.1 receiver and more traditional system I have in the living room. If you've got enough desk space for the 2X2X45 bar, and about 2X12X18 for the sub either on its side or flat on the floor, soundbars might be an option. Amazon also sells smaller Vizio bars, but I went for the 45.

2020-05-24 18:28:19

@19
Problem is that I only have 2 walls because this is a nice open floor plan where the kitchen and the living room sort of flow into one another.  It is actually very nice, that's not sarcasm, but one of the speakers would be literally on the microwave and that's only one of the problems.

Money isn't the issue.  The issue is that if it's not good quality audio gear with the speakers at the right angles, someone with good quality audio gear and the speakers at the right angles will rightly start reporting bugs--"What do you mean you can't hear xyz audio artifact and man this panning is way off" sorts of things.  So right now I'm punting it down the road.  It'll either be me working out some home theater thing that I haven't invented yet or some sort of math-driven test suite.  Or a volunteer, of course.

Libaudioverse got this because I used to have access to a home theater.  But since Libaudioverse has become my "what was I thinking" project, just lifting the algorithm from there probably only goes so far.

My Blog
Twitter: @ajhicks1992

2020-05-24 23:19:58

Fixed the lifetime issues.  I think it won't freeze anymore, but no promises.  Next up is Python but as usual with software the other stuff this weekend took more time than I'd have liked, so it might not happen for a week or so.

My Blog
Twitter: @ajhicks1992

2020-05-26 00:36:02

We have Python bindings. You can pip install synthizer.  It probably runs on your machine, but maybe not, because I haven't tested it on anything but mine yet and have spent almost all of the last 3 days coding as fast as possible to get us over this hurdle. Incredibly minimal example here.

It supports anything 3.6 and later.  I'm not doing Python 2.

Now that said, you can stream wav to a source and move the source in a circle and that's it.  Also you probably have to read the bindings source code itself to figure out how to use the thing and if you don't use the initialization context manager and forget to shut synthizer down your app hangs on exit.  So, usable, no.  But it exists and I can release packages.

Unfortunately I don't have 32-bit builds for you.  I should, but something goes wrong in CI that I will have to fix, and that fixing will take several hours to a day.  For those who haven't done CI before, basically every tiny change is waiting 5 plus minutes to find out if it worked.  Since 32-bit is vanishingly rare now, I'm leaving this one until there's enough demand and we have something worth using and less it-technically-exists, plus if it comes down to it I can do manual releases by hand for major versions if I must.

From here it mostly turns into me adding features until it's worthwhile.  Fortunately most of that is 2-3 hours each and not a bunch of frameworky stuff like getting set up to do releases or doing lockfree concurrency.  Things should finally move faster in other words.  In all honesty you could use this in a game after mp3 and a proper 3d source as opposed to one you control directly, which are probably about 3 hours of work each, and a manual as to how to use it, which is probably another 2-3 hours, but this weekend is coming to a close and if I hack the rest of it away I won't be in the right mindset to work tomorrow so it'll probably have to wait.

My Blog
Twitter: @ajhicks1992

2020-05-26 00:57:27

@camlorn, will you add in the option to load bytestreams? You might have done so already, I haven't looked at this fully yet. Asking for the purposes of encrypting one's assets.

2020-05-26 01:22:17

@23
Yes I will be, but no it's not there yet. As I said this is very early alpha quality stuff that doesn't even quite have the "I can put a source in 3d" bit done.

Also, though this isn't the place, you can't encrypt assets effectively no matter what you do, in any language or with any library. People can rip stuff out of BGT easily for one thing.

My Blog
Twitter: @ajhicks1992

2020-05-26 05:29:23 (edited by visualstudio 2020-05-26 05:38:56)

@22, something regarding your bindings:
first, for initializing of your c-types, use __cinit__ and for deinitialization of them use __dealloc__
also for your exceptions, use __cinit__ (you are initializing c variables)
instead of str, you can use char* which is native in c or if you want to use unicode, use Py_UNICODE* which on windows, becomes wchar_t*
I've said this when I had a simple look to the bindings.
thanks.