Yeah, and to fill in the blanks a bit more...don't try to code up some complex logic for every single swipe. Start with one--left or right. Using the above pointers, take the mouse position, only detect that one swipe, and make TTS say something like "Swipe right detected" whenever you make that one gesture. And then do it again and again and again, until it is detected way more often than not. Then move onto another. I'm hoping you have TTS working, because you'll need something to quickly tell you what's going on in your world. Console debugging may not be quick enough. That's how I did it for godot-accessibility. Started by trying to detect everything, then realized I was overwhelmed because I had no idea how many pixels made sense, and I was getting false positives in all the wrong directions at once.
Anyhow, start simple. The swipes have to feel natural, and the only way to know if they do is to add them one at a time.