This post was originally made in 2021. Since then, I've been talking with a friend about Vocaloid, and how much I'd love to use it. He showed me hands on that it's possible, and with his help, I managed to throw together a makeshift virtual instruction pamphlet of sorts. These steps are aimed at the V4 Editor, but they'll probably work with the V3 Tiny Editor as well. Also, read the resources in post 39, because they give some pretty useful keyboard shortcuts. My original post is at the end.
Configuration
When you first launch the editor, it'll tell you that no audio device is set or found or something like that. Press OK and set your playback device. I can't remember if this is the preferences window or a specialized property page just for setting audio settings. If this is just an audio settings dialogue, just press OK, press alt to go into the menu bar, find settings, then preferences. Once you're in preferences, or if you were already in preferences, go through all the tabs and choose your preferred settings. I like the idea of configuring pitch bends and vibrato manually, so I would uncheck Automatic Vibrato Detect. If you're used to Reaper, check return to start position when stopped. On the other settings tab I would set time out for software update check at start up to never, because I don't think Yamaha will be updating Vocaloid anymore for as long as humans continue to inhabit the universe. Last and most important, be sure to enable sound previews. This single feature will turn Vocaloid from being completely unusable to mostly usable.
Once you’re done setting your preferences, press OK. Now you want to go back into the settings menu and find customize shortcuts. I can’t stress this enough. Go through every single category and make sure you customize everything you think you’ll need. In particular, make sure moving to previous/next note, jumping to first and last note, and preview selected note all have shortcut keys. These keys are very important.
Midi Creation
Now that you set everything up the way you want, it's time to make your midi file. To my knowledge, making a song directly in Vocaloid is inaccessible, so use your DAW of choice to do this.
The way you do this next part depends on what language you're making your song in. If you're making a song in Japanese, each symbol takes up one note. This means you can't have nai (ない) as a single note. This means, rather than having one whole note for nai (ない), have 2 notes right next to each other, both half the length of the note you were going to create. Make the first note na (な), and make the second note i (い). Also, you’re supposed to be able to right the small tsu glottal stop symbol in Romaji by typing a question mark (?) but I couldn’t get it to work. You’ll have to use the actual symbol, which is “つ”. If you want to copy it from my post, don’t copy the quotation marks surrounding the symbol, because those will probably break the engine.
English works a bit differently. One syllable does fit on one note. If you're making an English song, remember to make one note for each syllable. Also, to my knowledge, English doesn’t have a glottal stop symbol.
Universally, regardless of language, the hyphen symbol (-) will stretch the syllable over to the next note. This is good for vocal runs, because instead of having 5 notes that go "a a a a a", you can do "a - - - -" instead. Just so you know though, this doesn’t work too well for English voices.
Finally, if you want to have breath sounds, those aren’t automatic. You have to add a single note for that. I wouldn’t make a breath note any longer than a beat or shorter than half a beat if I were you, because I heard that can cause weird glitches.
Importing Your Midi into Vocaloid
Once you finish your midi, export it as a midi file. Now go to your Vocaloid Editor and press Ctrl + O to open a Vocaloid project. Instead of a Vocaloid project though, you'll want to change the files of type to Vocaloid midi, then select the midi file you just made.
It will ask you if you want to pan your multi-channel out to separate tracks. Choose yes, then import just one channel from that list. You can import multiple tracks, but you'll only want to have one, as adding more destroys accessibility even more.
Setting Your Singer
Now that you imported the midi file, it puts you in the main interface. Pressing Control + Tab will move you between the Musical Editor and Track Editor. Make sure you’re in the track editor, then press Ctrl + A to select all. Press Alt + P for the Parts Menu, and press Y to go into the Part Properties dialogue box. From here, you can select the singer you want, and you can also turn on Pitch Snap to give your song a more autotuned sound.
Setting Your Lyrics
Now, press Ctrl + Tab to focus on the Musical Editor. Select all, press Alt + J for the Job Menu, then press S for the Insert Lyrics dialogue. Remember the tips I gave above for writing lyrics. Press enter or tab to the OK button and press that when you’re done.
Now that you’re back in the Musical Editor, press the top of file key (should be Ctrl + Home), then press the Spacebar or Enter key to listen to your masterpiece. If it sounds good on your first try, awesome! If you mistyped or forgot a part of your lyrics just go back to the Insert Lyrics window to fix it. If you missed a note in your midi, you’ll have to fix the midi in your DAW, scrap your current Vocaloid project, reopen the midi in the editor, and do everything mentioned afterwards all over again. Once you’ve taken care of fixing all the mismatched lyrics, we can continue.
Setting the Tempo
If your lyrics match properly, it's fairly likely that you don’t want your chosen tempo won't be 120 BPM. Before, it was thought that to edit the tempo, you had to edit the VSQX project file in a text editor. This part of the interface can be tricky, but it's perfectly duable. There's an awesome thing called a job plugin that can do any number of cool effects and tasks with Vocaloid. Setting tempo is supposed to be one of those things, but if you use the tempo job pluggin, it messes up the file in some pretty nasty ways. Job plugins are out of the scope of this post for right now, but suffice it to say they're Lua scripts that interface with Vocaloid to do some pretty cool things. Since they won't really help us right now, I will move on.
If you're using NVDA, setting the tempo is fairly simple. You'll have to use object navigation to accomplish this. You should be interacted with either the music editor or track editor depending on where you chose to go. Stop interacting with the window by pressing NVDA + Numpad 8 on the desktop keyboard layout, or NVDA + Shift + Up-Arrow on the laptop keyboard layout. Go right several times with either NVDA + Numpad 6 or NVDA + Shift + Right-Arrow. You should pass by a few things, including a few dialogues. Go into the second dialogue thing you see. You should see a few buttons, then eventually you should see an area that says 120.00. Use NVDA + Numpad divide or NVDA + Shift + M to move the mouse pointer to the currently focused control, then Numpad Divide or NVDA + Left-Bracket ([) twice quickly to double click this text. It should bring you into a global tempo set window. The default tempo is 120.00. Simply change the BPM to whatever you wish, then press enter.
If you're unable to change tempo through the interface, either because it doesn't work with other screen readers, something's broken, or you just want to change it manually, you can use a text editor to change the tempo as well. Save your project somewhere (Ctrl + S to save), then close out Vocaloid completely. Find the file you just saved in your file explorer of choice, then open it with something like Notepad or Notepad ++.
With your VSQX file opened in your text editor, use the find command to search for “12000” without the quotes. This number 12000 actually represents 120.00 BPM, so all you have to do is keep the two zeros at the end and type the BPM you want in front of them. For example, if you want 180 BPM, make sure the number is 18000.
After you change the tempo, save the file and close the text editor. Reopen the Vocaloid Editor, then reopen the VSQX. Make sure you’re in the Musical Editor again.
Cosmetic Changes
For a first timer your song probably sounds decent, but if you want it to flow better, there’s a few things you’ll want to change, like turning your breath notes into actual breaths if you added any or turning a Japanese lyric like shitei into shtei or something like that.
Jump to the first note (I believe the default key for that is home). Then preview it with your preview selected note shortcut key. You will have to push the preview shortcut every time you want to hear the note your cursor is on. Your screen reader will not report any of the notes, and the notes won’t automatically preview when you use the arrow keys to move between them either.
For simplicity’s sake, let’s pretend your lyrics are in Japanese, you’re writing in Romaji, and you wrote “ha wa ta shi ha tsu ne mi ku de su (I’m Hatsune Miku)”. I know; grammatically incorrect and very unoriginal lyrics, but I’m trying to keep this simple. So, the ha at the beginning is going to be your breath sample, you want “Watashi” to be pronounced “Watash”, and you want “desu” to be pronounced as “des”.
Adding a breath phoneme
Make sure that you’re on the first note by pressing the preview key. If you’re on the first note, you should hear “Ha”. Press Alt + L to go into the Lyrics Menu, and press P to enter Note Properties. Shift Tab until you land on an edit box. In this case, you should see “h a” in the box. To change this to a breath, you type br, then any number between 1 and 5. For example, if you wanted the fifth breath sound, you would type “br5”.
Press tab, and your screen reader should say something like “Protect checkbox checked”. At this point, you have to press down arrow to find the OK button. Also, you have to press enter on the button and not space, because for some reason, pressing any key other than enter forces you back to the protect checkbox.
Now that you changed the note to a breath, press your preview shortcut. If you don’t like your breath sound, you can always go back into the Note Properties window and change the number at the end in the phoneme box to whatever you want, as long as it’s between 1 and 5.
Chopping off Vowel Sounds With Phonemes
Now, it’s time to change the “shi” in “Watashi” to “sh”. Press right arrow followed by the preview key to go to the next note. Keep doing this until you hear “shi”. Once you find it, go into Note Properties again, and back into the edit box. You should see “S i”. Just take “ i” out of there, so you only have “S”. Now find the OK button.
Finally, let’s turn the “su” at the end into “s”. Press the end key to find the last note, press the preview key, then reenter Note Properties if you’re on the right note. Go back to the edit box. You should see “s M”. Just delete the “ M” at the end, then find OK.
Reviewing and Completing Your Work
Now that you’re done editing phonemes, jump to the top of the file again, then press the Spacebar to start playing the file. You should now hear “*breath* watash Hatsune Miku des”.
Now that you have finished your glorious masterpiece of a song that is guaranteed to earn you $1000000000 in streaming revenue, it’s time to save. Press Ctrl + S. If you want, you can now export as Wav by pressing Ctrl + Shift + Alt + S.
Final Notes
As far as I’m aware, having more than one track is completely impossible for blind people to work with, and I don't know why. You'll have to work on each track using a separate Midi and VSQX file.
Previously, I thought there was a glitch that involved navigating between notes. Sometimes you won’t move at all, and I thought you had to spam left and right arrow a few times to get it to move. What's actually happening is that you can't move notes while the current note is previewing. Wait until the note stops playing, then navigate to the next note.
I found a tutorial about phoneme input on the Vocaloid Fandom Wiki if you need guidance on doing more creative things with your phonemes. HERE’S A LINK TO THE PAGE.
That’s pretty much everything. I still haven’t figured out how to use pitch bends and other expressive settings properly, so I’m hoping @Nuno and @Luyi still have the info about it so they can post it here. If anyone has anymore info or if I missed something, please do let me know and I’ll edit this post with the necessary information. I hope this helps someone. All the best!
Original Post
I've been curious about these things for a while, and ever since I've listened to Vocaloid covers on YouTube and heard how good they are, I wanted to try it. Does anyone have experience with either of these programs? I tried Utau but I couldn't figure out where to start. I wanted to try Vocaloid, but I couldn't find a demo or trial version to test it out. I checked the price and saw it was about 300 USD, which is a little out of my budget. Even if it wasn't, I'd rather not spend money on something until I'm used to it or I find I like it. I heard about things like Piapro and the Vocaloid fandom but I don't know if they let you in unless you bought Vocaloid first. I also figured I'd ask here because I don't hear a whole lot about Vocaloid from other blind people, so I wasn't sure of its accessibility status. Do you guys use it? Did you like it? Let me know.