I'm using LipSync on my Gear VR project, but it's taking up about 25% of my frame when decoding the phonemes for the blendshapes. It looks great, but drops frames on my S6. On my S7 it doesn't drop frames but the sound can become choppy with lots of pops and cracks. I suspect this is due to how much performance is gobbled up by the dynamic sound analyzer.
Are there any optimizations planned for the LipSync SDK? It hasn't been updated in awhile. One thing I can think of is the ability to generate lipsync files in the editor to go with each sound effect. That way it can just play back pre-set blendshapes per frame instead of doing it on the fly. Maybe an editor tool where you select a bunch of audio files, and it generates a lipsync file for each one with the same name.
The only time I'd need dynamic lipsync is for voice chat, which my game does not have.
We are looking at updating ovrLipSync for better phoneme detection, which will give a much smoother looking lipsync. As well, optimizations will take place which will reduce the amount of CPU used by the library.
Unfortunately, the technology is quite CPU intensive; using it on Android right now may not be possible if using more then one voice stream at this time.
We have had discussions around generating lipsync files. For Unity, it's a matter of saving off the generated visemes when the animation plays and re-playing them. Included with ovrLipSync is a C# file ovrLipSyncContextSequencer. This was used to capture key-strokes and serialize into PlayerPref. You might want to take a look at this script and use it to record the lipsync values for playback (which is guarantee to be much faster then the current version 🙂 ). Keep in mind if doing so that the audio track is slightly delayed to stay in sync with the generated visemes (there is an API to get the delayed value, which is returned with the Process functions as frameDelay, in milliseconds).
In the meantime, I will open up discussion about possibly generating these sequence files to make it easier for developers, and I will post a note on the forum when we have updated to a better and faster lipsync lib.
Thanks -- I ended up decreasing my scene complexity a little and playing with the CPU throttling and got the lipsync to work in a frame. But yeah--I think for most games, precalculated lipsync tracks would work fine. I'll check out the context sequencer...but if you could include a precalculate option in the next version that would be great. The lip sync looks fantastic when it's not dropping frames!
Just an update in my adventures in LipSync with Unity. It's pretty much unusable on the S7. I guess because of throttling issues? Using CPU 1 GPU 3 works fine on S6 (while using GPU skinning to move that load over to the GPU). On the S7 this is a stuttering mess--I tired CPU 2 / GPU 3 on the S7 and it still stutters a bit due to CPU consumption of the lipsync system.
I'm investigating alternatives, but there's really only one real Unity lipsync plugin in the Asset Store.
I'd *reeaaaaalllly* appreciate a baked phoneme file feature for your system. It works so well otherwise and seems like a relatively simple feature to add.