Standalone SDK

Voiceful allows us to create new digital voice experiences for apps and services. It features speech and singing synthesis, transformation, pitch-correction, time-alignment, audio-to-midi, among others.

Our Native SDK can be integrated as cross-platform C++ libraries for Mobile (iOS/Android), Desktop (macOS/Windows/Linux) or Server applications.

TOOLKIT FEATURES

Voice generation | VoSyn

Our expressive voice generation approach, based on Deep Learning, was initially developed to generate artificial singing voice with high realism. It can learn a model from existing recordings of any individual and generate new speech or singing content. +More info

Voice Transformation | VoTrans

We can transform an actor's voice into a monster vocalization for a film, change a male voice into a kid or elder voice, and integrate it in real-time in games, social apps, or music applications. +More info

Voice Alignment and Pitch Correction | VoAlign

VoAlign analyzes and automatically corrects a voice recording without losing quality. We can align it to a reference recording for lip-syncing or ADR, or apply pitch correction automatically to an estimated musical key.
+More info

Voice Description | VoDesc

This voice analysis tool extracts acoustic and musical information from a voice recording. Data can be sent in real-time, or exported in a readable format to be used in applications such as visualization, classification, monitoring or singing rating. +More info

Time-scaling and Pitch-shifting | VoScale

Beyond voice signals, Voiceful includes also a high-quality time-scaling and pitch-shifting to process any audio content (music, field recordings, dialogues, etc). +More info

Mixing and FX | VoMix

VoMix works as a virtual DAW with all standard audio effects (autogain, compression, EQ, reverb, delay, panning, mixing, etc.) to deliver audio in a professional-like quality. +More info

DOCUMENTATION

See below an example code for integrating our SDK tools in C++.

                          
  #include "VoTransLib_api.h"

  /***************************************************************
  C++ usage example for VOICEFUL VoTrans Stand Alone SDK
  The below code is incomplete, it lacks audio in/out that will
  be implemented using standard audio driver libraries or
  reading from audio files on disk
  ****************************************************************/

  //VoTrans object
  void* mVoTrans = NULL;

  //parameters
  float gain_val = 0.9f;
  float pitch_val = 3.f;
  float mTimbreParameters[5] = { 0.5f, 0.6f, 0.7f, 0.55f, 0.5f };
  float vibratodepth_val = 0.2f;
  float vibratofreq_val = 0.3f;
  float robot_val = 0.0f;
  float alien_val = 0.0f;
  float pitchcorrection_val = 0.0f;
  float harmonizer_val = 0.0f;

  //INITIALIZATION OF THE LIBRARY
  int init()
  {
  int mnCh = 2; //number of channels
  float mSampleRate = 44100.f;
  int mBlockSize = 256;

  mVoTrans = VT_Create();

  //CONFIGURATION
  int highQuality = 1;
  int noisy = 0;
  int bypass = 0;
  VT_Configure(mnCh, mSampleRate, mBlockSize,
  highQuality, noisy, bypass, mVoTrans);
  VT_SetPreAnalysis(NULL, mVoTrans);
  VT_BeginProcess(mVoTrans);
  }

  //PROCESS (in loop)
  //This function will be the audio driver callback if used in realtime or
  //called in a loop while reading from a disk file or memory buffer if
  //transforming offline samples
  void ProcessCallback(float* inbuffer, float* outbuffer)
  {
  //set parameters (e.g. from gui)
  VT_SetGainParameter(gain_val, mVoTrans);

  VT_SetPitchTranspositionInSemitonesParameter(pitch_val, mVoTrans);

  VT_SetTimbreParameters(mTimbreParameters, mVoTrans);

  VT_SetVibratoDepthParameter(vibratodepth_val, mVoTrans);
  VT_SetVibratoFreqParameter(vibratofreq_val, mVoTrans);

  VT_SetRobotParameter(robot_val, mVoTrans);
  VT_SetAlienParameter(alien_val, mVoTrans);
  VT_SetPitchCorrectionParameter(pitchcorrection_val, mVoTrans);
  VT_SetHarmonizerParameter(harmonizer_val, mVoTrans);

  //inbuffer & outbuffer memory is allocated by the user,
  //DoProcessFloat fills the outbuffer with a maximum of
  //configured blocksize (e.g. mBlockSize=256)
  int n = VT_DoProcessFloat(inbuffer, outbuffer, mVoTrans);
  }

  //DESTROY
  void end()
  {
  VT_EndProcess(mVoTrans);
  VT_Destroy(mVoTrans);
  }

Q&A

How is the Native Standalone SDK provided?

We can provide it in the native C++ from existing code, or a wrapper library for desktop (Windows, macOS and Linux) or mobile (Android/iOS). We also provide an example with corresponding source code to show the integration. For iOS we distribute a iOS Framework.

Do you license each SDK tool (VoSyn, VoTrans, VoAlign) separately?

Yes, each SDK tool (VoSyn, VoTrans, VoAlign, VoDesc, VoScale) is licensed individually, so that only the required processing library needs to be integrated in the client’s app.

Do you offer pitch-correction, both offline and in real time?

Yes. As the internal functionality is different, they are included in two separate tools VoAlign for offline, and VoTrans for real-time. VoAlign provides an offline pitch-correction and time-correction functionalities, processing at once a complete vocal recording (e.g. a 20-30 seconds excerpt). It carries out a first musical knowledge analysis step (Key estimation, note analysis, etc.), which is used later in the signal correction step. VoTrans provides the real-time version. In this case, there is no musical knowledge analysis, and the pitch is quantized instantly, which might result in more rapid changes.

Can VoSyn generate both speech and singing synthesis?

Yes, the same tool can be used to generate artificial spoken or sung content. The input is different: for speech the input is a text (or SSML format), and for singing the input is a music score.

How are the lyrics of a song passed to VoSyn?

To generate a vocal track, VoSyn requires a digital musical score containing the musical notes and the lyrics. The required input format is MusicXML, which follows the notation of a traditional music score.

Can users commercialize the voice track generated in VoSyn within an app?

When using VoSyn, the generated voice can be used for the sole purposes of the app, but the users do not own the voice track. Users do not have rights to use it commercially broadcast it and receive royalties without permission.

Do you have a demo app to try out the real-time VoTrans?

Yes, we can provide a real-time GUI demo app (macOS / Windows) to evaluate VoTrans. This prototype has a GUI and allows you to apply voice transformation effects in real time.

Can we try out the VoTrans real-time solution on Android?

Yes! We prepared an android test app for evaluation. You will be able to test the user experience on different devices! Contact us!

Can the user modify the transformation parameters in VoTrans?

Yes, when integrating the VoTrans SDK, we provide functions to control the transformation parameters programmatically (e.g. changing pitch shift, timbre, vibrato rate, etc.). There is the option to store presets, so that they can be directly called within the app.

Is real time pitch-correction included in VoAlign?

No, real time pitch-correction is part of VoTrans real-time. With VoTrans you can test the pitch quantization (correction) in real-time.

Does the demo app allow us to select a key? Is the key selection not supported in VoTrans?

The key selection is not implemented on the test app GUI; however, it is possible to select it when using the API directly.