Standalone SDK

Voiceful allows us to create new digital voice experiences for apps and services. It features speech and singing synthesis, transformation, pitch-correction, time-alignment, audio-to-midi, among others.

Our Native SDK can be integrated as cross-platform C++ libraries for Mobile (iOS/Android), Desktop (macOS/Windows/Linux) or Server applications. Contact us for an evaluation account.

TOOLKIT FEATURES

Voice generation | VoSyn

Our expressive voice generation approach, based on Deep Learning, was initially developed to generate artificial singing voice with high realism. It can learn a model from existing recordings of any individual and generate new speech or singing content. +More info

Voice Transformation | VoTrans

We can transform an actor's voice into a monster vocalization for a film, change a male voice into a kid or elder voice, and integrate it in real-time in games, social apps, or music applications. +More info

Voice Alignment and Pitch Correction | VoAlign

VoAlign analyzes and automatically corrects a voice recording without losing quality. We can align it to a reference recording for lip-syncing or ADR, or apply pitch correction automatically to an estimated musical key.
+More info

Voice Description | VoDesc

This voice analysis tool extracts acoustic and musical information from a voice recording. Data can be sent in real-time, or exported in a readable format to be used in applications such as visualization, classification, monitoring or singing rating. +More info

Time-scaling and Pitch-shifting | VoScale

Beyond voice signals, Voiceful includes also a high-quality time-scaling and pitch-shifting to process any audio content (music, field recordings, dialogues, etc). +More info

Mixing and FX | VoMix

VoMix works as a virtual DAW with all standard audio effects (autogain, compression, EQ, reverb, delay, panning, mixing, etc.) to deliver audio in a professional-like quality. +More info

DOCUMENTATION

See below an example code for integrating our SDK tools in C++.

                          
  #include "VoTransLib_api.h"

  /***************************************************************
  C++ usage example for VOICEFUL VoTrans Stand Alone SDK
  The below code is incomplete, it lacks audio in/out that will
  be implemented using standard audio driver libraries or
  reading from audio files on disk
  ****************************************************************/

  //VoTrans object
  void* mVoTrans = NULL;

  //parameters
  float gain_val = 0.9f;
  float pitch_val = 3.f;
  float mTimbreParameters[5] = { 0.5f, 0.6f, 0.7f, 0.55f, 0.5f };
  float vibratodepth_val = 0.2f;
  float vibratofreq_val = 0.3f;
  float robot_val = 0.0f;
  float alien_val = 0.0f;
  float pitchcorrection_val = 0.0f;
  float harmonizer_val = 0.0f;

  //INITIALIZATION OF THE LIBRARY
  int init()
  {
  int mnCh = 2; //number of channels
  float mSampleRate = 44100.f;
  int mBlockSize = 256;

  mVoTrans = VT_Create();

  //CONFIGURATION
  int highQuality = 1;
  int noisy = 0;
  int bypass = 0;
  VT_Configure(mnCh, mSampleRate, mBlockSize,
  highQuality, noisy, bypass, mVoTrans);
  VT_SetPreAnalysis(NULL, mVoTrans);
  VT_BeginProcess(mVoTrans);
  }

  //PROCESS (in loop)
  //This function will be the audio driver callback if used in realtime or
  //called in a loop while reading from a disk file or memory buffer if
  //transforming offline samples
  void ProcessCallback(float* inbuffer, float* outbuffer)
  {
  //set parameters (e.g. from gui)
  VT_SetGainParameter(gain_val, mVoTrans);

  VT_SetPitchTranspositionInSemitonesParameter(pitch_val, mVoTrans);

  VT_SetTimbreParameters(mTimbreParameters, mVoTrans);

  VT_SetVibratoDepthParameter(vibratodepth_val, mVoTrans);
  VT_SetVibratoFreqParameter(vibratofreq_val, mVoTrans);

  VT_SetRobotParameter(robot_val, mVoTrans);
  VT_SetAlienParameter(alien_val, mVoTrans);
  VT_SetPitchCorrectionParameter(pitchcorrection_val, mVoTrans);
  VT_SetHarmonizerParameter(harmonizer_val, mVoTrans);

  //inbuffer & outbuffer memory is allocated by the user,
  //DoProcessFloat fills the outbuffer with a maximum of
  //configured blocksize (e.g. mBlockSize=256)
  int n = VT_DoProcessFloat(inbuffer, outbuffer, mVoTrans);
  }

  //DESTROY
  void end()
  {
  VT_EndProcess(mVoTrans);
  VT_Destroy(mVoTrans);
  }
                          
                        

Q&A

We can provide it in the native C++ from existing code, or a wrapper library for desktop (Windows, macOS and Linux) or mobile (Android/iOS). We also provide an example with corresponding source code to show the integration. For iOS we distribute a iOS Framework.
Yes, each SDK tool (VoSyn, VoTrans, VoAlign, VoDesc, VoScale) is licensed individually, so that only the required processing library needs to be integrated in the client’s app.
Yes. As the internal functionality is different, they are included in two separate tools VoAlign for offline, and VoTrans for real-time. VoAlign provides an offline pitch-correction and time-correction functionalities, processing at once a complete vocal recording (e.g. a 20-30 seconds excerpt). It carries out a first musical knowledge analysis step (Key estimation, note analysis, etc.), which is used later in the signal correction step. VoTrans provides the real-time version. In this case, there is no musical knowledge analysis, and the pitch is quantized instantly, which might result in more rapid changes.
Yes, the same tool can be used to generate artificial spoken or sung content. The input is different: for speech the input is a text (or SSML format), and for singing the input is a music score.
To generate a vocal track, VoSyn requires a digital musical score containing the musical notes and the lyrics. The required input format is MusicXML, which follows the notation of a traditional music score.
When using VoSyn, the generated voice can be used for the sole purposes of the app, but the users do not own the voice track. Users do not have rights to use it commercially broadcast it and receive royalties without permission.
Yes, we can provide a real-time GUI demo app (macOS / Windows) to evaluate VoTrans. This prototype has a GUI and allows you to apply voice transformation effects in real time.
Yes! We prepared an android test app for evaluation. You will be able to test the user experience on different devices! Contact us!
Yes, when integrating the VoTrans SDK, we provide functions to control the transformation parameters programmatically (e.g. changing pitch shift, timbre, vibrato rate, etc.). There is the option to store presets, so that they can be directly called within the app.
No, real time pitch-correction is part of VoTrans real-time. With VoTrans you can test the pitch quantization (correction) in real-time.
The key selection is not implemented on the test app GUI; however, it is possible to select it when using the API directly.

No answer? Maybe you're interested in the Cloud API or need a custom development!

Our Cloud API is a RESTful API that can be easily integrated into web sites, mobile apps and other SaaS platforms.

We offer Custom Services to extend and customize our technologies for the specific needs of your project idea.

EVALUATION ACCOUNT

SDK tools are available also on our Cloud API for evaluation.

Request us a free 1-month evaluation account and obtain the Terms of Use and Pricing information. (Account limited to evaluation purposes and only for professionals: companies, business individuals or research institutions).