AutoSync Modules

Modules are the core components of AutoSync3. Each module is self-contained system that does some work on an input LipSyncData clip, either adding, removing or modifying data, then gives a clip back with the new changes in.

Using A Module

While the simplest way of using AutoSync involves Presets, there are times when you may want a greater level of control. From the AutoSync menu in the Clip Editor, you'll see a "Modules" option. Under this submenu, you'll find all the available modules in your project. Clicking one of these will open the AutoSync Window and automatically add the chosen module to the list.

You can also just open the window and add any modules yourself, or select a preset to see all the pre-configured modules it contains. More detailed info on using this window is contained in the AutoSync Window page.

Module Requirements

Most modules will need some data to work on, whether that's an AudioClip and/or Transcript for a module that generates phoneme markers, or existing markers for something like the Marker Cleanup Module. In order to ensure all modules in a process will be able to run, and can build on the work of previous modules, each one defines a list of features that it needs, and a list of features that it will provide.

These features are defined by the ClipFeatures enum, and are as follows:

  • AudioClip - An AudioClip, usually containing dialogue, assigned to the clip.

  • Emotions - One or more emotion markers added to the timeline.

  • Gestures - One or more gesture markers added to the timeline.

  • Phonemes - One or more phoneme markers added to the timeline.

  • Transcript - A text transcript of the dialogue, accessible from the Clip Settings Window.

This allows the system to determine if a set of modules will be able to run, based on the features added to the clip already and the order of the modules in the set, before starting. If the clip doesn't meet a module's requirements, you will either have to provide the missing clip feature(s) or insert a module to the list that will.

Module Options

Modules can also have per-instance options. These are not stored globally, like the settings found in the Clip Editor's Settings screen, but are local to one particular instance. Generally, these are things you may want to change based on what outcome you want from the module, or the specific properties of the clip.

As an example, in the case of the Montreal Forced Aligner Module, you can select the Language Model you wish to use. If on one occasion, you were processing dialogue in a different language, you would probably want to change this to a different model for that particular clip.

When creating a Preset, any options for the chosen modules will be stored as well, and applied when the preset is loaded. This allows you to create a wide array of presets for purposes including different languages, different audio settings, or just personal preference about the look of the final animation.

Writing a Module

AutoSync Modules are standard C# classes that inherit from the RogoDigital.AutoSync.AutoSyncModule class. They are ultimately ScriptableObjects, though they are usually not serialized or saved to disk. Instead, when being run an instance of a module is created, set up with any options necessary, then passed to an AutoSync object as part of a list of modules called the moduleSequence.

When writing a new AutoSync Module, you must implement three methods defined by the parent class:

  • ClipFeatures GetCompatibilityRequirements () This is the method in which your module defines the ClipFeatures it requires in order to run. The ClipFeatures enum contains flags, so you can return any combination of features as needed, e.g.: return ClipFeatures.AudioClip | ClipFeatures.Transcript Indicates that the module requires both an AudioClip and a transcript in order to function.

  • ClipFeatures GetOutputCompatibility () This is the method in which your module defines the ClipFeatures it will provide if it completes successfully. Again, this can be a combination of ClipFeatures if your module will add more than one type of feature to a clip, however the method should only return features that were not necessarily present before the module ran but will be after. If the module will only manipulate existing data, then simply return ClipFeatures.None here.

  • void Process (LipSyncData inputClip, AutoSync.ASProcessDelegate callback) This is where the meat of the module goes. This method is called by the AutoSync system when it is its turn to execute. There may have been several modules before it in a sequence, and there may be more after it. The inputClip will be a LipSyncData object containing any data that was already in the clip, or was added by previous modules. Any manipulation of data can be done on this inputClip, and when complete the method should call callback.Invoke, passing it the clip with all modifications made, and a new AutoSync.ASProcessDelegateData object that indicates whether the process completed successfully, gives a message (either an error or more information), and indicates which ClipFeatures have been added. Usually this is just GetOutputCompatibility() if the process was successful, but there may be special cases where the actual features were different than expected.

As an example, here are excerpts from the Montreal Forced Aligner Phoneme Detection Module

GetCompatibilityRequirements

public override ClipFeatures GetCompatibilityRequirements()
{
    return ClipFeatures.AudioClip | ClipFeatures.Transcript;
}

GetOutputCompatibility

public override ClipFeatures GetOutputCompatibility ()
{
    return ClipFeatures.Phonemes;
}

Process

public override void Process (LipSyncData inputClip, AutoSync.ASProcessDelegate callback)
{
	//(Code for running the Aligner application and retrieving results here)

	// Example of error handling
	if (nothingFound)
	{
		callback.Invoke(inputClip, new AutoSync.ASProcessDelegateData(false, "Speech was not clear enough for recognition.", ClipFeatures.None));
		return;
	}

	//(Code for mapping the output to phoneme markers here)

	inputClip.phonemeData = data.ToArray();

	callback.Invoke(inputClip, new AutoSync.ASProcessDelegateData(true, "", GetOutputCompatibility()));
	return;
}

Last updated