Vocaloid

hatsune miku

Vocaloid is a singing synthesizer. Its signal processing part was developed through a joint research project led by Kenmochi Hideki at the Pompeu Fabra University in Spain in 2000. Backed by the Yamaha Corporation it was developed into a commercial product, which was first released in 2004. The software enables users to synthesize singing by typing in lyrics and melody. It uses synthesizing technology with specially recorded vocals of voice actors or singers. A piano roll type interface is used to input the melody and the lyrics can be entered on each note. The software can change the stress of the pronunciations, add effects such as vibrato, or change the dynamics and tone of the voice.

Each Vocaloid is sold as ‘a singer in a box’ designed to act as a replacement for an actual singer. The software was originally only available in English and Japanese, but as of Vocaloid 3, Spanish, Chinese, and Korean have been added. The software is intended for professional musicians as well as light computer music users and has so far sold on the idea that the only limits are the users’ own skills. Japanese musical groups Livetune and Supercell have released their songs featuring Vocaloid as vocals. Japanese record label Exit Tunes of Quake Inc. also have released compilation albums featuring Vocaloids. Artists such as Mike Oldfield have also used Vocaloids within their work for back up singer vocals and sound samples.

The Vocaloid singing synthesizer technology is categorized as concatenative synthesis (the joining two strings end-to-end), which splices and processes vocal fragments extracted from human singing voices in the frequency domain. In singing synthesis, the system produces realistic voices by adding information of vocal expressions like vibrato to score information. The Vocaloid synthesis technology was initially called ‘Frequency-domain Singing Articulation Splicing and Shaping.’ ‘Singing Articulation’ is explained as ‘vocal expressions’ such as vibrato and vocal fragments necessary for singing. The Vocaloid and Vocaloid 2 synthesis engines are designed for singing, not reading text aloud. They cannot naturally replicate singing expressions like hoarse voices or shouts, either.

Vocaloid 3 launched in 2011. It includes a ‘Vocalistener,’ which adjusts parameters iteratively from a user’s singing to create natural synthesized singing. New technology is also being used to bring back the voice of the singer Hitoshi Ueki who died in 2007. This is the first attempt to bring back a singer whose voice had been lost and had been considered a possibility since the software was first released in 2004. However, this is only being done for private use.

Yamaha developed Vocaloid-flex, a singing software application based on the Vocaloid engine, which contains a speech synthesizer. Users can edit its phonological system more delicately than those of other Vocaloid series to get closer to the actual speech language; for example, it enables final devoicing, unvoicing vowel sounds or weakening/strengthening consonant sounds. It was used in the video game ‘Metal Gear Solid: Peace Walker.’ It is still a corporate product and a consumer version has not been announced. This software was also used for the robot model HRP-4C at CEATEC Japan 2009.

Though developed by Yamaha, the marketing of each Vocaloid is left to the respective studios. Japanese magazines such as ‘DTM’ magazine are responsible for the promotion and introduction for many of the Japanese Vocaloids to Japanese Vocaloid fans. It has featured Vocaloids such as Miku, Kagamine Rin and Len, and Lily, printing some of the sketches by artist Kei and reporting the latest news on the Vocaloids. Thirty-day trial versions of Miriam, Lily, and Iroha have also contributed to the marketing success of those particular voices. Crypton has been involved with the marketing of their Character Vocal Series, particularly Hatsune Miku, has been actively involved in the GT300 class of the Super GT since 2008 with the support of Good Smile Racing (a branch of Good Smile Company, mainly in charge of car-related products, especially itasha (cars featuring illustrations of anime-styled characters) stickers). Although Good Smile Company was not the first to bring the anime and manga culture to Super GT, it departs from others by featuring itasha directly rather than colorings onto vehicles.

Originally, Hiroyuki Ito—President of Crypton Future Media—claimed that Hatsune Miku was not a virtual idol but a kind of the Virtual Studio Technology instrument. However, Hatsune Miku performed her first ‘live’ concert like a virtual idol on a projection screen during Animelo Summer Live at the Saitama Super Arena in 2009. Her image was screened by rear projection on a mostly-transparent screen.

It is difficult to know how many songs and albums are using the Vocaloid software since song writers must ask permission before being allowed to state specifically they are using a Vocaloid in their songs. The earliest use of Vocaloid related software used prototypes of Kaito and Meiko and were featured on the album ‘History of Logic System’ by Hideki Matsutake released in 2003. The first album to be released using a full commercial Vocaloid was ‘A Place in the Sun,’ which used Leon’s voice for the vocals singing in both Russian and English. Miriam has also been featured in two albums, ‘Light + Shade’ and ‘Continua.’ Japanese electropop-artist Susumu Hirasawa used Vocaloid Lola in the original soundtrack of ‘Paprika’ by Satoshi Kon. The software’s biggest asset is its ability to see continued usage even long after its initial release date. Leon was featured in the album ’32bit Love’ by Muzehack and Lola in ‘Operator’s Manual’ by anaROBIK; both were featured in these albums six years after they were released. Even early on in the software’s history, the music making progress proved to be a valuable asset to the Vocaloid development as it not only opened up the possibilities of how the software may be applied in practice, but led to the creation of further Vocaloids to fill in the missing roles the software had yet to cover. The album ‘A Place in the Sun’ was noted to have songs that were designed for a male voice with a rougher timbre than the Vocaloid Leon could provide; this later led to the development of Big Al to fulfill this particular role.

According to Crypton, because professional female singers refused to provide voice samples, in fear that the software might create their singing voice’s clones, Crypton changed their focus from imitating certain singers to creating characteristic vocals. This change of focus led to sampling vocals of voice actors and the Japanese voice actor agency Arts Vision supported the development. Similar concerns are expressed throughout the other studios using Vocaloid, with Zero-G refusing to release the names of their providers and Miriam Stockley (who provided the voice for Miriam) remains the only known Zero-G voice provider. PowerFX only hinted at Sweet Ann’s voice provider and only Big Al’s is known. AH Software named Miki’s voice provider, but for legal reasons cannot name Kaai Yuki’s as minors were the subject of the recordings.

Any rights or obligations arising from the vocals created by the software belong to the software user. Just like any music synthesizer, the software is treated as a musical instrument and the vocals as sound. Under the term of license, the mascots for the software can be used to create vocals for commercial or non-commercial use as long as the vocals do not offend public policy. In other words, the user is bound under the term of license of the software not to synthesize derogatory or disturbing lyrics. On the other hand, copyrights to the mascot image and name belong to their respective studios. Under the term of license, a user cannot commercially distribute a vocal as a song sung by the character, nor use the mascot image on commercial products, without the consent of the studio who owns them.

Since the Vocaloid or its vocal library is released for producers to do as they please, some producers liken the Vocaloids to dolls that they can make sing whatever they want. The portrayals of Vocaloids can at times touch controversial issues. Releases put out as young children risk becoming subject to sexual or pedophiliic portrayals. One of the most controversial uses of the legal agreements of any Vocaloid producing studio was from the Democratic Party of Japan, whose running candidate, Kenzo Fujisue, attempted to secure the use of Miku’s image in the Japanese House of Councillors election in 2010. The hope was that the party could use her image to appeal to younger voters. Although Crypton Future Media rejected the party’s use of her image or name for political purposes, Fujisue released the song ‘We Are the One’ using her voice but not credited to her on YouTube, by replacing her image with the party’s character in the music video.

Despite the success of the software in Japan, overseas customers have been reluctant on the software overall. In contrast to the reaction overseas, reviewers such as Michael Stipe of R.E.M. praised when it was first announced in 2003. Stipe noted that one of the more useful aspects of the software was that is gave singers a method of preserving their voice for future use should they lose their own, but as the technology progressed it could also be used to bring back the voices of singers whose voices have already been lost. However, while the provider of ‘Miriam,’ Miriam Stockley, had accepted that there was little point in fighting progress, she had noted there was no control over how her voice was used once the software was in the hands of others.

Reception to Vocaloid 2 was generally better. When Sweet Ann was first released, John Walden of ‘Sound on Sound’ had reviewed Leon, Lola, and Miriam and noted that Vocaloid itself had no previous rival technology to contend with, and praised Yamaha for their efforts as Vocaloid was an ambitious project to undertake, considering the human voice was more complex to synthesize than instruments such as the violin. In reviewing Vocaloid 2, he referred to the original software engine in a passing comment stating, ‘Undoubtedly a remarkable and innovative product and, with experience and patience, was capable of producing results that could be frighteningly realistic.’ While he congratulated the improvements made in Vocaloid 2, he noted the software was still far from being regarded as a top rate singer. Particularly what makes Vocaloid difficult to sell as a product is the notion that the human ear can pick up faults in vocal speech. When reviewing Tonio, ‘Sound on Sound’ writer Tom Flint argued that in the amount of time it takes to understand and learn how to use the software, it would be easier to hire a singer for half and hour to do the recording session. He, along with fellow writer John Walden during a review on Sonika, both stated singers will not fear losing their jobs just yet.

When interviewed by the Vocaloid producing company Zero-G, music producer Robert Hedin described how the software offered a creative freedom. He compared it to auto-tuning software, stating the Vocaloid software itself has enough imperfections to present itself as a singer who does not sound human. However, he states that Vocaloid also does not ‘snap into tune’ like auto-tuning software, which the music industry seems to favor these days. Giuseppe, who had produced demo songs for both Zero-G and PowerFX Vocaloids, and is now aiding in the production of Spanish based Vocaloids, had noted that each Vocaloid package worked the same way. However, each vocal has its own unique personality to it, so choosing one vocal over another is not easy.

Tags:

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s