"Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should”

- Jurassic Park

Having done the training, continued to coach, learnt how to edit, made a home studio, recorded top-class demos, built a website, co-ordinated a marketing strategy, and started making a proper living for yourself… you could be forgiven for getting a little annoyed at all of that being placed in jeopardy.  The last decade has witnessed the biggest shift in the voiceover industry to date, with huge swathes of work migrating to home-based talent.

Within the voiceover community, you still see the dying embers of this change occasionally flicker.  Talents who were used to driving from studio to studio and only working through their agents lament about the good old days.  Some still rail against having to learn how to edit and master recordings. 

By contrast, home studio talents revel in the opportunities advanced technology has afforded them - gaining ‘broadcast quality’ equipment at a fraction of its past cost, and being able to learn how to edit and create a recording space through a glut of free resources.  I’m a living embodiment of the latter.  Having recorded half a dozen voice jobs in my first eight years as an actor, and then hit a glass ceiling guarded by voice agents who never responded, recording from my home studio became a full-time job for me within a matter of months, allowing me to go directly to source for the work.

If anything, the Covid pandemic has accentuated this process, acting as a catalyst for many producers to ‘transition’ to home-studios faster than expected.  And many studio-only talents have been forced to build their set-ups quickly, encouraged by their agents and supported by audio-engineers, in order to maintain their previous income streams.

But time doesn’t stand still. 

TTS (text to speech) and AI voice has been with us for years, and has gradually crept into everyday use since the turn of the century.  Somewhat inevitably, this is starting to make in-roads into the voiceover market - and to ignore this reality as a talent would be irresponsible.


For many years, computer-generated voices were laughably robotic.  Then, the advent of Apple’s Siri into mainstream use started to change public perception.  And since then technology companies around the world have been working hard to make them more and more credible, more human.

Affordable TTS systems that might actually be confused with real people are now on the market (see below)  And whatever you think of the product, the implications are clear.  As the technology continues to be rapidly improved and iterated on, certain genres of voiceover will be profoundly effected.

The biggest sector is obviously e-learning.  An area that requires vast amounts of words to be recorded, to a consistent quality, which might often deal with highly complex terminology, and often needs regular updating.  The ‘churn-rate’ of the material is high - which is partly why the market has boomed during Covid, as businesses and organisations have rapidly had to retrain and reorganise their workforce.  And it requires a particular skill - a subtle enough variance of tone to resist becoming monotonous, while keeping the overall style consistent across thousands of words.

For this work, high quality TTS services used on a subscription basis seem like a logical next-step.  The production of recording will be a daily process, and by eliminating the human risk factor (inconsistent availability/acoustics/delivery), the product would be even more flexible in the long run.  If you can change the gender or tone of 300,000 words at a key stroke - AND at no extra cost or time - then these producers will be carrying these savings to their end-clients.  The product may not be quite there yet, but it’s a matter of time.

The corporate sector is certainly exposed to this risk in a similar way, though it depends a lot more on a project to project basis.  The shorter-form nature of some work means that the attention and specificity to detail may be increased, and micro-managing AI-algorithms  might take just as much time as directing a ‘live’ talent, thus negating some of the savings.  But in certain markets, the balance between quality and cost will definitely shift sooner rather than later.  Importantly, an audience’s knowledge the the voice might be artificial plays a bigger factor here - associating the public voice of a brand with artifice would inherently detract from the credibility of messaging.  But over (perhaps considerable) time, as our engagement with AI continues to develop, this may change. 

Audiobooks, which have never been the highest paid work, will probably only be significantly effected in the business sectors.  Celebrities, authors and well-known voices will still be highly demanded for the foreseeable future, and obviously the myriad storytelling skills involved means a rapid switch to TTS would seem unlikely.  Audiobooks trade deeply on the narratorial voice, and the skills involved vary hugely from book to book, so the potential for a ‘rollout’ approach is significantly diminished.

Commercials, again, would probably be considered safe for now - with reads being short, and exact skillsets on how to deliver such copy being at a premium.  Great voiceover talents pay a lot for coaching to be as flexible, nuanced and responsive as possible. And tastes change - in a way that an algorithm might struggle to accomodate quickly. Again, associating the public face of a brand with AI would diminish its power.  But the mid to low-level commercial projects might be chipped away at sooner rather than later, as TTS won’t be charging anything like the usage for extended campaigns.  To a lesser extent than the corporate sector, the lower end of the market will be chipped away at in time - and going rates might gradually be eroded as a result.

Gaming is often thought of as the obvious ‘safe place’ for live talent.  With such a rapidly expanding industry - not just in VO but also performance capture - and the necessity of acting skills and live responsiveness to direction, it will take another huge leap for live performers to be replaced.  At the very top level, there will always be the demand for ‘prestige’ names - either deeply embedded in gaming already (your Laura Bailey’s and Dave Fennoy’s) or parachuted-in movie celebrities (looking at you Ben Kingsley).

But ALL of the above pertains to the AAA market.  For indie developers, voice acting is still regularly put aside altogether because of the related costs.  Choosing to voice all your characters is a huge design decision, and hiring VO talent is still often the biggest single cost.  To these level of developers, who may only want to dabble and/or might have left VO talent to self-direct in the first place (“just three of each line please”), cheap subscription options that might offer something useable may prove a viable alternative - at least in the earlier stages of development - providing a ‘placeholder’ before real talent is recruited.  Sonantic (responsible for the video on the right)) are making particular inroads in this sector, and the development is significant.  Currently, they market their AI service as ‘complimenting’ a voice actor’s career, enabling talent to effectively outsource their voice for a percentage of profits.  That this would ultimately result in much lower fees, and less actual work for ‘live’ talent goes without saying - but the issue hasn’t been directly debated… yet. Their pitch is one entirely orientated around developers workflow, but will inevitably eat into actors sessions from a variety of angles such as multi-character work, pick ups, additional lines etc.

So - what to do?

Isn’t writing about the rise of AI on my website - where I SELL MY HUMAN VOICE - tantamount to professional suicide?

I’d argue not. 

I think the technology still (some) way to go, though the pace of change is fast.

I think that in a lot of genres, as I’ve hopefully indicated, the value of human talent is still hugely signifiant.  

As the aural embodiment of a client, authenticity, creativity and technical proficiency are key

I pride myself on being able to give my clients what they want, but I often try and provide my own take as an alternative. The best work if often a surprise, where brief and inspiration intersect - when I get feedback like “that’s totally not what we expected, but it works!”. 

Believing in the importance of that, combined with professionalism and continued development, is central to my approach. Ensuring that all the infrastructure built around the talent - from a technical and marketing perspective - is of the highest quality, is non-negotiable.

Knowledge is power - so being clear on what you REALLY offer as a performer, rather than ignoring the developing landscape, is the only responsible act. Actors need to pitch ourselves not just as ‘good’ or ‘capable’ performers, but as potential collaborators that gave give suggestions, insights and creative variety in the response to material.

Otherwise, we’re just waiting for Skynet…