(For context for other people reading this: here’s the recent newsletter about TranslateAI: https://mailchi.mp/leanpub/2023-04-10-author)
The follow-up to this post is here: http://help.leanpub.com/en/articles/7264702-won-t-the-translateai-translations-be-sub-par-i-think-this-compromises-leanpub-s-brand-and-mission
The proposition behind the TranslateAI service is simple, and a bit shocking:
GPT-4 translations are Good Enough for many types of books to ship, even without a human fluent in the language reading it before the first paying reader does.
We did not feel this was true with GPT-3, so we did not offer this service when we first got access to GPT-3.
The hypothesis is the following (I’m using English here since many people speak English as a second language, but it applies to all languages):
For almost everyone for whom English is a second language, GPT-4 can translate English into their language better than they can read English.
Now, we’re pricing this service at $249, which is not a price where we’re going and paying translators in that language to review the output.
Ironically, we currently have GPT-3 API access, but we do not yet have GPT-4 API access. So, for the books which currently have used this service, one of us has pasted the manuscript piece-by-piece into ChatGPT (using GPT-4), and pasted the output into the appropriate place. For us, the appropriate place is “a git repo which started as a bare clone” for the new translation. Then, when we’re done this semi-automated process, we convert the translated book into a Browser mode book, unless the author was using GitHub already.
Why are we doing this semi-automated process? It’s not for the money. It’s to build domain expertise in prompt engineering, to avoid the types of mistakes that GPT-4 can make when translating a Markua manuscript, so that the code we write is better. I believe in understanding the requirements fully, and doing something manually gets you a good understanding of the requirements.
This is a chore, but it’s worth doing: over the past week I personally did this for a handful of chapters of a computer programming book which is being translated into Polish, and I also did this for an entire non-technical book which got translated into French. I did it as a background task, while doing other stuff. (I don’t speak Polish at all, and as a western Canadian, my French is “cereal box French”. But as a software developer I can read git diffs, and understand very clearly where output was missed, etc.) At this point, one of our developers who is going to be building some of the TranslateAI service is finishing the Polish book, and they are also learning the edge cases through the act of interacting with GPT-4 via the ChatGPT service, etc. So, the code that they write will be informed by a deep understanding of the requirements and quirks of GPT-4, both through their own experiences of interacting with it via ChatGPT, my experiences doing the same, etc.
Through the process of translating part of one book and all of a second, I’ve gotten better at writing prompts for GPT-4. I never expected “prompt engineering” to be a skill I’d learn, but here we are. It’s actually kind of fun, and it’s an interesting challenge. This is something that Leanpub will get better and better at as we translate more books.
(One really cool thing I’ve learned from the ChatGPT interactions I just had is that GPT-4 does seem to understand Markua, not just Markdown! Hooray for having a spec; I’m pretty sure that GPT-4 has read it.)
We’re on the waiting list for GPT-4 API access, so later this year (presumably in the next month or two) when we have GPT-4 API access, we expect that the first pass will be done by our code interacting with GPT-4, and then there will still be human review of the git diff, to ensure that certain types of errors did not occur.
Over time, as our prompt engineering skills continue to improve, the amount of human review of the git diffs will decrease, and one day when we’ve concluded the service is good enough, we’ll make it available as a self-serve option. When we do, the pricing will still be similar to what it is now, by the way, so there’s no reason to wait. Right now we’re pricing the TranslateAI service based on the price that it will be when it’s 100% automated. I’m not pricing it how I’d price a consulting service. If this was could not be automated, I would not want to establish $249 as the perceived price of TranslateAI; I’d have to add a zero, at least.
We try to price things based on value, and I’m pretty confident that the value received by the first customers of TranslateAI exceeds $249, even though we’re still learning.
Our expectation is not that the translations will be perfect. We expect that they will have minor problems and small inconsistencies.
So, I should be more clear about what we’re trying to accomplish here:
First, it will not be 100% perfect. Then again, nothing is. Since this is for Leanpub authors, if there are any minor issues, the readers of the translated book can provide feedback. Also, authors can solicit feedback on the translation from people who they know who speak that language, either before or after publishing the translation. Or they can just publish the translation without any feedback, and ask the readers in that language to provide any feedback or corrections.
Second, the main use case is translating an English-language computer programming book or business book into French, German or Japanese. Now, most people in France, Germany or Japan who would be reading that book would have English as a second language. If they were really interested in the material, they could probably read the book in English, and have a tolerable experience. However, we want to offer them a better experience than that. Our expectation is that what TranslateAI produces is a better experience for a French, German or Japanese reader for whom English is a second language than reading the English language original book.
So, as I said earlier, the hypothesis is the following:
For almost everyone for whom English is a second language, GPT-4 can translate English into their language better than they can read English.
If this is true (and I believe it is), a GPT-4 powered translation is a benefit to all those readers. Since it’s a benefit to the readers, it’s a benefit to the authors as well. So, Leanpub should definitely be in the business of providing it, since we’re here to serve authors and their readers…
Thanks,
Peter
This was originally posted on the Leanpub Authors forum here.
If you have any feedback or questions about this article, please email the Leanpub team about it at hello@leanpub.com!
If you have any questions or thoughts on writing and self-publishing with Leanpub, please join our global community of authors in our Authors Forum here!
Are you interested in self-publishing, and creating your first Leanpub book? Here are some quick tutorials for our most popular writing modes: http://help.leanpub.com/en/articles/3088382-quick-walkthroughs-for-getting-started-on-a-leanpub-book
Are you looking for great deals on Leanpub ebooks, ebook bundles, and courses? Sign up for our Weekly and Monthly newsletter sales here!
Subscribe to our YouTube channel here: https://www.youtube.com/leanpub
Please subscribe to our YouTube channel here: https://www.youtube.com/leanpub
You can also follow Leanpub in lots of other places!