- Consulting
- Training
- Partners
- About Us
x
Streaming video or audio content is a very effective way to share information, entertain, and engage users. Every organization these days has an extensive collection of videos or audio with captions and subtitles. Translated captions and subtitles can be provided in multiple languages to make these videos or audio available to more viewers. This blog will check how to use Amazon Translate to create an automated flow that translates captions and subtitles without losing context.
Captions and subtitles give people with hearing impairment access to the video or audio, provide flexibility for users in noisy and quiet environments, and help support non-native speakers. Captions or subtitles are usually rendered in SRT (.srt) or WebVTT (.vtt) format. SRT stands for SubRipSubtitle and is the most common file format for subtitles and captions. WebVTT stands for Web Video Text Track and is becoming a popular format for the same purpose. In this blog, we will check on Translating the SRT files into different languages.
Amazon Translate is a neural machine translation service that delivers fast, high-quality, affordable and customized language translation. Neural machine translation is a form of automated language translation that uses machine learning models to deliver more accurate and natural sound translations than standard rule-based translation algorithms.
With Amazon Translate, you can create local content such as websites and apps for various users, easily translate significant texts for analysis, and effectively enable interaction between users.
This article will translate the data stored in a text file into different Languages. We will use S3 triggers that will make it possible to automate translation from start to end. Below is a detailed overview of what we will accomplish in this article.
After translation, we create the SRT files using the translated delimited file by adding the timestamp.
Click on the ‘Add Trigger’ option on the lambda, select ‘S3’ as a source, and select the Event Type as ‘PUT.’ The prefix is the folder & suffix is the file type. We are considering only .srt files for the demo, and our Lambda will be triggered when the file is uploaded to the “input” folder.
This code gets invoked from the S3 Event and fetches the file data. Then we call the functions from the “srtCaptions” file, which helps remove the timestamp from the file and convert it into normal text for translation. Then we Translate the text as per our requirement and again add the time stamp to the Translated text.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
import boto3 import requests from urllib.parse import unquote_plus from srtCaptions import * s3 = boto3.client('s3') translate = boto3.client('translate') def lambda_handler(event, context): try: print(event) fileName = unquote_plus(event['Records'][0]['s3']['object']['key']) fileBucket = event['Records'][0]['s3']['bucket']['name'] resp=s3.get_object(Bucket=fileBucket, Key=fileName) subtitles=resp['Body'].read().decode("utf-8") rep=subtitles.replace('0', '1', 1) srt=srtToCaptions(rep) delimitedFile=ConvertToDemilitedFiles(srt) TargetLanguage='hi' translatedData=translate.translate_text(Text=delimitedFile, SourceLanguageCode='en', TargetLanguageCode=TargetLanguage) translatedCaptionsList = DelimitedToWebCaptions(srt,translatedData['TranslatedText'],"<span>",15) captionsSRT=captionsToSRT(translatedCaptionsList) filename=fileName.split('/')[1].split('.')[0]+'_'+TargetLanguage+'.srt' file = open(f'/tmp/{filename}', "w") file.write(captionsSRT) file.close() s3.upload_file( Filename = f'/tmp/{filename}' , Bucket = "test-bucket-translate-demo" , Key = f'output/{filename}' ) return { "statusCode": 200, "body": captionsSRT } except Exception as e: print(e) return { "statusCode": 400, "body": 'Error in Execution !!' } |
Code in srtCaptions.py file
This file contains the code which will remove the timestamp from the SRT file, and will convert it into text that we can use for Translation. After Translation, again we will add the timestamp to the translated text and store it in S3.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
from tempfile import NamedTemporaryFile import math import html import re import webvtt from io import StringIO import logging logging.basicConfig(level=logging.DEBUG) logger = logging.getLogger(__name__) def srtToCaptions(srt): captions = [] f = NamedTemporaryFile(mode='w+', delete=False) f.write(srt) f.close() for srtcaption in webvtt.from_srt(f.name): caption = {} logger.debug(srtcaption) caption["start"] = formatTimeSRTtoSeconds(srtcaption.start) caption["end"] = formatTimeSRTtoSeconds(srtcaption.end) caption["caption"] = srtcaption.lines[0] logger.debug("Caption Object:{}".format(caption)) captions.append(caption) return captions def formatTimeSRT(timeSeconds): ONE_HOUR = 60 * 60 ONE_MINUTE = 60 hours = math.floor(timeSeconds / ONE_HOUR) remainder = timeSeconds - (hours * ONE_HOUR) minutes = math.floor(remainder / 60) remainder = remainder - (minutes * ONE_MINUTE) seconds = math.floor(remainder) remainder = remainder - seconds millis = remainder return str(hours).zfill(2) + ':' + str(minutes).zfill(2) + ':' + str(seconds).zfill(2) + ',' + str(math.floor(millis * 1000)).zfill(3) def formatTimeSRTtoSeconds(timeHMSf): hours, minutes, seconds = (timeHMSf.split(":"))[-3:] hours = int(hours) minutes = int(minutes) seconds = float(seconds) timeSeconds = float(3600 * hours + 60 * minutes + seconds) return str(timeSeconds) def captionsToSRT(captions): srt = '' index = 0 for caption in captions: srt += str(index) + '\n' srt += formatTimeSRT(float(caption["start"])) + ' --> ' + formatTimeSRT(float(caption["end"])) + '\n' srt += caption["caption"] + '\n\n' index += 1 return srt.rstrip() def ConvertToDemilitedFiles(inputCaptions): marker = "<span>" # Convert captions to text with marker between caption lines inputEntries = map(lambda c: c["caption"], inputCaptions) inputDelimited = marker.join(inputEntries) logger.debug(inputDelimited) return inputDelimited def DelimitedToWebCaptions(sourceWebCaptions, delimitedCaptions, delimiter, maxCaptionLineLength): delimitedCaptions = html.unescape(delimitedCaptions) entries = delimitedCaptions.split(delimiter) outputWebCaptions = [] for i, c in enumerate(sourceWebCaptions): caption = {} caption["start"] = c["start"] caption["end"] = c["end"] caption["caption"] = entries[i] caption["sourceCaption"] = c["caption"] outputWebCaptions.append(caption) return outputWebCaptions |
When we upload a text file in our S3 bucket, our Lambda will be triggered, and after execution of our Lambda, we will be able to see SRT files in our S3 bucket Output Folder containing the translated SRT files. This SRT file can be used per the business requirements for further processing, depending on the use case.
Refer to ‘Translate Text to Different Languages using Amazon Translate- Part 1’ for more information about Amazon Translate.
CloudThat is the official AWS Advanced Consulting Partner, Microsoft Gold Partner, and Training partner helping people develop knowledge on the cloud and help their businesses aim for higher goals using best in industry cloud computing practices and expertise. We are on a mission to build a robust cloud computing ecosystem by disseminating knowledge on technological intricacies within the cloud space. Our blogs, webinars, case studies, and white papers enable all the stakeholders in the cloud computing sphere.
If you have any queries about Amazon SageMaker, Natural Language Processing, Hugging Face, or anything related to AWS services, feel free to drop in a comment. We will get back to you quickly. Visit our Consulting Page for more updates on our customer offerings, expertise, and cloud services.
Ans. Amazon Translate supports plain text input in UTF-8 format.
Ans. Amazon Translate API calls are limited to 5,000 bytes per API call. Amazon Translate, an asynchronous Batch Translation service, accepts a batch of up to 5 GB in size per API call
Ans. Amazon Translate automatically detects source language using Amazon Comprehend behind the scenes if the source language is unknown.
Ans. No, Requests are not charged if the source language equals the target language.
Voiced by Amazon Polly |
Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!
Pratiksha
Jul 28, 2022
Informative!!
Click to Comment