Current state of Andreas Antonopoulos videos transcription & translations
Before translating any video, I decided to actually create a summary of what already have been done. Don't get me wrong - this is not effect of my work. I wrote a script, which analyzed current state of YouTube videos from aantonop channel. I used youtube-dl for that:
youtube-dl https://www.youtube.com/user/aantonop/ --write-sub --all-subs --skip-download
and later I used my script:
import re
from collections import defaultdict
from glob import glob
languages = set()
videos = {}
for filepath in glob("./subtitles/original/*"):
filename = filepath.replace("./subtitles/original/", "")
title, youtube_id, lang = re.match(r"(.*)-(.{11,13})\.(.*)\.vtt", filename).groups()
languages.add(lang)
if title not in videos:
videos[youtube_id] = {
"title": title,
"subtitles": [{"lang": lang, "filepath": filepath}]
}
else:
videos[youtube_id]["subtitles"].append({{"lang": lang, "filepath": filepath}})
print(filename)
headers = ["No.", "Title"] + list(languages)
langs = list(languages)
print("|", end="")
for header in ["No.", "Title"] + langs:
print(" <sup><sub>{}</sub></sup> |".format(header), end="")
print("")
print("|", end="")
for i in range(len(headers)):
print("----|", end="")
print("")
def multiline_split(str, char_per_line):
words = str.split(" ")
result = ""
line = ""
for word in words:
if len(line + word) < char_per_line:
line += " " + word
else:
result += line + "<br>"
line = word
result += line
return result
lang_stat = defaultdict(int)
for i, (youtube_id, video) in enumerate(videos.items()):
print("| <sup><sub>{}</sub></sup> |".format(i+1), end="", flush=True)
print(" <sup><sub>[{title}]({youtube_link})</sub></sup> |".format(
title=multiline_split(video["title"], 25),
youtube_link="https://www.youtube.com/watch?v={}".format(youtube_id)
), end="", flush=True)
for lang in langs:
if lang in [sub["lang"] for sub in video["subtitles"]]:
print(" <sup><sub>✓</sub></sup> |", end="", flush=True)
lang_stat[lang] += 1
else:
print(" |", end="", flush=True)
print("")
print("")
Result
More stats
Andreas currently has 238 videos uploaded on his channel.
- 23 of them has English transcription
- 7 of them has Spanish translation
- 4 of them has Latin/Spanish (es-419) translation
- 2 of them has Dutch translation
- single videos are translated to: Chinese (Simplified), German, Italian, Russian, French
Congratulations @andreas-m-videos! You have completed some achievement on Steemit and have been rewarded with new badge(s) :
Award for the number of posts published
Click on any badge to view your own Board of Honor on SteemitBoard.
For more information about SteemitBoard, click here
If you no longer want to receive notifications, reply to this comment with the word
STOP