Page 1 of 1
Sub-to-SRT updated for YouTube transcripts
Posted: Sun Oct 09, 2022 1:49 am
by jameshale
I have just updated the sample stack "sub-to-srt" to now convert YouTube's transcripts (sbv extension) to set files.
Perhaps it is age, but I am increasingly finding it difficult to catch all the words spoken in videos.
The LiveCode videos posted on YouTube are a case in point. Although I have subscribed to most of the "global" events, and watch them when I can be given the timezone differences, I find myself usually waiting until they are posted on my account page.
Previously I would then download the YouTube version in order to have closed captions.
Unfortunately the apps I use to download YouTubes no longer capture the captions.
However while on YouTube you can show the transcript and copy the text.
Doing so and saving the text file with a ".sbv" extension the "sub-to-srt" stack will now convert this into a correctly formatted ".srt" file which most media players can access.
I use "sub-to-srt" as a standalone app on my Mac.
To see the transcript of a YouTube video;
Turn on captions, if not already turned on. It's the "CC" icon at the bottom of the video.
Click the gear icon to adjust any available settings.
At the end of the line of icons with "like" "dislike" etc, you will see an ellipsis "…"
Click on this and select "show transcript".
It will appear to the right of the video.
Simply select and copy the text.
Re: Sub-to-SRT updated for YouTube transcripts
Posted: Sun Oct 09, 2022 8:01 am
by richmond62
One of the reasons you may find it difficult to make out what YouTubers are saying is because an awful lot of them speak through their fundament. LOL.
Re: Sub-to-SRT updated for YouTube transcripts
Posted: Sun Oct 09, 2022 9:43 am
by jameshale
Its not YouTubers, its Kevin (that accent) and Mark (only speaks at high speed.
As for the LiveCoders themselves, sorry.
Re: Sub-to-SRT updated for YouTube transcripts
Posted: Sun Oct 09, 2022 5:40 pm
by jacque
It took me a couple of years to fully understand Kevin, but now I don't notice the accent any more. Mark was harder and took longer than that.
At the first conference in Edinburgh I remarked to Ollie (he was on the team at the time) that I had trouble understanding Mark and he said, "I have trouble understanding him and I'm from the same town!"
You are not alone.
Re: Sub-to-SRT updated for YouTube transcripts
Posted: Mon Oct 10, 2022 2:44 am
by bobcole
by jameshale :
...
Click on this and select "show transcript".
It will appear to the right of the video.
Simply select and copy the text.
I didn't know about the transcript feature of YouTube videos, thanks for pointing it out. Thanks also for pointing out the Sub-to-srt Sample Stack.
Unfortunately, when I tried to copy the transcript from YouTube, I was only able to select one line of caption at a time, without the timecode. Awkward and time consuming to get even a little of the transcript.
Is there an easy way to get the whole transcript at once, in the "sub" or "sbv" format? Perhaps a YouTube account is required (I don't have one at the moment)?
Having the text of the transcripts would be a wonderful enhancement to my learning resources.
Thanks,
Bob
Re: Sub-to-SRT updated for YouTube transcripts
Posted: Tue Oct 11, 2022 12:24 am
by jameshale
Yeah it is a strange thing.
Cick on one line and that’s it. You cannot extend the selection.
BUT simply click AND drag down from the first line and a selection extends.
One it has started, the complete transcript has been selected. You do not need to drag all the way down.
Re: Sub-to-SRT updated for YouTube transcripts
Posted: Wed Oct 12, 2022 5:45 am
by bobcole
jameshale:
Thanks for the suggestion. I was able to select whole transcripts!
I reformatted them in BBEdit. The text is difficult to read without punctuation or capitalization and with a timecode at the start of each line. Also, I see a number of goofs in the transcriptions (e.g., "Web Camp" is sometimes transcribed as "webcam").
I am happy with the results however because, despite transcription errors, the text of videos can now be searched.
The plain text (UTF-8) files from BBEdit are fairly small: Web Camp Sessions 1 and 2 are 76KB and 52 KB respectively.
Here is a sample from the first Web Camp video:
...
2:03 hello and uh very warm welcome to session one of webcamp thanks very much
2:08 for taking the time to join us today i'm kevin miller live code ceo
2:14 and i am truly delighted to be able to present our brand new and very exciting vision
2:20 for live code on the web in today's webinar i'll talk a bit about what that vision is
...
I played with your sub-to-srt Sample Stack and worked through it to create srt files.
The srt files are recognized by the VLC app on my Mac. They don't play or do anything but I didn't expect they would.
I'm quite happy to have the plain text files.
Thank you for your guidance,
Bob
Re: Sub-to-SRT updated for YouTube transcripts
Posted: Wed Oct 12, 2022 6:01 am
by jameshale
When you say the srt files don’t play or do anything what do you mean?
If the srt file is in the same folder as the video file , when you play the video you should be able to select the srt file , or if the video and srt file have the Sam name VLC should see and play it automatically with the video.
Re: Sub-to-SRT updated for YouTube transcripts
Posted: Wed Oct 12, 2022 6:11 pm
by bobcole
jameshale:
My initial goal was just to get the text so I could conduct searches on various terms.
Following your suggestion, I downloaded the video file and put the srt file in the same folder. The transcript displays perfectly.
Thank you for your good advice.
Bob
Re: Sub-to-SRT updated for YouTube transcripts
Posted: Wed Oct 12, 2022 6:46 pm
by jameshale
_______

____
Re: Sub-to-SRT updated for YouTube transcripts
Posted: Wed Oct 12, 2022 7:11 pm
by jacque
BTW if you only want the text without the time codes (not for use with VLC) then BBEdit can strip those out with a regex expression. Let us know if that's something you want.
Re: Sub-to-SRT updated for YouTube transcripts
Posted: Wed Oct 12, 2022 8:01 pm
by bobcole
Jacque:
I think I’ll keep the time codes in my text files to identify the locations in the videos. In case I want to view a video I’ll know where to look.
Thanks for the idea though.
Bob