digip Posted May 11, 2007 Share Posted May 11, 2007 Ok. I was thinking about a program that could take a video or song and then make the audio into a text file. I then thought, Voice recognition software already kinda does this to begin with. Vista (and other os's) can do voice recognition and it's well known the exploits people have already tried, sending various commands through mp3's, etc to people, but what I was thinking is, Vista is understanding these commands, which means it must be able to convert it to text, us tlike as if a hadicapped person was giving an oral dictation in word, etc. So, my question is, how hard is it to get it to so they work for you, in creating a subtitle to a film, or dump the lyrics from a song to a text file? The should be a way to have Vista (or any other OS) create Speech to Text files on the fly for you. If someone can find a way to make a modded vb app to take advantage of this in vista, then theoretically you could have it dump the audio from an episode of Hak5 to a text file. And with a little counter system, time stamp each peice so it works like an .srt file to be merged with the video. It could then be translated into other languages so people from around the world could add subtitles to the episodes of Hak5. Quote Link to comment Share on other sites More sharing options...
Shaun Posted May 11, 2007 Share Posted May 11, 2007 Ok. I was thinking about a program that could take a video or song and then make the audio into a text file. I then thought, Voice recognition software already kinda does this to begin with. Vista (and other os's) can do voice recognition and it's well known the exploits people have already tried, sending various commands through mp3's, etc to people, but what I was thinking is, Vista is understanding these commands, which means it must be able to convert it to text, us tlike as if a hadicapped person was giving an oral dictation in word, etc. So, my question is, how hard is it to get it to so they work for you, in creating a subtitle to a film, or dump the lyrics from a song to a text file? The should be a way to have Vista (or any other OS) create Speech to Text files on the fly for you. If someone can find a way to make a modded vb app to take advantage of this in vista, then theoretically you could have it dump the audio from an episode of Hak5 to a text file. And with a little counter system, time stamp each peice so it works like an .srt file to be merged with the video. It could then be translated into other languages so people from around the world could add subtitles to the episodes of Hak5. Isn't that kind of like the PodZinger thing that they interviewed some guys from in one of the Hak.5 episodes? That recognises the speech from podcasts and transcribes them so you can search on the actual podcast content. I'd say that could easily be used for film, perhaps less so for music since then it has to distinguish between music and lyrics, plus problems of multiple singers/backup singers and recognising words that a pronounced strangely to make them fit the music. Edit: Although it's not that good, here's a few bits from 2x10: "...some kind of disk imaging software I'd chose Acronis true image nine point one. Let anything just mentioned Maureen Meehan like all of that is homily sailing -- reward lots it is and others partition..." "...got -- org slash Wiki you can email me directly west of Hak five dot org. Now coming up we're going to have a prerecorded segment with our whole body move aches and his US..." "...say. Almost looks militarily. Weren't able to use that link slipped one point yeah but it wasn't as easy wasn't disease or is fun because we have gradient around corners -- We'll instantly very cool..." Quote Link to comment Share on other sites More sharing options...
digip Posted May 11, 2007 Author Share Posted May 11, 2007 I have used the podzinger a few times to find specific segments, like the ssh tunneling episode... Thats sort of what I was thinking, only I want to see it generate the text to a file while watching the video so it can be saved and then translated later. Podzinger is cool for searching through segments and finding what you want, but what i am talking about is getting a fully transcribed text file generated from the video's audio feed. If podzinger has full text transcripts from the show, the only thing left would be to time stamp it in an srt format and then translate to other languages to be merged as subtitles, but I have yet to see where it gives a full transcript of the show. Any idea what the underlying software is that translates the audio for podzinger? Quote Link to comment Share on other sites More sharing options...
Shaun Posted May 11, 2007 Share Posted May 11, 2007 I have used the podzinger a few times to find specific segments, like the ssh tunneling episode... Thats sort of what I was thinking, only I want to see it generate the text to a file while watching the video so it can be saved and then translated later. Podzinger is cool for searching through segments and finding what you want, but what i am talking about is getting a fully transcribed text file generated from the video's audio feed. If podzinger has full text transcripts from the show, the only thing left would be to time stamp it in an srt format and then translate to other languages to be merged as subtitles, but I have yet to see where it gives a full transcript of the show. Any idea what the underlying software is that translates the audio for podzinger? Yeah I know what you mean, I was just using PodZinger as an example of the technology. Obviously PodZinger does have full transcripts of the show, since otherwise it wouldn't be very useful to searching podcast content, but whether they will give them to people is another matter (probably not). Quote Link to comment Share on other sites More sharing options...
digip Posted May 11, 2007 Author Share Posted May 11, 2007 I use Virtual Dub to merge pre made ssa files(converted from srt files to ssa, then run through virtual dub to merger subtitles), but it would be nice if it had a plugin to write the srt/ssa file from the audio track of the video. This way it time stamps it in the correct spot to line up with the video and can then be translated using something like google, and then imported and merged back in with the desired language for the video.(/me needs to stop writing run on sentences) Anywho, just throwign out some ideas. If anyone knows of any Speech to Text programs that will capture from a video or combinations of plugins to maybe do this, pleas epost the links here. I have seen some PayFor Speech to Text programs, but they all seem to be based on standard voice recognition programs, and I don't need Darren or someone spouting out commands and my pc starts to go haywire executing them... Quote Link to comment Share on other sites More sharing options...
digip Posted May 11, 2007 Author Share Posted May 11, 2007 Just a reminder, the Wiki is back up: http://wiki.hak5.org/wiki/Episode_Subtitles Quote Link to comment Share on other sites More sharing options...
deleted Posted May 11, 2007 Share Posted May 11, 2007 This would be really simple. Although, my way would involve a Jack to Jack Cord. Quote Link to comment Share on other sites More sharing options...
digip Posted May 11, 2007 Author Share Posted May 11, 2007 This would be really simple. Although, my way would involve a Jack to Jack Cord. I assume you mean out of the sound card and back in? As long as it is only recording the video's audio and not any other source, otherwise you get stereo feedback. You could of course go pc to pc, and just use the voice recognition to record it from on pc to the other as text, but there would be a lot of manual setup involved and I was looking more for an app or plug-in to do the whole process on the fly. See, you can script a media player control into a vb app with visual basic, and I know you can do text to speech, but I havent found any source code to do speech to text in a visual basic app. If it is possible, and I think it is, what I woudl do is have the app start the video and then create the text in a text box control on the form with a time stamp every time there is new audio, so if there is silence, it starts counting and when it hears audio, places a time stamp at the beginning of the sentence. Then when done, you can save it as an srt file which can be used as a subtitle, or imported into srt converter to make an ssa subtitle file to merge in Virtual Dub. Either way, you woudl have the text from the show, and then it could easily be translated to other languages, once it has been transcribed and time stamped. Anyone have any ideas on how to do this. I think it would be a good peice of software if someone could make it work. This is an example of what I would be doing if I get all the parts worked out. Its basic and all it woudl do is load a video, play it and record the audio to text int he text window with a time stamp, then you can dump it to an srt file or whatever you wnat to do with it. This is a very basic conecpt and I understand it would probably require some sort of software programming beyond a simple voice recognition plugin, but hey, its a start, or an conecpt of what I want to do. Quote Link to comment Share on other sites More sharing options...
Shaun Posted May 11, 2007 Share Posted May 11, 2007 This would be really simple. Although, my way would involve a Jack to Jack Cord. In that case your way is stupid. Quote Link to comment Share on other sites More sharing options...
digip Posted May 18, 2007 Author Share Posted May 18, 2007 Well, if anything, its worth a good laugh to see what the Speech recognitions will come up with. The following is taken from a clip of Wes going over how to use a laser pointer to send audio to a stereo. Its allmost jibberish, and unrecognizable. I have to laugh as I watch the video and read along what it is typing out on the screen though. Funny stuff. I needed Evelyn battery that have of user: NT my in the wall ones R's Anne Darcy were certainly I had I had anyone with a envoy-mile at my house was your yellow sawmills lesson on the side of Allied attack of this because the U.S. wages I knew I'd suggest and in the floss on this man's original design was that I want so why the medical lasers just all in all his illness going on on a handling of new name on it days/on off-the the family had to do is wrong or wires in different directions and recover Bracken's tradeoff of the outline of the battery pack your time and then this of this disaster accolades India laser pointer itself to serve the time being will just have trusted on right here on Asahara is connections here shortly that distance tainan meant anywhere that vision of the eyes of events today-music to pay attention to how that is. One major player (say this is a direct line then from there with Internationale have runs to one side of our audience transform the right hand side that and then the other native inspiration from the battery pack to transform you will distresses connections and I have a 10ms in the NASA can take a lot more time-made needs to do not a lot has given me and I boxed radio shack in the middle of pre-villages that have always had sailing in this way This is just too funny. I wonder what it would do if I gave it a speech of G.W.Bush? I can barely understand half of his ramblings as it is, its sure to be funny what the computer interprets his words as. Quote Link to comment Share on other sites More sharing options...
cooper Posted May 18, 2007 Share Posted May 18, 2007 Which software did you use to transcribe that? The Speech Recognition Howto refers to a number of voice recognition programs out there. No idea which one is best. Quote Link to comment Share on other sites More sharing options...
digip Posted May 18, 2007 Author Share Posted May 18, 2007 Which software did you use to transcribe that? The Speech Recognition Howto refers to a number of voice recognition programs out there. No idea which one is best. Thanks Cooper, but I am using WinXP at the moment. Haven't tried it in Linux. In fact, I haven't ever watched a video using linux, so I guess its a good time to start fiddling around with it... Quote Link to comment Share on other sites More sharing options...
l3db3tt3r Posted May 25, 2007 Share Posted May 25, 2007 I read the first post and no more... my response is.. metaphor and simile can not be translated. language is a construct of culture (and more). It is like, in English we can understand how some words can mean other things. (think "rod") but in other languages you will not get the metaphor or simile that that word represents... words with multiple meanings.. but it doesn't stop there, think of how many word combinations, or word phrases that have alternative meanings... it's endless... so we can do our best... but the true problem is to get a program to understand the context of the whole. (oh and then add on things like sarcasm) Quote Link to comment Share on other sites More sharing options...
Shaun Posted May 25, 2007 Share Posted May 25, 2007 I read the first post and no more... my response is.. metaphor and simile can not be translated. language is a construct of culture (and more). It is like, in English we can understand how some words can mean other things. (think "rod") but in other languages you will not get the metaphor or simile that that word represents... words with multiple meanings.. but it doesn't stop there, think of how many word combinations, or word phrases that have alternative meanings... it's endless... so we can do our best... but the true problem is to get a program to understand the context of the whole. (oh and then add on things like sarcasm) Huh? Metaphor and simile can often be translated, especially simile. In the case they can't then another phrase with a similar meaning can be used in its place. I'm not really sure what your argument is, since obviously hundreds of movies and TV shows are subtitled in other languages every day. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.