Jump to content

automation, trailer.mp4 download from imdb,com


i8igmac
 Share

Recommended Posts

I have a orginized list of movies, I have autoated movie cover dl, actor image dl and movie description document...

I'm compile this info into a website running on localhost. The most important feature is the trailer src that I strugle to automate this download... Omdb provides verry nice trailer that I hope to download OR just use the page/scrpt source...

I can engineer a proper get request for a single download but I notice its not consistent src location...

If some one can look at page src of i a mdb trailer. I don't have java script skill to defeat there security they try to prevent this...

I'm open for ideas, iframe src could be the main page but this is sloppy and I want to isolate the vido only....

Link to comment
Share on other sites

You could use something like Wireshark to find the URL for MP4 files, but if they are coming from rtsp streaming servers and not downloadable raw files(whcih usually they aren't) you need a program that can stream to disk the video data if you want to save them locally. They could also be flv files, but you need to determine if its a stored file, or streamed file, which are two different things.

Network Miner can also save files locally when you view site, like images and some video and audio files of known file types: http://www.netresec.com/?page=NetworkMiner

It stores them in a folder locally based on sites you visit, so if the file isn't streamed, you can potentially pull them just by visiting the page had watching the trailers.

Edited by digip
Link to comment
Share on other sites

(first post was from a droid so was quick_ now i have example to share)

i have been using wireshark, tcpick, burp to investigate my way threw traffic and this is a working download request... you can try if you like or take my work for it...

nc  progressive.totaleclips.com.edgesuite.net > out.mp4

GET /127/e12782_301.mp4?eclipId=e12782&bitrateId=471&vendorId=102&type=.mp4&sp_ubid=746-5916787-1173752 HTTP/1.1
Host: progressive.totaleclips.com.edgesuite.net
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:16.0) Gecko/20100101 Firefox/16.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Referer: http://www.imdb.com/images/js/app/video/mediaplayer.swf

(working on some examples for another reply)

Link to comment
Share on other sites

GET /127/e12782_301.mp4?eclipId=e12782&bitrateId=471&vendorId=102&type=.mp4&sp_ubid=746-5916787-1173752 HTTP/1.1
Host: progressive.totaleclips.com.edgesuite.net

is

http://progressive.totaleclips.com.edgesuite.net/127/e12782_301.mp4?eclipId=e12782&bitrateId=471&vendorId=102&type=.mp4&sp_ubid=746-5916787-1173752

Based on your own capture, your file is http://bit.ly/WwXJad

Edited by digip
Link to comment
Share on other sites

Please share when you do so! I'd be interested in a script/local website like this for my movie collection.

Link to comment
Share on other sites

#will get trailers....
#depends on apt-get install rtmpdump and wget
#set of rules for this to work... the name of the folder must be proper name as listed below, these names are also exact match from imdb

#/media/500_gig/movies/21 jump street (2012)/movie_file.avi     <--------  GOOD
#/media/500_gig/movies/21_jump_street_xvid_crap/movie_file.avi  <---   BAD

#example
#    ls /media/500_gig/movies/
#		21 Jump Street (2012)
#		antitrust (2001)
#		Avatar (2009)
#		Basketball diaries (1995)
#		be kind rewind (2008)
#		blank check (1994)
#		blow (2001)
#		buffalo soldiers (2001)

#run this script from any directory... the destination derectory must be changed below
#sudo ruby get_trailer "movie name (2000)"

Need sudo to write data to hard drive

require 'socket'
require 'cgi'
puts movie_name=ARGV[0]
dst_dir="/media/6E88F3A627ADD9B7/movies/#{movie_name}/"    #-          <--------change this
movie_name=movie_name.gsub(" ","+").chomp



s=TCPSocket.open("www.imdb.com",80)
s.print("GET /find?q=#{movie_name} HTTP/1.0\r\n\r\n")
buff=""
while line=s.gets
	buff<<line
end
s.close

#gather movie_home link 
buff=buff.gsub('"',"")
ping=buff.index("/title/")
if ping==nil
	puts"EXIT: next"
else
movie_home=buff[ping..ping+16]    # IFRAME home page / Root page crawl from starrting point
tt=buff[ping+7..ping+15]
end



s=TCPSocket.open("www.imdb.com",80)
buff1=""
s.print("GET /title/#{tt}/ HTTP/1.0\r\n\r\n")
while line=s.gets
buff1<<line
end
s.close

image_link=buff1.scan(/media.rm.*./).to_s[0..26]  # media/rm871673856/tt1232829
rm=buff1.scan(/media.rm.*./).to_s[0..26].scan(/\/.*.\//).to_s



buff2=""
s=TCPSocket.open("www.imdb.com",80)
s.print("GET /media#{rm}#{tt}/ HTTP/1.0\r\n\r\n")
while line=s.gets
buff2<<line
end
s.close



if ping=buff1.index("video/imdb/vi")
double_trailer_prevent=1
puts trailer_home=buff1[ping..ping+28]
trailer_home=trailer_home.scan(/video.imdb.vi.*.\//)



payload="GET /#{trailer_home}player?stop=0 HTTP/1.1
Host: www.imdb.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:16.0) Gecko/20100101 Firefox/16.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Proxy-Connection: keep-alive

"

	buff3=""
	s=TCPSocket.open("www.imdb.com",80)
	s.print(payload)
	while line=s.recv(5000)
	buff3<<line
	if buff3.include?("</html>")
	break
	end
	end
	s.close


	buff3=buff3.gsub('"',"")
	ping=buff3.index("so.addVariable(file, ")
	pong=buff3.index(");",ping)
	v_file=buff3[ping+21..pong-1]
	v_file=CGI.unescape(v_file)

	if v_file.include?("rtmp")
	ping=buff3.index("so.addVariable(id, ")
	pong=buff3.index(");",ping)
	v_id=buff3[ping+19..pong-1]
	v_id=CGI.unescape(v_id)
	
	q='"'
	puts"\n"
	system("rtmpdump -r #{q}rtmp://amazonimdb.fcod.llnwd.net/a2643#{q} -a #{q}a2643#{q} -f #{q}LNX 11,2,202,243#{q} -W #{q}http://www.imdb.com/images/js/app/video/mediaplayer.swf#{q} -p #{q}http://www.imdb.com#{q} -y #{q}#{v_id}#{q} -o '#{dst_dir}trailer.flv'")
	end
end



if ping=buff1.index("video/screenplay/vi")
	if double_trailer_prevent==1
		puts "double file download attempt"
		exit
	end

puts trailer_home=buff1[ping..ping+30]
trailer_home=trailer_home.scan(/video.screenplay.vi.*.\//)

payload="GET /#{trailer_home}player?stop=0 HTTP/1.1
Host: www.imdb.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:16.0) Gecko/20100101 Firefox/16.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Proxy-Connection: keep-alive

"

	buff3=""
	s=TCPSocket.open("www.imdb.com",80)
	s.print(payload)
	while line=s.recv(5000)
	buff3<<line
	if buff3.include?("</html>")
	break
	end
	end
	s.close


	buff3=buff3.gsub('"',"")
	ping=buff3.index("so.addVariable(file, ")
	pong=buff3.index(");",ping)
	v_file=buff3[ping+21..pong-1]
	v_file=CGI.unescape(v_file)

	
	
	if v_file.include?("http")
	puts"\n"
	system("wget '#{v_file}' -O '#{dst_dir}trailer.flv'")
	end
end

So, its ugly... dont judge me... it was sucessfull 95% (wrong name = fail, or trailer does not exist)

there is no error checking... now to process a hole list will take another small script...

irb mode...

data=`ls /media/500_gig/movies/`
for movie_name in data.map
system("ruby get_trailer.rb 'movie_name.chomp'")
end

Now i hope to get some help with a template for the site... i just want to scrole threw a list of images like netflix... can some one contribute?

im verry noob with building a webpage... so maybe some decent example code would be apriceated...

Edited by i8igmac
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...