Jump to content

Python Web Parsing Help


Recommended Posts

Im working on a little project to retrieve videos from certain sites. Im trying to findout how to save a video file that i and if i am parsing the html correctly I am new to this side of python..

import urllib
import re

url = "https://www.videourl"
htmlfile = urllib.urlopen(url)
htmltext = htmlfile.read()
regex = '<meta property="og:video" content="http://www.urlinpagesource;version=3">'
video = re.compile(regex)

Link to comment
Share on other sites

  • 2 months later...

Well first off that's now how you use regex:

import re

my_string = 'Test string to parse'

#Either pre-compile the regex if you are going to reuse it
regex = re.compile('Test')
m = regex.match(my_string)

# Or just use a regex once
m = re.match('test', my_string)

but apart from that you do not want to be parsing HTML with regular expressions. Take a look at http://www.crummy.com/software/BeautifulSoup/ which will make your life a lot easier.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...