Python — StackOverflow Post Checker / Updater


I’m a little obsessed with constantly refreshing stackoverflow.

I can’t find an API for it (I’ve since found the Stack Exchange API, oops), so I wrote a little python script to parse it for new questions as an exercise.

I built a fetcher class that builds an SQLite database (if one doesn’t exist), parses stackoverflow for questions relating to django, and determines if any are new (based on the URL being stored in the database) and does a Growl notification if found.

It’s extremely straight forward, but this shows you the power of python. From idea to implementation in no time at all, including figuring out how Growl’s python bindings work.

Simply amazing!

Stack Overflow Post Checker

Parse stackoverflow HTML for questions, store in sqlite database
and send notifications of new questions.

By Yuji Tomita
import os
import sqlite3
import urllib2
import BeautifulSoup
import Growl

class StackOverflowFetcher:
    def __init__(self):
        self.base_url = ''
        self.growl = Growl.GrowlNotifier(applicationName='StackOverflowChecker', notifications=['new'])
        self.tags = [('django', True), ('python', False)]
    def get_questions(self):
        Parse target URL for new questions.
        while self.tags:
            tag, sticky = self.tags.pop()
            url = self.base_url + tag
            html = urllib2.urlopen(url).read()
            soup = BeautifulSoup.BeautifulSoup(html)
            questions = soup.findAll('h3')
            for question in questions:
                element = question.find('a')
                link = element.get('href')
                question = element.text
                if self.is_new_link(link):
                    self.growl.notify(noteType='new', title='[%s] StackOverflow Post' % tag, description=question, sticky=sticky)
                    self.record_question(link, question)
    def get_or_create_database(self):
        Check if database file exists. Create if not.
        Open file and send query. 
        If query fails, create tables. 
        path = os.path.join(os.path.dirname(__file__), 'questions.db')
            f = open(path)
        except IOError:
            f = open(path, 'w')
        self.conn = sqlite3.connect(path)
            self.conn.execute('SELECT * from questions')
        except sqlite3.OperationalError:
    def create_database(self):
        self.conn.execute('CREATE TABLE questions(link VARCHAR(400), text VARCHAR(300));')
    def is_new_link(self, link):
        results = self.conn.execute('SELECT * FROM questions WHERE = "%s";' % link).fetchall()
        if not results:
            return True
        return False
    def record_question(self, link, question):
        results = self.conn.execute('INSERT INTO questions(link, text) VALUES ("%s", "%s");' % (link, question))
    def close_connection(self):

stack = StackOverflowFetcher()

This code is now on GitHub where it will be more updated.


19 thoughts on “Python — StackOverflow Post Checker / Updater

  1. OK; when I run this, the text column is not created, which implies the question is not recovered from the screen scrape. I changed line 30 to read
    question = element.renderContents()
    and that seems to work.

    PS Any reason you chose “stack_overflow.db” as the name? The SQLite plugin for Firefox uses “.sqlite” as the default extension it looks for (which one can obviously change, of course…)

  2. I have made a small modification to run this under Ubuntu 10.04. All the references to Growl can be removed and line 35 can be replaced with:
    os.system(‘notify-send “%s” -t 1’ % question)

    1. And another small modification…

      os.system(‘notify-send “Django on StackOverflow” “%s” -t 1 -i ‘/user/local/share/icons/django.png′ % question)

      will create an alert with a suitable title, and a neat icon to the left of the message. (You will need to download and save your own icon in a suitable location.)

  3. Hey Derek,

    Thanks for the feedback! The method name has been updated.

    as for stack.db, I think I picked up the habit arbitrarily from naming my databases db_name and it was sprint style coding : )

    “Hmm, I really hate refreshing stack.. time to write something!”

    I haven’t gotten around to the stackexchange api yet, but I do use BeautifulSoup all the time. I swear you can get to parsing what you need with a 3 minute python shell session and experimentation.

  4. Definitely believe that which you stated. Your favorite reason appeared to be on the net the
    easiest thing to be aware of. I say to you,
    I certainly get annoyed while people think about worries that they just don’t know about. You managed to hit the nail upon the top as well as defined out the whole thing without having side-effects , people could take a signal. Will probably be back to get more. Thanks

  5. Hi! I could have sworn I’ve been to this site before but after going through some of the articles I realized it’s new to me.
    Anyhow, I’m certainly delighted I came across it and I’ll be book-marking it and checking back regularly!

  6. Hello! This is kind of off topic but I need some help from an
    established blog. Is it very difficult to set
    up your own blog? I’m not very techincal but I can figure things out pretty quick.
    I’m thinking about setting up my own but I’m not sure where to
    start. Do you have any points or suggestions?
    Many thanks

  7. Buenas,
    Seguramente es la unica vez que he leido tu blog y tengo que comentar qque
    esta bastaante bien y creo que me tendras con frecuencia por eetos

  8. You ought to be a part of a contest for one of the most useful blogs on the internet.

    I am going to recommend this site!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s