In this tutorial I’ll show you how to create a simple, yet powerful,
tagging system using SQLAlchemy with TurboGears. As the concept of
tags, and social tagging in particular, have become very popular,
clients now demand “tagging-enabled” applications. So, here’s a simple
way to get you started.
Our application will associate sites with tags (many to many
relationship), like delicious does, but in a much simplified manner.
For instance, delicious keeps tracks of which user gave which tag to
which URL. We will only associate sites with tags. But it will be very
easy to add this functionality later.
We’ll quickstart a new project (the -s argument tells tg-admin that the project will use SQLAlchemy and not SQLObject)
Code (text)
$ tg-admin quickstart -s tags
Enter package name [tags]:
Do you need Identity (usernames/passwords) in this project? [no] yes
Defining The Model
We are going to have a table for the sites, a table for the tags and
a table that associates sites with tags (many-to-many). Here’s the code
which defines the tables (which goes in model.py):
Code (python)
sites_table = Table(’sites’, metadata,
Column(’site_id’, Integer, primary_key=True),
Column(‘title’, Unicode(256)),
Column(‘url’, Unicode(1024)),
)
tags_table = Table(‘tags’, metadata,
Column(‘tag_id’, Integer, primary_key=True),
Column(‘name’, Unicode(32), index=‘tag_idx’))
sites_tags_table = Table(’sites_tags’, metadata,
Column(’site_id’, Integer,
ForeignKey(’sites.site_id’),
primary_key=True),
Column("tag_id", Integer,
ForeignKey(‘tags.tag_id’),
primary_key=True))
We will now create the Python classes that correspond to these tables:
Code (python)
class Tag(object):
def __init__(self, name):
self.name = name
def __repr__(self):
return self.name
def link(self):
return "/tags/"+self.name
class Site(object):
def __init__(self, url, title):
self.url, self.title = url, title
Note the link() method in the Tag class.
You might wonder what it does there. It just a little habbit that I
wanted to share with you. I’ve found myself many times hard-coding URLs
inside my templates. Then, if you want to make a tag linkable in many
different places in your app, you have to hard-code the link every
time. In this way, you can just pass the tag object to your template
and do something like:
Code (xml)
<a href="${tag.link()}">${tag.name}</a>
Ok, now we continue with mapping the classes to the tables:
Code (python)
mapper(Tag, tags_table)
mapper(Site, sites_table, properties = {
‘tags’: relation(Tag, secondary=sites_tags_table, lazy=False),
})
Great. Now we can construct the database and start populating it:
Code (python)
$ tg-admin sql create
$ tg-admin shell
…
>>> g = Site(‘http://www.google.com’, ‘Search engine’)
>>> g.tags
[]
>>> g.tags = [Tag(’search’), Tag(‘google’)]
>>> session.save(g)
>>> session.flush()
(here SQLAlchemy echos the SQL statements it executes)
Handling tags
So we got the model right. The next step is to allow the users to
provide tags for the site. The easiest way (for you and your users) is
to ask them to enter the tags in a space-separated list. Suppose you
are given this kind of space-seperated string of tags from a user, then
you have to:
convert all tags to lower case, in order to avoid case senstivity issuescheck if the string contains the same tag twicefind which tags are already in the database and which are newrecover from some nonsense that users might throw at you
and then get a list of Tag objects that you can assign to a site. So here’s a function that does just that:
Code (python)
def get_tag_list(tags):
"""Get a string of space sperated tag,
and returns a list of tag objects"""
result = []
tags = tags.replace(‘;’,‘ ‘).replace(‘,’,‘ ‘)
tags = [tag.lower() for tag in tags.split()]
tags = set(tags) # no duplicates!
if ” in tags:
tags.remove(”)
for tag in tags:
tag = tag.lower()
tagobj = session.query(Tag).selectfirst_by(name=tag)
if tagobj is None:
tagobj = Tag(name=tag)
result.append(tagobj)
return result
So you can now easily do something like:
Code (python)
>>> f = Site(‘http://www.flickr.com’, ‘Flickr!’)
>>> f.tags = get_tag_list(‘photo sharing photograpy’)
>>> f.tags
[photo, sharing, photograpy]
>>> f.tags[0].link()
‘/tags/photo’
Tag Search
It is straightforward to just list a site together with its tags:
Code (xml)
<h3 class="site-title"><a href="${site.url}" target="_blank">${site.title}</a></h3>
<p class="site-tags">Tags:
<a py:for="tag in site.tags[:5]" href="${tag.link()}" class="tag">${tag.name}</a>
</p>
Search is a bit more tricky. It took me few attempts until I got the
search queries right. Here’s how to fetch all sites that are tagged by
‘google’:
Code (python)
q = session.query(Site)
sites = q.select((Tag.c.name==‘google’) & q.join_to(‘tags’))
the magic is mostly inside the join_to method - it stands for the
SQL statements that makes sure that the Tag clause is associated to the
sites. Without it, the query runs over the entire cartesian product of
Sites x Tags.
You can make the query simpler (for MySQL; not you), if you fetch
the tag_id of ‘google’ first. Then, the query uses only 2 of the 3
tables:
Code (python)
tagobj = session.query(Tag).get_by(name=‘google’)
if not tagobj:
raise cherrypy.InternalRedirect(‘/notfound’)
sites = session.query(Site).select((sites_tags_table.c.tag_id == tagobj.tag_id) &
(sites_tags_table.c.site_id == Site.c.site_id))
To search for google|photo:
Code (python)
q = session.query(Site)
sites = q.select(
Tag.c.name.in_(‘google’, ‘photo’) &
q.join_to(‘tags’))
To search for sharing+photos:
Code (python)
q = session.query(Site)
sites = q.select(
Tag.c.name.in_(’sharing’,‘photos’) &
q.join_to(‘tags’),
group_by=[Site.c.site_id],
having=(func.count(Site.c.site_id)==2))
The idea is that sites that are tagged both with ’sharing’ and
‘photos’ will appear twice in the select, then after grouping by
site_id and getting all which appear twice, we get the desired result.
There are many more things that can be done from this point, like:
associating with the tag-site relationship which user added the tag,
rendering a tag cloud and so on. Feel free to leave comments! |