Day 13: Tags
Previously on symfony
The askeet application can serve data trough a web page, a RSS feed, or email. Questions can be asked and answered. But the organization of questions is still to be developed. Organizing questions in categories and subcategories could end up in an inextricable tree structure, with thousands of branches and no easy way to know in which sub-branch a question you are looking for may be.
However, web 2.0 applications have come out with a new way of organizing items: tags. Tags are words, just as categories are. But the differences are that there is no hierarchy of tags, and that an item can have several tags. While finding a cat with categories could prove cumbersome (animal/mammal/four-legged/feline/, or other mysterious category names), it is very simple with tags (pet+cute). Add to that the ability for all users to add tags to a given question, and you get the famous concept of folksonomy.
Guess what? That's exactly what we are going to do with the askeet questions. It will take us some time (today and tomorrow), but the result is worth the pain. It will also be the occasion to show how to do complex SQL requests to a database using a Creole connection. Let's go.
The QuestionTag
class
There are several ways to implement tags. We chose to add a QuestionTag
table with the following structure:
When a user tags a question, it creates a new record in the question_tag
table, linked to both the user
table and the question
table. There are two versions of the tag recorded: The one entered by the user, and a normalized version (all lower case, without any special character) used for indexing.
Schema update
As usual, adding a table to a symfony project is done by appending its Propel definition to the schema.xml
file:
... <table name="ask_question_tag" phpName="QuestionTag"> <column name="question_id" type="integer" primaryKey="true" /> <foreign-key foreignTable="ask_question"> <reference local="question_id" foreign="id" /> </foreign-key> <column name="user_id" type="integer" primaryKey="true" /> <foreign-key foreignTable="ask_user"> <reference local="user_id" foreign="id" /> </foreign-key> <column name="created_at" type="timestamp" /> <column name="tag" type="varchar" size="100" /> <column name="normalized_tag" type="varchar" size="100" primaryKey="true" /> <index name="normalized_tag_index"> <index-column name="normalized_tag" /> </index> </table>
Rebuild the object model:
$ symfony propel-build-model
Custom class
Add a new Tag.class.php
in the askeet/lib/
directory with the following methods:
<?php class Tag { public static function normalize($tag) { $n_tag = strtolower($tag); // remove all unwanted chars $n_tag = preg_replace('/[^a-zA-Z0-9]/', '', $n_tag); return trim($n_tag); } public static function splitPhrase($phrase) { $tags = array(); $phrase = trim($phrase); $words = preg_split('/(")/', $phrase, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE); $delim = 0; foreach ($words as $key => $word) { if ($word == '"') { $delim++; continue; } if (($delim % 2 == 1) && $words[$key - 1] == '"') { $tags[] = trim($word); } else { $tags = array_merge($tags, preg_split('/\s+/', trim($word), -1, PREG_SPLIT_NO_EMPTY)); } } return $tags; } } ?>
The first method returns a normalized tag, the second one takes a phrase as argument and returns an array of tags. These two methods will be of great use when manipulating tags.
The interest of adding the class in the lib/
directory is that it will be loaded automatically and only when needed, without needing to require it. It's called autoloading.
Extend the model
In the new askeet/lib/model/QuestionTag.php
, add the following method to set the normalized_tag
when the tag
is set:
public function setTag($v) { parent::setTag($v); $this->setNormalizedTag(Tag::normalize($v)); }
The helper class that we just created is already of great use: It reduces the code of this method to only two lines.
Add some test data
Append a file to the askeet/data/fixtures/
directory with some tag test data in it:
QuestionTag: t1: { question_id: q1, user_id: fabien, tag: relatives } t2: { question_id: q1, user_id: fabien, tag: girl } t4: { question_id: q1, user_id: francois, tag: activities } t6: { question_id: q2, user_id: francois, tag: 'real life' } t5: { question_id: q2, user_id: fabien, tag: relatives } t5: { question_id: q2, user_id: fabien, tag: present } t6: { question_id: q2, user_id: francois, tag: 'real life' } t7: { question_id: q3, user_id: francois, tag: blog } t8: { question_id: q3, user_id: francois, tag: activities }
Make sure this file comes after the other files of the directory in the alphabetical order, so that the sfPropelData
object can link these new records with the related records of the Question
and User
tables. You can now repopulate your database by calling:
$ php batch/load_data.php
We are now ready to work on tags in the actions. But first, let us extend the model for the Question
class.
Display the tags of a question
Before adding anything to the controller layer, let's add a new tag
module so that things keep organized:
$ symfony init-module frontend tag
Extend model
We will need to display the whole list of words tagged by all users for a given question. As the ability to retrieve the related tags should be a method of the Question
class, we will extend it (in askeet/lib/model/Question.php
). The trick here is to group double entries to avoid double tags (two identical tags should only appear once in the result). The new method has to return a tag array:
public function getTags() { $c = new Criteria(); $c->clearSelectColumns(); $c->addSelectColumn(QuestionTagPeer::NORMALIZED_TAG); $c->add(QuestionTagPeer::QUESTION_ID, $this->getId()); $c->setDistinct(); $c->addAscendingOrderByColumn(QuestionTagPeer::NORMALIZED_TAG); $tags = array(); $rs = QuestionTagPeer::doSelectRS($c); while ($rs->next()) { $tags[] = $rs->getString(1); } return $tags; }
This time, as we need only one column (the normalized_tag
), there is no point to ask Propel to return an array of Tag
objects populated from the database (this process, by the way, is called hydrating). So we do a simple query that we parse into an array, which is much faster.
Modify the view
The question detail page should now display the list of tags for a given question. We will use the sidebar for that. As it has been built as a component slot during the seventh day, we can set a specific component for this bar in the question module only.
So in askeet/apps/frontend/modules/question/config/view.yml
, add the following configuration:
showSuccess: components: sidebar: [sidebar, question]
This component of the sidebar
module is not yet created, but it is quite simple (in modules/sidebar/actions/components.class.php
):
public function executeQuestion() { $this->question = QuestionPeer::getQuestionFromTitle($this->getRequestParameter('stripped_title')); }
The longest part to write is the fragment (modules/sidebar/templates/_question.php
):
<?php include_partial('sidebar/default') ?> <h2>question tags</h2> <ul id="question_tags"> <?php include_partial('tag/question_tags', array('question' => $question, 'tags' => $question->getTags())) ?> </ul>
We choose to insert the list of tags as a fragment because it will be refreshed by an AJAX request a bit later.
This partial has to be created in modules/tag/templates/_question_tags.php
:
<?php foreach($tags as $tag): ?> <li><?php echo link_to($tag, '@tag?tag='.$tag, 'rel=tag') ?></li> <?php endforeach; ?>
The rel=tag
attribute is a MicroFormat. It is by no means compulsory, but as it costs nothing to add it here, we'll let it stay.
Add the @tag
routing rule in the routing.yml
:
tag: url: /tag/:tag param: { module: tag, action: show }
Test it
Display the detail of the first question and look for the list of tags in the sidebar:
http://askeet/question/what-can-i-offer-to-my-step-mother
Display a short list of popular tags for a question
The sidebar is a good place to show the whole list of tags for a question. But what about the tags displayed in the list of questions? For each question, we should only display a subset of tags. But which ones? We will choose the most popular ones, i.e. the tags than have been given to this question most often. We will probably have to encourage users to keep on tagging a question with existing tags to increase the popularity of relevant tags for this question. If all users don't do that, maybe "moderators" will do it.
Extend the model
Anyway, this means that we have to add a ->getPopularTags()
method to our Question
object. But this time, the request to the database is not simple. Using Propel to do it would multiply the number of requests and take way too much time. Symfony allows you to use the power of SQL when it is the best solution, so we will just need a Creole connection to the database and execute a regular SQL query.
This query should be something like:
SELECT normalized_tag AS tag, COUNT(normalized_tag) AS count FROM question_tag WHERE question_id = $id GROUP BY normalized_tag ORDER BY count DESC LIMIT $max
However, using the actual column and table names creates a dependency to the database and bypasses the data abstraction layer. If, in the future, you decide to rename a column or a table, this raw SQL query will not work anymore. That's why the symfony version of the request doesn't use the current names but the abstracted ones instead. It is slightly harder to read, but it is much easier to maintain.
public function getPopularTags($max = 5) { $tags = array(); $con = Propel::getConnection(); $query = ' SELECT %s AS tag, COUNT(%s) AS count FROM %s WHERE %s = ? GROUP BY %s ORDER BY count DESC '; $query = sprintf($query, QuestionTagPeer::NORMALIZED_TAG, QuestionTagPeer::NORMALIZED_TAG, QuestionTagPeer::TABLE_NAME, QuestionTagPeer::QUESTION_ID, QuestionTagPeer::NORMALIZED_TAG ); $stmt = $con->prepareStatement($query); $stmt->setInt(1, $this->getId()); $stmt->setLimit($max); $rs = $stmt->executeQuery(); while ($rs->next()) { $tags[$rs->getString('tag')] = $rs->getInt('count'); } return $tags; }
First, a connection to the database is opened in $con
. The SQL query is built by replacing %s
tokens in a string by the column and table names that come from the abstraction layer. A Statement
object containing the query and a ResultSet
object containing the result of the query are created. These are Creole objects, and their use is described in detail in the Creole documentation. The ->setInt()
method of the Statement
object replaces the first ?
in the SQL query by the question id
. The $max
argument is used to limit the number of results returned with the ->setLimit()
method.
The method returns an associative array of normalized tags and popularity, ordered by descending popularity, with only one request to the database.
Modify the view
Now we can add the list of tags for a question, which is formatted in a _list.php
fragment in the modules/question/templates/
directory:
<?php use_helper('Text', 'Date', 'Global', 'Question') ?> <?php foreach($question_pager->getResults() as $question): ?> <div class="question"> <div class="interested_block" id="block_<?php echo $question->getId() ?>"> <?php include_partial('question/interested_user', array('question' => $question)) ?> </div> <h2><?php echo link_to($question->getTitle(), '@question?stripped_title='.$question->getStrippedTitle()) ?></h2> <div class="question_body"> <div>asked by <?php echo link_to($question->getUser(), '@user_profile?nickname='.$question->getUser()->getNickname()) ?> on <?php echo format_date($question->getCreatedAt(), 'f') ?></div> <?php echo truncate_text(strip_tags($question->getHtmlBody()), 200) ?> </div> tags: <?php echo tags_for_question($question) ?> </div> <?php endforeach; ?> <div id="question_pager"> <?php echo pager_navigation($question_pager, $rule) ?> </div>
Because we want to separate the tags by a +
sign, and to avoid too much code in the template to deal with the limits, we write a tags_for_question()
helper function in a new lib/helper/QuestionHelper.php
helper library:
function tags_for_question($question, $max = 5) { $tags = array(); foreach ($question->getPopularTags($max) as $tag => $count) { $tags[] = link_to($tag, '@tag?tag='.$tag); } return implode(' + ', $tags); }
Test
The list of questions now displays the popular tags for each question:
http://askeet/
Display the list of questions tagged with a word
Each time we displayed a tag, we added a link to a @tag
routing rule. This is supposed to link to a page that displays the popular questions tagged with a given tag. It is simple to write, so we won't delay it anymore.
The tag/show
action
Create a show
action in the tag
module:
public function executeShow() { $this->question_pager = QuestionPeer::getPopularByTag($this->getRequestParameter('tag'), $this->getRequestParameter('page')); }
Extend the model
As usual, the code that deals with the model is placed in the model, this time in the QuestionPeer
class since it returns a set of Question
objects. We want the popular question by interested users, so this time, there is no need for a complex request. Propel can do it with a single ->doSelect()
call:
public static function getPopularByTag($tag, $page) { $c = new Criteria(); $c->add(QuestionTagPeer::NORMALIZED_TAG, $tag); $c->addDescendingOrderByColumn(QuestionPeer::INTERESTED_USERS); $c->addJoin(QuestionTagPeer::QUESTION_ID, QuestionPeer::ID, Criteria::LEFT_JOIN); $pager = new sfPropelPager('Question', sfConfig::get('app_pager_homepage_max')); $pager->setCriteria($c); $pager->setPage($page); $pager->init(); return $pager; }
The method returns a pager of questions, ordered by popularity.
Create the template
The modules/tag/templates/showSuccess.php
template is as simple as you expect it to be:
<h1>popular questions for tag "<?php echo $sf_params->get('tag') ?>"</h1> <?php include_partial('question/list', array('question_pager' => $question_pager, 'rule' => '@tag?tag='.$sf_params->get('tag'))) ?>
Add the page
parameter in the routing rule
In the routing.yml
, add a :page
parameter with a default value in the @tag
routing rule:
tag: url: /tag/:tag/:page param: { module: tag, action: show, page: 1 }
Test it
Navigate to the activities
tag page to see all the questions tagged with this word:
http://askeet/tag/activities
See you Tomorrow
The Creole database abstraction layer allows symfony to do complex SQL requests. On top of that, the Propel object-relational mapping gives you the tools to work in an object-oriented world, useful methods that keep you from worrying about the database, and it transforms requests into simple sentences.
Some of you may worry about the important load that the above requests may put on the database. Optimizations are still possible - for instance, you could create a popular_tags
column in the Question
table, updated by a transaction each time a related QuestionTag
is created. The list of questions would then be much less heavy. But the benefits of the cache system - which we will discuss in a few days - make this optimization useless.
Tomorrow, we will finish the tag features of the askeet application. Users will be able to add tags to a question, and a global tag bubble will be made available. Make sure you come back to read about it.
The full code of the askeet application as of today can be grabbed from the askeet SVN repository, tagged /tags/release_day_13/
. If you have any questions about today's tutorial, feel free to ask them in the askeet forum.
This work is licensed under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License license.