Manuel's Blog thoughts, technologies and life

8Dec/090

[en] Using php on your Ubuntu to get the TED talks

Intro

Anyone who knows me a little knows I'm a big fan of the presentations on TED.com. I watch them, adore them, I learn from them and I try to imitate the techniques of good speakers there. My Google Reader is subscribed to the RSS feed of ted.com and I download almost every presentation from their site almost as soon as they appear. But wouldn't it be nice if the episodes came strainght to my computer? Of course it would, and there are some programs that can help with that, but this article isn't about them. This article is about going all geeky and having a custom solution. Let's start!

Start coding

First thing you'll need is php and zend framework. Why Zend Framework? Because it makes it really nice and easy to manipulate the RSS feed as you'll see a bit later. I am using Ubuntu, and this article is based on what Ubuntu can do, so open up your terminal and type:

sudo apt-get install php5-cli zend-framework-bin

Now open Gedit or any other text editor and start writing some PHP code.

#!/usr/bin/php
<?php
echo "works";
?>

Now save the file in your home directory as ted.php and in the terminal type:
cd ~
chmod +x ted.php
./ted.php

If on your screen you can see the message "works" then all is ok. Let's continue by getting the RSS feed, add the following lines to the file replacing the echo statement:
require_once 'Zend/Feed/Rss.php';
$channel = new Zend_Feed_Rss("http://feeds.feedburner.com/tedtalks_video");
echo $channel->title(); echo "\n";

Run the file again by typing ./ted.php and after a few seconds (depending on your Internet connection) you'll see the title of the TED talks feed. We include the Zend Rss class easily since it's in the php include path and that's because we've downloaded Zend Framework through the package manager. If you do it differently then you must include it differently.
Copy the link of the RSS feed and open that link in Firefox. Go to the feed source and copy the pubDate of the second piece of news that you see (hint: the attribute looks something like: Thu, 03 Dec 2009 01:00:00 -0600). Now paste the date into a file called last.date inside the same folder as the ted.php file.

We will use this file to keep track of the presentations that we have already downloaded. It's simple, if there is a presentation published on the TED site after the date in the last.date file then we want it.

The date problem

In order to establish whether the publication date of one presentation is before or after the one in our file we will need to be able to compare the dates. The format for the date in the RSS feed is something called RFC822.  A quick search on php.net shows that we can create new DateTime objects from the RFC822 format by using DateTime::createFromFormat(). This method is available in PHP starting from version 5.3. Comparing the dates would be a piece of cake if Ubuntu came with PHP version 5.3 (the latest version). Unfortunately it doesn't. Although PHP 5.3 was released on 30th of June and Ubuntu 9.10 was released at the end of October it is not the default PHP version and that is because of stability and reliability issues. Put simply, the Ubuntu community has decided to let PHP 5.3 stabilise a bit before including it. A sound decision considering that on 19th of November the 5.3.1 version was release which featured

Over 100 bug fixes, some of which were security fixes as well.

Now let's get back to our business. Since we can't use date objects we'll use something simpler and which has existed in PHP for a while: strtotime(). So let's see how the code looks now:

#!/usr/bin/php
<?php
require_once 'Zend/Feed/Rss.php';
$lastDate = strtotime(file_get_contents("last.date"));
$channel = new Zend_Feed_Rss("http://feeds.feedburner.com/tedtalks_video");
foreach ($channel as $item) {
if(strtotime($item->pubDate()) > $lastDate){
echo $item->title();
echo "\n";
} else {
break;
}

}
?>

Notice how easily I'm iterating through the news items using Zend Framework's RSS class. Run the script again and you should see the titles of the items newer that the date you copied. I notice one problem though. It's really annoying to see the "TED talks : " before each title so we'll strip that out. Replace the line that contains:

echo $item->title();

with

echo substr($item->title(), 11);

Now it looks a bit better, run it again, you'll see.

What we need to take care of now is to alert the user of the new episodes and to actually download the new episodes. Since Ubuntu is a Unix type of system it adheres to the philosophy of writing small programs very specialised on one task, and so will our small script. It will not deal with the actual downloading of the episodes nor with the actual notification of the user, instead we will use two specialised tools for this: wget and notify-send. The first connects to the passed URL and downloads the respective file, the second one is responsible for the new notifications that have been introduced in Ubuntu since version 9.04. Let's deal with the notifications first, here's the modified source code:

#!/usr/bin/php
<?php
require_once 'Zend/Feed/Rss.php';
$lastDate = strtotime(file_get_contents("last.date"));
$channel = new Zend_Feed_Rss("http://feeds.feedburner.com/tedtalks_video");
$notifyText = "New presentations:\n";
foreach ($channel as $item) {
if(strtotime($item->pubDate()) > $lastDate){
$notifyText .= substr($item->title(), 11);
$notifyText .= "\n";

} else {
break;
}
}
if(strlen($notifyText) > 19){
exec("
DISPLAY=:0.0 /usr/bin/notify-send \"$notifyText\"");
}

?>

Run the script again and this time you'll get a nice notification of the new episodes. Now let's deal with the downloading. See the code:

#!/usr/bin/php
<?php
require_once 'Zend/Feed/Rss.php';
$lastDate = strtotime(file_get_contents("last.date"));
$channel = new Zend_Feed_Rss("http://feeds.feedburner.com/tedtalks_video");
$notifyText = "New presentations:\n";
foreach ($channel as $item) {
if(strtotime($item->pubDate()) > $lastDate){
$notifyText .= substr($item->title(), 11);
$notifyText .= "\n";
$link = $item->guid();
exec("
cd ~/Videos/Ted/ && wget " . $link);
} else {
break;
}
}
if(strlen($notifyText) > 19){
exec("DISPLAY=:0.0 /usr/bin/notify-send \"$notifyText\"");
}
?>

ATTENTION: In the code I've presented will store all the presentations in the Ted folder located inside your Videos folder. You must make sure your Ted folder exists before you run the code.

If you are to run the script now it would download the presentations from TED.com. The problem is that it won't remember the new date, so let's fix that by storing the last date, here's how the new foreach looks like:

foreach ($channel as $item) {
if(strtotime($item->pubDate()) > $lastDate){
if(!isset($newDate)){
$newDate = $item->pubDate();
file_put_contents("last.date", $newDate);
}
$notifyText .= substr($item->title(), 11);
$notifyText .= "\n";
$link = $item->guid();
exec("cd ~/Videos/Ted/ && wget " . $link);
} else {
break;
}
}

Good, now we have a functional script that we can use. Let's make it run every one hour! In the terminal type: crontab -e. You will be asked to select an editor (if this is the first time you've run this command). I suggest number 4 (vim.basic). After the selection an almost blank screen appears. I will assume you've selected vim.basic as your editor. Press I and then type:

0 * * * * /home/<your user name>/ted.php

Remember to replace <your user name> with your actual ubuntu username. Then press ESC, the column (the : sign) and then "wq" and then Enter. This makes sure that the script is run every one hour.

Making it nice

If you want you can add a representative icon to the notification message. It makes it look a bit more professional. Replace the notify-send line with:

exec("DISPLAY=:0.0 /usr/bin/notify-send -i \"<link to icon>" \"$notifyText\"");

Where you replace <link to icon> with the file location of the icon file. I've created a simple one for the TED presentations which you can download (see further down).

Download files

You can download both the full source code and the logo and use them.

Comments (0) Trackbacks (0)

No comments yet.


Leave a comment

(required)

 

No trackbacks yet.