AU XMLTV Grabber
From KdjWiki
NOTE: This has been updated to support the new method provided by tvguide.org.au - for the original version see Old AU XMLTV Grabber
Contents |
Overview
This is a simple bash-based Australian XMLTV grabber retrieving it's data from the XML feed provided by http://tvguide.org.au.
There is another grabber called tv_grab_au_reg (found at http://www.cse.unsw.edu.au/~willu/xmltv/tv_grab_au_reg.html) that seems popular (and it will also optionally accept guide data from http://icetv.com.au if you have a subscription) which is written in python. I actually used this as a starting point (thanks to William Uther) which is how I developed the main program flow.
For most people this is probably of no concern but if you need to customise or finetune your grabber and don't like having your control structure defined by the level of indentation, you might like to check this out.
NOTE: I have put no effort into making this work (or testing) on a non-linux system. It may work on OSX or Windows (via Cygwin) but I have no experience with these systems.
It should be possible to replicate this on Windows as most of the linux utilities are available for Windows too:
- grep : http://www.interlog.com/~tcharron/grep.html
- wget : http://www.interlog.com/~tcharron/wgetwin.html
- xsltproc : http://www.zlatkovic.com/pub/libxml/
General process
This may seem to be over complicating things, as you might think all this could be incorporated in a single tv_grab_au script, but I like this separation for a number of reasons:
- The xmltv file that the PVR will retrieve has been processed and verified. It may not be the most current if the latest download failed, but I know it will not break the PVR and/or clear guide valid data.
- If I/as I switch PVR solutions, I only have to re-implement the tv_grab_au stub (which effectively just blindly downloads or echos the xmltv file previously created)
- If I/as I have multiple PVR's, I only need to retrieve the guide data from the Internet once - in effect this is lightening the load by creating a local cache.
The general process of populating your PVR with guide data using these scripts would operate as such:
- scheduled task (cron job) runs the script grab_xmltv.sh. This is a good idea as for free you get emailed when cron tasks produce any output. As the grab_xmltv.sh script will only ever output text on error, you don't have to worry about checking logfiles - just your email.
- grab_xmltv.sh functions as follows:
- wait for a random duration (up to 5 hours) to help ensure the server doesn't get swamped
- download the registered user's current full xml tv file
- apply stylesheet to add channels and create a (hopefully) valid xmltv file
- apply verification stylesheet to ensure result is valid xml
- if required/configured grab_xmltv.sh will then copy the tvguide.xml file to a location (and filename) your PVR is expecting
If your PVR cannot load it's guide data from an XMLTV file (i.e. it needs to run the tv_grab_au program) you should be able to easily create a dummy tv_grab_au program program that either calls the grab_xmltv.sh script, or (my preference) echo out the contents of the current tvguide.xml file (as this has been verified as valid xml). See the EPG section of Ubuntu MythTV Setup for an example of this.
Requirements
The application consists of the following scripts and style sheets:
The scripts depend on the following utility:
- xsltproc (see http://xmlsoft.org/XSLT/xsltproc2.html)
They also uses several other utilities (such as wget, grep, etc) but these should exist on all linux systems.
Configuration
The configuration is found in the grab_xmltv.sh script and your selected channels are found in the xmltv.xsl style sheet. All configurable options in the script should be clearly located between
# # START USER CONFIGURATION #
and
# # END USER CONFIGURATION #
to make it easy to know what you need to configure.
The channels should similarly be easily identifiable in the xmltv.xsl file.
Scheduling
I have the following cron task scheduled to run between 11:00pm and 4:00am (because of the random delay found in the script) to download the latest guide data and generate an XMLTV file.
0 23 * * * root /usr/local/bin/grab_xmltv.sh
The grab_xmltv.sh script will only ever output on error, and with the magic of cron I will receive an email if this is ever to occur. Additionally, as seen in the grab_xmltv.sh script details, the resulting XMLTV file will only ever be migrated to it's target location (where is is picked up by my PVR) if no errors occur (including a final XML validation check).
Bash Script
I have the bash script located in /usr/local/bin.
grab_xmltv.sh
#!/bin/bash
cd $(dirname "${0}")
#
# START USER CONFIGURATION
#
max_delay=18000
tvguide_user="WikiUsername"
tvguide_password="wikipassword"
wgetopts='-q -T 300 -t 5'
wget_params="--http-user=${tvguide_user} --http-passwd=${tvguide_password} ${wgetopts}"
data_dir="/root/.xmltv"
xmltv="${data_dir}/tvguide.xml"
xsl_fix="${data_dir}/xmltv.xsl"
xsl_check="${data_dir}/xmltv_check.xsl"
target="/var/www/localhost/xmltv.xml"
#
# END USER CONFIGURATION
#
# Ensure prerequisites
xsl_proc="xsltproc"
url="http://minnie.tuhs.org/tivo-bin/xmlguide.pl"
[ -d "${data_dir}" ] || mkdir "${data_dir}"
ok=`which ${xsl_proc} | grep "no ${xsl_proc}"`
if [ "${ok}" != "" ]; then
echo "ERROR: XSL processor (${xsl_proc}) required and not found"
exit 1
fi
if [ ! -f "${xsl_fix}" ]; then
echo "ERROR: XSL file (${xsl_fix}) not found - can't process guide data"
exit 1
fi
if [ ! -f "${xsl_check}" ]; then
echo "Warning: XSL file (${xsl_check}) not found - no final validation will occur"
fi
# Random sleep
sleep $((RANDOM % ${max_delay}))
# Grab guide - 31 day rolling cache
outfile="${data_dir}/`date +%d`.xml"
ret=`wget ${wget_params} "${url}" -O "${outfile}"`
if [ "${ret}" != "" ]; then
echo "ERROR: wget failed on ${url}"
echo "${ret}"
exit 1
fi
# process output
ret=`$xsl_proc $xsl_fix ${outfile} > ${xmltv}`
if [ "${ret}" != "" ]; then
echo "ERROR: XMLTV download/process error"
echo "${ret}"
exit 1
fi
# verify
if [ -f "${xsl_check}" ]; then
ret=`$xsl_proc $xsl_check $xmltv > /dev/null`
if [ "${ret}" != "" ]; then
echo "ERROR: Can't validate resulting XML"
echo "${ret}"
exit 1
fi
fi
if [ "${target}" != "" ] && [ -d "$(dirname ${target})" ]; then
cp "${xmltv}" "${target}"
fi
exit 0
XSL Style Sheets
The XSL style sheets should be located in the data directory (for me /root/.xmltv) although this directory (as in the data directory) is configurable in the grab_xmltv.sh script.
xmltv.xsl
This style sheet is used to add the channel information to the downloaded program file.
NOTE: Obviously, this will need to be updated with the list of channels applicable for you. This example is assuming you are downloading the NSW FTA channels and your PVR will be matching the names with the FTA Digital names.
<?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="xml" /> <xsl:template match="tv"> <tv source-info-name="tvguide.org.au" generator-info-name="XMLTV - grab_xmltv.sh" generator-info-url="http://wiki.accordingtokris.com/index.php?title=AU_XMLTV_Grabber"> <channel id="ABC-NSW"><display-name>ABC TV</display-name></channel> <channel id="ABC2"><display-name>ABC2</display-name></channel> <channel id="Nine-Syd"><display-name>NINE DIGITAL</display-name></channel> <channel id="SBS-NSW"><display-name>SBS Digital</display-name></channel> <channel id="Seven-Syd"><display-name>7 Digital</display-name></channel> <channel id="Ten-NSW"><display-name>TEN Digital</display-name></channel> <xsl:apply-templates select="programme" /> </tv> </xsl:template> <xsl:template match="programme"> <xsl:copy-of select="." /> </xsl:template> </xsl:stylesheet>
NOTE: If you have specific channel id's that you need the resulting XMLTV file to use, you can use the following XSL to remap them:
<?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="xml" /> <xsl:template match="tv"> <tv source-info-name="tvguide.org.au" generator-info-name="XMLTV - grab_xmltv.sh" generator-info-url="http://wiki.accordingtokris.com/index.php?title=AU_XMLTV_Grabber"> <channel id="Sydney.ABC"><display-name>ABC TV</display-name></channel> <channel id="Sydney.ABC2"><display-name>ABC2</display-name></channel> <channel id="Sydney.NINE"><display-name>NINE DIGITAL</display-name></channel> <channel id="Sydney.SBS"><display-name>SBS Digital</display-name></channel> <channel id="Sydney.SEVEN"><display-name>7 Digital</display-name></channel> <channel id="Sydney.TEN"><display-name>TEN Digital</display-name></channel> <xsl:apply-templates select="programme" /> </tv> </xsl:template> <xsl:template match="programme"> <programme> <xsl:attribute name="start"><xsl:value-of select="@start" /></xsl:attribute> <xsl:attribute name="stop"><xsl:value-of select="@stop" /></xsl:attribute> <xsl:choose> <xsl:when test="@channel='ABC-NSW'"> <xsl:attribute name="channel">Sydney.ABC</xsl:attribute> </xsl:when> <xsl:when test="@channel='ABC2'"> <xsl:attribute name="channel">Sydney.ABC2</xsl:attribute> </xsl:when> <xsl:when test="@channel='Nine-Syd'"> <xsl:attribute name="channel">Sydney.NINE</xsl:attribute> </xsl:when> <xsl:when test="@channel='SBS-NSW'"> <xsl:attribute name="channel">Sydney.SBS</xsl:attribute> </xsl:when> <xsl:when test="@channel='Seven-Syd'"> <xsl:attribute name="channel">Sydney.SEVEN</xsl:attribute> </xsl:when> <xsl:when test="@channel='Ten-NSW'"> <xsl:attribute name="channel">Sydney.TEN</xsl:attribute> </xsl:when> <xsl:otherwise> <xsl:attribute name="channel">Unknown Channel</xsl:attribute> </xsl:otherwise> </xsl:choose> <xsl:copy-of select="*" /> </programme> </xsl:template> </xsl:stylesheet>
xmltv_check.xsl
This is a simple style sheet that will output the input. If it encounters an error (such as malformed XML) it will freak out and this is used as as indication on whether the XML is valid or not.
<?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="xml" /> <xsl:template match="/"> <xsl:copy-of select="." /> </xsl:template> </xsl:stylesheet>
Security
When grab_xmltv.sh is running, you are able to see the wget command line being executed using ps ax. This will include the tvguide.org.au username and password. If you want to hide this information from prying eyes, you should create a wgetrc file somewhere only accessible to the executing user containing the following:
/root/.xmltv/wgetrc:
http-user=WikiUsername http-passwd=WikiPassword
Then update the grab_xmltv.sh script as such:
Replace
wget_params="--http-user=${tvguide_user} --http-passwd=${tvguide_password} ${wgetopts}"
with
wget_params="${wgetopts}"
and obviously you can remove the lines:
tvguide_user="WikiUsername" tvguide_password="wikipassword"
and assuming /root/.xmltv/wgetrc (change to your path as required):
Replace
ret=`wget ${wget_params} "${url}" -O "${outfile}"`
with
export WGETRC="/root/.xmltv/wgetrc"
ret=`wget ${wget_params} "${url}" -O "${outfile}"`
export WGETRC=