Converting Your iTunes Library To Html

Contents:

1. Introduction

Warning: CCTunes is nearly constantly changing. Not all of this documentation may correspond to the way CCTunes is working when you downloaded it.

iTunes is keeping track of all the information of your music library, in a proprietary database file and does not provide any interface to manipulate that. You have to go through AppleScript to get or set information in it. This is sometimes nice, some people have nice websites of all sorts of AppleScripts to do whatever they felt useful to do, but I have not found much information on how to publish your whole library in html-format. Therefore I went to search the internet, and found some links which set me on my way to get my own solution.

It might not be perfect, it might not even be nice, but it could do for you. Anyway, it did for me. As of lately, I've bundled everything together in a nice little application which you can find here.

You can see a list of similar packages at the end of this document. Some of them are payware, some of them are freeware. All of them have advantages and disadvantages, and you might well find that mine is not the package that best suits your needs. However, you will find that mine comes with a full explanation of what it does. As a matter of fact, it actually grew out of this documentation and has not had a purpose so far but illustrating this text. That might change in the future, but as for now, it's better to still consider it like that. If you have some success or ideas with the package, I'd be glad to hear!

2. About the XML file

The XML file resides in your home directory, at ~/Music/iTunes/iTunes Music Library.xml. It has been described as braindead and, agreed, it does not look like one would expect an xml file for a music library would look like. It is track based, and every track has an ID and most of the information one would expect to find there, except for some, like the images. There does not seem to be a logical way in which the Track IDs are assigned and they tend to change when f.i. you reencode songs. Where one would expect the name of the tag has information about the information contained in the tag, the XML file just has key tags followed by value tags, like a dictionary streamed out. This is the typical plist approach of Apple, and it's just another example of how not to use XML, imho.

When you are playing tracks in your iTunes application, iTunes seems to update both a private proprietary database file and the xml file. The proprietary database, ~/Music/iTunes/iTunes 4 Music Library, is the master file. Whatever you change in the xml file, if the database file is not updated, the next time you play something in iTunes, the XML file will be updated again. Very annoying.

3. Prerequisites

Organizing The Library

First of all, it was my aim to make a nice list of all the albums I posess in my iTunes library. I am in the ongoing process of converting my whole CD collection, some couple of hundred CDs, to MP3, and while doing so, I try to keep a list. I used to do this by updating some spreadsheet file in GNumeric, which is still a viable approach, but this is nicer. However, I use certain rules to tag my MP3s:

  • I don't keep track of Genres. They are never accurate anyway.
  • I try to keep the Artwork of the tracks to just contain the Album cover art, not any other artwork, like singles cover art or logos.
  • I try to keep the disk numbers up to date. They are often ignored but I want albums consisting of multiple CDs to be treated as one album.
  • Make sure compilations are treated as such by iTunes by setting the Compilation ID3 Tag.
  • Compilation albums have the compilation checkbox set correctly, and have the same composer. The artists can be different per track, but the composer is meant to be the one that has composed the compilation. This is open for discussion, I know, but this is the rule I will adjust when making the final layout.
  • Greatest Hits are therefore no compilations! Many of the entries at cddb are set differently, so it might be necessary to correct this.
You do not have to organize your library like I do, but just in order for all of this to work and look nicely, I recommend this approach. I recommend this approach anyway, I think it should be forced by law to organize your library like this ;-).

Installing the software used

In earlier versions I used perl modules like MP3::Info to extract the images and ImageMagick command line tools to generate the thumbnails. However, things are much simpler now: I've made a self contained package which is downloadable and has everything you will need.

4. Compiling the Package

In order to compile the package yourself, you will need a couple of directions, because I have just assembled together a bunch of things using command line scripting and XCode. This chapter will give some directions, but it is not complete yet. The most important things should be covered, and if you're having troubles, just ask. Updates will follow and be consecutively outdated by updates to the software, but anyway, let's hope it's valuable to some.

Compiling XML Starlet

One of the things you will need is XML Starlet. In earlier days CCTunes was based on xsltproc, which is ok but not as flexible as XML starlet I thought. So, I compiled the thing and it does all I need it to do, like xsltrpoc would but easier than xsltproc would.

You will need to download the source packages from libxml2 (libxml2-2.6.14.tar.gz), and a compatible libxslt (libxslt-1.1.9.tar.gz) and XML Starlet (xmlstarlet-0.9.5.tar.gz). Those are the versions I used. Other versions might work for you in similar ways, but chances are you run into compatibility issues which you will have to resolve yourself. Give me a reason to upgrade to another version of XML Starlet and I will.

Compiling is the usual bunch of configure and make and make install's, but because we don't want to use any system libraries we are going to give some instructions to place the installed binaries in a temporary directory. Here we go for libxml2:

Code listing 4.1

mkdir ~/docxmlstarlet
cd ~/docxmlstarlet/
mkdir tmp
mkdir tmp/prefix
mkdir tmp/eprefix
cp ~/Downloads/libxml2-2.6.14.tar.gz .
tar xvfz libxml2-2.6.14.tar.gz 
cd libxml2-2.6.14
./configure --prefix=$HOME/docxmlstarlet/tmp/prefix/ --exec-prefix=$HOME/docxmlstarlet/tmp/eprefix/
make; make install

This will take a while (and use up my laptop's battery while I'm writing this on the train ;-) ), but in the end we will have the libraries installed and setup to go to the next phase: compiling the xslt libraries. Very similarly the following instructions trigger the same process:

Code listing 4.2

cd ~/docxmlstarlet/
cp ~/Downloads/libxslt-1.1.9.tar.gz .
tar xvfz libxslt-1.1.9.tar.gz
cd libxslt-1.1.9
./configure --prefix=$HOME/docxmlstarlet/tmp/prefix --exec-prefix=$HOME/docxmlstarlet/tmp/eprefix \
    --with-libxml-prefix=$HOME/docxmlstarlet/tmp/eprefix \
    --with-libxml-include-prefix=$HOME/docxmlstarlet/tmp/prefix/include \
    --with-libxml-libs-prefix=$HOME/docxmlstarlet/tmp/eprefix/lib
make; make install

And finally, for XML Starlet. The version we are using has a few glitches that we need to tweak. Let's first do similarly:

Code listing 4.3

cd ~/docxmlstarlet/
cp ~/Downloads/xmlstarlet-0.8.1.tar.gz .
tar xvfz xmlstarlet-0.8.1.tar.gz
cd xmlstarlet-0.8.1
open .

Now, ideally we would have the same set of instructions to make the executables but it turns out there were some mistakes in the configure script, which is found in the directory that pops up. Locate the configure script and open it with your favourite text editor. First of all, since we are using libxml2 version 2.6.14, the configure script thinks we are using version 2.6.1, which is not what it expects. To fix this, change if test "$LIBXML_VERSION" -lt 262; then on line 900 to if test "$LIBXML_VERSION" -lt 260; then. That will fix that bug. Also, it doesn't consider our system as a macintosh like system. Therefore, change *mac* on line 1811 to *apple-darwin*. That's it, the usual command sequence should complete the compilation:

Code listing 4.4

./configure --prefix=$HOME/docxmlstarlet/tmp/prefix \
            --exec-prefix=$HOME/docxmlstarlet/tmp/eprefix \
            --with-libxml-prefix=$HOME/docxmlstarlet/tmp/eprefix \
            --with-libxml-include-prefix=$HOME/docxmlstarlet/tmp/prefix/include \
            --with-libxml-libs-prefix=$HOME/docxmlstarlet/tmp/eprefix/lib \
            --with-libxslt-prefix=$HOME/docxmlstarlet/tmp/eprefix \
            --with-libxslt-include-prefix=$HOME/docxmlstarlet/tmp/prefix/include \
            --with-libslt-libs-prefix=$HOME/docxmlstarlet/tmp/eprefix/lib \
            --with-libiconv-prefix=/usr \
            --with-libiconv-libs-prefix=/usr
make; make install

And, at $HOME/docxmlstarlet/tmp/eprefix/bin/ there is the xml executable that is the XML Starlet we wanted. That's the one you find, checked in in the package directory of the source package you can download.

Compiling id3 libraries

... to be completed ...

Compiling the total package

... to be completed (use XCode on Panther) ...

5. Converting the XML file

Forcing the XML file to make sense

We want a decent xml file to start with, not that plisty file that is not like anything we would come up with ourselves. Luckily there are people that helped me on the way to do this. Searching google gave me a nice little link, http://www.xmldatabases.org/WK/blog/1086?t=item which explained it all. Now that was very useful. Step one finished.

Code listing 5.1: makeProperXml.xsl file makes a nicer XML file

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xsl:version="1.0">
        
  <xsl:template match="/">
    <songlist>
      <xsl:text>
      </xsl:text>
      <xsl:apply-templates select="plist/dict/dict/dict"/>
    </songlist>
    <xsl:text>
    </xsl:text>
  </xsl:template>
  <xsl:template match="dict">
    <song>
      <xsl:text>
      </xsl:text>
        <SortKey>
          <xsl:variable name="SortArtist"      select="string[preceding-sibling::node()[1]='Artist']" />
          <xsl:variable name="SortAlbum"       select="string[preceding-sibling::node()[1]='Album']" />
          <xsl:variable name="Composer"       select="string[preceding-sibling::node()[1]='Composer']" />
          <xsl:variable name="StartsWithThe" select="starts-with($SortArtist,'The ')" />
          <xsl:choose>
            <xsl:when test="true[preceding-sibling::node()[1]='Compilation']">
              <xsl:value-of select="'ZZZZZZZZZZZ-VA'" /><xsl:text> - </xsl:text>
              <xsl:value-of select="$Composer" /><xsl:text> - </xsl:text>
              <xsl:value-of select="$SortAlbum" />
            </xsl:when>
            <xsl:otherwise>
              <xsl:choose>
                <xsl:when test="$StartsWithThe">
                  <xsl:value-of select="substring-after($SortArtist,'The ')" /><xsl:text> - </xsl:text>
                  <xsl:value-of select="$SortAlbum" />
                </xsl:when>
                <xsl:otherwise>
                  <xsl:value-of select="$SortArtist" /><xsl:text> - </xsl:text>
                  <xsl:value-of select="$SortAlbum" />
                </xsl:otherwise>
              </xsl:choose>
            </xsl:otherwise>
          </xsl:choose>
	</SortKey>
          <xsl:text>
          </xsl:text>
      <xsl:apply-templates select="key"/>
      <xsl:text>
      </xsl:text>
    </song>
  </xsl:template>
  <xsl:template match="key">
    <!-- main template that makes plist style conversion to XML style conversion -->
    <xsl:element name="{translate(text(), ' ', '_')}">
      <xsl:value-of select="following-sibling::node()[1]"/>
    </xsl:element>
    <xsl:text>
    </xsl:text>
  </xsl:template>
</xsl:stylesheet>


This style sheet applied to the plist-file you get by exporting an iTunes library to XML gives you a list of tracks that is structured in a way that makes it easier to take the XSL formatting a step further.

For those with little XSL experience I will explain what is going on. To simplify the reading, you can ignore the xsl:text nodes, which just serve to insert newlines in the resulting XML file.

The most important part of the stylesheet is where the key is converted from the plist style to regular XML style. That is, <key>Key Name</key> <string>StringValue</string> is converted to an element with name Key_Name and a child text node StringValue, like this: <Key_Name>StringValue</Key_Name>. This is more like what you expect in an XML file. This is the part done in the xsl:template on the last 9 lines of the file.

The songlist is generated on a per track basis, so as a result we will have one big long songlist. We need to differentiate between various albums from this list, and that is what the SortKey element is for. For every song in the songlist we will generate a key based on the artist, the albumname and the fact that it is part of a compilation or not. Songs that are part of a compilation will rather use the composer tag to differentiate the albums than the artist tag, per definition. Also, we want the compilations to be at the end of the list, so we start their SortKey with ZZZZZZZZZ so that even ZZ-top will be before any compilation in the library. As a last special thing, we want to ignore the first "The" in group names, so that The Smiths is just after Sigur Rós and not after The Notwist. Like iTunes does. That is also arranged via this SortKey.

To test this on your exported xml file, using xml starlet, you can use the following command:

Code listing 5.2: xml command to generate proper XML from plist file

xml makeProperXml.xsl "~/Music/iTunes/iTunes Music Library.xml" > properiTunes.xml

Sorting the XML file per Album

Remains the task of making the songlist XML file created in the previous section to an XML file that has album items: one node per album, containing a songlist in the correct order and all other information we could need to make HTML. This is done using the makeMainXml.xsl file:

Code listing 5.3: makeMainXml.xsl file creates the main XML file

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xsl:version="1.0">

<xsl:key name="album-sort-keys" match="song" use="SortKey" />            
<xsl:template match="songlist">
  <xmlLibrary>
    <xsl:for-each select="song[ count(. | key('album-sort-keys',SortKey)[1]) = 1]" >
      <xsl:sort select="SortKey"/>
      <xsl:variable name="albumid" select="position()"/>
      <includeXml><xsl:value-of select="$albumid"/></includeXml>
      <xsl:text>
      </xsl:text>
      <xsl:document href="{$destinationDir}/{$albumid}.xml">
      <AlbumItem>
   <xsl:text>
   </xsl:text>
        <Artist>
          <xsl:variable name="compilationval" select="count(child::Compilation)"/>
          <xsl:choose>
            <xsl:when test="$compilationval=1">
              <xsl:text>Various Artists</xsl:text>
            </xsl:when>
            <xsl:when test="$compilationval=0">
              <xsl:value-of select="Artist"/>
            </xsl:when>
            <xsl:otherwise>
              <xsl:text>????</xsl:text>
            </xsl:otherwise>
          </xsl:choose>
        </Artist>
   <xsl:text>
   </xsl:text>
        <Album>
          <xsl:value-of select="Album"/>
        </Album>
   <xsl:text>
   </xsl:text>
        <AlbumID/>
   <xsl:text>
   </xsl:text>
        <Picture>
      <xsl:text>
      </xsl:text>
          <PictureURL><xsl:value-of
               select="Location" /></PictureURL>
     <xsl:text>
   </xsl:text>
        </Picture>
     <xsl:text>
   </xsl:text>
	        <DateAdded><xsl:value-of select="Date_Added"/></DateAdded>
         <xsl:text>
         </xsl:text>
	        <Year><xsl:value-of select="Year"/></Year>
         <xsl:text>
         </xsl:text>
            <SortKey><xsl:value-of select="SortKey"/></SortKey>
         <xsl:text>
         </xsl:text>
	<TrackList>
          <xsl:for-each select="key('album-sort-keys',SortKey)" >
            <xsl:sort select="Track_Number" data-type="number" />
      <xsl:text>
      </xsl:text>
	      <Track>
         <xsl:text>
         </xsl:text>
	        <Number><xsl:value-of select="Track_Number"/></Number>
         <xsl:text>
         </xsl:text>
	        <Name><xsl:value-of select="Name"/></Name>
         <xsl:text>
         </xsl:text>
	        <TotalTime><xsl:value-of select="Total_Time"/></TotalTime>
         <xsl:text>
         </xsl:text>
	        <DiscNumber><xsl:value-of select="Disc_Number"/></DiscNumber>
         <xsl:text>
         </xsl:text>
	        <PlayCount><xsl:value-of select="Play_Count"/></PlayCount>
      <xsl:text>
      </xsl:text>
              </Track>
	  </xsl:for-each>
   <xsl:text>
   </xsl:text>
	</TrackList>
<xsl:text>
</xsl:text>
      </AlbumItem>
      </xsl:document>
      <xsl:text>
      </xsl:text>
    </xsl:for-each>
  </xmlLibrary>
</xsl:template>

</xsl:stylesheet>

Once again, for the XSL illeterate, a little explanation. You can again ignore any empty xsl:text element, which only serves to get some nicer formatting in the resulting XML file. The actual sorting is done using a technique called Muenchian Sorting. The link explains it better than I can, so if you want to know what the xsl:key is doing please read all about it there.

Code listing 5.4: Making main XML file using xsltproc

xml makeMainXml.xsl properiTunes.xml > masterXml.xml

The XML file is generating a set of XML files using the xsl:document element which have sequential id's and a main XML file which include these XML files. This will later be postprocessed by the generate.command script which will generate new id's on a per album basis (more about that later), and rename the whole set of XML files.

While generating XML files the relevant elements from the songlist are copied over to the album's XML file.

6. Getting the images out of the mp3 files

The trickiest part of the job was getting the images. Since the images are not in the XML file we need to get them out of the mp3 files themselves. iTunes uses the correct mp3 v2 tag to store the image. Now, initially I wrote a perl script to extract the images based on a perl module. But, you need to install the perl modules MP3::Info and URI::URL in order for this to work. This is not too hard to do, but is still some effort. I could have packaged the perl modules I guess, but I found it easier to create a little Cocoa command line tool that does the same based on libid3, the library that is distributed by http://www.id3.org. Both approaches are still explained however.

Using a custom command line tool to get the image out of the mp3 file

Another approach is taken in the downloadable package however. The getPicure command is not a perl command, but a compiled tool, based on id3v2 which already uses the libid3 library and has a great set of functionalities, except... extracting images.

Adapting the sources to eliminate all we don't need and adding the extra little bit of functionality to extract the image was easy however, since we basically know how to do this in perl already. In order to compile it you will need to have the libid3 headers in place and have a libid3 library that you can link to. The commands needed to compile it with the statically linked library are included in comments in the source code.

Code listing 6.1: Source code to getPicture command line tool

/**
 * Based on id3v2 ... compile something like
 *
 * g++ -I/usr/local/include/ -O3   -c -o getpic.o getpic.cpp
 * c++ -L/usr/local/lib/ -O3 -pedantic -Wall -framework CoreFoundation -lz -liconv
 *                       -g -o id3v2 getpic.o ./libid3.a
 * 
 * Part of CCTunes, distributed under the GPL.
 * details at http://www.coin-c.com/CCTunes/
 */

#include <CoreFoundation/CoreFoundation.h>

#include <id3/misc_support.h>
#include <id3/tag.h>

#include "frametable.h"
#include "genre.h"

int main( int argc, char *argv[]);
int main( int argc, char *argv[])
{
  bool tags = false;
  CFURLRef theSourceUrl = CFURLCreateWithBytes(kCFAllocatorDefault,
					       (unsigned char*)argv[1],
					       strlen(argv[1]),
					       kCFStringEncodingUTF8,
					       NULL);
  unsigned char sourceFilePath[500];
  if (CFURLGetFileSystemRepresentation(theSourceUrl,true,sourceFilePath,500))
    {

      ID3_Tag myTag;
      myTag.Link((char*)sourceFilePath, ID3TT_ID3V2);

      const ID3_Frame * myFrame = 0;
      const ID3_Tag myTagConstRef = myTag;
      ID3_Tag::ConstIterator *Iter = myTagConstRef.CreateIterator();
      for (size_t nFrames = 0; nFrames < myTag.NumFrames(); nFrames++)
	{
	  myFrame = Iter->GetNext();
	  if (NULL != myFrame && myFrame->GetID() == ID3FID_PICTURE)
	    {
	      myFrame->Field(ID3FN_DATA).ToFile(argv[2]);
	      tags = true;
	      break;
	    }
	}
    }
    if(!tags)
      std::cout << "<warning><id>no_picture</id><info>" << (unsigned char*)argv[1]
                << "</info></warning>" << std::endl;
}


Generating Thumbnails

This might be all you need if you are on your own private server, have lots of disk space and don't really care on how fast your collection loads in the browser. Since I have been using this, I found that the size of the images tends to be too big to view all of the images for all of the albums at once, so it would be nice to generate thumbnails for the overview.

However, for your ease of use, the currently downloadable package comes with a little command line tool programmed in cocoa and based on pdf2png by Evan Jones which does all of this for you. It simply takes an image and draws it in an offline buffer of the specified size and saves that as a png file. I must say however, you will see that the quality is not good. It is the same quality you can see if you use iPhoto to look at the images you've taken with your digital camera. Uuuugly. I must provide some feedback to Apple on that someday.

Code listing 6.2: Little Cocoa command line tool to resize images, based on pdf2png

// A tiny program that resizes images to PNG images with certain dimensions.
// based on pdf2png by Evan Jones, http://evanjones.ca/pdf2png.m
// modified by Kristof Van Landschoot, Coin-C to fit the purpose of resizing.
//
// gcc --std=c99 -g -o resize resize.m -framework Cocoa
//
// Written originally by Evan Jones <ejones@uwaterloo.ca> Februrary, 2004
// http://www.eng.uwaterloo.ca/~ejones/
//
// Released under the GNU Public License

#include <Cocoa/Cocoa.h>

int main( int argc, char* argv[] )
{
	int destinationSize = 36; // in DPI

	int page = 1;
	
	NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
	
	// Package all arguments as NSStrings in an NSArray
	NSMutableArray* args = [NSMutableArray arrayWithCapacity: argc - 1];
	for ( int i = 1; i < argc; ++ i )
	{
		[args addObject: [NSString stringWithCString: argv[i]] ];
	}
	
	// If we have a "--dpi" along with a corresponding argument ...
	unsigned int index = NSNotFound;
	if ( (index = [args indexOfObject: @"--dpi"]) != NSNotFound && index + 1 < [args count] )
	{
		// Parse it as an integer
		destinationSize = [[args objectAtIndex: index + 1] intValue];
		[args removeObjectAtIndex: index + 1];
		[args removeObjectAtIndex: index];
	}
	
	if ( [args count] != 2 || [args indexOfObject: @"--help"] != NSNotFound || destinationSize <= 0 )
	{
		fprintf( stderr, "resizePicture [options] file\n" );
		fprintf( stderr, "\t--dpi dpi\tSpecifies the destination size of the image in pixels\n" );
		fprintf( stderr, "\t--help\tPrint this help message\n" );
		return 1;
	}
	
	NSString* sourcePath = [args objectAtIndex: 0];
	NSImage* source = [ [NSImage alloc] initWithContentsOfFile: sourcePath ];
	[source setScalesWhenResized: YES];
	
	// Tip from http://www.omnigroup.com/mailman/archive/macosx-dev/2002-February/023366.html
	// Allows setCurrentPage to do anything
	[source setDataRetained: YES];
	
	if ( source == nil )
	{
		fprintf( stderr, "Source image '%s' could not be loaded\n", argv[1] );
		return 2;
	}
	
	// The output file name
	NSString* outputFilePath = [args objectAtIndex: 1];
	
	NSSize sourceSize = [source size];
		
	NSSize size = NSMakeSize( destinationSize, destinationSize );
	
	[NSApplication sharedApplication];
	[[NSGraphicsContext currentContext] setImageInterpolation: NSImageInterpolationHigh];
	
	[source setSize: size];
	NSRect destinationRect = NSMakeRect( 0, 0, size.width, size.height );
	
	NSImage* image = [[NSImage alloc] initWithSize:size];
	[image lockFocus];
	    
	NSEraseRect( destinationRect );
	[source drawInRect: destinationRect
		fromRect: destinationRect
		operation: NSCompositeCopy fraction: 1.0];
	
	NSBitmapImageRep* bitmap = [ [NSBitmapImageRep alloc]
				     initWithFocusedViewRect: destinationRect ];
	
	NSData* data = [bitmap representationUsingType:NSPNGFileType properties:nil];
	[bitmap release];
	
	[[NSFileManager defaultManager]
	  createFileAtPath: outputFilePath
	  contents: data
	  attributes: nil ];
	
	[image unlockFocus];
	[image release];
	[pool release];
}

7. Naming Files

Introduction

During the development of CCTunes various options were implemented to name the files. The initial, and simplest, option was to just use the "Track ID" that is generated from iTunes to name the albums.

This had various drawbacks, the most one being that these id's are on a per track basis. So, from one generation to the next, it would mean the HTML files in your library would receive a different id. Also, as I move my mp3 files from one share to another, or from my iPod to my computer, quite oftenly, these track id's did not remain the same in my library.

In conclusion, I thought this method was not suited for the job. So, I implemented another one.

First Attempt - MD5 Sums

When you're a bit familiar with md5 sums, the first thing that comes to mind in situations like these is: md5-sums. They are a hash based on the data of the album, so it would mean that whenever something in the album data changed, the album id would also change. If you combine that with a version control system (Subversion), like I do, it immediately means that it's easy to follow the changes in your album collection.

There are drawbacks to this approach too. The md5 sums are quite meaningless and rather long. They can, on top of this, be the same for two different albums so a mechanism would be needed to give a warning when the ids of two different albums are the same.

So, this was not the method implemented in the current version. It is still present in the createId.command file in the package, after the exit, though, should you be interested in it.

Use what is there: cddb id's

But, there is a mechanism implemented to do something like md5 on albums for music, and it is a commonly accepted method to create an id like md5, but more suited for music albums. It's the CDDB mechanism. Not that it's free from criticism either, this one. One of the first things I encountered on google was a fairly harsh criticism on the way the hash sums are calculated.

It wasn't that it was easy to find the way it's calculated either. It's open source, but documented only by means of the source. And, most of the packages creating such an id did an immediate lookup on either freedb or gracenote servers and used the data directly read from the CD, using lead in times that did not seem like they could be retrieved from the ripped mp3 data.

In the end, I found a package generating an id from mp3 data, and it did not seem to need any information from the CD itself. I don't know if it will create correct id's in any case, but it looks like the closest I can get for now. I'll explain why I think it will never create correct id's.

Here is the script that is finally generating the id's.

Code listing 7.1: Generating freedb id's on a per album basis

#!/usr/bin/perl        
# Returns the disc ID as a string.

use POSIX;

sub cddb_sum
{
	my ($n, $ret) = (shift, 0);
	for (split //, $n) { $ret += $_ }
	return $ret;
}

sub cddb_discid
{
        # kvl this is where the party's at ... using get_mp3info higher on to populate cdtoc
	my @cdtoc	   = @_;
	my $n		   = 0;
	my $total_time = 0;
	
	foreach my $track (@cdtoc)
	{
		my $track_time = floor($track+.5);
                # print "track time: "; print $track_time; print "\n";
		$n			+= &cddb_sum($total_time);
		$total_time +=			 $track_time;
	}
	return sprintf("%08x", ($n % 0xFF) << 24 | $total_time << 8 | @cdtoc);
}

$calci = 0;
foreach $calced (@ARGV)
{
  $calcarr[$calci] = $ARGV[$calci] / 1000;
  $calci = $calci + 1;
}

print cddb_discid ( @calcarr );

print "\n";

Why the id's don't correspond

An explanation of how the cddb id's are calculated can be found at freedb.org. As can be seen there, the only thing that matters is the number of tracks and the track durations. There is a little program at http://jeremy.zawodny.com/c/discid/ that you can use to generate discid's, or to verify what is going on in our case. We need to get the information out of the mp3's, or out of the iTunes XML file. This turned out to be troublesome.

As an example, I will try to explain the calculation of Morrissey's Kill Unkle album. Not because it's particularly special, just because the CD was lying around at the time I tried these things, so it's good to set as an example.

We get timings from different sources. The iTunes GUI of course. Also from the iTunes XML file, which is where we would prefer to get the timings, this is after all XML manipulation based. But also from discid, which is reading the Table Of Content directly from the CD. I needed to manipulate discid a bit to get it to print the timings, but that was not too difficult to do. And last the timings from the freedb website itself.

Track Number Time in ms from iTunes XML Time in m:ss from freedb website Time in seconds from discid Time in m:ss from iTunes UI Cumulative ms from rounding down iTunes XML values Cumulative ms from rounding iTunes XML values
0 2
1 205061 3:25 205 3:25 61 61
2 201743 3:22 202 3:21 804 -196
3 209084 3:29 209 3:29 888 -112
4 212741 3:33 212 3:32 1629 -371
5 176666 2:57 177 2:56 2295 -705
6 119797 2:00 120 1:59 3092 -908
7 203363 3:23 203 3:23 3455 -545
8 334602 5:34 334 5:34 4057 -943
9 211800 3:32 212 3:31 4857 -1143
10 112431 1:52 113 1:52 5288 -712

In the first column we can see the entries from the XML file. It turns out this is stored in milliseconds, so divide them by 1000 to convert to seconds. The difference between these values and the values from the iTunes UI are immediately showing the strange uncorrelatedness of these figures: Track 2 seems to indicate we need to round down, track 4 seems to indicate the opposite. So no simple rule to go from one to another.

The second column shows the entries from the freedb website. They would seem to be correct nearest integer rounding, if it wasn't for track 8, which is all of a sudden rounded down even though it would seem to be necessary to round it to 335s instead of 334s. Also the relation with the discid timings is interesting, because discid is the only program tried to give me the correct ID. The CD table of content has a lead in, which is the time in seconds at which the first track starts. This is indicated in the table as the duration of track 0. Also here, there is no simple correlation between the values from the XML file and the values from discid: track 1 to 3 seem to indicate a rounding to the nearest int, but track 4 denies this rule by all of a sudden expecting a rounding down.

So, how to proceed? It could be an interesting idea to inspect the cumulative rounding values, I've included them in the last two columns for rounding down and rounding to the nearest integer. Neither of the two columns seem to be giving a key to the solution of the problem, but maybe I'm just missing things.

So, for now, we will stick to rounding to the nearest integer, which gives us a pretty close ID, but not an exact match. Any ideas on getting an exact match are welcome. For now, for the example we get a freedb id of 7907c40a instead of the expected 7307c30a. You will have to admit it's similar.

Todo: something about MP3Browser

Todo: url's for the cddb-id perl script.

8. Making HTML

One little stage is still left, and that is formatting the XML files so that they make nice HTML files that we can render in a browser. A full explanation of all things you can do in XSL is out of the scope of this document, but this very basic style sheet should provide a good starting point for anyone willing to get his or her hands dirty.

Code listing 8.1: This very basic XSL style sheet creates a list of all albums in the libary

<xsl:stylesheet
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xsl:version="1.0">
            
  <xsl:key name="album-sort-keys" match="AlbumItem" use="SortKey" />
  <xsl:template match="xmlLibrary">
    <html>
      <head>
        <title>Music Library - Exported from iTunes</title>
      </head>
      <body>
        <xsl:for-each select="AlbumItem">
          <xsl:sort select="SortKey"/>
          <xsl:value-of select="Album" /> - <xsl:value-of select="Artist" /><br/>
        </xsl:for-each>
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>

This style sheet basically consists of only a couple of processing instructions that are specific to XSL. Anyone that has ever seen some HTML should be able to recognize the HTML structure within this style sheet: the <html> section, the <head> and the <body> are there.

In the body there is just one big loop, indicated by the xsl:for-each tag. The first instruction indicates how to get the albums sorted - there is a key ready for your usage in the XML that is generated by the script, and using the <xsl:value-of> directive, it selects artist and album, prints a dash in between them and a newline (<br>) after it.

How easy can it be? You can start from this to lay out the HTML like you want it and use the other XSL files in the package for further inspiration to do the trickier things. Good luck!

9. One Simple Package

The final bash script that does everything is, now that we've outlined the techniques, a pretty straightforward bash script.

Code listing 9.1: generate.command bash script does it all

#!/bin/bash

debug=0

if [ $debug -ne 0 ]; then
  echo "0 [$0] 1 [$1] 2 [$2] 3 [$3] 4[$4]" > /tmp/debuginfo
fi

# get my location (don't worry about this one)
# this one is used by other scripts too
export set WORKINGDIR=`dirname "$0"`

# "/Users/kristof/Desktop/Library.xml" for instance
musicXmlFile="$1"

# "/Users/kristof/Desktop/New Folder" for instance
destinationDir="$2"
destinationDirUrl=`"${WORKINGDIR}/urlencode.sh" "$2"`

# makeFinalXHtml.xsl or makeListXHtml.xsl
styleSheet="$3"

"${WORKINGDIR}/xml" tr "${WORKINGDIR}/../xml/makeProperXml.xsl" \
	"$musicXmlFile" \
	> /tmp/mymusic$$.xml

"${WORKINGDIR}/xml" tr "${WORKINGDIR}/../xml/makeMainXml.xsl" \
    -s "destinationDir=$destinationDirUrl" \
    /tmp/mymusic$$.xml > /tmp/mymusic2$$.xml

# get list with old id's, the ones that were assigned on a starting with number one basis
OLDIDS=`"${WORKINGDIR}/xml" sel -T -t -m xmlLibrary/includeXml -v "concat(node(),' ')" /tmp/mymusic2$$.xml`

echo "<?xml version=\"1.0\"?>" > "${destinationDir}/mymusic.xml"
echo "<xmlLibrary xmlns:xi=\"http://www.w3.org/2001/XInclude\">" >> "${destinationDir}/mymusic.xml"

for oldit in $OLDIDS; do
    file="${oldit}.xml"
    PICTUREURL=`"${WORKINGDIR}/../bin/xml" sel -T -t -v /AlbumItem/Picture/PictureURL \
        "${destinationDir}/${file}"`
    NEWID=`"${WORKINGDIR}/createId.command" "${destinationDir}/$file"`
    "${WORKINGDIR}/getPicture" "${PICTUREURL}" "${destinationDir}/${NEWID}-high"
    "${WORKINGDIR}/resizePicture" --dpi 60 "${destinationDir}/${NEWID}-high" \
        "${destinationDir}/${NEWID}-low.png"
    "${WORKINGDIR}/xml" ed -u /AlbumItem/AlbumID -v "$NEWID" "${destinationDir}/$file" \
        > "${destinationDir}/${NEWID}.xml"
    rm -f "${destinationDir}/$file"
    "${WORKINGDIR}/xml" tr --xinclude \
        "${WORKINGDIR}/../../Resources/English.lproj/makeFinalXHtmlDetail.xsl" \
	    "${destinationDir}/${NEWID}.xml" > "${destinationDir}/${NEWID}.html"
    echo "<xi:include href=\"${NEWID}.xml\" />" >> "${destinationDir}/mymusic.xml"
done

echo "</xmlLibrary>" >> "${destinationDir}/mymusic.xml"

# for mymusic.xml file, apply main stylesheet to obtain index file

"${WORKINGDIR}/xml" tr --xinclude "${WORKINGDIR}/../../Resources/English.lproj/${styleSheet}.xsl" \
   "${destinationDir}/mymusic.xml" > "${destinationDir}/mymusic.html"

/bin/cp "${WORKINGDIR}/../../Resources/default.css" "${destinationDir}/default.css"

/usr/bin/open "${destinationDir}/mymusic.html"

if [ $debug -eq 0 ]; then
  /bin/rm /tmp/mymusic$$.xml
  /bin/rm /tmp/mymusic2$$.xml
fi

For to those that like it, having a command line script is nice. You can use it for your scripting, without have to figure out what AppleEvents you should signal an application using AppleScript. Those that have ever done anything like this will appreciate to have a command line script ready to trigger from the crontab.

And those that don't just use the package, which is just a GUI wrapper around this script. You can fill in the arguments with a intuitive interface and everything should work as expected. If nothing goes wrong that is... feedback is still a bit underdeveloped at the moment in the GUI.

10. Tips and Tricks

Making your own template

Making your own template is not too difficult. Two templates are provided already. You might want to read up on the XSLT specification, and you may want to peek into the generate.command script which is in the CCTunes package.

If you need any help, asking will never hurt. I don't always have time to answer everything, but you may be lucky. If you feel like giving back the template once it's finished that would be nice. If you want to link to me for giving inspirational credit, that would be nice too.

Combining templates

At the moment there are two templates. One for making a list-like html page, and one for making a "peephole" view of the library in one big html page. These two approaches can be combined by using the same destination folder for outputting.

Avoiding strange characters

At my Hosting Provider, they have set the encoding of HTML files to be sent via the headers as ISO-8859-1, but the HTML files are encoded in UTF-8. Since my browser prefers to look at the http headers, I needed to convert the html files to php files with a correct header being sent. Not difficult to do, as this little script will show, but interesting to know.

Code listing 10.1: Converting html files to php files with UTF-8 encoding header

#!/bin/bash

ALLHTMLFILES=`ls *.html`
for i in ${ALLHTMLFILES}; do
  FILENAME=`echo $i | sed s/\.html$//g`
  echo "<? header('Content-Type: text/html; charset=utf-8'); ?>" > ${FILENAME}.php
  cat ${FILENAME}.html >> ${FILENAME}.php
done

Using ImageMagic or GraphicConverter

If you are dissatisfied with the quality of the small images in the peephole view, like I am, you can choose to use ImageMagick to convert the images from higher size to lower size.

You will need to install it using fink or compile it yourself, both should be easy and with a simple set of commands on the prompt you convert all images like this:

Code listing 10.2: Generating thumbnails with ImageMagick convert

find outputdir -name *-high -type f -exec convert -resize 60x60 {} {}-low.png \;

If you have a utility like GraphicConverter, which came for free with my laptop, you can easily generate 60 by 60 thumbnails from the images by using the batch mode to set the Max Size.

Using perl to get the images out of the mp3 files

Warning: This section is outdated with respect to CCTunes. It is here for documentation purposes only.

It used to be so that the images were extracted from the MP3 files using perl libraries and a little perl script. This is how it was done.

A link that got me on the way was on the ever interesting macosxhints website, http://www.macosxhints.com/article.php?story=20030429003250559, in which one of the comments describes almost perfectly what needs to be done. Almost, but for the fact on how to get to open a file based on the url. Of course, this is not hard to do, using some parsing, and I have my "Perl in a Nutshell" book on my desktop, but getting used to this perl thing I thought there must be a module already out there that does this for me. Now, that's the URI::URL module we've installed, and it was mentioned in the book.

There we have our perl script, which takes as argument a unix file path and a filename to write the png file to. If this script does not find the corresponding tag, it just creates a link to a nopic.png file instead of the destination file, so that we can put in a nice icon when the image is missing in the mp3.

Code listing 10.3: The getPicture script gets the image from the mp3 file

#!/usr/bin/perl

no strict 'refs';
use Getopt::Long;
use IO::File;
use MP3::Info;
use URI::URL;

$url1 = new URI::URL shift;
$unixpath = $url1->unix_path();
$e = $ARGV[0];
if ( -e $e || -l $e ) {
    system("/bin/rm " . $e);
}
if (my $mp3tag_id3v2 = get_mp3tag($unixpath, 2, 1)) {
    my $tag = "PIC";
    my $result = $mp3tag_id3v2->{$tag};
    $result =~ s/^(.....).//;
    if (length($result) > 0)
    {
      open $f, "> $e";
      print $f $result;
    }
    else
    {
      system("/bin/ln -s ./nopic.png " . $e);
    }
}
else
{
   system("/bin/ln -s ./nopic.png " . $e);
}

To invoke this for the set of MP3's that the library existed of, a little hack was needed which added a little comment in the XML file so that when grepping for the getPicture command a bash script was generated that used the perl script to extract all the images.

I have understood that a Microsoft implementation of the XSLT parsing implements something like callbacks, in which you can call a script from within your XSL file. That's a good idea, nice, but since it isn't in the XSLT specification it is not very compatible to depend upon it. But if you do, you might want to check that out. The approach described did work too, however.

11. Conclusion

So, if you want to do a quick XHTML list of your mp3 collection, you need not do much. If you want, there is a package that you can download, and hopefully with the explanations given here you are even able to adapt everything to your needs.

12. Future Wishlist and Todo

In the beginning I was doing this to get to know something about XSLT and about perl scripting etc... However, things have evolved in the Cocoa direction and it has proven to be a good choice so far, since with little programming effort a very nice result was delivered. Things are pretty much the way I want them to be, so don't expect any drastic changes.

A short list of the things I want to do however:

  • A way to show the play count, which is usually a good measure on how much I like an album I bought. Related to this, a way to merge the xml files, since I listen to albums both at work as home.
  • You still need the iTunes xml file. It would be nice if it could take extra information from this XML file, but get the main information from the mp3 files itself.
  • One XSL template to do the whole transformation.
  • Jaguar, Tiger versions. A Windows version, why not?
  • Probably even more...

And a short list of the things I need to do:

  • Update documentation to reflect the current state of CCTunes.
  • Make sure versioning is in place.
  • Provide older versions online.

And a list of things I won't do (and why):

  • There are handy applications to get the Album's artwork into iTunes. CCTunes will not try to replace those, since for those that want to use Free Software, Clutter is available at sourceforge and does the job. A shareware application that, amongst other things, also does this is Synergy.

13. Resources

Downloading all the scripts

These scripts can be found in the package of the application at http://www.coin-c.com/CCTunes/cct-download.html. Surely you could copy and paste everything from this file, but downloading and untarring is a bit safer, since sometimes the whitespace is important.

Resources used in this article

Link Description
http://www.w3.org/TR/xslt#key The XSLT specification. It is the dry stuff, the spec and just the spec.
http://www.jenitennison.com/xslt/grouping/muenchian.html This could be a helpful technique to aid in speeding up the XML parsing we are doing. To be investigated.
http://www.xmldatabases.org/WK/blog/1086?t=item This is the article that tells about how to clean up the property list that iTunes has as an XML file.
http://www.oreillynet.com/lpt/wlg/3130 Nice article about what's in iTunes, and how.
http://www.macosxhints.com/article.php?story=20030429003250559 A tip about how to get images from iTunes, in which the perl hint occurs.

Other Resources

Name Link Description
mp3report http://mp3report.sourceforge.net/ Suppose you have no iTunes XML file, mp3report could be used to generate one. It runs over your collection of mp3 files and does about the same as all these scripts.
AudioHiJack http://www.rogueamoeba.com/audiohijack/ Very good application to record online radio with. Shareware at a nice price.
Doug's AppleScript for iTunes http://www.malcolmadams.com/itunes/index.php Ultimate resource for everything AppleScriptable in iTunes.
ImageMagick http://www.imagemagick.org/ Whatever you need to convert images if GraphicConverter does not seem to do the trick. Available in a weird license, which I think is as free as possible for an image manipulation program that supports a wide range of formats like this one.
Clutter http://www.sprote.com/clutter/ Nice application to get the Artwork in your iTunes. It has not been updated for a while however.
iCatalog http://www.kavasoft.com/iTunesCatalog/ Shareware application that does the same as the scripts in this article. Except, without paying you only get albums from artists starting with letters A-E. Easy workaround: put an A-E in front of all your artists or so? Naaah, just pay or use the scripts here.
iPlaylist http://iplaylist.knownworld.net/index.html Donationware application to publicize your library on a web page that looks like you're in iTunes itself. Does not extract images at all.
itunes2html http://disobey.com/detergent/code/itunes2html.txt Somebody took the perl script approach for converting iTunes playlists to HTML. iPlaylist is based on this.

Document Layout

This document was made with the excellent guide style sheet also used to make the gentoo documentation, with some minor adaptations done by myself.

The XSLT file to generate all of this, is at http://dev.gentoo.org/~swift/local/guide.xsl.



Creative Commons License This work is licensed under a Creative Commons License.
line
Updated $LastChangedDate: 2005-01-07 20:53:38 +0100 (Fri, 07 Jan 2005) $
line
Kristof Van Landschoot
Author

line
Summary: This is a short description of the things I did to get my library published from the XML file that iTunes keeps to a simple HTML file, album based.
line
Multi-page Version (feedback enabled!)
Copyright 2003-2004 Coin-C bvba. Questions, Comments, Corrections? Email cctunes@coin-c.com.