July 25, 2004

Delete reworked

I've revamped the way that "deleting" songs works on NewMusicRadio. Now the "deleted" songs are just added to a list of files to exclude, and the original files are left as-is.

This means that:

  1. The $deleteFilter is deprecated because none of the mp3 files are actually harmed when "deleted"
  2. I can decide I don't like tracks that came on CDs without ending up with random tracks in my music collection being deleted (even if they aren't tracks I like)
  3. Most importantly, you don't need the --ignore-length switch on wget anymore
  4. You also don't need the chmod g+w because the script doesn't have to overwrite files

I've added a few lines of statistics output to my cron script because I'm curious as to how many files drop off the list as "too old" each day, and how many new files are added. These are the extra find and wc lines.

The final big change is with the line:
find /home/tunes/downloads/ -name \*.mp3 -mtime +42 -exec rm {} \;

This deletes any mp3s older than 42 days in the directory /home/tunes/downloads/ (or any directories below that). In detail:

  • /home/tunes/downloads/ this is the directory where all my downloaded (mp3 blogs and Soulseek) mp3s live
  • -name \*.mp3 find should only look for files called <something>.mp3
  • -mtime +42 only look for files older than 42 days. The + is important, -42 means look for files younger than 42 days
  • -exec rm {} \; for each file found, perform an rm <filename>; to delete it

There's still more to play with to link the mp3s downloaded from mp3 blogs to their source, but I think the bare bones of it are now working correctly.

Posted by Adrian at 09:24 PM | Comments (0) | TrackBack

July 19, 2004

Your First Assumption Should Be "It's My Fault"

When writing code, it's always a good idea to remember that when things aren't behaving as you expect then the chances are you've done something wrong. Unfortunately, it's always tempting to try to explain it away as a bug in someone else's code, or that there's some other reason that lets you off the hook.

I try not to fall into that trap, but in my rush to get NewMusicRadio finished over the weekend I succumbed. I let myself think that the reason the cron job was taking most of the day to check a few websites was either NTL being flakey as usual, or that some of the MP3 blogs had some form of advanced wget detection script running which left connections hanging for ages.

Stupid Adrian. A brief look into it this morning shows there's some problem with using the --ignore-length option of wget that results in neither wget nor the web server realising that the transfer has finished in some cases.

So, until I get some time to either work out how to stop the connections hanging, or code an alternative approach to deleting tracks, I've removed the --ignore-length options from the example cron script so the delete functionality won't work.

Posted by Adrian at 11:43 AM | Comments (0) | TrackBack

July 18, 2004

Your Personal New Music Radio Station

There's been a lot of talk about MP3 blogs lately; I've followed newflux and sleeve notes for quite a while now, and there's a growing list of others in my blogroll.

Getting new music to listen to is always good, but the way my system was set up meant that listening to the MP3s required interrupting whatever music I was already listening to (and I'm always listening to something), and often I need a couple of listens to a track before it worms its way into my head.

For a while I've been meaning to write some code to get the latest MP3s from the RSS feeds and somehow insert them into my listening schedule for the day. Then I came across Jeffrey Veen's post about using wget to download the MP3s. That took care of acquiring the tracks, but didn't give me an easy way to listen to them.

So, some experimenting and perl hacking later, I now have NewMusicRadio. There are two parts to this, and when combined they provide my own personal radio station of recent additions to my music collection and an assortment of new discoveries from Soulseek and the MP3 blogs I subscribe to.

First off is the cron job script which runs each day - downloading the new tracks from an assortment of MP3 blogs, and generating the list of MP3s recently added to my collection. Jeff explained most of the wget options, and I've added two others:

  • -nv This just cuts down on the amount of output generated to reduce the size of my cron logs
  • --ignore-length This tells wget not to download a new copy of a file just because the sizes don't match. That allows me to overwrite files I don't like with a zero-byte file and wget won't just get me another copy next time it runs. The chmod g+w after each wget is just because my cron job doesn't run as the same user as the CGI script that might want to "delete" an MP3; they are in the same group, so the chmod makes sure the downloaded MP3s can be overwritten.

Rather than give wget a list of URLs for each MP3 blog, I'm running a different wget for each blog so that I can filter the downloaded MP3s into a directory for that blog, just to help me keep track of which tracks I got from where.

Then I use find to create a list of tracks which have been added in the past 42 days:
find /home/musicfiles/ -name \*.mp3 -mtime -42 -print > /home/musicfiles/recentfiles.txt

  • /home/musicfiles/ This is where all my MP3s live. When I buy a new CD, it gets ripped and the MP3s are put in /home/musicfiles/<artist>/<album>/<track_n.mp3> and everything I download with soulseek, or from an MP3 blog gets put into /home/musicfiles/downloads. This parameter tells find where to look for updated files.
  • -name \*.mp3 This option tells find just to look for files called <something>.mp3. The \ is needed before the * to ensure the * isn't expanded by the shell and makes it to find as a *
  • -mtime -42 tells find to only include files which were modified in the past 42 days (six weeks)
  • -print output the filenames which match
  • > /home/musicfiles/recentfiles.txt redirect the output into the file we'll give to the playlist generator

Now that I've got a load of MP3s and a list of which ones are new, I use a little perl script installed on my Apache webserver to pick random songs from the list and deliver them to my media player. Musicmatch seems to get the artist/title information when it's available, but Winamp just displays the URL the track is being streamed from, which isn't as useful.

When I point my web browser at the script, I get a form to fill in (see an example). Once I've chosen how many tracks I want in the playlist I can either listen to the playlist straight away - generating an m3u playlist which automatically fires up Musicmatch - or have a look at which tracks are chosen. The view playlist option looks like this; shows me which tracks are in the playlist; lets me listen to the playlist; and lets me delete songs that have been downloaded (as opposed to ripped from CD) if I decide I don't like them.

If anyone fancies having a play with it, I've made it available over here. I'd be interested in hearing any thoughts about it, or suggestions for improvement.

Posted by Adrian at 08:47 PM | Comments (0) | TrackBack