Python Examples

Python is another scripting language - two things about it caught my eye. First, the gui module included with it works on the Macintosh (unlike with perl!) and second, it is supposedly very popular with people at Google. Since I am a big fan of Google, I figured I would give it a look.

My First Really Useful Python Script

Slowly, slowly I'm learning python. And this evening (morning?) I finished my first really useful Python Script. (From my point of view). It duplicates what my clipsort.pl perl script does - but does it better. Why better? Because it includes dialog boxes for picking and saving the file and it can be turned into an app that I can run like any other applet. Very cool.

Because the source is kinda long I will put it on it's own page python_dom_sort. The big thing for me with this applet, is that I used the Document Object Model (light) to manipulate data in an XML document. It was much easier than I expected. While learning dom was a tad bit confusing, that was faster than writing a parser to create my own funky xml object tree. Python seems so powerful because the modules that come with it… are well… very powerful.

The source code isn't that long.

What makes Python also powerful is that it is built for manipulating lists. With surprising few lines and rich grammar you can cycle through lists of things, filtering out what you don't want and keeping just the pieces that you do.

Overview

What I did was to create a simple Main that provides the File Dialogs from EasyDialogs a mac specific UI package. I guess I could have used TKinter (and maybe one day I will) but these worked so easy it reminded me of AppleScript so I used them. Besides, it's not like I can take this script to another box. It's designed to work with iMovie files which… will only appear on a Mac!

  1. Select file to sort
  2. Open File and import it into minidom object.
  3. use getElementByTagName() to return a list of array objects.
  4. Find the one we want (only one of them is interesting right now - the trashone)
  5. Build a list of tuples that contain 4 items (the name of the dict object, a ref to the dict object, and copies of the refs to the dicts' siblings - previous and next).
  6. Sort the list of tuples
  7. Walk through the array and swap/update/exchange dict object references to match the sorted list.
  8. Return the dom object

What made all this trickier is that the python parser treats newlines '\n' as a text element. Since these are sprinkled throughout the document, I had to do two things.

  1. I had to respect that white space text elements wold be sprinkled through out.
  2. I had to skip those whenever possible

This created a few side effects but a couple great list iterators were:

    slots = [ s for s in range(len(sortRoot.childNodes)) if sortRoot.childNodes[s].nodeType == 1]
    dlist = [ d for d in sortRoot.childNodes if d.nodeType == 1]

the first line is a classic filtered loop. It builds a list one element at time from the values s takes on. The purpose of this loop is to find the index value for every dict node in the array (skipping the dreaded text elements). As s needs to be an integer, we build the integer count with the command range(len(sortRoot.childNodes)) which basically, builds a list of numbers from 0… the number of childNodes our sort root has. Then for each value of s, we check to see if that childNode has a node type of 1 (it's a DOM Element, NOT a text Element). When the loop is finished slots contains a list of integers - which if used in conjunction with sortRoot.childNodes would produce all the dict references in the array.

The dlist group does something very similar in that it loops through all the childNodes of sortRoot but only puts ones into the list that are DOM Elements (thus skipping the text elements).

That's some pretty efficient stuff, creating a loop, building a list, and testing for membership all in one line. Very nice.

My Second Really Useful Python Script

I wrote a script called findDups.py to locate all the duplicates in an archive of photographs. It's pretty intense so a new article is here: pictureTools

Object-ives

One feature of SGMLParser that looked kinda neat is that you can create a subclass of a function and the parent will call it. Huh?

Basically, you can create a master object class that will call functions which are defined by it's descendents. You can do this:

class myObj:
	def __init__(self):
		self.reset()
	def reset(self):
		self.x = 2
	def process(self,method_name):
		try:
			method = getattr(self,method_name)
		except AttributeError:
			self.dummy(method_name)
		if method:
			method()
		else:
			self.unkownMethod()
	def unkownMethod(self):
		print 'Unkown Method'
	def dummy(self,name):
		print 'Dummy Called: %s' % name
 
		
class myDesc(myObj):
	def reset(self):
		self.y = 3
	def foo(self):
		print 'Foo got called!'
 
		
>>> o = myDesc()
>>> o.process('foo')
Foo got called!

Basically, the base class provides a mechanism (using getattr()) to let future classes derive functions (add methods) and have those methods called from the base class.

Because Python treats function names as variables (practically everything is a reference) you can create function names on the fly and use getattr() to call those functions.

Pythonics

One of the most common bugs in programming is the “off-by-one” bug. Either you didn't count enough of x. or you counted one two many of x. Python seems to take a different approach. Their core idea seems to be built around two types of lists. lists which can be changed and tuples - which are lists that can not be changed.

lists = mutable (changeable) tuples = immutable (unchangeable)

They have a simple but powerful construct that iterates over lists. You should almost never iterate yourself (think of it as manual search) but if you use filters, indexes, finds, and ins, you can do almost anything to a list of objects without having to worry about the dreaded off by one problem.

In fact, to make this work, they even have a function called range() which converts a number into a list (so that it can be consistently iterated over).

Example:

Instead of

   for (i = 0; i <$#v; i++) {
      sum += v[i];
   }

Python would do something like:

   for val in v: sum += val

That may be too trivial an example, but a better one from a script I recently wrote is kind of interesting.

   myFiles = os.listdir(os.getcwd())
   needsFixn = [f for f in myFiles if '%' in f ]

The first line is pretty straight forward, we get a list of files from the object os via it's listdir method. os.getcwd() makes sure we are looking in the current directory.

The second line contains the magic. It creates a list… from an iterator (f for f in myFiles), but it also filters that list as it's created by the test if '%' in f. Basically, if the file name string f contains a percent sign, then we need to fix the file name. The new list needsFixn contains only file names that need to be fixed.

Notice there is no loop counter, no incrementing of an index, no end test. We just walk through each element in the list, and test it against a condition, adding it to the new list if the condition is true, or tossing it out (skipping) if it fails. That's some pretty crazy stuff, but from what I can tell very Pythonic

Fix Web Outlook File

When I download spreadsheets and such from Microsoft Web Outlook, the spaces always seem to be changed to %20. Very annoying. (Love Microsofts cross platform compatibility). So here's a quick way in Python to fix those. It's cumbersome and eventually I'd like to write a simple macro to fix it but here it is.

  • Click the file and hit the right ENTER key, this selects and edits the file name.
  • Copy the file name (Command-C)
  • Launch IDLE or the Python Shell.
  • Type:
    >>>'<command-v>'.replace('%20',' ')
  • and you get a string with all the %20's replaced with spaces.
  • Select the fixed string and paste it back into the icon name.
  • For example:
  >>> 'This%20File%20is%20messed%20up'.replace('%20',' ')
 'This File is messed up'

This little script takes advantage of the fact that any string is an object.

Cool

python.txt · Last modified: 2010/02/11 06:01 by newacct
chimeric.de = chi`s home Creative Commons License Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0