Showing posts from February, 2005

Machine vision of GUIs

[ Audio Version ] I just completed a brief foray into machine vision with a project focusing on being able to see and to some degree "understand" windowed graphical user interfaces (GUIs) like Microsoft Windows. I wrote a test program and an essay on the subject, so I'd rather suggest you visit the project's home page instead of simply repeating its contents here. But I'll summarize briefly. The base premise of my explorations is that most GUIs are composed of rectangular blocks within blocks. I called the core of the concept I was experimenting with "expansion" and "contraction" algorithms. "Expansion" here means starting with a test rectangle that begins inside a block and, like a balloon, expands outward until it finds the outer bounds of the current block. Similarly, "contraction" means starting with a rectangle that is just inside a rectangular block that gradually shrinks downward until it wraps snugly around the