In order to have a computer navigate a labyrinth level using my robot I first had to figure out how to get the computer to determine where the walls are, and where the ball and goal is, and then somehow generate a path for the ball to follow. The first step was to get an image into the computer. To do that I used OpenCV to get the video data from the webcam.

Yeah I know, the camera position needs refinement
It turns out that image processing algorithms really like to work with black and white images. So the next step was to convert the image to grayscale.

Conver the image to grayscale before processing

The next thing that I did with was to extract the positions of the ball and the goal. It turns out that there is a technique called a Hough Transform that can extract certain desired shapes (such as a circle) from images, and OpenCV has the ability to use it built in, saving me from having to write a single line of code – well actually thats a lie, I wrote one line, the line used to find the circles (amongst many others). It turns out that the Hough transform was returning a lot of false positives so I blurred the image to get rid of any speckles or small edges using a gaussian blur (built into OpenCV) and then did the Hough Transformation to find round objects to extract the ball and goal locations (as they are both round).

The Image After Blurring
You can see the extracted circle locations drawn on the image below. Drawing the circles on the image is another great feature of OpenCV!

The extracted circle locations drawn on the image
The next step is to extract the positions of the walls – I found the most reliable method was to use an edge finding algorithm. I used the Canny Edge Detector algoritm built into OpenCV, which produced the output seen below (I filled in the circles above the goal and the ball to prevent detecting them as edges). Note that the blurring used in finding the ball and goal also helped false edges from being found.

The edges extracted using a canny filter
Once I had detected the edges, I used a flood fill near the goal to fill what would be any place I could potentially roll on with white, with the assumption being that there would be a path between the ball and the goal. Any place filled with white is a place the ball could potentially go.

Fill the space between the walls with white. This is the locations the ball can roll
At this point I have reduced the image to a monochromatic representation of walls and potential places to roll, as well as retrieved the locations of the ball and the goal. The next step was to reduce it into something that the computer can more easily digest for the path finding routine (where I figure out the best way to get from the ball to the goal). What I did was to create a two dimensional array 1/16 the size of the image filled with integers. The image from the webcam is 640×480 pixels meaning the array I created was 40×30. I then went across the image horizontally and vertically checking the color of the pixel every 16 pixels. If it was white I filled in the corresponding cell of the array with a 1, otherwise it was a zero. I also filled in the locations of the ball and the goal with a 2. This is confusing. A visualization of the array can be seen below. You can see that it has captured fairly well the locations of the walls and the balls.

The image reduced to 1s and 0s (and 2s).
Now that I had an array of simple integers that represented the geometry of the course I ran an implementation of the A* Search Algoritm which unfortunately did not come with OpenCv so I had to write it. A visulization of the found path can be seen in the image below. The squares filled in with a “7″ correspond to the path that the ball will take through the course. I have highlighted the path in green and the walls in red to help the eye strain. The matrix is good for the computer, bad for humans. You can see that the path is wavy and not actually the most efficient path. This is a result of me leaving out a portion of the A* search algorithm because I was getting OK results and was eager to get to the next step. I will go back and fix it soon.

The path found visualized
Since the matrix was generated by sampling the image every 16 pixels, taking the location of each point of the path found in the matrix and multiplying it by 16 gives the desired location of the ball at each point in screen coordinates (e.g. if I want the next location for the ball to move to is cell 4,8 in the matrix above, that means I want the ball to roll to position 64,128 in the image returned from the camera.
So thats how I go from an image from the webcam to a set of sequential coordinates that I need to move the ball to get through the course. I will post the code eventually – at the moment it is not fit for human consumption.
I’ve gotten my program talking to the arduino over serial (a nightmare to be described at a later time), and am working on the control system to get the ball to go where I want. I’ll post more info as things progress.