Fri Jan 1 00:24:00 PDT 2021

pongbot, a robot that plays pong against itself

(Programming note: this is a blog post I wrote for my previous employer, now defunct, preserved here. Some of the links won't work, since they point at a domain that no longer exists)

For VEX Robotics World Championship 2015, like last year, Robot Mesh needed some kind of booth demo.

The requirements are open-ended, but challenging. It's got to be made of (mostly) VEX parts, be novel, and be interesting.

We wanted to do something with pneumatics, and something with the Pixy camera, since computer vision is both interesting and interestingly difficult. How about... a robot that plays pong against itself?

Let's go over the hardware first.

We used 2906s for the frame because they're cheap and rigid. A robot brain with one fried H-bridge pulled out of the junk pile was used for control, and a joystick with a broken partner port was used for joysticking. Even with heavy use of scrounged parts, final project cost came out to more than five hundred dollars.

The first concern, once we settled on the basic idea, was that paddle traverse speed might not be fast enough. We weren't limited by competition rules, so the motors could have been overvolted, with the corresponding dramatic reduction in motor lifespan, but we wanted to see how far we could get with stock components, so the rack gearbox was built with a 36-tooth pinion gear, and the 393 motor was switched to the high speed configuration. Unloaded, this should have moved the paddle at ((1.5 in. diameter * pi) * (160 RPM / 60)) = 12.56 inches per second. (31.91 cm/s)

As the paddle had nonzero weight, it ended up moving slower than that, but in practice it ended up being good enough. If you were to build your own Pongbot, you might try using the turbo gears.

The Pixy is pretty neat. It does all the heavy processing onboard, then just outputs screen-space coordinates. (It can do color blob detection of 7 different hues, for up to 1000 different objects, though in practice you'd probably run out of sensor resolution and channel bandwidth before then.) It communicates with the Cortex by serial over the UART port.

Like all computer vision sensors, it is exquisitely sensitive to lighting conditions. If the scene is partially lit by natural light, then a correctly calibrated sensor in the morning will give erratic and strange results in the afternoon. It's handy to keep PixyMon open so you can check to make sure that the Pixy actually sees what you think it sees.

Two pitfalls when working with the Pixy: the camera is focused by threading the lens in and out of the camera housing. This thread is fairly loose: if there's any mechanical vibration at all, the lens will gradually drift out of focus. (Oddly enough, this makes signature recognition undependable, which you wouldn't expect) Also, the camera can either output live preview over USB, or data over the UART line. If you're in live preview mode, the camera won't output any data, and your robot will mysteriously stop working.

The robot went together quite quickly, and after about, oh, ten seconds of system testing, it became obvious that the pneumatics were going to need to be directly connected to an air compressor, otherwise we would have to pump it up by hand every five minutes.

It's tough to find air compressors that are both cheap and quiet. Eventually we discovered that the magic keyword here is "airbrush compressor". Any airbrush compressor would work, but the one we used was the TCP Global TC-20. It only had a maximum pressure of 57 psi, but that was fine, since we were running the cylinders almost completely unloaded, and I was worried about damaging the seals by dry firing them. (This was mitigated by only opening the solenoid very briefly, see below.)

A problem arose. All our air tool fittings used big chunky high-flow 1/4 inch NPT threads, while the reservoir was drilled and tapped for 1/8 inch NPT. An adapter of that size is kind of a rare beast, with the result that it was unlikely a general hardware store would have them, and we didn't want to wait around another two business days for a little piece of brass to be shipped to us from a warehouse in Ohio.

Where could we possibly find an obscure pneumatics adapter in Seattle?

Why, at Western Fluid Components, just down the street from us, who happened to have a box of them just sitting on the shelf. If you're ever in Kirkland, and need some fluid components, they're your guys.

Late in the project we chopped up a battery extension cable to run the Pongbot off of a bench power supply, as battery lifespan had become a problem. Unfortunately the PSU had been sized by the time-honored "guessing" method, ("How much current could two little motors draw, anyway?") and was only rated to 3 amps. Well, it turns out that a single motor tries to draw 4.8 amps when stalled, which causes the PSU output voltage to droop alarmingly. This doesn't seem to harm anything, since the Cortex was always tethered to a laptop, making it immune to CPU brownouts. (Low voltage can do weird things to computers.)

Rubber bands were strung across the corners, to keep the ball from getting stuck where the paddle couldn't get at it. (A literal corner case.) Another problem discovered during testing that, due to the square edge on the paddle, it could end up entraining a ball directly next to it, so we chopped up a 1x25 steel bar and bent the ends in a vice so that the ball would be nudged away from the paddle during travel.

Let's talk about the software. (Code examples edited for clarity)

paddle1_kick = vex.DigitalOutput(8)

paddle1_kick.on()

This tells the interpreter that port 8 should be defined as a DigitalOutput, and gives it the name paddle1_kick. (To be precise, we instantiate the vex.DigitalOutput class with the object name paddle1_kick.) To turn it on, we do paddle1_kick.on(), (Calling the on method of vex.DigitalOutput.)

def FirePaddle(device):
    device.on()
    sys.sleep(.05)  # Avoid dry firing by only opening the solenoid for a brief time[1]
    device.off()

One problem is that both sides of the Pongbot have a paddle, which means I need to specify the delay in two places. Copy and paste code is a pain to work with, since every time you want to make one global change, you have to go in and edit the code in multiple places. This results in a surprisingly large number of bugs, even in real, professional software. In the programming industry, the injunction against repeating yourself is called Don't Repeat Yourself.

To avoid Repeating Myself, I define the name of a new function using def, name it FirePaddle, and specify that it takes one argument, device. (Don't forget to end the line with a colon! The error it returns isn't very helpful for figuring out what you did wrong.)

There is a slight amount of cleverness here: you can pass in the name of a function as an argument to another function. This means you can pass in paddle1_kick:

FirePaddle(paddle1_kick)

Causing FirePaddle to execute:

paddle1_kick.on()
sys.sleep(.05)
paddle1_kick.off()

Isn't that neat?

Let's take a look at something a little more complicated.

pixy = vex.Serial(vex.SerialParam.PORT_UART1, 38400)
FRAME_SYNC = (0x55, 0xaa, 0x55, 0xaa)

while True:
    if pixy.read_byte() == FRAME_SYNC[0]:
        if pixy.read_byte() == FRAME_SYNC[1]:
            if pixy.read_byte() == FRAME_SYNC[2]:
                if pixy.read_byte() == FRAME_SYNC[3]:
                    rawdata = pixy.read_bytes_list(40)
                    break

(During testing, we found that 19200 bits/s was not fast enough for the Pixy to transmit three blocks per frame, but at 56600 bits/s the robot brain wasn't fast enough to keep up with processing, and the buffer filled up and tended to overwrite frames while they were being read off, causing checksum failures. The happy medium is 38400 bits/s.)

This stack of if statements is designed to search through an incoming stream of data for the start of a new frame. The Pixy sync word is 0xaa55, sent least significant byte first.[2] It transmits one sync word between each block (each object in the scene that it has recognized) and two sync words at the start of each new video frame.

The while loop runs forever, reading one byte at a time off the serial port. If the byte it reads is FRAME_SYNC[0], 0x55, then it reads off another byte and checks to see if it's 0xaa. It proceeds in this fashion until it finds a complete frame sync, at which point it reads off 40 bytes, (Three object blocks) then breaks out of the infinite loop.

for x in (0, 14, 28):
    checksum = rawdata[x+0] | (rawdata[x+1] << 8)
    signature = rawdata[x+2] | (rawdata[x+3] << 8)
    x_pos = rawdata[x+4] | (rawdata[x+5] << 8)
    y_pos = rawdata[x+6] | (rawdata[x+7] << 8)
    width = rawdata[x+8] | (rawdata[x+9] << 8)
    height = rawdata[x+10] | (rawdata[x+11] << 8)

This one's a little arcane.

Each variable inside a block is transmitted as a 16-bit word, but are read off the wire as 8-bit bytes, so we have to reassemble them before packing them into variables. The LSB byte can be handled normally, but the MSB byte is left-shifted 8 bits, then bitwise (one bit at a time) ORed with the LSB. A fabulous ASCII art diagram:

Let's say the checksum was decimal 258, which would be hex 0x0102. 
That's transmitted as

1   0x02 00000010
2   0x01 00000001

Left shift byte 2 by 8 bits, then OR them

1   0x02 00000000 00000010
|
2   0x10 00000001 00000000
=
1 0x0102 00000001 00000010

There!

Now that we have the coordinates of where the ball is right now, we can make a stab at predicting where it will be.

class PredictLocation():
    def __init__(self):
        self.older = 0.0
        self.old = 0.0
        self.new = 0.0
    def __call__(self, location):
        self.older = self.old
        self.old = self.new
        self.new = (((self.old - self.older) + self.old) + location + location) / 3
        self.predicted = ((self.new - self.old) + self.new)
        return self.predicted

predicted_loc_ball = PredictLocation()

predicted_loc_ball(y_pos)

Here we define a class, instantiate it as predicted_loc_ball, then call it, passing y_pos, which we got from the camera earlier.

def __init__(self):
    self.older = 0.0
    self.old = 0.0
    self.new = 0.0

This defines a special class method, like the on method of the vex.DigitalOutput class earlier. We give it the special method name __init__, which is executed when the class is instantiated. (A class constructor.) We pass another magic word, self, which is replaced with the variable name you chose when you instantiated the class, here predicted_loc_ball.[3]

We initialize the variables with 0.0 to force them to use the floating point data type. Python will return an integer as the result of dividing two integers, which on the one hand makes logical sense, but on the other hand means that 3 / 2 will return 1, while 3.0 / 2.0 will return 1.5.

__call__ is another special method name that is called by default without having to specify a method. As for what it actually does:

self.older = self.old
self.old = self.new
self.new = (((self.old - self.older) + self.old) + location + location) / 3
self.predicted = ((self.new - self.old) + self.new)
return self.predicted

All camera sensors have a fair amount of jitter, so some filtering is required of the raw data. The standard tool used for this in computer vision is the Kalman filter, (It's like a PID algorithim, only inside out) but all the code found from a lazy Google search required the NumPy library, as Python doesn't have a native array type. Under time pressure, I just threw up my hands and averaged the latest reading with the previous reading. This worked well enough in practice, but introduced half a frame of latency, 10ms, which wasn't super ideal. (The Pixy outtputs 50 frames per second, 50hz. 1000ms / 50hz = 20 ms.)

This code makes an effort at predicting the future location of the ball. When called, it takes the difference between the last two readings and uses it to guess at the current location, then does a weighted average of the guess with the sensor reading. It then takes that filtered result and guesses at the future location of the ball.

The paddle control code is greatly simplified by cheating. First, we define a generic Paddle class, since both sides use the same control code, and we don't want to repeat ourselves:

class Paddle():
    def __init__(self, limit_right, limit_left, motor, kick):
        self.limit_right = limit_right
        self.limit_left = limit_left
        self.motor = motor
        self.kick = kick
        self.filtered_loc_paddle = OutputFilter()

paddle1 = Paddle(limit_right_1, limit_left_1, paddle_1, paddle_kick_1)
paddle2 = Paddle(limit_right_2, limit_left_2, paddle_2, paddle_kick_2)

We pass in the objects of the motors, limit switches, and pneumatic solenoids, (We define the names and port numbers in the interface monitor, which makes troubleshooting a little easier) then instantiate an OutputFilter class. (Like PredictLocation, but it doesn't forward predict. If we forward predicted both the ball and the paddle, then the system would oscillate pretty badly.)

def fire(self):
    self.kick.on()
    sys.sleep(.05)
    self.kick.off()
def __call__(self, predicted_loc_ball, loc_ball_x loc_paddle):
    a = self.filtered_loc_paddle(loc_paddle[1])
    if predicted_loc_ball < (a - 5) and not self.limit_right.is_on():
        self.motor.run(((predicted_loc_ball - a) * 2) - 20)
    elif predicted_loc_ball > (a + 5) and not self.limit_left.is_on():
        self.motor.run(((predicted_loc_ball - a) * 2) + 20)
    else:
        self.motor.off()
    if (abs(loc_paddle[0] - loc_ball_x)) < 10 and (abs(a - predicted_loc_ball)) < 10:
        self.fire()

Revision 1 only used the Pixy for tracking the ball, and IMEs[4] for tracking the paddles, which had the advantage of much greater resolution and stability, and the disadvantage of having to convert between screen-space coordinates and encoder ticks, as well as a fiddly and tedious calibration procedure every time the camera shifted in the clamp mount. (Though, now that I think of it, calibration could have been an automated step on startup, if I had left the colored tags on the motors...)

Eventually I ran into a showstopping problem with I2C resets clearing the IME tick counter, and just punted and went with Pixy tracking of all three objects. This has the advantage of making the control code incredibly concise: it just tries to move the paddle until it lines up with the ball.

There is a deadband of 10 pixels around the ball, to prevent excessive hunting. (The > (filtered_loc_paddle1 + 5) < (filtered_loc_paddle1 - 5) stuff)

The paddle motor is directly throttled with the offset from the ball, with a minimum power of 20, and a multiplier of 2 to get it to ramp up to 100 more quickly. Values higher than 100 are just clamped to 100 by the Motor.run API. The solenoid fires when the ball enters a 20x20 pixel box around the paddle.

The __call__ method takes the x and y coordinates of both the ball and the paddle. The code here isn't as clear or concise as it could be, since to keep loop execution time down I'm only doing forward prediction on the ball's y position, and output filtering for the paddle y position.

And that's pretty much it! The code is up on Robot Mesh Python, so you can run it yourself. Check it out!


1: ^ It is tremendously tempting to rephrase what you're doing in a code comment, especially when the API requires you do something a little awkwardly. On my first pass, I wrote this comment as:

sys.sleep(.05)  # Avoid dry firing by only opening the solenoid for 50ms

Ah, but Don't Repeat Yourself! I had to change the value passed to sys.sleep() several times during testing, which meant I had to keep changing the comment in lockstep. Eventually I realized my error, and made the comment more generic. Never put a number from your program logic in a comment!

2: ^ This is why we define FRAME_SYNC as (0x55, 0xaa, ...) not (0xaa, 0x55, ...). Keeping track of endianess is one of the joys of programming against the bare metal.

3: ^ You can examine the internal variables of a class using the same syntax as a method call, of course, since a method is just a variable, like any other. print predicted_loc_ball.old will return 0.0. It is considered bad form to do so outside of debugging, however, and if your code depends on fiddling with a class's internal variables, instead of accessing them from its return value, then you are almost certainly committing a design error.

4: ^ VEX product 276-1321 is actually labeled and titled "Integrated Encoder Module", but the wiki and everyone I've ever talked to have called them IMEs, Integrated Motor Encoders. Even the Inventor's Guide pdf is confused, calling it an encoder module and a motor encoder at different points.


Posted by Samuel Bierwagen | Permanent link | File under: nerdery