Copyright 2000 Stephen Satchell -- all rights reserved. Anyone is free to link directly to this essay from any Web site. Reproduction, but not for profit, is hereby granted. Any commercial use of the information of this paper requires that appropriate publishing rights be secured from the author.
A number of people have performed analyses of the :CueCat barcode reader (affectionally known around here as "the Red-Nosed Pussy") and have even taken them apart and posted pictures on the Web, but as of September 11, 2000 I hadn't seen anyone analyze how the barcoded cues are put together. This essay is an attempt to do just that.
NOTE: On a separate page I detail what I've found regarding the decoding capabilities of the CueCat.
Why did I do this page? I had an article to write, for Planet IT. I decided to publish these notes about the structure of a cue while I was finishing up the article. For those of you who are interested, I've provided a more complete rationale, including the silence on the part of Digital Convergence Inc.
The response to this research from the community has been great. We now have in the public domain a Code-128 derivative that can be used to generate basic boring barcodes that are read as cues. Indeed, some of the more interesting details have been exposed by building test documents with hundreds of barcodes that explore the envelope of the capability and limits of the CueCat. Click here to see more information.
So let's look at how a CatCue is put together.
What is a :Cue:Cat cue? The cue is a printed element that can be scanned by the :Cue:Cat in conjunction with the :CRQ software (developed and distributed by Digital Convergence, Inc. of Dallas TX), and interpreted by the Digital Convergence site. The interpreted data is returned from the DC site as a URL that is presented to your browser. For example, an advertisement in Forbes magazine may include a cue. When that cue is swiped by the :Cue:Cat wand, the browser is told to go to the Web site of the advertiser, perhaps even to a particular page.
Here is a picture of one of the sample cues that came on the "Practice Makes Purr-fect" card that came with the Forbes CueCat.
The cue consists of a trademarked graphic element (the ":C"), followed by a stylized representation of data in a machine-readable form. The smallest cue element is 1.78 inches wide by 0.30 inches high; the largest cue here is 3.40 inches wide by 0.55 inches high. The actual barcode area ranges in width from one inch to just under two inches.
The red color of the trademark graphic element is selected so that when the LEDs in the wand illuminate the trademark, the design element appears as though it were white to the barcode reader. This approach guarantees a large clear zone in the swiping area, allowing for the fact that users will tend to start their barcode reading on top of the trademark. The clear area is important to the ease of decoding the barcode, because the decoding software in the wand has to do less work if the code has few or no extra transitions between light and dark for the wand to ignore.
Some of the cues have hairline outlines of black around the red areas. In the cues that show this particular addition, the hairline is thinner than the thinnest bar in the cue. To the wand, this would look like a very small "bar" of black in a wide expanse of white--which should be filtered easily by the software in the wand as noise.
The numbers across the bottom show the data contained in this barcode. (The :CueCat reports the data in the barcode in a different manner when you swipe the code; we talk about that later.) In order to understand how the information C 01 00 00 03 00 02 45 is encoded in our example cue, let's look at just the barcode information itself. First, we need to bring the bars upright:
The original barcode has its bars skewed about 22 degrees to the left, so we reverse-skewed it to obtain an upright code. Another sample showed a 23-degree tilt. The actual design angle is 22.5 degrees, calculated by taking half of 45 degrees. The intent is that the characteristics of pixel fill will be uniform across the barcode if printed properly. With the bars upright, we can now examine the detailed characteristics of the barcode coding scheme.
A bar code consists of arrangments of bars and spaces. Some codes (such as the Plessey code) look for the presence or absence of color. Most barcode schemes use bars of varying width and spaces of varying width to encode data; this is true for the 3-of-9 code, UPC (versions A and E), EAN (13 and 8), UCC/EAN 128, and 2-of-5 encoding. The following table shows the possible ratios of minimum and maximum bar width and space width:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Examination of our sample above shows that the sample contains one black bar that is four times the size of each of the thinnest black bars. Indeed, if you look at the two bars immediately above the 45 in the first picture of the Cue at the top of this paper, you see two black bars that show exactly this 4:1 ratio in width. This pronounced difference in widths means that the only viable candidate for a standard coding schema is Code 128. This gives us a start.
The standard for Code 128 says that every character position is composed of 11 cells, with each group of 11 cells containing three black bars and three spaces. The sole exception is the "stop character" which consists of four bars in 13 cells; indeed, the stop-character pattern is fixed at BBwwwBBBwBwBB (described in the standards as 2 3 3 1 1 1 2, for the cell widths of each bar and space). The diagram below shows the entire bar code with timing marks and a decoding of the bar pattern according to the Code 128 standard:
(The top row of numbers describe the width of spaces in cells. The row of numbers in reverse video describe the width of black bars in cells. The bottom shows the limits and value of each character. The values displayed consist of the Code-C interpretation, a slash, and the Code-B interpretation.)
Excluding the broad white spaces at each end of the code, a standard cue has 101 cells packed in a one-inch wide code field. Reading from left to right, from the dark-to-light transition of the triangular device to the edge of the first bar we encounter 10 cells of white. The first character of each cue, for every cue we've examined, is an exclamation mark (Code B) or the digit pair "01" (Code C). (The usual Code-128 start code is not present in the cue.) (But see below for surprising information about this so-called start code.) The next six characters define the cue itself; in this example it is "000003 000245" based on Code C interpretation, which the :CueCat reports as the Code-B equivalent "space space hash-mark space double-quote cap-M". The "84" code is the check digit. The Stop code is straight out of the specification.
Experiments have shown that the triangular graphic elements that bracket each bar code serves no significant technical function other than to ensure that there is a sufficiently large guard band of white on each side of the barcode. The :CueCat is just as happy without any black to each side of the barcode. This lack of technical function leads me to believe this is another possible graphic element for trademark differentiation.
Calculating the check digit. The value of 84 for the check digit derives from a slightly modified form of the standard Code 128 check digit; it appears that the first character, which is taking the place of the Start code, is being included into the check digit with a weight of one. This means the check digit is most likely being calculated in the following way:
Graphic (Code B) | ! | <sp> | <sp> | # | <sp> | " |
|
|
|
|
|
|
|
|
|
Multiplier | 1 | 1 | 2 | 3 | 4 | 5 | 6 |
Product | 1 | 0 | 0 | 9 | 0 | 10 | 270 |
The sum of the products is 290, and the sum modulo 103 is 84, which becomes the value of the check digit. (These assumptions have been confirmed; the author extends a deep bow to Azalea Software, Inc. for the tools to make the analysis possible.) Indeed, when the check digit is purposely miscalculated, the :CueCat will report nothing when the malformed cue is swiped.
So what does "C 01 00 00 03 00 02 45" mean? First, the "C" indicates that the human interpretation is taken from the Code C table of the Code 128 specifications. The numbers are the Code C representations of the characters as they appear in the Code 128 specifications. The check digit is not included in the human interpretation. The Code B equivalents, which are actually reported by the :Cue:Cat, are shown in the check-digit calculation table.
Are there any limitations on the character set as used in the device? Unfortunately, yes. Experimentation suggests that the CueCat is only capable of reporting only those codes that have assigned graphic characters in the Code C table of the Code-128 specification. Specifically, the values 00 through 95 are reported by the CueCat; attempts to return 96 through 99 cause a null report from the device. That means that there are exactly 966, or 782,757,789,696 possible combinations. If you remove the DEL code as problematic, then you have 956, or just 735,091,890,625 combinations of cues available in the cue universe.
How many barcode characters can a Cue have in it? Well, in an experiment to see just how the checksum calculation survives the test of more and more digits, we started with a single 11 code, and worked our way up to nineteen 11 codes. On the 20th 11 code, the wand refused to respond. Now, 19 codes (that's 38 characters) makes for a long barcode. To check to see if this may be an inherent limitation of the reader, I explored the boundary in standard Code 128, and found that the wand reacted in exactly the same way when you threw too many digits at it. Ah, well, you get what you pay for...
Does the start code have to be an 01/! character? No. It turns out that the start code can be other characters. The wand reports a cue with a different start character as a different kind of barcode. For example, if the first character is 11/+ character, then the barcode type is reported by the wand as a "CC+" type. This situation expands the cue universe by another 90 or so groups. This expands the universe of cues to 66.1 * 1012. Later experimentation showed that the only codes that could not participate as a "start mark" is 00/sp and 95/Del. I wonder if the Digital Convergence site can handle these style of cues yet?
Why tilt the bars at a 22.5-degree angle? I don't know. Unless there is something in the wand that no one has located, and the Cross pen is also unusual, there is no technical reason to tilt or not tilt the barcodes in the cue. One purpose that comes to mind, though, is that this would be an interesting way to attempt to trademark the "look and feel" of a CueCat cue. Registering the format of a CueCat cue would be child's play for a good intellectual property lawyer. (I won't go into details, just in case Digital Convergence hasn't retained a good one -- why do their work for them?)
What is the required amount of contrast between the bars and the white background? DigitalConvergence hasn't published specifications, but experiments with the :Cue:Cats show that you need clear printing with a great deal of difference. In the September 25, 2000 issue of Forbes magazine, several of the advertisers had included a background. The ad by Lucent Technology had the cue on a medium-brown background...and it didn't read at all. Another ad has a very light halftone that interfered with proper swiping. In another case, a magnifying glass showed that the typography was poor on the cue, which prevented it from being recognized properly.
What is the correction to the base-64 conversion table? The information has been moved to its own page.
Throughout this paper, there is this reference to "Code B" and "Code C" in the Code 128 specification. What is the relationship between the two interpretations? In the specification for Code 128, there are three different tables for each of the 106 codes that comprise the basic codeset. In order to be able to represent the entire ASCII character set, including control characters, there are mode-switch codes to indicate which code set should be used to interpret the values of the barcodes. The three code sets are called A, B, and C.
In the case of the CueCat, the value of the codes are returned using the Code B interpretation, but the printed, human-readable information is the same data as shown in the Code C table. The table immediately below shows the correlation between the two codesets; remember, though, that the data the CueCat throws back (once you have run the data through the base-64 algorithm) is the Code B representation.
|
|
||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||
|
|
The gray areas are values not returned by the CueCat for cues.
If you would like to generate your own cues (but without the trademark devices like skewed bars and the triangle graphic elements), Azalea Software, Inc. has made available a free toolkit for printing those barcodes in a slightly different format. See this page on the Qtools package for the software and fonts required. There is a version for Linux, and also one for Windows. You will need to use a word processing program and the fonts supplied in order to generate the barcodes. If you are using Linux, make sure you get the 1.1 version; the 1.0 version had a little problem. (Yes, I sent a patch back to Azalea Software.)
:Cue:Cat, :CRQ, and the :C device are trademarks of DigitalConvergence, Inc. (Dallas TX) | The mailing address for the author of this paper is Stephen Satchell, PO Box 6900, Incline Village, NV 89450-6900; the author may be reached via electronic mail here. |