The Corona Geek episodes were over a few weeks, around #150 or so. Skipped a week somewhere and did some half episodes in the end. Off-hand, the one with the glass bakes in some constants, basically taking advantage of shader source being conventional Lua strings, and touches on the uber-shader concept. The program covering shader precision might come in handy as well.
1-pixel rects are merely the minimum, just so long as you land on one of the pixels in question. If you consult the reference card, the way to get pixels back onto the C / Lua side is glReadPixels(), which does take the dimensions and such. I'm pretty sure there's a thorough process getting them back; if you want the nitty gritty details, see the spec. Where this becomes troublesome, I think, is bringing content scaling into the picture.
If it's useful, I do have PNG-saving code (the "operators" module is compatible with bit, so you could use that instead), with docs here. This will not have any of PNG's compression, so if you want that too, open it up in some program and re-save it. It's been a while since I've touched that, but I'll try to answer any questions.
You certainly could put some of that information in the data texture. I guess the major point to make there is that, assuming the code flow looks like Connor's, you would be doing a texture fetch, then immediately needing the results to do a second texture fetch. You also immediately need the results of the second (getting the index), of course, but there's less getting around that one. So it can bottleneck on the read, and it's a double whammy to boot. Something to consider if it ends up being slow.
That said, those values being constants, they're perfectly suited to fetching in the vertex shader (which will be executed for each of your four vertices, versus hundreds of thousands of pixels) and feeding into the fragment shader through a varying. ("varying" is rather general terminology; you'd sample identical data in each vertex, so the per-pixel interpolation is a formality.) I haven't tried this yet, so I don't know if it "just works", but ideally you'd just do
if (gl_MaxVertexTextureImageUnits > 0)
// In vertex shader, fetch data stuff, pass into varying
// In fragment shader, compute index from varying
// In vertex shader, nothing?
// In fragment shader, do two fetches, compute index
clauses and sample / not sample as appropriate in each kernel. Vertex texture fetch (VTF) isn't guaranteed to be present on all hardware, thus the checks, but it's not an exotic feature by any means. gl_MaxVertexTextureImageUnits is a built-in constant, so any reasonable driver should do static branching, much like an #if / #else / #endif combo, given the above type of code. Also, I believe VTF only uses "nearest"-style filtering.
Ufff, floating point...
I gave something of a primer on it during one of the shows, and assembled some links here. (WARNING: rabbit hole!) If you refer to the "qualifiers" section of the reference card you'll see something about (minimum guaranteed) relative precisions, 2^-10 or 2^-16. Basically, the way floating point works, between each power of 2, you can exactly represent each such step along the way. So between, say, 32 and 64, we can represent 32, 32 + 32 * 1 / 1024, 32 + 32 * 2 / 1024, ... This applies equally well to the negative powers of 2, i.e. 2^-14, 2^-13, ... 2^-1, so we have, say, as per the reference, 14 or 62 intervals, of 1024 or 65535 values, respectively. So we can get quite accurate for numbers between -1 and 1. I bring this up since your texture coordinates will have been normalized to a [0, 1] range on the shader side. Non-power-of-2 textures could be slightly off, but with filtering going on anyhow it might not really amount to anything.
On the 16-bit index, sounds good, although of course you'll send them across as multiples of 1 / 256 and then recover them shader-side.
Now that I'm looking for it I can't find it, but "entry sampler" was just what one of the guys called the data texture's sampler. And you don't want bilinear filtering because you'd be looking up two or four tiles at once!
The only comment I'd make about the composite paint is that, if you do end up trying vertex texture fetch, you probably want the data texture in paint1 just in case only one vertex sampler is available.
The single set of texture coordinates won't be an issue for you, since you're coming up with your tile's coordinates on the fly. This limitation rears its head instead, say, when trying to use an image sheet frame and a full texture together in the same effect, or two sheets.
The content coordinates are what you see in the vertex shader. Really, rather than making it screen-sized, the rect could just be of the normal size and in its proper position. The fragment shader will only receive those pixels that survive clipping anyhow.