Speedup = 1 / (S + (1-S)/(0.67n) + Hn)
where S = sequential time, n = number of logical processors, H = overhead.
Clay Breshears of Intel was kind enough to double-check the revision. Ignoring overhead, if a program is 99% parallelized on a Hyper-Threading chip with two logical processors, the speedup is:
Speedup = 1 / (.01 + .99 / (.67*2)) = 1 / 0.749 = 1.34
Intel has long suggested that a 30% overall speedup was the most improvement that could realistically be achieved. The 34% improvement shown here is more theoretical than real because no overhead is included and the code is 99% parallelized. So, Intel's projections are consistent with this formula.
Thanks to Clay and to Gang Chen.