33-Year-Old Unix Bug Fixed In OpenBSD 162
Ste sends along the cheery little story of Otto Moerbeek, one of the OpenBSD developers, who recently found and fixed a 33-year-old buffer overflow bug in Yacc. "But if the stack is at maximum size, this will overflow if an entry on the stack is larger than the 16 bytes leeway my malloc allows. In the case of of C++ it is 24 bytes, so a SEGV occurred. Funny thing is that I traced this back to Sixth Edition UNIX, released in 1975."
Yeah, it's probably you. (Score:3, Informative)
I bet you they're not talking about the system stack pointer. Remember, yacc is a parser generator; parsing algorithms always use some sort of stack data structure. So, the "stack pointer" in question is just a plain old pointer, pointing into a stack that yacc's generated code uses.
Re:Great! (Score:5, Informative)
While I'm sure you're trolling, I feel I should point out that, 1) I agree with you, and 2) this has apparently been fixed, on Linux:
http://agnimidhun.blogspot.com/2007/08/vi-editor-causes-brain-damage-ha-ha-ha.html [blogspot.com]
Re:Other Unixes (Score:5, Informative)
Yes. But OpenBSD fixed it, so they get credit for the fix. It's up to the maintainers of the other unix(ish) versions to implement the fix.
Re:Yeah, it's probably you. (Score:1, Informative)
Exactly. The code:yym = yylen[yyn];
yyval = yyvsp[1-yym];
This is one of the reasons that I hate C code (but I love it most of the time). If your stack was an object (preferably a STL vector), bugs like this wouldn't arise in a way that they could be exploited (your program would instead terminate with an uncaught exception that would point you exactly where your bug was).
Re:Yeah, it's probably you. (Score:3, Informative)
Actually, the [] operator of an STL vector doesn't throw any exceptions, and will happily allow you to reference an index which is out of bounds.
That's not a bad thing, because it's more efficient when you already know that your index is in rage. But if you don't know that, you're better off using the at() function.
Re:Was it really a bug back then? (Score:3, Informative)
Now, looking at it just as a bug, if the yacc script overflowed the buffer, yacc can either stop cleanly or crash untidily. It has the same effect - nothing much happens - unless, for some weird reason, the kernel holds onto the memory. That would be a kernel bug, though, the yacc bug would merely be a catalyst for exposing it.
Comment removed (Score:1, Informative)
Re:Great! (Score:3, Informative)
Instead of "ls a*"? Seriously? Hopefully, someone will mod you funny.
Unix has extremely low overhead spawning processes. If you prelink and have a little cache this is plenty fast :P
Seriously though, this is a serious annoyance in the way Unix does business. Shell globbing is very convenient for programmers, but not so convenient for users in an awful lot of situations.
Re:You do realize.. (Score:5, Informative)
yacc is not a compiler,
Excuse me?
Yet Another Compiler Compiler most definitely is a compiler.
Re:Great! (Score:5, Informative)
if you want ls -l style output, "find -name 'a*' -exec ls -l {} \;"
Yeah, because nothing endears you with the greybeards like racing through the process table as fast as possible. Use something more sane like:
which only spawns a new process every few thousand entries or so.
Re:Great! (Score:5, Informative)
It's both. The kernel is responsible for setting up the execution environment, and in the past it used a fixed 32 pages for the arguments. 32 pages on an ordinary PC is 128KiB, which is the old limit. The new limit is that any one argument can be up to 32 pages, and all the arguments taken together can be 0x7FFFFFFF bytes, which is ~2GiB.
Here's the diff: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b6a2fea39318e43fee84fa7b0b90d68bed92d2ba;hp=bdf4c48af20a3b0f01671799ace345e3d49576da [kernel.org]
After that, it was up to libc people to fix the globbing routines. Ulrich Drepper, taking some time off from his full-time job of being an asshole on mailing lists, managed to work this into glibc 2.8:
http://sourceware.org/ml/libc-alpha/2008-04/msg00050.html [sourceware.org]
Re:You do realize.. (Score:4, Informative)
OpenBSD still uses GCC, version 3.3.5 on i386. I can't say which version is used on the other platforms.
You are talking of PCC, which is being worked on by some of the OpenBSD developers, but I think its a parallel project, see http://pcc.ludd.ltu.se/
for more information.
Jem Matzen talked of this too, see http://www.thejemreport.com/mambo/content/view/369/
Re:Yeah, it's probably you. (Score:3, Informative)
Best of all, even if you use assert() (or similar) for really explicit bounds checking, GCC will omit it from code paths where it's deemed to be unused. So if your accesses are being inlined (and if they're not, take a long hard look at your life) then the already-safe paths won't have the check overhead even in a debug build.
Yes, I've tested it. Yes, it's impressive.
Re:Great! (Score:3, Informative)
On modern systems, find -name 'a*' -exec ls -l {} +
Personally, however, I prefer find -name a\* -exec ls -l {} +
Also, you probably want to add a -type f before the -exec, unless you also want to list directories.
Either that, or make the command ls -ld to not list the contents of directories.
Re:Great! (Score:3, Informative)
tar tf archive.tar | while read FILENAME ; do
rm "$FILENAME"
done
Re:Yeah, it's probably you. (Score:5, Informative)
From the link you cited:
The code for yacc was certainly not originally written in c - c didn't exist at that time.
The "archaic behaviour" was never part of that standard - it was a mistake in early implementations while they were still "working out the details" of the language, well before K & R, as Ritchie says:
It wasn't an archaism in c - it was an archaism from b that was removed during the development of what became c. Small difference, and for all practical purposes, it gives the same result - previously-working code that wasn't reviewed as the language evolved towards a standard ended up with "implementation-dependent behaviour" - bugs ... The worst part is that the buggy code is syntactically correct, so no compiler warnings. Of course, if your conforming compiler doesn't give a warning, you assume that the code written with the experimental versions is still valid.
Comment removed (Score:2, Informative)