






33-Year-Old Unix Bug Fixed In OpenBSD 162
Ste sends along the cheery little story of Otto Moerbeek, one of the OpenBSD developers, who recently found and fixed a 33-year-old buffer overflow bug in Yacc. "But if the stack is at maximum size, this will overflow if an entry on the stack is larger than the 16 bytes leeway my malloc allows. In the case of of C++ it is 24 bytes, so a SEGV occurred. Funny thing is that I traced this back to Sixth Edition UNIX, released in 1975."
Time to patch (Score:5, Funny)
Re:Time to patch (Score:5, Funny)
But ./ is already taken over with yak. Seriously.
Re:Time to patch (Score:4, Funny)
Who cares about OpenBSD yacc? BSD is dying and Netcraft confirms it. The world has moved to GNU/Linux and Bison.
Re:Time to patch (Score:4, Funny)
Re:Time to patch (Score:4, Interesting)
Who cares? Like GCC versus TinyCC, being bloated means it can produce a more useful output. GNUware can be faulted for being heavy compared to traditional Unix tools, but the functionality and flexibility provided more than makes up for it.
Except for autotools. What the HELL were they thinking.
Re: (Score:2)
Re-entrant parsers.
Re: (Score:2)
Wouldn't want to let anyone take over your system with yacc. Seriously.
I think if you've installed yacc with setuid bit then you have other problems to worry about. Seriously.
Re: (Score:2)
Re:Time to patch (Score:4, Funny)
Ah, but it would be written as a J2EE application. And the input wouldn't be .y, it'd be an XML document. And the output wouldn't be C, it'd be another XML, passing through a terabyte of XSLT. Then you pass this compiled parser XML, only a gigabyte in size, and your language file to a parser web service and it returns even more XML representing the parse tree.
Ahh, progress.
Re:Time to patch (Score:5, Funny)
Speaking of old bugs the guy who sits next to me at work hooked a 15yo mainfame bug a few months back. His stock comment whenever someone mentions it is: "Three more years and that one would have been old enough to vote!"
Re: (Score:2)
From back when (Score:5, Funny)
Unix beards were Unix stubble
bad omen (Score:5, Funny)
a 33 year old bug, plus a 25 year old bug (http://it.slashdot.org/article.pl?sid=08/05/11/1339228)....
if we keep going backwards, will the world implode? or will daemons start spewing out of cracks in time and space?
Re:bad omen (Score:5, Funny)
Re: (Score:2)
That isn't necessarily true. It's just as possible people are wasting time fixing unimportant issues and ignoring more important ones.
I'm not trying to disparage the OpenBSD team or anything. It's just that no development team is perfect.
Re:bad omen (Score:5, Funny)
It's just as possible people are wasting time fixing unimportant issues and ignoring more important ones.
We're talking programmers here, not politicians...
Re: (Score:3, Insightful)
Re:bad omen (Score:5, Funny)
Sure. Break malloc even worse to allow for backwards compatibility.
See "Windows 95".
Re: (Score:2)
Is there any other option than to fix yacc?
Divorce is one way to "fix" constant yaccing.
Re: (Score:2)
Then again, the bug came up while failing to compile xulrunner, so it wasnt hunting for stupid 30+ years old code noone uses, but running a compile of something from this side of the millenium that in the end pointed to this bug.
Re: (Score:3, Insightful)
Re:bad omen (Score:5, Funny)
a 33 year old bug, plus a 25 year old bug (http://it.slashdot.org/article.pl?sid=08/05/11/1339228)....
if we keep going backwards, will the world implode?
Well since time began only 38.5 years ago we should find out the answer very soon!
Re: (Score:3, Funny)
In exactly 3.5 years , but I'm afraid the answer will disappoint you.
Re:bad omen (Score:4, Funny)
or will daemons start spewing out of cracks in time and space?
I finally figured out what the UAC were doing on the Mars colony... and it had nothing to do with those artifacts!
Thank god there's a division of Space Marines there...
Re: (Score:2)
They're using Vista on the Mars colony?
Shurely shome mistake
Re: (Score:2)
You are trying to activate the BFG 9000.
Cancel or Allow?
You are trying to frag a cyberdemon.
Cancel or Allow?
Re:bad omen (Score:5, Interesting)
Re: (Score:3, Funny)
Well since bugs before the epoch [wikipedia.org] were actual insects, judging by past precedent they'll get super powers... like wall-climbing ability or maybe spidey senses ??
Re: (Score:2)
This is just a guess with no etymological (or entomological) evidence to back it up, but I've always wondered if "bug" wasn't from "bugbear [wikipedia.org]" -- unseen mischievous forces which screw with how your stuff works.
This would be as in the Merriam-Webster definition [merriam-webster.com], particularly 2b: "a continuing source of irritation". Of course, 2a might sometimes fit: "an object or source of dread". Perhaps even definition 1 after along shift of maintenance work: "an imaginary goblin or specter used to excite fear".
Re:bad omen (Score:5, Funny)
The next bug will be in Boolean logic. After that, OpenBSD devs will start fixing structural engineering errors the Tower of Pisa.
Re: (Score:2)
or will daemons start spewing out of cracks in time and space?
Nope, they will just simply spew from our noses [catb.org].
Great! (Score:5, Interesting)
Any word on when they're going to fix the even older "Too many arguments" bug?
Sorry, but any modern system where a command like "ls a*" may or may not work, based exclusively on the number of files in the directory, is broken.
Re:Great! (Score:5, Funny)
Re: (Score:2)
128k should be enough for anyone.
Re: (Score:2)
Re: (Score:2, Interesting)
So, as an example, let's say I want to archive a bunch of files, then remove them from my system, to save space. I packed them up, using:
tar cf archive.tar dir1 dir2 file1 file2 file3
and, because I'm extremely paranoid, I only want to delete files I'm sure are in the archive. How would I do that? Could I use:
rm `tar tf archive.tar`
How about:
tar tf archive.tar | xargs rm
I'm pretty sure neither of those will work in all cas
Re: (Score:2)
Re:Great! (Score:5, Funny)
Burn the contents of the tar archive onto a CD. Mount the CD over the original directory structure. Use find(1)'s -fstype option to locate all the files that aren't on the CD, copy them to an empty disk image, then eject the CD. Remount the disk image over the original directory, delete all the files in the directory, then unmount the disk image. The files identical in name to those that were on the disk image (which are those that weren't on the CD) won't be deleted thanks to the peculiarities of mount(2).
You're welcome.
Re: (Score:3, Funny)
You forgot "Er.". All Linux advice must contain "Er." at the beginning of the first sentence in order to signify the fact that the poster should have already known how to do this rather than asking this question.
Re:Great! (Score:4, Funny)
So Saturdays at your house must be a real blast, huh?
Re: (Score:2)
You're just jealous you couldn't think of something so simultaneously clever and over-involved.
Re: (Score:3, Informative)
tar tf archive.tar | while read FILENAME ; do
rm "$FILENAME"
done
Re: (Score:2)
Re: (Score:2)
How about:
for f in `tar tf archive.tar`; do
rm $f
done
Re: (Score:2)
Psudo code, because my geekhood isn't threatened by petty things like the need for debugging and syntax checks:
tar tf archive.tar > /tmp/archive.txt /tmp/tree.txt /tmp/archive.txt /tmp/tree.txt > /tmp/delta.txt /tmp/delta.txt
ls dir1 dir2 file1 file2 file3 >
diff
rm
Re: (Score:2)
It is threatened by the lack of < signs, however. That's supposed to read: /tmp/delta.txt
rm <
Re:Great! (Score:5, Informative)
While I'm sure you're trolling, I feel I should point out that, 1) I agree with you, and 2) this has apparently been fixed, on Linux:
http://agnimidhun.blogspot.com/2007/08/vi-editor-causes-brain-damage-ha-ha-ha.html [blogspot.com]
Re:Great! (Score:5, Interesting)
If "ls a*" isn't working, it's because the shell is expanding a* into a command line >100kB in size. That's not the right way to do it.
Try "find -name 'a*'", or if you want ls -l style output, "find -name 'a*' -exec ls -l {} \;"
Re:Great! (Score:5, Informative)
if you want ls -l style output, "find -name 'a*' -exec ls -l {} \;"
Yeah, because nothing endears you with the greybeards like racing through the process table as fast as possible. Use something more sane like:
which only spawns a new process every few thousand entries or so.
Re: (Score:3, Informative)
On modern systems, find -name 'a*' -exec ls -l {} +
Personally, however, I prefer find -name a\* -exec ls -l {} +
Also, you probably want to add a -type f before the -exec, unless you also want to list directories.
Either that, or make the command ls -ld to not list the contents of directories.
The Problem is *why* it's the wrong way to do it (Score:4, Interesting)
You're correct that's it's not the right way to do it. The problem is *why* it's not the right way to do it. It's not the right way to do it because the arg mechanism chokes on it due to arbitrary limits, and/or because your favorite shell chokes on it first, forcing you to use workarounds. Choking on arbitrary limits is a bad behaviour, leading to buggy results and occasional security holes. That's separate from the question of whether it's more efficient to feed a list of names to xargs or use ugly syntax with find.
Now, if you were running v7 on a PDP-11, there wasn't really enough memory around to do everything without arbitrary limits, so documenting them and raising error conditions when they get exceeded is excusable, and if you were running on a VAX 11/780 which had per-process memory limits around 6MB for some early operating systems, or small-model Xenix or Venix on a 286, it's similarly excusable to have some well-documented arbitrary limits. But certainly this stuff should have been fixed by around 1990.
In Defense of Limits (Score:3, Interesting)
Soft limits can actually mitigate bugs. If we limit processes by default to 1,024 file descriptors, and one of them hits the limit, that process probably has a bug, and would have brought the system to its knees had it continued to allocate file descriptors. Programs designed to use more descriptors could to increase the limit.
Sane limits (Score:2)
While xargs is a great little workaround/workhorse, it is needed in far too many cases. Why on earth would it be so hard to increase the limits every once in a while? After all, the limit in question was probably perfectly acceptable back in the day when 20mb was a lot of space and 500 files was more files than you could imagine ever creating.
Re: (Score:2)
Re: (Score:3, Informative)
Instead of "ls a*"? Seriously? Hopefully, someone will mod you funny.
Unix has extremely low overhead spawning processes. If you prelink and have a little cache this is plenty fast :P
Seriously though, this is a serious annoyance in the way Unix does business. Shell globbing is very convenient for programmers, but not so convenient for users in an awful lot of situations.
Re: (Score:2)
it was fixed years ago....
find . -name "a*" -prune -exec ls -ld {} \;
(note: this command line was generated by reading the man page for gnu find - may not work on all unix/linux variants)
Re: (Score:2)
or an even shorter solution...
ls -c1 | grep "^a"
and if you wanted upper and lower-case a files,
ls -c1 | grep -i "^a"
Re: (Score:2, Informative)
Re: (Score:2)
uhm - no...
the ls entries listed above work perfectly well on Solaris, AIX, HP-UX and Linux.
ksh is the only shell I use, although I'm sure it would work with bash.
I primarily work on Solaris (SPARC/x86/x64) platforms - I won't go into any kind of flame wars over it, it's just what my company uses primarily.
(modern_system != infinite_memory) (Score:2, Interesting)
It is not broken. The fact that it complains "too many arguments" is evidence that it is not broken, since the program (ls) is doing bounds checks on the input. If it was broken, you wouldn't get the message; there would be a buffer overflow because the programmer didn't do constraints checking.
ERRATA (Score:3, Insightful)
Re: (Score:2)
Re:Great! (Score:5, Informative)
It's both. The kernel is responsible for setting up the execution environment, and in the past it used a fixed 32 pages for the arguments. 32 pages on an ordinary PC is 128KiB, which is the old limit. The new limit is that any one argument can be up to 32 pages, and all the arguments taken together can be 0x7FFFFFFF bytes, which is ~2GiB.
Here's the diff: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b6a2fea39318e43fee84fa7b0b90d68bed92d2ba;hp=bdf4c48af20a3b0f01671799ace345e3d49576da [kernel.org]
After that, it was up to libc people to fix the globbing routines. Ulrich Drepper, taking some time off from his full-time job of being an asshole on mailing lists, managed to work this into glibc 2.8:
http://sourceware.org/ml/libc-alpha/2008-04/msg00050.html [sourceware.org]
Re: (Score:2, Interesting)
Re: (Score:2)
heh... FWIW on Windows people are stuck with only a few kB of command line and no shell wildcard expansion at all, and they don't seem to be crying in their beers (... it's the market leader last time I checked)
The (not-so-)secret is to not do things by passing big lists around using command line arguments. Back in unix land, you can do glob-filtered listings like the one you suggested with the find command. And even the basic commands like ls can take parameters via xargs instead of regular command line
Re: (Score:2)
foreach file ( a* )
echo $file
end
maybe it's just me (Score:1, Interesting)
But this code just seems wrong. What is C code doing referencing the stack pointer directly?
Yeah, it's probably you. (Score:3, Informative)
I bet you they're not talking about the system stack pointer. Remember, yacc is a parser generator; parsing algorithms always use some sort of stack data structure. So, the "stack pointer" in question is just a plain old pointer, pointing into a stack that yacc's generated code uses.
Re: (Score:3, Informative)
Actually, the [] operator of an STL vector doesn't throw any exceptions, and will happily allow you to reference an index which is out of bounds.
That's not a bad thing, because it's more efficient when you already know that your index is in rage. But if you don't know that, you're better off using the at() function.
Re: (Score:3, Informative)
Best of all, even if you use assert() (or similar) for really explicit bounds checking, GCC will omit it from code paths where it's deemed to be unused. So if your accesses are being inlined (and if they're not, take a long hard look at your life) then the already-safe paths won't have the check overhead even in a debug build.
Yes, I've tested it. Yes, it's impressive.
Re: (Score:2)
There's a bug in the explanation.
From the linky (emphasis mine):
"-=" is NOT the same as "=-".
Example:
returns a
Re: (Score:2)
Re:Yeah, it's probably you. (Score:5, Informative)
From the link you cited:
The code for yacc was certainly not originally written in c - c didn't exist at that time.
The "archaic behaviour" was never part of that standard - it was a mistake in early implementations while they were still "working out the details" of the language, well before K & R, as Ritchie says:
It wasn't an archaism in c - it was an archaism from b that was removed during the development of what became c. Small difference, and for all practical purposes, it gives the same result - previously-working code that wasn't reviewed as the language evolved towards a standard ended up with "implementation-dependent behaviour" - bugs ... The worst part is that the buggy code is syntactically correct, so no compiler warnings. Of course, if your conforming compiler doesn't give a warning, you assume that the code written with the experimental versions is still valid.
Re: (Score:2)
In the early 70's, a new language evolved from B. In these years, the new language was evolving quickly, but by 1973, it was mature enough to allow the Unix kernel to be rewritten in it. The makers called it C, from the same article:
Who are you to say that it wasn't C?
And for certain, the yacc included in V6 was written in C, it generated C and it would soon be used to write C compilers.
Re: (Score:2)
Because it is C, and C is designed to be able to do so? How do you think the Linux kernel gets implemented, though it also has Assembly to be sure. C was designed to allow implementation of Operating Systems. The capability to reference the Stack Pointer and do other assemblyie things via the asm keyword [gnu.org] is part of its charm ;-)
Was it really a bug back then? (Score:5, Interesting)
Was this a bug when it was originally written, or is it only because of recent developments that it could become exploitable? For instance, the summary mentions stack size. I could imagine that a system written in 1975 would be physically incapable of the process limits we use today, so maybe the program wasn't written to check for them.
Does your software ensure that it doesn't use more than an exabyte of memory? If it doesn't, would you really call it a bug?
Re:Was it really a bug back then? (Score:5, Insightful)
If you overflow a buffer then it's a bug, whether it is exploitable or not.
Re:Was it really a bug back then? (Score:5, Funny)
If you overflow a buffer then it's a bug, whether it is exploitable or not.
If you can overflow an exabyte-sized memory buffer, you deserve a fucking medal.
Re: (Score:2)
while(1){
*buffer=1;
buffer++;
}
/*Where's my medal?*/
Re:Was it really a bug back then? (Score:5, Funny)
You'll get it when the buffer overflows. If you're running it on a system that processes a billion of those loops per second, that should be in a bit over 31 years. Scale accordingly for your processor and memory speed.
Re: (Score:2)
Re: (Score:2)
Unless you're running on pretty rare 64-bit hardware, an int will still only be 32 bits. I can't remember what happens when you overflow 2^31 and try to de-reference a negative pointer (probably it just implicitly casts to unsigned), but you sure aren't going to overflow an exabyte buffer that way.
Re: (Score:2)
Re: (Score:2)
If you overflow a buffer then it's a bug, whether it is exploitable or not.
If you can overflow an exabyte-sized memory buffer, you deserve a fucking medal.
*insert emacs joke here*
Re: (Score:3, Interesting)
If you overflow a buffer then it's a bug, whether it is exploitable or not.
It is today, but my questions is whether it was even overflowable (is that a word?) when it was written. For example, suppose it was written for a 512KB machine and had buffers that could theoretically hold 16MB, then it wasn't really a bug. The OS itself was protecting the process by its inability to manage that much data, and it wouldn't have been considered buggy to not test for provably impossible conditions.
I'm not saying that's what happened, and maybe it really was just a dumb oversight. However,
Re: (Score:2)
Failure to check for a buffer overflow is an error. It doesn't matter if someone else will do it for you and, as such, the error will never result in a problem for someone. It's simply wrong.
Re: (Score:3, Informative)
Now, looking at it just as a bug, if the yacc script overflowed the buffer, yacc can
Other Unixes (Score:2, Interesting)
Re:Other Unixes (Score:5, Informative)
Yes. But OpenBSD fixed it, so they get credit for the fix. It's up to the maintainers of the other unix(ish) versions to implement the fix.
Hilarious! (Score:5, Funny)
Funny thing is that I traced this back to Sixth Edition UNIX, released in 1975
My sides are completely split! Invite this guy to more parties.
Coincidence? (Score:2)
I just finished reading the "The A-Z of Programming Languages" series on Computerworld (found out about it in here [slashdot.org]), and now the next article in the series just came up and it's a chat with the creator of Yacc.
Coincidence?
And for those that want to read the interview, it can be found here [computerworld.com.au].
New tagline for OpenBSD (Score:2)
Only two or three remote holes in the default install not from 33 years ago, in more than 10 years but not less than 33 years!
Re: (Score:2)
Sorry that message was me, didnt seem to want to keep me signed in :|
Re: (Score:2)
Sorry that ^U I am Spartacus.
Re: (Score:2)
Re:You do realize.. (Score:4, Informative)
OpenBSD still uses GCC, version 3.3.5 on i386. I can't say which version is used on the other platforms.
You are talking of PCC, which is being worked on by some of the OpenBSD developers, but I think its a parallel project, see http://pcc.ludd.ltu.se/
for more information.
Jem Matzen talked of this too, see http://www.thejemreport.com/mambo/content/view/369/
Re:You do realize.. (Score:4, Interesting)
Re:You do realize.. (Score:5, Informative)
yacc is not a compiler,
Excuse me?
Yet Another Compiler Compiler most definitely is a compiler.
Re: (Score:2)
Mod parent -1 Horseshit.
yacc is a compiler, what do you think the two c's stand for?
Re: (Score:2)
Alternate and equivalent volume to 1 ml?