-- ---------- Forwarded message ---------- Date: Wed, 21 Jan 2004 11:46:15 -0500 (EST) From: Robert G. Brown <rgb@xxxxxxxxxxxx> To: Jakob Oestergaard <jakob@xxxxxxxxxxxxx> Cc: prakash borade <hpcatcnc@xxxxxxxxx>, mail-plug@xxxxxxxxxxx, beowulf@xxxxxxxxxxx Subject: Re: [Beowulf] Which is better GNU C or JAVA (for network programing) On Wed, 21 Jan 2004, Jakob Oestergaard wrote: > > Well, the bait is out, let's see if someone bites ;) > Having been accused of having early alzheimers and forgetting some silly little symbol, what was it, oh yeah, a "++" in my even handed and totally objective diatribe, I'll have to at least nibble:-) > It is better to light a flame thrower than curse the darkness. > - Terry Prachett, "Men at Arms" Alas, my flame thrower is in the shop. The best I can do is make a nifty lamp with a handy wine bottle, some gasoline, and some detergent flakes. There, let me stick this handy rag in the neck like this, now <click source="lighter"> where were we again? My memory is failing me. Oh yes. The pluses. > > There. Let us bask for a moment in the serenity of our knowledge that > > we have the complete freedom to choose, and that there are no wrong > > answers. > > I find your lack of faith disturbing... ;) Trying an old jedi mind trick on me, are you? At least your metaphor is correct. You are indeed attracted to the Power of the Dark Side...:-) > > Now we can give the correct answer, which is "C". > > Typing a little fast there, I think... The correct answer for anything > larger than 1KLOC is "C++" - of course, you knew that, you were just a > little fast on the keyboard ;) > > (KLOC = Kilo-Lines-Of-Code) Well, we COULD make a list of all the major programs involving > 1KLOC (exclusive of comments, even) that have been written entirely in C (plus a small admixture, in some cases, of assembler). Unfortunately, the list would be too long (I'd be here literally all day typing), and would include things like the kernels of pretty much all the major operating systems (certainly all the Unix derivatives and variants) and gcc itself, including its g++ extension. To be fair, that doesn't prove that your observation is wrong -- there is a rather huge list of fortran sources available in my primary field of endeavor, for example, which doesn't stop fortran from being a cruel joke. However, to be fair the OTHER way most of that fortran was written by scientists (a.k.a. "complete idiots where advanced computer programming is concerned") to do what amounts to simple arithmetic problems, or maybe even complex arithmetic problems, fine, with a trivial data interface (read the data in, write the data out). Quite a bit of the aforementioned C sources and operating systems and advanced toos were written not only by computer professionals, but by "brilliant" computing professionals. Class O stars in a sea of A, B, F and G (where fortran programmers are at best class M, or maybe white dwarfs). What you probably mean is that IF everybody knew how to program in C++, they would have written all of this in C++, right? Let's see, is there a major operating system out there that favors C++? Hmmmm, I believe there is. Are it and its applications pretty much perpetually broken, to the extent that a lot of its programmers have bolted from C++ and use things like Visual Basic instead? Could be. This isn't intended to be a purely a joke observation. I would rather advance it as evidence that contrary to expectations it is MORE difficult to write and maintain a complex application in C++. The very features of C++ that are intended to make an application "portable" and "extensible" are a deadly trap, because portability and extensibility are largely an illusion. The more you use them, the more difficult it is to go in and work under the hood, and if you DON'T go in and work under the hood, things you've ported or extended often break. To go all gooey and metaphorical, programming seems to be a holistic enterprise with lots of layering and feathering of the brush strokes to achieve a particular effect with C providing access to the entire palette (it was somebody on this list, I believe, that referred to C as a "thin veneer of upper-level language syntax on top of naked assembler"). C++, in the same metaphor, is paint by numbers. It ENCOURAGES you to create blocks to be filled in with particular colors, and adds a penalty to those that want to go in and feather. In some cases, those paint-by-numbers blocks can without doubt be very useful, I'm not arguing that. It is a question of balance. A very good C++ programmer (and I'm certain that you are one:-) very likely has developed a very good sense of this balance, as suggested by your observation that a good C++ programmer writes what amounts to procedural C where it is appropriate (which IMHO is a LARGE block of most code) and reserves C++ extensions for where its structural blocking makes sense. Who could argue with that? Of course, a very good programmer in ANY language is going to use procedural methodology where appropriate, and create "objects" where they make sense. We must not make the mistake of comparing good programming practice with the language features. Here are the real questions. Presuming "good programmers, best practice" in all cases: a) Do C's basic object features (typedefs, structs and unions, for example) suffice to meet the needs for object creation and manipulation in those code segments where they are needed? I would say of course they do. IMO, protection and inheritance are a nuisance more often than an advantage because, as I observed, real "objects" are almost never portable between programs (exceptions exist for graphical objects -- graphics is one place where OO methodology is very appropriate -- and MAYBE for DB applications where data objects are being manipulated by multiple binaries in a set). "protection" in particular I think of as being a nuisance and more likely to lead to problems than to solutions. In a large project it often translates into "live with the bugs behind this layer, you fool, mwaahhahahaha". Result: stagnation, ugly hacks, bitterness and pain. In single-developer projects, just who and what are you protecting your structs FROM? Yourself? b) Do C++'s basic object features increase or decrease the efficiencies of the eventual linked binaries one produces? As you say, C++ is more than an "extension" of C, it has some real differences. In particular, it forces (or at least, "encourages") a certain programming discipline on the programmer, one that pushes them away from the "thin veneer" aspect of pure C. I think it is clear that each additional layer of contraint INCREASES the intrinsic complexity of the compiler itself, thickens the veneer, and narrows the range of choices into particular channels. Those channels in C++ were deliberately engineered to favor somebody's idea of "good programming practice", which amounts to a particular tradeoff between ultimate speed and flexibility and code that can be extended, scaled, maintained. So I would expect C++ to be as efficient as C only in the limit that one programs it like C and to deviate away as one uses the more complex and narrow features. Efficient "enough", sure, why not? CPU clocks and Moore's Law give us cycles to burn in most cases. And when one uses objects in many cases, they ARE cases where bleeding edge efficiency IS less important than ability to create an API and "reuse" data objects via library calls (for example). Still I think C would have to win here. c) Do C's type-checking, etc. features suffice to make programming particularly easy or safe? Here I will give C++ a win, as the answer for C at least is no. C is not particularly easy and it most definitely is not safe. It is a THIN veneer on top of assembler, and assembler is as unsafe as it gets (short of writing in naked machine code), so thin that one can easily inline assembler to access and manipulate CPU registers etc and have a pretty good idea of how to smoothly move data around and switch from one programming mode to the other. Again the art metaphor is appropriate -- with color-by-numbers you are "safer" from creating really, really ugly pictures (although it is always possible, of course:-). With nothing but the canvas, the paints, and a set of brushes and palette knives you can create anything from Da Vinci or Rembrandt to a three-year-old's picture of a dog (big blob of muddy brown in the middle of the canvas that may or may not even have "eyes"). This is why I think that the C++ vs C issue is almost entirely determined by an individual's personal taste and preferences, along with (maybe) the amount and kind of code that they write. I personally prefer to eat my vegetables (objects) raw -- if I need a struct, I make a struct. If I want to allocate a struct, I use malloc or write a constructor, depending on the complexity of the struct. If I want to de-allocate a struct, I either use free or I write a destructor, again depending on the complexity (whether or not I have to recursively free the contents of the struct, and at how many levels, how many times in the code). If I want to change the struct (either structurally or by accessing or altering its contents), I change the struct, and am of course responsible for changing all the points in my program that are affected by the change. If I want to "protect" the struct, well, I don't change it, or write to it, or read it, or whatever. My choice, unchanneled by the compiler. Do I have to deal with sometimes screwing up data typing? Absolutely. Do I have to occasionally deal with working my way backwards through my programs to redo all my structs? Of course (I'm doing it right now with a program I'm writing). This process is INEVITABLE in real-world program design because we DON'T know a priori what the best shape is for the data objects is until we've programmed them at least once the wrong way, in most cases. The really interesting question is whether or not this PROCESS is any easier in C++ than in C. I can't see why it would be, but perhaps it is. I suspect it is still mostly a matter of skill and style (and how one apportions time between active code development and "planning" a program in the first place) and, of course, personal taste. > > In order to give it, I have to infer an implicit "better than" and a > > certain degree of generality in your question is in "which language is > > BETTER suited to write an efficient networking program using linux > > systemscalls, in GENERAL". > > > > With this qualifier, C wins hands down. A variety of reasons: > > > > a) The operating system is written in C (plus a bit of assembler) > > This ought not to be a good argument - I guess what makes this a good > argument is, that the operating system provides very good C APIs. > > So, any language that provides easy direct access to those C APIs have a > chance of being "the one true language". You miss my point entirely. The argument is that C is a thin veneer on top of assembler -- so thin that one CAN write an operating system in it. Imagine writing an operating system in LISP. Wait, don't do that. The results are too horrible to imagine. Imagine doing it on top of fortran, instead. That's bad, but you won't have nightmares for more than a week or two afterwards. (IIRC, somebody actually did this once.) The API issue is moot. Hell, perl and python have direct access to most of the C APIs. IN THE CONTEXT of the reply, of course, there was also the suggestion that the C APIs are a good way of writing network code since the network drivers and kernel structs those APIs provide access to were all written in C, so your access is pretty much "naked". You can often read or write directly any register or value or memory location that isn't in the protected part of the kernel, if you dare (or need to to achieve enough efficiency in your particular application). But C++ I'm sure provides the same degree of naked access and C and C++ share a common underlying data organization, really, and even Fortran (with somewhat different data organization) probably does pretty well. In other languages (especially scripting languages e.g. java, perl, python), the access is typically "wrapped" in a translation layer that is required because one has NO CONTROL over the way the interpreter actually creates data objects. They are the "ultimate" in OO programming -- instantly created by the interpreter in real time, manipulable according to a wide set of rules, they go away when they aren't being used without leaking memory (usually:-) but heaven help you if you look beneath the hood and try to manipulate the raw memory addresses. This is a lovely tradeoff in a lot of cases, which is why I hesitate to actually stick a rod up java's ass and barbeque it on a spit. I really do believe the BASIC argument associated with OO programming, that the data and what you want to do with it should dictate program design features, including choice of language and programming technique. For many problems perl or python provide a very good data interface where one DOESN'T have to mess with lots of data issues that are a major chore in C. Wanna make a hash? Use it. Wanna make your hash into an array? Change the little % TO A @ and hack your dereferencing a bit. Is your hash actually stored as a linked list, as an array of structs? Don't ask, don't tell, don't try to look. All this and direct syntactical manipulation of regular expressions, what's not to love? This is what I was trying to keep an open mind to WRT java. Perhaps there are aspects of its data manipulation methodologies and programming features that are excellent fits to particular problems. I just don't know, and would rather have my teeth drilled with a half-charged portable black and decker screwdriver (allen bit) than learn YAPL just to find out. Hell, I'm only starting to learn python under extreme duress as I swore perl was going to be the last language I ever learned and that was before PHP and now python. Somebody would have to pay me a LOT OF MONEY to get me to learn java. Yessir, a whole lot. [Anybody reading this who happens to have a lot of money is welcome to contact me to arrange for a transfer...;-)] > I prefer to think of "C++" as "A better C", rather than a "C extension", > as not all C is valid C++, and therefore C++ is not really an extension. Funny that. I tend to think of C++ as "A broken C extension" for exactly the same reason;-) If they hadn't broken it, then there really would be no reason not to use it, as if C were a strict subset of C++ all the way down to the compiler level, so that pure C code was compiled as efficiently as pure C is anyway then sure, I'd replace all the gcc's in my makefiles with g++'s. That would certainly lower the barrier to using C++; C programmers could program in C to their heart's content and IF AND WHEN their program needs a "C++ object" they could just add it without breaking or even tweaking the rest of their code. Breaking C was (IMO) a silly design decision. So was adding crap like a completely different stdin/stdout interface, which makes most C++ code, even that which doesn't use objects in any way, most non-back-portable-or-compatible to C. It is very clear that these were all deliberate design decisions INTENDED to break C and FORCE programmers to make what amounts to a religious choice instead of smoothly extend the palette of their programming possibilities. I'd honestly have to say that it is this particular aspect of C++ more than any other that irritates me the most. There is a HUGE CODE BASE in C. For a reason. > > c) Nearly all decent books on network programming (e.g. Stevens) > > provide excellent C templates for doing lots of network-based things > > d) You can do "anything" with C plus (in a few, very rare cases, a bit > > of inlined assembler) > > Amen! This goes for C++ as well though. Y'know, you won't believe this, but I actually added the (and C++) references in my original reply just thinking of you...;-) Last time we had this discussion you were profound and passionate and highly articulate in advancing C++ -- so much so that you almost convinced me. Alas, that silly barrier (which I recall your saying took you YEARS to get over yourself)... I just bounce right off of it in a state of total irritation every time I try. Another problem is that every time I work with a student C++ programmer coming out of Duke's CPS department (which now teaches C++ as standard fare) I observe that while they are just great at creating objects and so forth they are clueless about things like pointers and malloc and the actual structure of a multidimensional array of structs and how to make one and manipulate its contents. As a consequence I have to spend weeks teaching them about all of the virtues of "C as a thin veneer" in order to get them to where they can cut advanced code at all, in either one. You may not have encountered this as you did C first. Or maybe you never have cause to allocate triangular arrays of structs or to build your own linked list of structs, or maybe C++ does this and eats your meatloaf for you and I'm just to ignorant to know how. Its funny. Back in the 80's all the CPS departments taught pascal because it provided strong type checking and absolutely forced one to use a rigorous bottom-up program design methodology. Precisely the same features that caused all real programmers to run from it screaming, as somehow all the programmers I've ever met work top down initially, then top, bottom, middle, whereever until the project is finished, often with a few epiphany-driven total project reorganizations in the middle as experience teaches one what the REAL data objects are that they are manipulating and whether they need easily maintained code that can run relatively slowly or whether they need the core loop to run faster than humanly possible if they have to hand-code it in assembler to get it there. Now they teach C++, and as you observed, teach it badly. I humbly submit that the REASON they teach it badly is that they aren't teaching "C++" at all, they are teaching structured programming and programming discipline with a language that simply doesn't permit (or if you prefer encourage) the student to use methodologies with far more power but far less externally enforced structure. Pure pedagogy, in other words. I personally think the world would be a better place if they FIRST taught the students to code in naked C with no nets, and taught them that the reason for learning and using good programming discipline is BECAUSE the bare machine that they are working with comes with no nets. Then by all means, teach them C++ and object oriented design principles. I suspect that students who learn C++ in this order are, as you appear to be, really good programmers who can get the most out of C++ and its object oriented features without "forgetting" how to manipulate raw blocks of memory without any sort of OO interface per se when the code structure and extensibility requirements don't warrant all the extra overhead associated with setting one up. > C, in my oppinion, would be somewhat like C++, except for larger > problems it doesn't fare qute as well (not poorly by any means, just not > as well). For CERTAIN larger problems, you could be right. I do consider things like writing operating systems and compilers to be "larger problems" though, and C seems to do very well here;-) Your arguments from last time were very compelling, as I said. > The only very very large problem with C++ is, that almost no people know > the language. There is a mountain of absolutely crap learning material > out there. This is why you see examples of "C versus C++" where the C++ > code is several times larger or even less efficient than the C example, > because the author felt that shovelling bad OO and braindead design > after the problem seemed like a good idea. Again, I think that is because most C++ programmers learn C++ >>first<< as part of a course on structured programming, which teaches them to use C++ the way it was "intended" to be used. If one learns C first, on the other hand, you already KNOW how to write tight code and can freely switch over to using C++ constructs where they make sense. In fact, since you actually understand what a struct IS and how data allocation WORKS (a thing that, believe it or not, is largely hidden from the student in most C++ classrooms as this is precisely the under-the-hood stuff considered anathema in a world where programmers are expected to be commodity items and write commodity --interchangeable -- code) you can probably even do very clever end runs around most of the silliness that they teach as good practice whenever and whereever it suits you. > I believe that with the state of compilers today, nobody should have to > start a large project in C - except if it absolutely needs to run on a > platform for which no decent C++ compiler is available (maybe Novell > NetWare - but that's the only such platform that comes to mind...) Give me a few days paring C++ down to the bare minimum extension of C so that it is a pure superset (so ALL C CODE just plain compiles perfectly and even produces the same runfile, so that the I/O language "extensions" are moved off into a library that you can link or not, mostly not, and so that using things like classes and protection and inheritance are suddenly pure and transparent extensions of the basic ideas of structs and unions and typedefs, and I wouldn't even argue. In fact, I'd say that the argument itself becomes silly and moot. One wouldn't even have to change Makefiles, as C++ would just be a feature of gcc that is automatically processed when and where it is encountered. So the kernel would suddenly be written in "C++", sort of, and there would be a very low barrier to converting C into C++ because basically, there would be no conversion at all, only extension. This of course will never happen. Until it does, then an excellent reason to choose C instead of C++ is if you happen to know C very well but not know (or care for) C++ as well. Or if your project is derived from a code base already written in C, and you don't feel like going back and fixing everything to make a C++ compiler happy with it (especially when doing so might well "break" it so that a C compiler is no longer happy with it). Note that this isn't a "theoretical" argument about should -- it is a real-world argument on why the choice you describe is NOT made, over and over again, by most people writing code (judging strictly on the basis of the number of GPL projects written in C vs C++). Even in a world where most schools have been TEACHING C++ for close to a decade, the majority of people who become computer scientists and systems programmers and to work on the guts of computer systems underneath the hood seem to at some point crossover to C and stay there, while the majority of people who stick with C++ end up programming for Windows for some big corporation (which might well be Microsoft) -- and produce shitty, broken, huge, inefficient code as often as not. My next door neighbor is an interesting example that comes to mind. He is a professional programmer. A decade ago he taught CPS at Duke, but got irritated because he had to teach the students to program in C++, and in such a way that they never learned how data structures really work. Believe it or not, I've had to actually teach SEVERAL students who have FINISHED the intro computer courses here just how memory on a computer works -- it is deliberately taught in such a way that one DOESN'T learn that, one learns instead to "think" only about the compiler-provided memory schema. He once spent a whole afternoon in my yard ranting at me about how a struct or union was all the object support any programmer could ever need. So he quit and took over Scholastic Books software division, made a modest fortune, and STILL lives next door writing software and clipping coupons. Other faculty I know seem to view C++ the same way -- a good thing to use to teach students because it forces them to code a certain way, but THEY don't use it for THEIR research projects. Professionals seem more often than not to prefer having the entire palette and all the brushes, and don't like even MAKING color by numbers codelets. This is very definitely not exhaustive or definitive. I also know students where they exact opposite is true -- they learned C++, learned about data structs and malloc and C, and continue to code in C++ (as you do) with strictly enriched abilities. Of course they are (like you;-) Very Bright Guys...and could probably even write good Fortran code if they ever learned it. No, wait, that is an oxymoron -- couldn't happen. > Seriously though, I think that the language-flamewars are fun and > informative, since so much happens in the space of compilers and > real-world projects out there. So, I think it's useful to get an update > every now and then, from people who have strong feelings about their > langauges - oh, and a discussion in the form of a friendly flame-fest is > always good fun too ;) > > / jakob I agree, and hope you realize that I'm deliberately overarguing the case for C above for precisely that reason. I really do believe that it is as much a matter of taste and your particular experience and application space as anything else and wouldn't be surprised if some java coder DOES speak up for java as that's what THEY are really good at and they've discovered some feature that makes it all worthwhile. We've already heard from the python crowd, after all, and THAT'S a language where they even eschew such obviously useful constructs as {}'s and line terminators such as ; in favor of indentation as a matter of deliberate design/dogma. It really does come down in the end to religious belief/personal taste. In the best spirit of tongue-in-cheek, I cannot do better than end with: http://www.phy.duke.edu/~rgb/Beowulf/c++_interview/c++_interview.html Read it and weep...:-) rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@xxxxxxxxxxxx _______________________________________________ Beowulf mailing list, Beowulf@xxxxxxxxxxx To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf