cbloom rants: 06-15-10 - The end of ad-hoc computer programming

6/15/2010

06-15-10 - The end of ad-hoc computer programming

I think we are in an interesting period where the age of the "guerilla" "seat of the pants" , eg. ad-hoc, programmer is ending. For the last 30 years, you could be a great programmer without much CS or math background if you had good algorithmic intuition. You could figure out things like heuristics for top-down or bottom-up tree building. You could dive into problems without doing research on the prior art and often make up something competitive. Game programmers mythologize these guys as "ninjas" or "hackers" or various other terms.

That's just about over. The state of the art in most areas of coding is getting so advanced that ad-hoc approaches simply don't cut the mustard. The competition is using an approach that is provably optimal, or provably within X% of optimal, or provably optimal in a time-quality tradeoff sense.

And there are so many strong solutions to problems out there now that cutting your own is more and more just foolish and hopeless.

I wouldn't say that the end of the ad-hoc programmer has come just yet, but it is coming. And that's depressing.

14 comments:

michael maniscalco said...: Nonsense. That's like saying that chess is coming to an end just because there are books that thoroughly document opening theory. The enjoyment is in the learning and the experimentation. We ... none of us ... would start this journey if it were not in us to experiment and walk our own paths ... even if they were already completely documented.

There is always room for improvement and innovation. And while the techniques may be well established the problems are always evolving. What was the correct solution yesterday is not optimal today. Be it because of hardware changes, technology changes, or whatever else, there will always be an element of artistry to programming.

I don't believe for a single second that "ad-hoc" programming is coming to its end if for no other reason than because the technical challenges that inspire solutions are themselves too robust to let that happen.; June 15, 2010 at 8:13 PM
won3d said...: Yeah, I don't buy this either. I think it is an artificial distinction; everyone is an ad hoc coder most of the time! It's not like the average coder is getting smarter or more willing to adopt best practices.

And even if it were true, I also fail to see how this is a bad thing.; June 15, 2010 at 8:55 PM
cbloom said...: First of all, let me be clear that I am talking about *algorithm* development here, not "coding" in the sense of applying existing algorithms or just the manual act of writing code.

"That's like saying that chess is coming to an end just because there are books that thoroughly document opening theory"

Now now, that's not at all analogous to what I said. What I am trying to say is analogous to "great chess players who have not studied books will not be able to win in the future" (but saying it in 1600 or whenever chess was immature). Obviously chess does not come to an end, and coding does not come to an end, but the ability of people who are not serious or schooled does in fact come to an end when a certain art form develops into maturity.

"There is always room for improvement and innovation."

Obviously. That's almost a tautology; I guess the number of algorithms expressable in specific finite size is a finite number, hence there is an end to simple algorithm development at some point, but that number while finite is very large.

Even though there is always room for improvement, that improvement becomes more arcane, more baroque, more off in the fringes.

"It's not like the average coder is getting smarter or more willing to adopt best practices."

I'm not talking about code contruction here, but yes I agree. What I'm suggesting is that the average coder will get farther and farther away from the cutting edge. People who are serious and advanced will do research, and other people will implement applications, and the current existence of people who have no formal background and hack out applications that are also cutting edge will dwindle.

Let me give a concrete example : 10 years ago, you could be one of the top graphics coders in the world and not actually understand the rendering equation or light transport or any of that. That is less true now and will be fully untrue in 10 years.

I think the only way you can possibly disagree with my contention is if you think that the number of problems in computer science is so vast that there will always be some new interesting problem that is immature.

Currently when the "ad hoc" coder is given a task to implement something, he will often just think about it for a minute and come up with an algorithm off the top of his head. People just love to do this. My contention is that in the future people who do this will be making intolerably sub-standard code.; June 15, 2010 at 9:38 PM
michael maniscalco said...: "My contention is that in the future people who do this will be making intolerably sub-standard code."

Heh ... we all do this ... or did ... or do ... or both ...

Come on Charles ... if it were not for us hacks you wouldn't have MSufSort to kick around ... to say nothing of rants regarding unsigned integral types! (^;

I proudly list myself amongst the inspired hacks and whatnots! They (we) are an inspired bunch.

"It's not like the average coder is getting smarter or more willing to adopt best practices."

This is the only thing I find interesting anymore ... sigh ....

Long ago it became boring for me. Then I discovered the concept of "best practices" et al.

It's the only thing that keeps me interested any more. Leak proof code, RAII, exception safe code etc.; June 15, 2010 at 10:30 PM
Tom Forsyth said...: The last two major hardware paradigm shifts of the last twenty years are caching and parallelism. Neither are even remotely supported by any mainstream programming language (hmmm... maybe databases - kinda). We're so far away from a well-structured approach that in most cases we don't even know how to enumerate the full possibility space of solutions.; June 16, 2010 at 12:49 AM
cbloom said...: "Come on Charles ... if it were not for us hacks you wouldn't have MSufSort to kick around ... "

That's actually an example of what I'm talking about. 10 years ago I would have written my own suffix sort and it would have been close to state of the art. Now it would be very hard for me to do that from scratch, I would have to do a lot of research and spend a lot more time on it. Now I'm more inclined to find some existing code that someone has done a good job on than roll my own.

This happens to any intellectual endeavor as it becomes mature. Data compression is a great example, it's now very hard work to roll a compressor from scratch that is competitive with state of the art, and to advance the state of the art requires geometrically more work each time to make each step of improvement.

In the past N years computer science has been an exciting "wild west" of new problems where hackers can make their mark, but as it matures the hackers get pushed out by scientists and the more serious people who go and iterate on the minutia.

Certainly parallelism provides a host of new problems that will keep us busy for a while, but are there more beyond that?; June 16, 2010 at 11:00 AM
won3d said...: I didn't realize you were talking about algorithms. And yeah, TomF is certainly correct that there isn't a good way to deal with memory locality or concurrency at the language level. Another problem I personally deal with is programming clusters rather than single machines, but this is very much like those other problems at a chunkier granularity (pet peeve: people who say "more granular"). The whole database (really, table-oriented) approach that Casey and Sean wrote about on Molly a few years ago is really starting to make lots of real sense.

One area that seems somewhat open is hashing and sketching. I mean, I wrote a cuckoo hashtable variant that works pretty damn well. I bored Thatcher today at lunch ranting that I should re-implement it since I lost the code with my old laptop. But the whole sketching (e.g. using hashes to speed high-dimensional approximate nearest-neighbor queries) is a pretty interesting open problem as well.

I think there are interesting open problems in the string algorithm space (yes, including suffix sorting). I mean, didn't Sean come up with some fancy regular expression optimization? And I think there are open problems for multiple string alignment (some bioinformatics problem, I'm probably misremembering).

Also, exception safe code as best practice! LOL!; June 16, 2010 at 11:48 AM
cbloom said...: High dimensional Approximate Nearest Neighbors has some pretty good solutions :

http://www.cs.umd.edu/~mount/ANN/

that's definitely an area where we will continue to see some more improvement, but it's also a great example of an area that is quite mature. It's hard to beat balanced box decomposition with hashing methods (I tried)

http://www.volksbot.de/surmann/papers/icar2005_2/node17.html; June 16, 2010 at 11:59 AM
won3d said...: From the site:

"Based on our own experience, ANN performs quite efficiently for point sets ranging in size from thousands to hundreds of thousands, and in dimensions as high as 20. (For applications in significantly higher dimensions, the results are rather spotty, but you might try it anyway.)"

Yeah, if you have millions/billions of points and thousands of dimensions, then you have to figure out how to sample things, and/or sketch your dimensions down to dozens. Or, you could try to improve the data structure. I'm not really an expert on this problem, but I'm aware that scaling up is quite difficult.; June 16, 2010 at 12:42 PM
cbloom said...: "
Yeah, if you have millions/billions of points and thousands of dimensions, then you have to figure out how to sample things, and/or sketch your dimensions down to dozens. Or, you could try to improve the data structure. I'm not really an expert on this problem, but I'm aware that scaling up is quite difficult. "

Yep this is clearly an area of ongoing research.

I successfully used ANN on image patches, which is 64*3 = 192 dimensional, and with 1920*1080 = 2M data points. It lets me find the absolutely best match in an SSD sense for an 8x8 block in video compression.

Obviously you can also use it with approximate dimension reduction (fitting a lower dimensional manifold through high D space which approximately hits the points) - approximate dimension reduction is an interesting and rich problem in its own right.; June 16, 2010 at 6:22 PM
michael maniscalco said...: "... I would have to do a lot of research and spend a lot more time on it. Now I'm more inclined to find some existing code that someone has done a good job on than roll my own."

Fair enough. I agree with you. We can't all sit around reinventing the wheel. "If I have seen further it is only by standing on the shoulders of giants." comes to mind. I didn't write the compiler that I used to develop MSufSort. Nor the OS nor build the machine. I focused my energies on a subset of the problem and enjoyed the advantages of previous work to facilitate my goals. So, yes, I agree with you.

"This happens to any intellectual endeavor as it becomes mature. Data compression is a great example, it's now very hard work to roll a compressor from scratch that is competitive with state of the art, and to advance the state of the art requires geometrically more work each time to make each step of improvement."

Well, there surely are advances to be had, always. But as with your previous point, to make such advances requires that you enjoy the privilege of employing the work of those that came before you. So it requires dedication, devotion and, mostly, an increasingly larger investment of time. Mostly because the 'bar' doesn't go down over time. It goes up. So, the more that any given area of research has to harden, the more time is required to significantly improve it.

So I guess I agree with this too.

"Also, exception safe code as best practice! LOL!"

I admit that I was more or less enumerating the things that keep me interested these days rather than a list of "best practices" but, seriously, it's not "a" best practice. It's "the" best practice. But it's one of the last practices that programmers adopt. That's my observation anyhow. Lots of code requires handling exceptions gracefully. It takes good programming practices to be able to handle the exceptional gracefully.; June 16, 2010 at 6:46 PM
Anonymous said...: "I mean, didn't Sean come up with some fancy regular expression optimization?"

Yes... in 1990.; June 16, 2010 at 7:39 PM
won3d said...: 1990? Holy shit. Crap, and it was only 2008 (which seems like forever ago, because I barely remember that exchange) when I looked at the Judy Array stuff you checked out in 2003.

@michael: Thanks for msufsort, and most importantly for having good explanations of efficient suffix sorting!

I guess exception safety is better than not having exception safety, and I shouldn't just mock it even though I have this "been there done that but it is worthless" attitude. Actually, I'm kind of curious how people think about this? Are people compiling with exceptions right now? Does it help them solve real problems? What issues comes from shipping a library with exceptions?; June 17, 2010 at 12:24 AM
said...: One awesome thing about technology, is that it always changes. What was provably optimal 5 years ago, isn't necessarily optimal now. Time with it changes all things. Ninjas will always be useful.; June 18, 2010 at 11:35 PM

cbloom rants

6/15/2010

06-15-10 - The end of ad-hoc computer programming

14 comments:

old rants