1/28/2009

01-28-09 - Graph Memory

So we have a thing to track memory allocs with a stack trace, la di da, no big whoop. I log it all out to a file. So I wrote a thing to parse them into a hierarchy and spit them out with tabs for tabview . That's awesome.

Then I thought, hey, Atman makes these awesome graphs with "graphviz" so maybe I'll try that. One disadvantage of the pure hierarchy view in tabview is that you can't really see the flow when lines merge back up. That is, call graphs are not *trees* they're *DAGs*. Sometimes the stack hierarchy forks apart but then comes back together again. Graphviz should be able to show this neatly. Graphviz makes "SVG" files that you can just load with Firefox (Firefox 3's support is much better than 2's).

Anyway I made some graphs with various options and it's kinda cool. Here's an example : Allocs1.svg (you'll need to use ctrl-wheel to zoom out to see anything). Apparently if you click this link Firefox does nothing good. You have to download it - then open it with Firefox. WTF. Apparently it's my Verio servers doing the wrong thing . Yegads I hate the web.

Not bad, but I'm a little disappointed with graphviz's layout abilities. In particular the actual cell and edge layout seems very good, but they seem to have literally zero code to try to put the labels in good places.

In a bit of odd deja vu, this was one of the very first things I ever worked on as a professional programmer in 1991; I worked for a GIS company that had an old COBOL GIS database engine that worked with Census data and such; I wrote them a new C front end with graphics for PC's and such, and one of the things you have to do is take this raw street data with lat/long coordinates and street names and do some nice munging to make it look okay; a big part of that is a lot of heuristics about putting the labels for streets in good places (and when to repeat the label, etc.).


ADDENDUM : talking to Sean I realized you really want the graph to have different sizing/color options, be hierarchical, interactive, and stable.

That sounds hard, but I don't think it actually is. The key thing that makes it easy is that there is very good hierarchy in this information, and you can create the graph incrementally. I think that means you can just use simple iterative penalty methods to make the graph stable.

Here's my proposal :

Start with the graph at very coarse granularity ; maybe directory granularity of you have a few directories and that makes sense, else file granularity. Whatever coarse level so you have < 32 nodes or so. Just use a solver like graphviz to make this initial graph.

Now, interactively the user can click any group to expand its hierarchy. When that happens the big cell splits into various pieces. You just create the new pieces inside the parent and make the new edges - and then you just let them time evolve with a semi-physical iterative evoluton.

You apply a penalty force for intersection with neighbors to drive the nodes apart so there's no overlap. You similarly apply forces with the edges to make them never intersect edges. And the edges also act kind of like springs, applying forces to try to be short and straight. Stable 2d physics is a pretty solved problem so you just let them run until they settle down. Note that as they spread apart they can force the other nodes in the graph to move around, but it's all nice and smooth and stable.

I think it's much easier to treat the canvas as just infinitely large and let your nodes move apart all they need to. Graphviz does everything oriented towards being printed on a page which is not necessary for the interactive view.


ADDENDUM 2 : after figuring out some evil things I've got graph_viz making better graphs. Here's an example : allocs2 SVG (direct clicking should work now too).

See comments on this post for details.

I'm still not happy with the edge labeling, and the FF SVG navigation is not ideal, but it's useable now.

17 comments:

  1. If you're unlucky enough to have installed the Adobe SVG plugin for IE, then Firefox won't even load it from your local machine unless you delete the SVG file association. Even if you point it at Firefox. Awesome.

    ReplyDelete
  2. Graphviz is quite cool indeed. Mainly because the source text of the graphs can be generated.

    I've a script that gets the #include from code to generate a nice graph of who includes what. Nice to see if one cpp includes a few headers or a whole forest of crap.

    ReplyDelete
  3. My usual obligatory comment: tcmalloc does something similar for heap and cpu profiling, and I agree it is very useful.

    http://code.google.com/p/google-perftools/wiki/GooglePerformanceTools

    Your MIME type is oddly set to text/html, despite the clear extention and doctype header (although I'm not sure that servers actually look at the latter), so browsers that respect that dutifully render it as quirks html.

    ReplyDelete
  4. I tried to check out tcmalloc for comparison but it doesn't even build in VC7.1 (aka 2003). I sent them an email but I'm sure they don't care.

    Also the cool performance tracking stuff only works in Linux.

    ReplyDelete
  5. Hmm yay now I'm trying to fix my Apache configuration. I think I accidentally deleted my entire "etc" directory at one point so cbloom.com is all screwy.

    ReplyDelete
  6. BTW while the graph thing is kind of neat, it becomes useless really fast for large complex programs.

    To really work you would need an SVG tree view.

    Like start with a graph at file granularity. Each file box is clickable and it expands just that part of the graph to function granularity. Then each function is clickable and it blows up to line granularity.

    That would rock but requires a semi-realtime graphing engine, graphviz is too slow to run like a dhtml thing where you have it make new graphs each time you click. (plus it's not stable so doing that would make the graph change wildly)

    I did see that SVG has some mouse interaction capability, so you could at least add some more info to the nodes and give them some kind of mouse-over popup box or something.

    ReplyDelete
  7. Sly - you should share that include grapher, that sounds pretty cool.

    ReplyDelete
  8. cbloom said...
    > Sly - you should share that
    > include grapher, that sounds
    > pretty cool.

    Already did actually. It's a bit hacky ; I quickly developed it only to work on some of my projects, not as a publishable tool. But I guess others could give it a try. It's a Unix/Cygwin script:
    http://tfpsly.free.fr/Files/GenerateDepGraph.sh

    ReplyDelete
  9. The google perftools mem grapher output looks almost exactly like yours, except that boxes are sized according to total bytes. We use it on giant programs without an interactive mode and it seems to be fine. The other thing it has is a diff mode, so you can find leaks.

    ReplyDelete
  10. One way to cope with large programs is to be able to "focus" on different stack traces. Perftools lets you specify a regex that filters the stacks it includes in the analysis. Really, it would be awesome if it could be done totally interactively. Stable layout would make things work pretty well.

    ReplyDelete
  11. What does Perftools use for the actual graphing?

    I tried making the boxes sized by bytes but it just made graphviz do a really bad job of the layout.

    ReplyDelete
  12. Hmm.. looks like it's in the PProf perl and they use DOT.

    It seems they don't actually size the nodes, rather they size the font on each node proportional to the alloc and then let DOT make the node size fit the font.

    Dear god reading perl is painful.

    ReplyDelete
  13. To change the size of node, you can use the width and height parameters:
    H_5 [height=3];
    0 being the default size, 1 the double, and so on...

    Ref:
    http://www.graphviz.org/doc/info/attrs.html

    ReplyDelete
  14. Sly, yes, though width & height actually set the size in "inches". Also you need to use fixedsize=true.

    ReplyDelete
  15. Urg. Okay I found a major thing :

    The font size output from graph_viz is broken / not supported by FF3. See :

    http://groups.google.com/group/mozilla.dev.tech.svg/browse_thread/thread/1d574c2690e37c7b

    http://groups.google.com/group/mozilla.dev.tech.svg/browse_thread/thread/1d574c2690e37c7b

    I've hacked a fix to this and my graphs look way better now.

    ReplyDelete
  16. The other thing I found from reading the pprof perl was a cool thing the Google guys are doing :

    They set the graph optimization edge weight by the size of the allocation. That way the "fat pipes" of big allocation get nice short straight edges, and the smaller allocs get thrown out to the borders and get long wavey edges.

    ReplyDelete
  17. > height actually set the size in "inches"

    Inches? How the fuck could I guess that? ;) RTFM I guess...

    ReplyDelete