The *other* wizard named Harry.

I read through The Dresden Files by Jim Butcher a bit back, and I simply must recommend them here. The best thing about these books is the author’s deep grasp of the nature of the spiritual world, which shows through in everything from the way he explains the nature of magic in his fictional world to the way he portrays the magical/spiritual characters and worlds involved. These books have spawned in me a lot of thinking and conversing with God about whether some of my beliefs about the nature of reality are really well founded or whether I’ve simply gone along with the winds of the cultural assumptions without really thinking about it. All in all, I’ve found the books to be something that God’s used to push me toward’s new truths that I would likely not have gotten to otherwise, being too comfortable where I was. Do prepare yourself for true depictions of the ugliness of both the spiritual world and the human soul, though; Jim Butcher does not hold back. Also, as the books progress, the depth and complexity of spiritual concepts Butcher explores increases, so hold onto your seat!

Since I’m reviewing a series of books here, I suppose I should at least comment on the slightly more mundane details… The prose are well written, the main character (who is also the narrator) is enjoyably witty, the plot is unpredictable but believable, all the characters have fullness and depth (proportional to how much they intersect with the plot of course), the imagined world and its mechanics make sense, and I find the stories overall quite suspenseful. I can’t wait for the next one!

On compositing, APIs, performance, and simplicity

I’ve been thinking about the whole software structure involved in the sharing of video hardware among processes doing accelerated video operations (which I mean in a general sense, from video decoding to 3D). The development of compositing has certainly changed the paradigm of how this is done, and I believe it’s a great improvement. However, it came as a hack that the designers of what it was built on had never considered, much like AJAX, and as a result there are some serious design issues that need to be addressed before it will truly work well. The problem that jumps out to me is that the critical path of video content rendering between it’s origin in a userspace process and its display on the screen has to cross the CPU/GPU barrier 3 times, which is just silly. Allow me to illustrate:

  1. Program(CPU) gives the GPU some data and directions for how to render it into a buffer. (Critical path goes from CPU to GPU)
  2. GPU renders into the buffer and it sits around waiting for the compositor(CPU) to get a scheduling slot. (Critical path goes from GPU to CPU)
  3. Compositor(CPU) tells the GPU to take the buffer and render it as a part of another 3D scene. (Critical path goes from CPU to GPU)
  4. During output of the next frame, the compositor’s output buffer is put on the screen.

Obviously, the compositor process shouldn’t be in the critical path here. Consider if the path looked like this:

  1. Compositor(CPU) responds to some event (such as input), calculates a description (as a function of time) of how to assemble various buffers into a 3D scene, and sends it to the GPU. (Not part of critical path)
  2. Program(CPU) gives the GPU some data and directions for how to render it into a buffer. (Critical path goes from CPU to GPU)
  3. The GPU uses stored descriptions to automatically composite buffers in preparation for the next frame, thereby getting the program’s data to the screen.

This would not only increase performance of composited desktops, but also greatly simplify the problems involved in avoiding tearing while maintaining low latency from generation to display, etc, since the problem wouldn’t be spread across multiple domains anymore, but entirely in the control of the graphics driver writers. Furthermore, compositors wouldn’t use constant CPU time anymore, could be stacked (which has its uses, from a software point of view), and even esoteric things like direct graphics hardware access from inside virtual machines through I/O virtualization don’t require anything particularly unusual on this layer, which I think is one of the marks of a good solution. In order to implement this though, aside from the things that might need to be done at the hardware and driver levels (about which I know very little), one would need a different kind of graphics API for compositors to allow it to send these compositing descriptions to the GPU; simply extending an existing API wouldn’t cut it.

I don’t pretend to know just how difficult this would be, but I do think it’s the right solution.