Caustic Graphics: Hardware Raytracing

by Derek Wilson on 4/20/2009 4:00 PM EST
Comments Locked

48 Comments

Back to Article

  • jido - Thursday, April 30, 2009 - link

    With fast branching and parallel computing, as well as a good amount of RAM, there has to be other applications for this card. Could you do encryption/decryption or AI maybe?
  • lemonadesoda - Thursday, April 30, 2009 - link

    Wouldnt the "clearspeed e710" PCI board outperform this? The hardware is there. This company should focus on the software libraries.

    The CATS™ 700 1U rack module comtains 12 of these and delivers over 1 teraflops. With the right software you probably could do near-real time raytracing (seconds rather than minutes or hours per scene). Farm the CATS 700 and yes you could do real time raytracing on a one or two second "lag". That is, use 20 of these things, all rendering separate frames, one frame apart. It will take you a few seconds to build the first scene, but then you will have the rest ready to show real-time.

    How to manage this? Well that's the software issue these guys should be working on... NOT reinventing hardware that is already out there.

    AFTER they get it working on CATS then perhaps they mighht consider developing their own optimised hardware. But that should come SECOND not FIRST.

    Back to MBA skool boyz.
  • kyleb2112 - Saturday, April 25, 2009 - link

    If this can appreciably speed up raytracing, it'll find a market in the same demo that buys workstation graphics cards. But the way cpu cores are multiplying, they'll have to hurry up. Nothing loves multicores more than 3D rendering, and once we've got 32-core boxes this tech may be obsolete.
  • Draven31 - Thursday, April 23, 2009 - link

    ART PURE/Renderdrive anyone? Its been done, and the last couple times it has been tried, it was cheaper to buy six render nodes that ended up being the same speed, within six months. If its gonna be a year before they are even shipping cards to consumers, i doubt they are going to get *anywhere*
  • 7Enigma - Wednesday, April 22, 2009 - link

    ...and no one has a clue what that will be. Comon people, this article was little more than a puff press piece; interesting to read and make geeks giddy, but no actual substance. To be honest, this should be a blog post. Yes the description of the differences between ray-tracing and rasterization are nice and all, but there is no meat to this product at the time being.

    So no, it's not 20X faster, or 100X faster, or 2% faster, it doesn't yet exist and until independant testing has been done, I won't believe a word I read.
  • simtex - Wednesday, April 22, 2009 - link

    All the people that claim, this is a bad idea, intel will just copy it an make it run on their CPU. Well people the CPU have other tasks too, if ray-tracing is to be used in games, I would be rather anoyed if I couldnt get physics and AI calculations because my CPU had to do all the rendering work. So a card to off-load some of the computations would surely be a nice addition. Of course then Intel could cooperate with nvidia and offload physics and AI calculations to the graphic card, but that doesn't seems very likely atm.

    Also Caustics doesn't claim that this is a production board, infact they state that this is a prototype, and that their final product will use ASICs, and not FPGAs. Furthermore Caustics design does in fact consider the bandwidth requirements for ray-tracing, actually they claim that their algorithms are specially designed to cope with the limited bandwidth, and that this in their major achievement. Personally i think they use some sort of ray-bundling, although this have also been implemented in software ray-tracers they must have invented some new tricks to make it even better.

    Another great aspect of ray-tracing is that the frame rate is more dependent on the number of pixels you wish to render, than it is on the number of triangles on the scene, in contrary to rasterization.
  • ssj4Gogeta - Wednesday, April 22, 2009 - link

    Well I don't know much about all this, but I saw their video. The co-founder says that ray-tracing isn't a compute problem anymore, and that they looked at it in a different way. So, I'm wondering, if they're using a new algorithm or something that's making all the difference, can't Intel use their general purpose Larrabee to simulate that?
  • simtex - Thursday, April 23, 2009 - link

    Depends whether the algorithm is public available, as I understand the claims of Caustics is that it's their own algorithm and propably patented.
  • slusallek - Wednesday, April 22, 2009 - link

    It is strange to see that someone proposes a hardware architecture that by design is bandwidth limited. They have separated ray tracing at the point where the bandwidth is highest -- whihc seems like not a really smart move.

    By doing the ray traversal and intersection on their card and the shading on the GPU they must constantly transfer ray data between the two: transfer the hit point of each ray/pixel to the GPU (point, normal, texture coordinates, shader ID, ...) and then transfer any rays newly generated by a shader back to their chip (origin, direction, min/max_dir, ...). Each of those transfers is easily between 30 to 60 bytes per ray.

    So for a HD screen this is easily 100MB for just a single ray generation and only one sample per pixel. Given a PCI 1.0 4x bandwidth of 1GB/s this gives a theoretical maximum of just 5 fps -- and we have not done any work yet. No AA, no shadow rays, reflections, they all generate multiples of this in bandwidth. Even with PCI 3.0 and 16x lanes this will be a huge bandwidth issue.

    Let me also just point out that one of the first RTRT papers on visualizing car headlights (http://graphics.cs.uni-sb.de/Publications/2002/200...">http://graphics.cs.uni-sb.de/Publications/2002/200... already achieved up to 10 fps at the same video resolution. Note that the headlight used up to 25 rays per pixel and lots of complex multiple reflection and refraction. It ran on a cluster of 16 nodes with dual CPU single-core Athlon 1800s providing a total of 32 cores.

    Compare this with a latest machine used by Casutics with likely Dual Quad-core CPUs giving something 16 cores (with hyperthreading) each one easily providing a multiple of the FLOPS of the old Athlon. So it seems their performance is not even reaching what has already been done 7 years ago.

    So in summary, I am not really impressed by what Caustics claims. Their hardware architecture is severely limited by design and their software results are way behind what has been done many years ago.
  • nubie - Tuesday, April 21, 2009 - link

    I for one am looking forward to a time when games no longer have ugly polygons.

    Even in recent games cylindrical objects are pictured as comprised of as few as 8 sides.

    If you can send the raw math into the equation for the rays to hit it will make the whole thing much more powerful.


    I would love to see this and Larrabee succeed, progress and competition are always good for the consumer.
  • tdenton1138 - Tuesday, April 21, 2009 - link

    Check out their videos here:

    http://vimeo.com/4240520">http://vimeo.com/4240520
  • poohbear - Tuesday, April 21, 2009 - link

    im so replacing their ugly blue heatsink w/ my own aftermarket cooling..... gonna overclock the hell outta this thing!!!! WOOOOOOOOOOOOOOT! Raytracing here i come!
  • Flunk - Tuesday, April 21, 2009 - link

    The use of FPGAs for Caustic One makes it sound more like a prototype than an actual product. It's nice to see people trying to sell new ideas but it might be a bit premature today. Then again a proof of concept is always nice.
  • DerekWilson - Tuesday, April 21, 2009 - link

    FPGAs are good enough if the benefit is real. lots of people use FPGAs in shipping devices. an ASIC would be better (faster), but it requires a lot of start up money.
  • DeathBooger - Monday, April 20, 2009 - link

    I'm a professional 3D artist. I don't really see this taking off. Right now I have a core i7 and it does the job for me just fine. I create photorealistic images and animations for a living and I don't really see the point of this in this day and age. And if I can't see the point, I doubt production companies with access to large render farms will. Especially if it makes more fiscal sense to pop in a new processor instead of changing out all of the mainboards to fit a new card that might be faster.

    The software used for ray tracing these days is a lot faster than it used to be. 3rd party apps like Vray pretty much solved the issue of slow ray tracing years ago.

    I could see this taking off for games to get real time global illumination to be come a standard, but only if Microsoft and Sony decides to add Caustic hardware to their next consoles. Keeping it PC exclusive wouldn't go anywhere long term.

    Actually another prospect would be for Nvidia to buy them out since they own Mental Ray. Mental Ray is the renderer that ships with most 3D software these days. It still won't change the fact that people in the know use Vray instead since it's a lot faster than Mental Ray and more user friendly. Mental Ray is more powerful in the right hands and I could see the film industry gobbling these cards up if the SDK was implemented into Mental Ray, but freelance guys like me probably will never touch one.
  • Griswold - Tuesday, April 21, 2009 - link

    You dont sound like you actually know what you're talking about in respect of this hardware let alone like a professional 3d artist who does what this thing was designed for...
  • ssj4Gogeta - Tuesday, April 21, 2009 - link

    I'd like to disagree too. They say they provide a 20x improvement in rendering times. Surely this card will be cheaper than buying 20 processors. And who said you need to replace the mainboard? It clearly uses a PCIe slot. Look at the pic.

    Now if Intel can deliver something with similar better performance with Larrabee, at a price point that many consumers can afford, then things would be different.
  • DeathBooger - Wednesday, April 22, 2009 - link

    PCIe is a rare commodity in servers still to this day. Render farms use servers, not typical workstations. This company is essentially trying to add another component where one never existed before. That requires a total reconfiguration for server farms. It's not like each server has a video card that can just be swapped out for this Caustic card easily.

    Tell you what, if Pixar adopts it, then I'll eat my words. Pixar has the resources to do anything they want. If they find value in this card then I was wrong.
  • RagingDragon - Friday, April 24, 2009 - link

    Uh, you might want check the HP, IBM and Dell server linups...

    New Intel/AMD servers do have PCIe (mostly 8x and 4x)
    New RAID controllers are mostly PCIe 8x or 4x
    10GB ethernet, fiber channel cards, etc. are mostly availible in PCIe too.
  • Tuvok86 - Tuesday, April 21, 2009 - link

    Monsters vs Aliens movie required 40 million hours of rendering time
  • DeathBooger - Wednesday, April 22, 2009 - link

    They're speaking in terms of workstation hours, not actual hours. HP is hyping their products so it is misleading.
  • Roland00 - Tuesday, April 21, 2009 - link

    So each frame took about (94*60*30=169,200 frames)

    Thus each final frame took 236.40 hours of render time.
  • Verdant - Monday, April 20, 2009 - link

    I respectfully disagree, a fully raytraced scene with anything more than basic lighting can easily take well over a minute per frame, even if you have a huge render farm, it takes a long time to render an animation of any significant length and detail. Most larger animation houses would jump on something like this, if it really can render their frames 20x faster, and not use 20x the power.
  • jabber - Monday, April 20, 2009 - link

    ....that still cant show anything but a rendered image of its product several months after its been announced.
  • Tuvok86 - Tuesday, April 21, 2009 - link

    the card pictured
    http://www.pcper.com/images/reviews/694/card-angle...">http://www.pcper.com/images/reviews/694/card-angle...
  • monomer - Monday, April 20, 2009 - link

    Is it just me, or does the Caustic logo look similar to a slightly rotated Quake III logo?
  • SonicIce - Monday, April 20, 2009 - link

    lol yea. except its like quake 6 or something
  • ssj4Gogeta - Monday, April 20, 2009 - link

    If raytracing catches on in games, how long will it take Intel to make similar units and put a couple of them on the Larrabee die? I'm sure if they could do it, Intel's scientists too.

    Besides, from what I've seen/read, it seems Larrabee will be good enough for raytracing. In a Larrabee research paper from Intel I read that Larrabee is 4.6 times more efficient than Intel Xeon (Core based) processors in raytracing on a per clock, per core basis. Also, Intel ray traced quake war at around 25 fps @1280x720 using 4 Intel Dunnington hexa-core processors (24 cores in total).

    So if Larrabee will have 32 cores, and even if we take it to be 4x more efficient instead of 4.6 (scaling etc.) then it will be (32*4)/24 = around 5.5 times faster than the setup they used. That's enormous! 130 fps at 1280x720 res for a fully ray traced game!, or you could increase the res and keep the fps to 60. Besides, Larrabee will most likely have MUCH more bandwidth available than that FSB based Dunnington system had.

    I can't wait, Intel. Hurry up!
  • lopri - Monday, April 20, 2009 - link

    Interesting article. Thank you for the explanation on Ray Tracing v. Rasterization. The difference is still confusing to me, but hopefully I'll eventually understand. I don't expect a simple answer to my questions but maybe someone can enlighten me.

    1. Doesn't Ray-Tracing still require triangles anyway? I understand Rasterazation as Derek explained: draw triangles and 'flatten' them. Ray-tracing shoots (?) rays on triangles. So it still needs triangles anyway. It sounds more like shooting rays on 'flattened' triangles.. Oh but what do I know.

    2. Is there any 'fundemental' reason why Ray-traced images look better than rasterized images? It seems to me they're just different ways for a same result. Yes, I'm a noob.

    Anyway, I agree with others regarding this specific company. It's proably applying some patents and then looking to be bought by bigger fishes. Do they even have a working hardware?
  • DerekWilson - Tuesday, April 21, 2009 - link

    1) You can do raytracing without triangles -- you can just use math to describe your objects like spheres and stuff as all that's really needed is an intersection point. But you can also use triangles, and this is often what is done because it does still make some things easier. You just do intersection between a line and a plane and see if the intersection point falls inside your triangle. So -- for rasterization, triangles are required while for raytracing they are perfectly fine to use but you aren't as locked in to using them as with rasterizers.

    2) because each pixel can contain input from more of the rest of the scene with little programatic complexity and a high degree of accuracy. it is possible for raytracing to produce a more accurate image /faster/ than rasterization would require to achieve an equally accurate image. however, it is possible for rasterization to produce an image that is "close enough" MUCH faster than raytracing (especially with modern hardware acceleration).

    ...

    there are some raytraced images that look very bad but accurately portray reflection and refraction. accuracy in rendering isn't all that's required for a good looking image. The thing that is being rendered also needs to be handled well by artists -- accurate textures and materials need to be developed and used correctly or the rendered image will still look very bad. I think this is why a lot of raytracing proof of concepts use solid colored glass even when they don't have to. I honestly don't think the sample images Caustic provided are very "good" looking, but they do show off good effects (reflection, refraction, caustics, ambient occlusion, soft shadows ...) ...

    so ... I could take a diamond and try cutting it myself. I could put this diamond on a table next to a really well cut cubic zirconium. people might think the imitation looks much better and more "diamond" like in spite of the fact that my horribly cut diamond is a diamond ... which one is "better" is different than which one is more "accurate" ... both are good though :-)

    hope that helps ...
  • ssj4Gogeta - Monday, April 20, 2009 - link

    I'm no pro, but from what I know the main difference is that things like shadows, refractions and reflections render MUCH better. This is because in raytracing you also use secondary rays. So the rays reflected off/refracted from a surface can affect the color of other nearby surfaces, producing shadows, reflections, etc. In ray tracing you do it just like nature does it in real world (in reverse of that, but that doesn't affect the outcome). In rasterization, you need to manually program for producing shadows, reflections, etc. and so they are mostly just approximations.

    Another advantage of ray tracing is that programmers don't need to work that hard - things which may take you hundreds of lines of code in rasterization, only take 10 lines in ray tracing.

    Compare this to 3D vs 2D rendering of a 3D cube. Suppose you need to render the cube as the camera circles around it. If you're doing it in 3D, you just render it in 3D, flatten the image and display it. Now if you're using 2D, but you want to create the effect of 3D, you don't have to render a cube - you need to directly render how the flattened image would look. That is, you need to take into account the 3D distortion due to perspective/depth and you need to make the edges of the cube oblique accordingly, you need to make the part of the cube that is farther look farther by rendering it smaller than the face of the cube that is closer to the camera. A lot of work for the programmer, and needless to say, it won't be very accurate. Well in this case it may be accurate as it's just a cube with straight edges, and so you can easily calculate things. But not in the case of a complex object.

    Same way, rasterization at best can offer approximations of phenomena like refraction etc., how close they are depend on the programmer.

    Check this rendered pic to see what ray tracing can deliver:
    http://upload.wikimedia.org/wikipedia/commons/e/ec...">http://upload.wikimedia.org/wikipedia/commons/e/ec...

    Note: I'm not a graphics programmer but this is how I understand it. Please correct me where I'm wrong.
  • AmbroseAthan - Wednesday, April 22, 2009 - link

    That picture is a ray-traced rendering?! It looks like a photograph! Someone put in a lot of time and crunching power on that one.
  • JimmiG - Monday, April 20, 2009 - link

    I'm also wondering about #2 - is Raytracing really "better", or just "different"?

    Back in the 90's I used to be really impressed with the quality of raytraced animations and pictures, with their shiny, reflective objects, realistic water, lighting, soft shadows etc. However back then the capabilities of "3d accelerators" were very limited - 3d games used simple models, "flat" lighting models with no shadows and only one light source, and blurry textures without any shader effects like bumps, parallax maps, reflections etc. Today it seems the latest 3d engines already do everything in realtime that you needed raytracing and many minutes/hours per scene to do in the past.
  • jimhsu - Monday, April 20, 2009 - link

    Someone correct me, but i think it is not that raytracing "looks" better, but because it is closer to a physical description of light (only in inverse), effects such as ambient occlusion, caustics, and other shiny things can be implemented in a relatively straightforward manner in a physically correct manner, while rasterization requires the use of shaders to "emulate" reality. These approximations are often complicated to program and implement, even though they achieve nearly the same effect.
  • DerekWilson - Tuesday, April 21, 2009 - link

    this is sort of true ... it's possible to write shaders for a rasterizer that do everything a raytracer does. But in addition to the code complexity in a z-buffer based rasterizer you end up with performance disadvantages.

    At the point where you properly and accurately emulate raytracing with a rasterizer you need to start generating all kinds of dynamic environment maps every frame for every object. treating objects as light sources and doing real time radiosity for rasterization (which can be done as well) is also difficult. To get an image that is as physically accurate as raytracing, rasterization (at this point with todays technology) would be slower even using a GPU designed for rasterization.

    Honestly, there are some things that rasterization can approximate well enough that we don't notice the difference, and I think for a long time to come we'll still see rasterization as the main vehicle for realtime graphics. I think we'll start to see raytracing come in as a secondary tool to augment rasterization and add effects that are tough to achieve otherwise.
  • lopri - Tuesday, April 21, 2009 - link

    I think I am learning more abou raytracing from this article and comments than anywhere else to date. Thank you for excellent explanations and analogies!
  • jimhsu - Monday, April 20, 2009 - link

    An analogy for math majors would probably be trying to solve a function analytically (i.e. rules of calculus) vs. numerically (i.e. Newton's Method, Euler, etc). The numerical result is often close to the analytical result, but intuitively the analytical result is the "right" way to do the problem, except when it is infeasible (we have something that we can't integrate).
  • Einy0 - Monday, April 20, 2009 - link

    This may actually help Larrabee gain ground. An alternative to do realtime raytracing. Then again Larrabee may help this gain ground as Larrabee will be do both rasterizing and raytracing.
    I'm surpirsed they can make anything worth buying with FPGAs. I guess FPGAs have come a long way. I'd love to get some specifics on the underlying architecture.
    Bad timing considering the global economy etc...
  • ifkopifko - Tuesday, April 21, 2009 - link

    lol... real-time raytracing? Keep dreaming. :-D Something like that is far far in the future.
  • Sivar - Tuesday, April 21, 2009 - link

    There have been assembly language demos from the 4k scene which have done realtime ratracing since the late 90's. In software. On a Pentium MMX.
    They aren't quite what I'd call Pixar-quality, but it's far from impossible.

  • HelToupee - Tuesday, April 21, 2009 - link

    Go outside. Look around. Real-time raytracing is here today! The future is now!! :)
  • MrPickins - Monday, April 20, 2009 - link

    The FPGA implementation surprised me as well. It's impressive that they can get such performance out of a pair of them.
  • SonicIce - Monday, April 20, 2009 - link

    I'll give them 12 months...
  • Harbinger - Monday, April 20, 2009 - link

    I'm pretty sure they will succeed. Just make a working prototype and prove to Pixar/Dreamworks/Disney/whatever that this thing will hugely accelerate they're rendering.

    They don't have to appeal to masses that expect a wide variety of features on a wide variety of platforms and software. They target a very very specific segment and if they can convince that segment they'll gonna be fine.
  • DerekWilson - Tuesday, April 21, 2009 - link

    You are right, except if Larrabee competes with this in terms of speeding up raytracing ... but we'll have to wait and see on that one. If they focus on a niche market, they could succeed.
  • RamarC - Monday, April 20, 2009 - link

    agreed, unless they get a mainstream rendering app to sign on and can get some royalties out of the software end. if not, nvidia will just implement a similar api and they'll promote using quadros as render accelerators.
  • ssj4Gogeta - Monday, April 20, 2009 - link

    Unlike Ageia PhysX, this is not about the API, but the hardware.
  • smartalco - Monday, April 20, 2009 - link

    Except, given that this is /custom hardware/, nvidia can't just role out a CUDA update

Log in

Don't have an account? Sign up now