<div dir="ltr"><div dir="auto"><div dir="ltr"><span></span></div><div dir="ltr"><div dir="ltr"><span></span></div><div dir="ltr"><div dir="ltr"><span></span></div><div dir="ltr"><div dir="ltr"><span></span></div><div dir="ltr"><div dir="ltr"><span></span></div><div dir="ltr"><div dir="ltr"><span></span></div><div dir="ltr"><div dir="ltr"><span></span></div><div dir="ltr"><div dir="ltr"><span></span></div><div dir="ltr"><div dir="ltr">Hi everyone,<br><br>We just had the Eurographics Symposium on Rendering (EGSR) this past week, so here are some some of the things that I learned while I was there that might be of interest to the rendering folks at UCSB.<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> Lingqi was also there and he can probably add some other things that he thought was interesting.</span><br><br>(Sorry to spam everyone, but we don’t have a list dedicated to everyone who does rendering projects across our labs. Maybe we should either create a mailing list or start a Slack channel focused on rendering.<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> Maybe one of the students can help with that process.</span>)<br><br><b>SAMPLING</b><br>There were several good papers on sampling that we should probably read and go through. Specifically, the paper by Heitz and Belcour got the “best paper” award, and the paper by Jarosz et al. also got some attention.<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> He posted his talk here so you can watch it:
<a href="https://cs.dartmouth.edu/~wjarosz/publications/jarosz19orthogonal-slides.mp4" target="_blank">https://cs.dartmouth.edu/~wjarosz/publications/jarosz19orthogonal-slides.mp4</a> </span></div><div dir="ltr"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></span></div><div dir="ltr"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">Let's read these and some of the other papers of interest in the rendering discussion group.</span><br><br><br><b>PATH GUIDING</b><br>It seems path guiding is a very hot area of research and many people are pursuing various aspects of it. Some people feel like this will be the key for the next big rendering algorithm that <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">could potentially </span>transform the <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">field</span>.<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> Many people that I talked to are looking into ideas that are similar to the ones that we have either done, or are currently trying to do.</span><br><br>@Steve: I think that our last project with Pixar was precisely starting to drill at the entrance of <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">this </span>potential goldmine. It is most unfortunate that we couldn’t get it to the level it needed to get it accepted at a proper venue, but hopefully we can get it out soon and <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">release a co</span><span class="gmail_default">de that fixes the obvious limitations of the current method (e.g., properly sample the hemisphere, handle specular/glossy surfaces by properly accounting for the BRDF in the importance sampling process, etc.) so that this wo<font face="arial, helvetica, sans-serif">rk can start to get traction. It </font></span>seems many people are interested in these ideas<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">! </span></div><div dir="ltr"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></span></div><div dir="ltr"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">Of course, the main problem with this algorithm (only guiding the first bounce) is an artifact of the decision to represent the incident radiance with a hemisphere map like we did, and I don't think we are going to be able to fix that in this version of the algorithm. Hopefully we will be able to address this in Chris's project.</span><br><br>@Chris<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">:</span> we need to push hard on your project because there is a sense <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">in among researchers in rendering that </span>we are drilling near a <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">large </span>goldmine and you may have some <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">serious </span>competition (e.g., Pascal Grittman<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">, Jaroslav Krivanek, and others</span>) who seem to be examining <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">quite </span>similar ideas. This is a good sign, because it means that we are absolutely on the right track, but now it’s a race to the top to see who gets there first!<br><br>The last time I was this excited about the potential of a rendering project was in 2008-2009 when Soheil and I started working on the first general MC denoisers, and then again in 201<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">2-201</span>3 when we started <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">thinking about </span>using ML for denoising for the first time. <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">In both cases, I felt like we were on the brink of doing something that would change the way the rendering community does things, and in both these previous cases these are things that are now gaining huge traction in research and industry. </span></div><div dir="ltr"><br></div><div dir="ltr">I think this has the potential to be the next big thing, but we have to move <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">very, very </span>fast<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> and make sure we pull out all the stops by demonstrating our algorithms on complex scenes</span>. Some of the things we need to be able to do<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> are</span>:<br><br><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">- Compare successfully (and convincingly) against standard path tracing, bidirectional path tracing, VCM, etc., and of course the current slew of path guiding methods. However, since many of these path guiding methods do online learning, they should be no problem to beat. We also need to find scenes where neither unidirectional nor bidirectional path tracing has no hope of working at all. We can discuss more in detail.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div>- Handle complex materials<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">, not just basic diffuse and specular, but hopefully more complex BRDFs since a lot of the bulk of the rendering time in production goes into evaluating shaders. We need to make sure that our method accounts for these BRDFs properly when computing the PDFs for sampling so that it can handle arbitrary materials.</span></div><div dir="ltr"><font face="arial, helvetica, sans-serif"><br></font></div><div dir="ltr"><font face="arial, helvetica, sans-serif">- Keep memory in check. Although this isn’t a major issue, we need to make sure that our memory usage doesn’t balloon from our method, as usually production scenes have to deal with a large amount of memory for geometry, buffers, etc. I think your approach of culling old passes of vertices is not a bad idea, but we should also explore things like deleting vertices that are not recently used (kind of like in caching). We should also explore the full trade off between memory and quality<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">, to see how much better things can get if we decide to keep every vertex (although that would consume a large amount of memory).</span></font></div><div dir="ltr"><font face="arial, helvetica, sans-serif"><br></font></div><div dir="ltr"><font face="arial, helvetica, sans-serif"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">- Demonstrate good performance on sufficiently complex scenes. See below.</span><br></font><div><font face="arial, helvetica, sans-serif"><br></font><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">- Subsurface scattering/volumetric transport (?) Folks were concerned about how some </span><span style="font-family:arial,helvetica,sans-serif">bidirectional or path guiding approaches might work in these kinds of situations. It would be interesting to explore in more detail. This could be the subject of a follow on paper<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">, as it seem to me that trying to handle this in one submission could be too much.</span></span></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">- Some people seemed to be skeptical about running a neural network at every bounce to do anything, say reconstruct the sampling PDF. They thought it would be too slow. I still think that is the way we should do it for the first version of the paper, say the SIGGRAPH paper, but then we can come up with a "simplified" version that we can submit to say, EGSR, that does some quick adhoc thing and hopefully show comparable results.</span></font></div><div><font face="arial, helvetica, sans-serif"><br></font></div><div><font face="arial, helvetica, sans-serif"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">We can discuss this in more detail when we meet face to face next week.</span><br></font><br><br><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><b>KEYNOTES</b></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">We had three good keynote speakers. Summary of their thesis of their talks:</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Jaakko Lehtninen (Aalto University/NVIDIA): Machine learning models do not provide semantically meaningful abstractions that allow them to leverage human-understandable constructs, such as the Newton's equations of motion or Kajiya's rendering equation. How do we build systems that can bridge this gap? This was not necessarily new; we have talked about this issue ourselves in the ML discussion group and I think it is an interesting area of research. Unfortunately, there were no concrete details about how to do this in this talk.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Marcos Fajardo (Solid Angle): Lots of pretty pictures/short films and a history of the development of the Arnold path tracer renderer. He also made some interesting comments about how only unidirectional path tracing is used in industry, and about keeping things efficient (such as only loading things once). One of the things he was the most proud of was the fact that he doesn’t use pre-processing in Arnold, like photon mapping, which he didn't seem to think it was a good thing (not only because of bias issues). So @Chris, we should consider tracing the eye rays first and then starting the iteration, so that there isn't any pre-processing. Let's think about what difference that makes.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Ali Eslami (Google Deep Mind): He presented his work in Science where he trained a machine learning system takes in a set of images and creates a scene representation as a latent vector (encoding) and then another machine learning system takes this and a camera pose and creates an image (decoder). The latter is essentially learning the rendering process, and can do occlusions, etc. It can also do some very cool "image math", such as image of blue sphere minus image of red sphere plus image of red triangle = image of red sphere. We've discussed this paper in our group before, but we should probably schedule some time to look at it in more detail. It's very "far out" kind of research, but it's also very exciting! In particular, I have some interesting ideas based on this kind of work that I think could significantly impact rendering. If anyone is interested in using Machine Learning to directly learn the rendering process, please talk to me. I'm looking for a someone to work on this!</div><br><br><b>INDUSTRY PERSPECTIVE<br></b>I also spent some time at the conference talking with folks in production (e.g., Luca Fascione, Johannes Hanika at Weta, Marcos Fajardo at Solid Angle) trying to understand their problems better<span class="gmail_default" style="font-family:arial,helvetica,sans-serif">, what things academic researchers like us should focus on, and how the algorithms we develop can be made more portable to industry. We talked about various things:</span><br><br><b><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">Unidirectional Path Tracing</span></b><br>It seems that everyone in the community <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">currently </span>uses unidirectional path tracing<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> for the bulk of their rendering. Of course, every single system uses some kind of MC denoising method (it’s good to see that our work can have significant impact!) but surprisingly they seldom use bidirectional path tracing.</span><span style="font-family:arial,helvetica,sans-serif"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> Specifically, t</span>hey listed several problems with bidirectional path tracing:</span></div><div><ul><li><font face="arial, helvetica, sans-serif"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">Significantly increase in computation. Say you have 8 bounces of eye and light paths. You have to compute the PDFs of all possible interconnections 8x8 = 64 possible complete paths between them, which involves computing a lot of complex BRDFs and evaluating everything. Very expensive, especially if you take into account the following bullet points...</span></font></li><li style="text-align:center"><span style="font-family:arial,helvetica,sans-serif">Kinds of scenes. Most of the scenes they work with in production are open, not closed boxes like the Cornell box. Furthermore, the <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">positions of the </span>lights often lead to paths <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">that lead </span>away from the <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">eye</span>. <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">For example, a common example in production is a</span> closeup<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> </span>of a character's face <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">lit by a</span> rim light from behind, like this:<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span></span><img src="cid:ii_jy22r5pg1" alt="image.png" width="501" height="249" style="text-align: center; font-family: arial, helvetica, sans-serif;"></li></ul></div></div></div></div></div></div></div></div></div></div></div></div><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div dir="auto"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div><div>This is a very common shot<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> in production, since cinematographers like to light their characters from behind because it is more dramatic. So this means that the light paths will hit the back of the character's head, and (most often) bounce off into open space or hit parts of the scene that have nothing to do with any eye path. So bidirectional path tracing doesn't buy you anything, and it only eats up your time trying to make connections with light vertices that have bounced far behind the occluding character and will never be seen by the camera.</span></div></div></div></div></div></div></div></div></div></div></div></div></blockquote><div dir="auto"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div><ul><li><span style="font-family:arial,helvetica,sans-serif">Their scenes have a LARGE number of lights. Thousands. Many of those are nowhere near the camera, nor do they have some path to reach the camera. So when you start a light path in bidirectional path tracing, you have to pick a random light and then a random point on that light and trace the path, only to find that most likely it never connects with the eye path at all.<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span></span></li><li>Fireflies. Bidirectional path tracing produces fireflies that <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">often </span>unidirectional path tracing does not<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> because the light source is never seen</span>. Although there are methods to deal with them (e.g., clamping <span class="gmail_default" style="font-family:arial,helvetica,sans-serif">their values </span>during denoising), apparently they are annoying to deal with. Furthermore the noise patterns from unidirectional path tracing is easier to deal with so this is why they prefer to deal with that.<br></li></ul><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">For these reasons, most production rendering systems rely mostly on unidirectional path tracing. Some systems include switches that allow them to turn on bidirectional path-tracing in certain scenes, but more often than not unidirectional path tracing is used even though it cannot handle light transport in complex scenes.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">However, <span class="gmail_default"></span>@Chris, your project will DIRECTLY address many of these problems by using the eye vertices to guide the light transport paths. This is why it is so exciting: it will combine the benefits of bidirectional path tracing and those of the simpler, unidirectional path tracing. It could be a game changer.</div></div><div><br></div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></span></div><b>Scenes to Render</b></span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">On a related note, I asked the folks in production about the kinds of scenes we use in academia and how they compare against the scenes used in production. </span><span style="font-family:arial,helvetica,sans-serif">What kinds of scenes should we be using in order to demonstrate that our algorithms would work in practical scenes that are used in production?<span class="gmail_default" style="font-family:arial,helvetica,sans-serif"> </span></span><span style="font-family:arial,helvetica,sans-serif">Are we “overfitting” the algorithms we are working on to a bunch of useless scenes that are not useful in practice? </span><span style="font-family:arial,helvetica,sans-serif">In short, mostly yes! There are a few reasons for this:</span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><div class="gmail_default"><ul><li><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><div class="gmail_default">Light oversimplification. Our scenes usually have a few lights, and maybe a single environment light. Most production shots have hundreds (if not thousands) of lights. For example, at Weta their area light sources are tesselated and have different colors per "cell", so a single area light source can easily be considered thousands of smaller lights of different colors. To deal with these complex lights, most production rendering systems build complex light hierarchies and use them for sampling. It is unclear whether they take visibility into account or not, or only the intensity of each light source. We should manually (or rather procedurally) modify our scenes to have these kinds of light sources to emulate more realistic scenes.</div></span></li><li><div class="gmail_default">Geometry. Our scenes usually have much simpler geometry than the billions of polygons found in typical production scenes. For this reason, Disney has released the Moana island scene (<a href="https://www.technology.disneyanimation.com/islandscene" target="_blank">https://www.technology.disneyanimation.com/islandscene</a>) and we should probably think about using it in our experiments more often. We should also build our scenes that stress our systems.</div></li><li><div class="gmail_default">Shaders. Simply using "diffuse" and "specular" as our material properties is FAR removed from the state-of-the-art in practice. If we want our work to gain the respect we want (in particular for some of our methods that are either computing the PDF for importance sampling, or modeling the "phase" function of a voxel of geometry) </div></li><li><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><div class="gmail_default">I think rather than rendering empty room environments, we would do ourselves a huge favor by putting characters in these same environments and lighting them properly (see pic above) in order to get scenes that behave much closer to what you see in production environments.</div></span></li></ul></div><div class="gmail_default"><br></div></span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><b>Bottlenecks</b></span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">I asked a lot about performance of their systems, and got a rough breakdown of render times for a typical 10 hour render (which is not uncommon in production):</span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">30% tracing rays [3 hrs]</span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">30% evaluating shaders [3 hrs]</span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">10% picking light sources [1 hr]</span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">20% Texturing, other memory access [2 hrs]</span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">10% everything else (e.g., writing buffers, file I/O, scene loading) [1 hr]</span></div><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Or something like that. Note that many of the renderers do some kind of deferred shading where they output the texture coordinate lookups and then do the memory fetch all at once at the end to reduce memory bandwidth.</div></div><div><font face="arial, helvetica, sans-serif"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></span></font></div><div><font face="arial, helvetica, sans-serif"><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">@Chris, your project would at least help address 40% of this work load (tracing rays and picking light sources), but as a group we should also think about ways of evaluating shaders more efficiently. Perhaps we should revisit an approach that @Steve explored in one of his internships at Pixar, which tried to use machine learning to simplify the shading/texturing process. I still think there is value here, perhaps by encoding the shader in a latent vector and using that at run-time. We should discuss at some point.</span></font></div><div><br></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"><b>How they get the low variances</b></span></div><div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif">Most production shots are rendered between 128-256spp, and some even at 64spp (Marcos played a nice short movie that he claimed was rendered at 64spp!). Of course, most of them are denoised, but even without denoising they still exhibit an extremely amount of variance for an MC rendering system. I probed them quite a bit about what kinds of things they found to have the biggest impact on variance reduction. Here is what they said:</span><br><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><ul><li>Direct lighting! The biggest source of noise is the direct lighting. They spend quite a bit of time (10% according to the breakdown above) building the light hierarchy and figuring out which of the light sources they need to sample from to ensure that they are mostly sampling the brightest sources in the scene.</li><li>Good sampling patterns. In his talk, Marcos admitted that they could not use QMC patterns because of patent reasons but they found good sampling patterns that worked very well.</li><li>Multiple Importance Sampling (MIS). This is one of the keys to getting reduced variance, but getting good weighing schemes is tricky. If @Chris's method is successful, how much would we need this?</li></ul></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><b>They don't render shots</b></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Production rendering systems don't usually render entire shots, just individual frames. I pressed them on this point and they said that they don't have a specific reason for doing this, but that's the way they typically do it. Weta folks did admit they had a secret project they couldn't talk about where they are working on rendering a shot all together. Personally, we should move towards rendering of entire shots; it just makes sense. For example, in your project @Chris we can leverage vertex information in neighboring frames to further inform the process, not just the ones that are neighboring in space. I think this should really make a huge difference, because if you think about it, in a scene where the camera is moving but the scene is not the light transport will not change from frame to frame, so information from one frame will be completely useful in another. When scene objects start moving around this breaks down a bit, but still there will be considerable coherence. So it will still be tremendously useful.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Overall, I feel that THIS IS AN UNDEREXPLORED ASPECT OF RENDERING and I have been clamoring to see someone work on for the past 10 years. So far, however, except for a little bit of temporal coherence in the MC denoising paper no one has really taken me up on the challenge and took this head on. I think this would make a huge difference.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">In any event, this is all I have time to core dump for now. We can discuss more at a meeting if folks want to discuss it further.</div><div><br></div><span class="gmail_default" style="font-family:arial,helvetica,sans-serif"></span><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Best,</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">-Pradeep</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"></div><br><br><br>---<br>Pradeep Sen<br>Professor<br>UCSB MIRAGE Lab<br>Dept. of Electrical & Computer Engineering<br>University of California, Santa Barbara<br>Santa Barbara, CA 93106-9560<br></div></div>
</div></div></div></div></div></div></div></div></div></div>