<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Der Schmale &#8211; Real-time 3D programming</title>
	<atom:link href="https://www.derschmale.com/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.derschmale.com</link>
	<description>David Lenaerts - Freelance graphics programmer</description>
	<lastBuildDate>Thu, 20 Jul 2017 19:29:52 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
	<item>
		<title>Putting my Helix 3D engine code online (JS/WebGL)</title>
		<link>https://www.derschmale.com/2017/07/20/putting-my-helix-3d-engine-code-online-jswebgl/</link>
					<comments>https://www.derschmale.com/2017/07/20/putting-my-helix-3d-engine-code-online-jswebgl/#respond</comments>
		
		<dc:creator><![CDATA[David]]></dc:creator>
		<pubDate>Thu, 20 Jul 2017 19:29:52 +0000</pubDate>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Helix]]></category>
		<category><![CDATA[WebGL]]></category>
		<guid isPermaLink="false">http://www.derschmale.com/?p=1076</guid>

					<description><![CDATA[It&#8217;s been a while! Over the past couple of years, I&#8217;ve been working off and on (and more off than on) on a playground 3D engine called Helix. Starting as a WebGL port of a personal C++ engine I did and after rewriting it umpteen times, it ended up being a platform I do like&#8230;]]></description>
										<content:encoded><![CDATA[<p><img fetchpriority="high" decoding="async" class="size-large wp-image-1077 aligncenter" src="http://www.derschmale.com/blog/wp-content/sponza-1024x511.jpg" alt="" width="700" height="349" srcset="https://www.derschmale.com/blog/wp-content/sponza-1024x511.jpg 1024w, https://www.derschmale.com/blog/wp-content/sponza-300x150.jpg 300w, https://www.derschmale.com/blog/wp-content/sponza-768x383.jpg 768w, https://www.derschmale.com/blog/wp-content/sponza.jpg 1677w" sizes="(max-width: 700px) 100vw, 700px" /></p>
<p>It&#8217;s been a while! Over the past couple of years, I&#8217;ve been working off and on (and more off than on) on a playground 3D engine called Helix. Starting as a WebGL port of a personal C++ engine I did and after rewriting it umpteen times, it ended up being a platform I do like to play around with. Also, the term &#8220;for shits and giggles&#8221; comes to mind.</p>
<p>At this point, it comes with a long list of disclaimers. It&#8217;s not optimised very much, the API is probably a bit different from what you&#8217;d expect, and most importantly: I made it for <em>me</em>, not <em>you </em>;) I have no interest or motivation to compete with existing 3D engines and offer support for things that I&#8217;m not personally working on myself.</p>
<p>But in the spirit of sharing, I decided to make the code public. This way, I can more easily put some shader experiments online and as such it may serve an educational purpose. (Much like my Flash-based engine &#8220;Wick3d&#8221; from 10 years ago ;) )</p>
<p>Anyway, code and some documentation is here:</p>
<ul>
<li><a href="https://github.com/DerSchmale/helixjs">Github</a> (code + wiki)</li>
<li><a href="http://www.derschmale.com/helix/docs/helix-core/">Class reference</a></li>
</ul>
<p>And some examples as coded by a coder (not all are optimised):</p>
<ul>
<li><a href="http://derschmale.com/helix/examples/primitives/">Primitives</a></li>
<li><a href="http://derschmale.com/helix/examples/pbr/">Physically based materials</a></li>
<li><a href="http://derschmale.com/helix/examples/specular-properties/">Specular properties</a></li>
<li><a href="http://derschmale.com/helix/examples/env-map-equirectangular/">Glossy reflections</a></li>
</ul>
<p>Some for desktop only (using WASD + mouse interaction), not optimised at all!</p>
<ul>
<li><a href="http://derschmale.com/helix/examples/blue-marble/">Blue marble</a></li>
<li><a href="http://derschmale.com/helix/examples/sponza-obj/">Sponza</a></li>
<li><a href="http://derschmale.com/helix/examples/terrain/">Terrain</a></li>
</ul>
<p>And finally, a tongue-in-cheek nod to my partners in crime <a href="http://www.dasprinzip.com/">Frank Reitberger</a> and <a href="http://www.barradeau.com/">Nicolas Barradeau</a>!</p>
<ul>
<li><a href="http://derschmale.com/lab/ah80s/">Amazing Horse!</a></li>
</ul>
<p>But more about those guys at some other time soon :)</p>
<p><img decoding="async" class="alignnone size-large wp-image-1078" src="http://www.derschmale.com/blog/wp-content/terrain-1024x416.jpg" alt="" width="700" height="284" srcset="https://www.derschmale.com/blog/wp-content/terrain-1024x416.jpg 1024w, https://www.derschmale.com/blog/wp-content/terrain-300x122.jpg 300w, https://www.derschmale.com/blog/wp-content/terrain-768x312.jpg 768w, https://www.derschmale.com/blog/wp-content/terrain.jpg 1677w" sizes="(max-width: 700px) 100vw, 700px" /></p>
<p>&nbsp;</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.derschmale.com/2017/07/20/putting-my-helix-3d-engine-code-online-jswebgl/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Project: WebGL Porsche 911 Showcase</title>
		<link>https://www.derschmale.com/2015/10/18/project-webgl-porsche-911-showcase/</link>
					<comments>https://www.derschmale.com/2015/10/18/project-webgl-porsche-911-showcase/#comments</comments>
		
		<dc:creator><![CDATA[David]]></dc:creator>
		<pubDate>Sun, 18 Oct 2015 14:52:41 +0000</pubDate>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[WebGL]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[911]]></category>
		<category><![CDATA[Porsche]]></category>
		<category><![CDATA[rendering]]></category>
		<category><![CDATA[Shaders]]></category>
		<category><![CDATA[Showcase]]></category>
		<category><![CDATA[ThreeJS]]></category>
		<guid isPermaLink="false">http://www.derschmale.com/?p=1041</guid>

					<description><![CDATA[I don&#8217;t really get to post much about actual projects for a couple of reasons. My work is usually behind the scenes graphics coding, which typically result in posts about the techniques rather than the projects themselves. In my last project, a showcase project for the new Porsche 911 with the German agency UDG, I was the user of a 3D&#8230;]]></description>
										<content:encoded><![CDATA[<p><a href="http://www.porsche.com/countries/911/" target="_blank"><img decoding="async" class="aligncenter wp-image-1042 size-large" src="http://www.derschmale.com/blog/wp-content/porschaaaaah-1024x618.jpg" alt="Porsche 911 Showcase" width="700" height="422" srcset="https://www.derschmale.com/blog/wp-content/porschaaaaah-1024x618.jpg 1024w, https://www.derschmale.com/blog/wp-content/porschaaaaah-300x181.jpg 300w, https://www.derschmale.com/blog/wp-content/porschaaaaah.jpg 1113w" sizes="(max-width: 700px) 100vw, 700px" /></a></p>
<p>I don&#8217;t really get to post much about actual projects for a couple of reasons. My work is usually behind the scenes graphics coding, which typically result in posts about the techniques rather than the projects themselves. In my last project, a showcase project for the new Porsche 911 with the German agency <a href="https://www.udg.de/" target="_blank">UDG</a>, I was the <em>user</em> of a 3D engine for a change. Focusing on how things look rather than how things work was a nice change of pace. Furthermore, I was lucky to work together with two close friends: <a href="http://www.dasprinzip.com/" target="_blank">Frank Reitberger</a>, taking the reins of our sub-team and catching the inter-team blows, and <a href="http://simppa.fi/" target="_blank">Simo Santavirta</a> who worked on a lot of the playful background stuff, animations, and so on. Maybe they&#8217;ll put blog posts online about their parts, but I&#8217;ll just focus on my contributions here.</p>
<p>First of all: <a href="http://www.porsche.com/countries/911/" target="_blank">check out the project here</a>!</p>
<p>My tasks (the ones I want to talk about anyway, no one cares about 100 iterations of model imports and texture compression) were mainly shader and engine-oriented: materials, reflections, etc. The engine in question is Mr. Doob&#8217;s ever popular <a href="http://threejs.org/" target="_blank">Three.js</a>. In what follows, I&#8217;ll explain some of the things I did in the projects in words and concepts, not code. If anyone wants to know more about some aspect or other, just let me know.</p>
<p>&nbsp;</p>
<p><strong>Project overview</strong></p>
<p>The project itself is a 5-chapter showcase for Porsche&#8217;s latest 911 models, showing off some facets they seem to be pretty proud of: design, perfomance (showing off the engine), driving (showing off the wheels/axles), some weird things the headlights do when turning, all that jazz! Parts of the site also had to run on newer mobile devices.</p>
<p>There were a couple of immediate challenges and we had to give the 3D modellers a really hard time to get poly &amp; draw call count down as much as possible, as well as the amount and sizes of the textures.</p>
<p>(Oh, and I&#8217;ll admit it, I know <em>nothing</em> about cars. I don&#8217;t even have a driver&#8217;s license, nor do I want one. So yeah, most communication happened as &#8220;that springy thingy&#8221; or &#8220;that punchy thing inside the engine&#8221;. Since UDG is a German company, I did pick up on some great vocabulary. The winner? <em>&#8220;Auspuff&#8221;</em>, meaning <em>&#8220;exhaust pipe&#8221;</em> :&#8217;) Anyway&#8230; moving on.)</p>
<p>&nbsp;</p>
<p><strong>Custom work is more fun</strong></p>
<p>Most of the work we had to do, even if you can&#8217;t tell by looking at it, required a degree of custom work. We hacked the three.js codebase in places in order to splice in our changes (I can&#8217;t say I generally like being limited to out of the box stuff, and neither should you). The materials were all custom-built so we had full control over lighting models, which type of lights to use depending on the material, baked maps, custom reflections, and weird animation code.</p>
<div id="attachment_1045" style="width: 306px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/flares.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1045" class="size-medium wp-image-1045" src="http://www.derschmale.com/blog/wp-content/flares-300x195.jpg" alt="Lens flares" width="300" height="195" srcset="https://www.derschmale.com/blog/wp-content/flares-300x195.jpg 300w, https://www.derschmale.com/blog/wp-content/flares.jpg 488w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1045" class="wp-caption-text">Lens flares not cut off by geometry</p></div>
<p>To give an example of the less obvious: small lights, lens flares or highlighted car parts are made by quads that always point towards the camera. When done manually with default code, these quads would intersect with the car&#8217;s geometry and not be visible (unlike an actual flare which scatters inside the lens). So the quad should be in front of regular geometry, but still fade out depending how much geometry occludes the light itself. Rather than doing expensive occlusion tests like a default lens flare (there is code for that in the examples repository), we managed to make these things work by changing the vertex shader&#8217;s depth value and some algebra. It&#8217;s not <em>perfect</em>, some cut-off still occurs, but it works well enough given some patience to tweak the numbers since it&#8217;s not a complex occlusion situation (and it&#8217;s much more performant).</p>
<p>Similar tricks were used to get some of the transition animations to work: changing wheels in the showroom required some depth buffer trickery to make them morph into eachother nicely.</p>
<p>&nbsp;</p>
<p><strong>Materials</strong></p>
<p>Most of the material shaders were built keeping physical plausibility in mind. Given the limitations of WebGL and not being able to use some extensions, we couldn&#8217;t go all the way with this. No floating point textures, so no HDR to work with, we solved some things by for example simply scaling environment map values. All of the materials do have fresnel-based BRDFs with normalized distribution functions (we mostly avoided geometric self-shadowing or foreshortening terms for performance reasons). Expecting limited overdraw, we used Three&#8217;s forward renderer which gave us a lot of flexibility to tweak lighting models and materials as required for the surfaces. The scene was relatively static, so all shadows are just baked light- and ambient occlusion maps.</p>
<p>All materials except for the very rough ones (where it would be a nearly invisible waste of resources) use an environment map. We couldn&#8217;t rely on the <em>EXT_shader_texture_lod</em> extension so a mip-chain to handle different roughnesses was out of the question. Instead, we settled for 3 separate environment maps. The largest one for very smooth surfaces was one that&#8217;s updated at real time to represent the actual environment. The two others, for different degrees of roughness, were baked convoluted cube maps. These were generated using <a href="https://www.knaldtech.com/lys/" target="_blank">Knald&#8217;s Lys</a>, a tool I&#8217;ve grown very fond of. When required, the environment map was assigned a size and position in the shader. That way, we could calculate where the reflection ray intersects the reflection cube, resulting in much more locally correct reflections, which is especially important for the many flat surfaces we were dealing with.</p>
<div id="attachment_1043" style="width: 306px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/duotone.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1043" class="size-medium wp-image-1043" src="http://www.derschmale.com/blog/wp-content/duotone-300x191.jpg" alt="Slight duo-tone effect (yellow/red) for colour depth." width="300" height="191" srcset="https://www.derschmale.com/blog/wp-content/duotone-300x191.jpg 300w, https://www.derschmale.com/blog/wp-content/duotone.jpg 637w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1043" class="wp-caption-text">Slight duo-tone effect (yellow/red) for colour depth.</p></div>
<p>The car paint has a GGX Trowbridge-Reitz specular distribution model to get nicer highlight tails that allow for a better soft metallic look. Normals are perturbed both with a normal map and a fleck texture to get some subtle metallic flecks in there. I had hoped to be able to spend more time on the actual metallic clear-coat shader, but instead I had to adapt what we already had to match a series of Photoshopped screenshots (I had forgotten this is how 2D-oriented people like to work ;) ). The diffuse paint model supports a fresnel-based multi-layered &#8220;douchebag&#8221; paint effect, but that actually turned out to be little used except to add some depth in the paint: there&#8217;s no actual douchebag paints in the showcase. What a pity! With some tweaking and subtle use, however, it sometimes even gives a slight impression of subsurface scattering, which is always a nice extra with car paint.</p>
<p>Other &#8220;solid&#8221; materials just use the normalized Blinn-Phong model with regular Lambertian diffuse scattering. The metallic materials of course just use specular reflections: at least an environment map and optionally including the scene lights. In this case, the albedo colour is used as the normal incident specular reflection colour. In the picture on the right, some are black metal (<em>kvlt!</em>), some are more regularly coloured, but all are metal. Apart from this, there&#8217;s also optional self-occlusion maps that can be used to darken some of the reflections in niches.</p>
<div id="attachment_1044" style="width: 268px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/metal.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1044" class="wp-image-1044 " src="http://www.derschmale.com/blog/wp-content/metal-300x245.jpg" alt="Metal materials" width="262" height="215" srcset="https://www.derschmale.com/blog/wp-content/metal-300x245.jpg 300w, https://www.derschmale.com/blog/wp-content/metal.jpg 585w" sizes="auto, (max-width: 262px) 100vw, 262px" /></a><p id="caption-attachment-1044" class="wp-caption-text">A bunch of metallic materials with different configurations.</p></div>
<div id="attachment_1047" style="width: 315px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/wheels.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1047" class="wp-image-1047 " src="http://www.derschmale.com/blog/wp-content/wheels-300x206.jpg" alt="Car wheels" width="309" height="215" /></a><p id="caption-attachment-1047" class="wp-caption-text">Not quite metal, not quite plastic.</p></div>
<p>I&#8217;ve been told the car rims aren&#8217;t actually metal, but they do seem to exhibit some definite metallic reflections. To get them to look convincing &#8211; but not quite chrome-like &#8211; we used a hybrid model. Basically, it&#8217;s a somewhat regular Blinn-Phong model with normal incidence reflections boosted, while reducing diffuse reflections based on the specular boost. Not very different from changing the &#8220;metallicness&#8221; value in something like Unreal.</p>
<p>Glass materials are mostly just environment maps using the Fresnel factor as alpha with normal alpha-blending. In the case of the car windows, there&#8217;s a layer that uses multiplicative blending to darken what&#8217;s behind it before applying the environment map in a second pass. It&#8217;s considerably more realistic than doing everything in one pass with default blending.</p>
<p>Most of the &#8220;special effect&#8221; materials such as the highlights are simply a flat colour with fresnel-based fall-off (think rim-lighting), and additive blending. There&#8217;s also some depth offset being applied to allow overdraw of near pixels while still preserving most occlusions.</p>
<p>&nbsp;</p>
<p><strong>Floor reflections</strong></p>
<div id="attachment_1048" style="width: 710px" class="wp-caption alignnone"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1048" class="wp-image-1048 size-large" src="http://www.derschmale.com/blog/wp-content/softreflections-1024x367.jpg" alt="softreflections" width="700" height="251" srcset="https://www.derschmale.com/blog/wp-content/softreflections-1024x367.jpg 1024w, https://www.derschmale.com/blog/wp-content/softreflections-300x108.jpg 300w, https://www.derschmale.com/blog/wp-content/softreflections.jpg 1280w" sizes="auto, (max-width: 700px) 100vw, 700px" /><p id="caption-attachment-1048" class="wp-caption-text">Reflections getting softer away from the floor</p></div>
<p>One of the most striking aspects of the original mood boards were the reflections of the car and the environment on the floor. Somewhat soft reflections as in real life: perfect reflections where the objects touch but getting blurrier the further away it is from the surface. Obviously we wanted to replicate this in the project as well. There is code out there to do planar reflections in three.js, but those result in perfect mirror-like reflections. To get what we wanted, we built our own reflection renderer, much like what I did for Away3D back in the day (<a href="http://www.derschmale.com/2012/09/10/away3d-4-1-dev-dynamic-reflections/">see this</a>) with some optimizations/omissions: our reflecting plane was always aligned with the XZ plane going through the origin without the camera ever crossing it. In other words: mirror the camera vertically and render the scene to a texture. To get the distance-based soft reflections working, we had to have all the materials output the fragment&#8217;s world space Y coordinate to the alpha channel.</p>
<div id="attachment_1049" style="width: 306px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/blurradii.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1049" class="wp-image-1049 size-medium" src="http://www.derschmale.com/blog/wp-content/blurradii-300x180.jpg" alt="Blur radii" width="300" height="180" srcset="https://www.derschmale.com/blog/wp-content/blurradii-300x180.jpg 300w, https://www.derschmale.com/blog/wp-content/blurradii.jpg 1014w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1049" class="wp-caption-text">An object close to the surface should not contribute to far object&#8217;s blur radius</p></div>
<p>Using the alpha value, we could calculate an approximate distance of the reflected point to the floor, which in turn could be used in the blurring stage. That blurring worked very much like a depth-aware blur. First, the central point is sampled to figure out how far that is from the floor. This distance is used to calculate the basic blur radius. Every point that&#8217;s then sampled within the blur radius has a weight calculates based on its own distance, so we can calculated a weighted average at the end. If we wouldn&#8217;t do this, objects close to the surface would be included in the blur of an object further away, which should not always be the case.</p>
<p>The final blurred texture is then used when rendering the floor itself, as with normal planar reflections, using the floor normals to perturb the sampled point a bit.</p>
<p>&nbsp;</p>
<p><b>Last words</b></p>
<p>I&#8217;m not sure in how far a write-up like this is useful, as &#8211; again &#8211; it&#8217;s not something I get to do all that often. But at least I can show an actual project!</p>
<p>In the past year or so, I&#8217;ve gotten pretty comfortable with WebGL and while it always feels like a step back from all the amazing things you can do with desktop tech (what?! no compute shaders?!), the fun is also in the limitations themselves: finding cheap solutions or approximations with what you&#8217;ve got. I&#8217;ll never be a fan of Javascript &#8211; or even Typescript &#8211; tho ;)</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.derschmale.com/2015/10/18/project-webgl-porsche-911-showcase/feed/</wfw:commentRss>
			<slash:comments>11</slash:comments>
		
		
			</item>
		<item>
		<title>Speaking at Reasons to be Creative 2015</title>
		<link>https://www.derschmale.com/2015/04/16/speaking-at-reasons-to-be-creative-2015/</link>
					<comments>https://www.derschmale.com/2015/04/16/speaking-at-reasons-to-be-creative-2015/#respond</comments>
		
		<dc:creator><![CDATA[David]]></dc:creator>
		<pubDate>Thu, 16 Apr 2015 10:52:50 +0000</pubDate>
				<category><![CDATA[Misc]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[Reasons to be Creative]]></category>
		<category><![CDATA[rendering]]></category>
		<category><![CDATA[Session]]></category>
		<guid isPermaLink="false">http://www.derschmale.com/?p=1034</guid>

					<description><![CDATA[Hey there! A quick update to plug the fact that I&#8217;ll be speaking again at Reasons to be Creative in Brighton, September 7 to 9! As it&#8217;s one of my all-time favourite events, I&#8217;m stoked to be representing the real-time 3D graphics programming crowd (while being incredibly humbled by the other names on the bill)! What can you expect&#8230;]]></description>
										<content:encoded><![CDATA[<p><a href="http://reasons.to"><img loading="lazy" decoding="async" class="alignright wp-image-1035 size-medium" src="http://www.derschmale.com/blog/wp-content/r2bc-300x250.jpg" alt="Reasons to be Creative" width="300" height="250" srcset="https://www.derschmale.com/blog/wp-content/r2bc-300x250.jpg 300w, https://www.derschmale.com/blog/wp-content/r2bc.jpg 336w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a>Hey there!</p>
<p>A quick update to plug the fact that I&#8217;ll be speaking again at <a title="Reasons to be Creative" href="http://reasons.to/" target="_blank">Reasons to be Creative</a> in Brighton, September 7 to 9! As it&#8217;s one of my all-time favourite events, I&#8217;m stoked to be representing the real-time 3D graphics programming crowd (while being incredibly humbled by the other names on the bill)!</p>
<p>What can you expect from my talk? I haven&#8217;t settled on a topic for a full 100% <em>yet,</em> but if you follow my blog or have seen some of my previous talks, you should have an idea. Obviously, you can expect real-time 3D in some form or another. I&#8217;m playing with some ideas, but if anyone who&#8217;s coming has a specific request, I can try and incorporate it into the talk. If not, it&#8217;ll always be a good subject to talk about over a couple beers at night! :)</p>
<p>In any case, I&#8217;ll keep this post updated once I&#8217;ve submitted my session description, so check back later!</p>
<p>Hope to see you there!</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.derschmale.com/2015/04/16/speaking-at-reasons-to-be-creative-2015/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Upcoming Talks: Crib Game Days &#038; FITC Amsterdam</title>
		<link>https://www.derschmale.com/2014/12/17/upcoming-talks-crib-game-days-fitc-amsterdam/</link>
					<comments>https://www.derschmale.com/2014/12/17/upcoming-talks-crib-game-days-fitc-amsterdam/#respond</comments>
		
		<dc:creator><![CDATA[David]]></dc:creator>
		<pubDate>Wed, 17 Dec 2014 13:40:46 +0000</pubDate>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Misc]]></category>
		<category><![CDATA[WebGL]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[Crib Game Days]]></category>
		<category><![CDATA[FITC]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[presentations]]></category>
		<guid isPermaLink="false">http://www.derschmale.com/?p=1030</guid>

					<description><![CDATA[Hi everyone! A quick update to plug a couple of events that have invited yours truly for a talk. Yes I know, it&#8217;s been about two years since my last real talk. I haven&#8217;t even been able to visit any conferences this year. I&#8217;m showing severe withdrawal symptoms and I&#8217;m ecstatic to be able to spend a&#8230;]]></description>
										<content:encoded><![CDATA[<p><img loading="lazy" decoding="async" class="alignright wp-image-1031 size-medium" src="http://www.derschmale.com/blog/wp-content/21011-194x300.png" alt="FITC Amsterdam 2015" width="194" height="300" srcset="https://www.derschmale.com/blog/wp-content/21011-194x300.png 194w, https://www.derschmale.com/blog/wp-content/21011.png 210w" sizes="auto, (max-width: 194px) 100vw, 194px" />Hi everyone!</p>
<p>A quick update to plug a couple of events that have invited yours truly for a talk. Yes I know, it&#8217;s been about two years since my last real talk. I haven&#8217;t even been able to <em>visit</em> any conferences this year. I&#8217;m showing severe withdrawal symptoms and I&#8217;m ecstatic to be able to spend a couple of days in such fine company again. So check out:</p>
<ul>
<li><a title="Crib Game Days" href="http://cribgamedays.org/" target="_blank">Crib Game Days</a>, January 23: Genk, Belgium</li>
<li><a title="FITC Amsterdam" href="http://fitc.ca/speaker/david-lenaerts/?event=15841" target="_blank">FITC Amsterdam</a>, February 23/24: Amsterdam, The Netherlands</li>
</ul>
<p>While different in size, both events have a great line-up, so be sure not to miss out on them! :)</p>
<p><strong>The talk</strong></p>
<p>The title of my talk will be &#8220;<strong>A Peek at the Future of 3D on the Web</strong>&#8220;. It will draw from having been quite intimate with its past and my experience with WebGL and its extensions. A while ago, I took it upon myself to take my playground DirectX engine Helix and create a JavaScript/WebGL version. Since the original Helix is DX11-based and relies on quite a few &#8216;modern things&#8217;, I wasn&#8217;t too worried about cross-platform functionality. If it ran well enough on both my desktop and laptop &#8211; both relatively capable machines &#8211; it&#8217;s all good. I might fix everything up and make it work &#8220;everywhere&#8221; if I have the time at some point, but that&#8217;s beside the point right now :) The Helix port will function as a sort of leitmotif throughout the presentation.</p>
<p>So obviously, the talk won&#8217;t be about how to build a WebGL game that runs reliably on all platforms; quite the contrary. Of course, there will be segments that show how to improve your rendering <em>today</em>, which I&#8217;ve found oddly lacking in existing projects/engines. I wouldn&#8217;t want you to go home without being able to apply anything directly, now would I? However, the main focus will be on what the future brings: either currently available through extensions (and hence might only work on a select amount of devices) or what&#8217;s being proposed but currently only exists in f.e. OpenGL/DirectX.</p>
<p><b>Apart from all that&#8230;</b></p>
<p>I&#8217;ve been taking quite a long time off from payed work to focus on learning new things: brushing up on my JavaScript/WebGL, checking out Outracks&#8217; <a title="Fuse Tools" href="http://www.fusetools.com/" target="_blank">UNO &amp; Fuse Tools</a> (looking promising, I might be showing some of this during the talk as well), some <a title="Unity" href="http://unity3d.com/" target="_blank">Unity</a> (always a pleasure), and a brief foray into Python territory. Yes, watch me embellish my LinkedIn profile so headhunters can try and hire me for completely unrelated things! I&#8217;ve even started <a title="Wayward Bound on SoundCloud" href="https://soundcloud.com/waywardbound/" target="_blank">working on music again</a> (gasp!), if you&#8217;re into post-rock/metal-ish sort of things ;)</p>
<p>Not sure if it counts as a sabbatical, but getting back to <em>actual </em>work (<a title="I Gotsta Get Paid" href="https://www.youtube.com/watch?v=kaIZWjItReI" target="_blank">I gotsta get paid!</a>) with new knowledge under my belt definitely feels gratifying. And hopefully, it will also lead to more blog posts in which I can actually show running examples ;)</p>
<p>Signing out!</p>
<p>&#8211; D</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.derschmale.com/2014/12/17/upcoming-talks-crib-game-days-fitc-amsterdam/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Unprojections Explained</title>
		<link>https://www.derschmale.com/2014/09/28/unprojections-explained/</link>
					<comments>https://www.derschmale.com/2014/09/28/unprojections-explained/#respond</comments>
		
		<dc:creator><![CDATA[David]]></dc:creator>
		<pubDate>Sun, 28 Sep 2014 16:32:10 +0000</pubDate>
				<category><![CDATA[Graphics]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[directx]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[homogeneous coordinates]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[opengl]]></category>
		<category><![CDATA[projection]]></category>
		<category><![CDATA[unprojection]]></category>
		<guid isPermaLink="false">http://www.derschmale.com/?p=1025</guid>

					<description><![CDATA[Recently, one of the responses to the Reconstruction Positions [&#8230;] post dealt with the unprojection of frustum corners. More specifically: with the inverted projection matrix and the final division with the coordinate. Being the lazy sod that I am on Sundays, I thought I&#8217;d quickly google it and paste a link with the explanation. Only one problem:&#8230;]]></description>
										<content:encoded><![CDATA[<p>Recently, one of the responses to the <a title="Reconstructing positions from the depth buffer pt. 2: Perspective and orthographic general case" href="http://www.derschmale.com/2014/03/19/reconstructing-positions-from-the-depth-buffer-pt-2-perspective-and-orthographic-general-case/">Reconstruction Positions [&#8230;]</a> post dealt with the unprojection of frustum corners. More specifically: with the inverted projection matrix and the final division with the <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-dfee5c980777976ae8cf6541893fb572_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;" title="Rendered by QuickLaTeX.com" height="8" width="13" style="vertical-align: 0px;"/> coordinate. Being the lazy sod that I am on Sundays, I thought I&#8217;d quickly google it and paste a link with the explanation. Only one problem: I couldn&#8217;t find any decent articles! At least not within a reasonable amount of time, that is. I&#8217;m sure they&#8217;re out there somewhere ;) Most people asking &#8220;How do I unproject?&#8221; or &#8220;How can I get view space positions from a screen/mouse position?&#8221;* were told to check out existing open source code and copy it. That would indeed solve the issue at hand, but if you&#8217;re anything like me, you don&#8217;t like using code you don&#8217;t truly understand. So here&#8217;s my attempt to explain (Also&#8230; Mathematical rigour? What&#8217;s that? :) ).</p>
<p><em>*This question is also addressed in the earlier mentioned posts, but they&#8217;re geared toward shader-based post-processing, and it skimps over the unprojection part.</em></p>
<h3>Homogeneous coordinates</h3>
<p>In order to understand &#8220;un&#8221;-projections, it would help to know how projections work in the first place. I&#8217;ll probably be a bit too verbose in this part, but I reckon it&#8217;s good to have a proper intuitive grasp on it.</p>
<p>When working in regular 3D space, we tend to use 4D coordinates to differentiate between vectors and points (by setting the fourth &#8211; <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-dfee5c980777976ae8cf6541893fb572_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;" title="Rendered by QuickLaTeX.com" height="8" width="13" style="vertical-align: 0px;"/> &#8211; coordinate to 0 or 1, respectively). This lets us use 4D matrices to perform <a title="Affine transformation" href="http://en.wikipedia.org/wiki/Affine_transformation">affine transformations</a> (for example: rotations, scale, and translations and combinations) with a single matrix, without having translations affect vectors (since <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-dfee5c980777976ae8cf6541893fb572_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;" title="Rendered by QuickLaTeX.com" height="8" width="13" style="vertical-align: 0px;"/> == 0, the translation component will be nullified). If you&#8217;re not rolling your eyes at this point because I&#8217;m stating the obvious, you should grab any book on 3D programming math and revise :)</p>
<p>Anyway, these 4D coordinates are called homogeneous coordinates. Before projecting, the homogeneous aspect doesn&#8217;t really matter because the <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-dfee5c980777976ae8cf6541893fb572_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;" title="Rendered by QuickLaTeX.com" height="8" width="13" style="vertical-align: 0px;"/> coordinate is hardly used. But, since we&#8217;re operating in 4D, we can do things not possible with a simple matrix in 3D, including projections. Projections are extensively covered all over the place, but let&#8217;s revisit homogeneous coordinates and how they&#8217;re relevant for this article.</p>
<p>More generally, homogeneous coordinates can be seen as an &#8220;extension&#8221; to regular <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-0ded95295f8e6e0f18d7ea1c83acadf6_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#40;&#120;&#44;&#121;&#44;&#122;&#41;" title="Rendered by QuickLaTeX.com" height="18" width="56" style="vertical-align: -4px;"/> triplets by adding said <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-dfee5c980777976ae8cf6541893fb572_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;" title="Rendered by QuickLaTeX.com" height="8" width="13" style="vertical-align: 0px;"/> coordinate, moving towards 4D. They map back to good old 3D as follows:</p>
<p class="ql-center-displayed-equation" style="line-height: 32px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-c45b0e824482e34ad742c661f2a4db2e_l3.png" height="32" width="174" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#40;&#120;&#39;&#44;&#32;&#121;&#39;&#44;&#32;&#122;&#39;&#41;&#32;&#61;&#32;&#40;&#92;&#102;&#114;&#97;&#99;&#123;&#120;&#125;&#123;&#119;&#125;&#44;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#121;&#125;&#123;&#119;&#125;&#44;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#122;&#125;&#123;&#119;&#125;&#41; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>This is a <em>projection</em> from 4D to 3D. This is in fact also used to &#8220;rearrange&#8221; the z coordinate for perspective projections to get the divide-by-z, but you can find that explained in any proper 3D book as well. You&#8217;ll see that any scalar multiple of homogeneous coordinates will project to the same 3D point. For example, a point <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-03753a417e0235eae51a1a63ac25ac30_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#40;&#120;&#44;&#32;&#121;&#44;&#32;&#122;&#44;&#32;&#119;&#41;&#32;&#61;&#32;&#40;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#120;&#44;&#32;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#121;&#44;&#32;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#122;&#44;&#32;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#119;&#41;" title="Rendered by QuickLaTeX.com" height="18" width="233" style="vertical-align: -4px;"/></p>
<p class="ql-center-displayed-equation" style="line-height: 37px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-1338cb1fafbbaee0ddb3a7266e2f76be_l3.png" height="37" width="310" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#40;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#120;&#125;&#123;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#119;&#125;&#44;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#121;&#125;&#123;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#119;&#125;&#44;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#122;&#125;&#123;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#119;&#125;&#41;&#32;&#61;&#32;&#40;&#92;&#102;&#114;&#97;&#99;&#123;&#120;&#125;&#123;&#119;&#125;&#44;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#121;&#125;&#123;&#119;&#125;&#44;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#122;&#125;&#123;&#119;&#125;&#41;&#32;&#61;&#32;&#40;&#120;&#39;&#44;&#32;&#121;&#39;&#44;&#32;&#122;&#39;&#41; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>So, we see that a homogeneous point <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-f58d9de2a86c90953239c4bcd6dcb593_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;" title="Rendered by QuickLaTeX.com" height="11" width="11" style="vertical-align: -3px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-fa483b8b46649d6e00ec0e01ca01c471_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;" title="Rendered by QuickLaTeX.com" height="16" width="21" style="vertical-align: -3px;"/> represent the same 3D point, and we call scalar multiples of homogeneous points <em>equivalent:</em></p>
<p class="ql-center-displayed-equation" style="line-height: 18px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-fca7a4a0897948c632040bda4d4db959_l3.png" height="18" width="190" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#40;&#120;&#44;&#32;&#121;&#44;&#32;&#122;&#44;&#32;&#119;&#41;&#32;&#92;&#101;&#113;&#117;&#105;&#118;&#32;&#92;&#108;&#97;&#109;&#98;&#100;&#97;&#32;&#40;&#120;&#44;&#32;&#121;&#44;&#32;&#122;&#44;&#32;&#119;&#41; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>We&#8217;re used to work with the subset where <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-fa10d62255303ad15423cb226b1100c0_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;&#32;&#61;&#32;&#49;" title="Rendered by QuickLaTeX.com" height="13" width="45" style="vertical-align: -1px;"/>. I&#8217;m not sure if there&#8217;s an actual name for this set, but let&#8217;s call them the <em>principal </em>representation of the point to make things easier to explain (that&#8217;s right, I&#8217;m <em>coining</em> things here!). This is almost always the representation we want in the end.</p>
<p>A final note about when <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-a49962254fa81e88032c8b317a3a5962_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;&#32;&#61;&#32;&#48;" title="Rendered by QuickLaTeX.com" height="12" width="46" style="vertical-align: 0px;"/>. These points are called <em>ideal points</em>, and have some practical applications which we don&#8217;t need to concern ourselves about here. Multiplied with a scalar, an ideal point remains an ideal point. Furthermore, they are projected at infinity (division by 0). They don&#8217;t correspond to proper 3D points, which is at the base of why we can use them to represent vectors. But since we&#8217;re just dealing with points from now on, let&#8217;s let it rest at that :)</p>
<p>Check your 3D math books chapter again on (perspective) projection, and you should have a better idea of how the homogeneous coordinates function theoretically beyond &#8220;divide by <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-dfee5c980777976ae8cf6541893fb572_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;" title="Rendered by QuickLaTeX.com" height="8" width="13" style="vertical-align: 0px;"/> for perspective foreshortening&#8221;. In any case, the important part here is this: <em>scalar multiples of homogeneous coordinates represent the same 3D point.</em></p>
<h3>Unprojecting</h3>
<p>Your usual every day projection happens as follows:</p>
<ol>
<li>Provide a point in view space (principal representation, <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-fa10d62255303ad15423cb226b1100c0_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;&#32;&#61;&#32;&#49;" title="Rendered by QuickLaTeX.com" height="13" width="45" style="vertical-align: -1px;"/>).</li>
<li>Multiply with the projection matrix: this yields a homogeneous coordinate with non-principal representation.</li>
<li>Divide by <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-dfee5c980777976ae8cf6541893fb572_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;" title="Rendered by QuickLaTeX.com" height="8" width="13" style="vertical-align: 0px;"/> to get the projected point in principal representation (the GPU does this for you for the vertex shader&#8217;s position output). This yields normalized device coordinates (NDC).</li>
</ol>
<p class="ql-center-displayed-equation" style="line-height: 16px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-ba4b5ea731ce6c5a5419e855254c5e20_l3.png" height="16" width="121" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#104;&#111;&#109;&#125;&#32;&#61;&#32;&#77;&#32;&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#118;&#105;&#101;&#119;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p class="ql-center-displayed-equation" style="line-height: 35px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-88b54cc06b0a09876208ad09da3bb23b_l3.png" height="35" width="100" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#110;&#100;&#99;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#104;&#111;&#109;&#125;&#125;&#123;&#119;&#95;&#123;&#104;&#111;&#109;&#125;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>So when &#8220;unprojecting&#8221;, we want to figure out <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-c7ce7561fe0a189ccb53aaf3fea094d7_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#118;&#105;&#101;&#119;&#125;" title="Rendered by QuickLaTeX.com" height="12" width="39" style="vertical-align: -4px;"/> when we know <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-93dad3e35e173a6b44b1d97acc9ee4dd_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#110;&#100;&#99;&#125;" title="Rendered by QuickLaTeX.com" height="12" width="32" style="vertical-align: -4px;"/> *. Simple solving, right?</p>
<p><em>* You may not know the full NDC coordinates and only window <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-f2cc7fdbb9ba90fd8eac175adbd6ac10_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#40;&#120;&#44;&#32;&#121;&#41;" title="Rendered by QuickLaTeX.com" height="18" width="39" style="vertical-align: -4px;"/> coordinates, but that&#8217;s okay, see below.</em></p>
<p class="ql-center-displayed-equation" style="line-height: 12px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-b67d73d8f85c7a585f6e8e315ab157a5_l3.png" height="12" width="136" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#104;&#111;&#109;&#125;&#32;&#61;&#32;&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#110;&#100;&#99;&#125;&#32;&#119;&#95;&#123;&#104;&#111;&#109;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p class="ql-center-displayed-equation" style="line-height: 21px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-8186c1c34a37d43f2ddcb82f2d699b71_l3.png" height="21" width="140" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#77;&#94;&#123;&#45;&#49;&#125;&#32;&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#104;&#111;&#109;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>But wait, you&#8217;d need to know <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-23e96ac9b5f4e5bbaf89c3c1e6ace19c_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;&#95;&#123;&#104;&#111;&#109;&#125;" title="Rendered by QuickLaTeX.com" height="11" width="40" style="vertical-align: -3px;"/> to calculate <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-b4a46c8f0a341027477613ce759be18c_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#104;&#111;&#109;&#125;" title="Rendered by QuickLaTeX.com" height="12" width="38" style="vertical-align: -4px;"/>! Mission impossible, because that&#8217;s obviously part of what we&#8217;re trying to figure out! But remember, we&#8217;re dealing with homogeneous coordinates here, so we can use the equivalence property. <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-23e96ac9b5f4e5bbaf89c3c1e6ace19c_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;&#95;&#123;&#104;&#111;&#109;&#125;" title="Rendered by QuickLaTeX.com" height="11" width="40" style="vertical-align: -3px;"/> is a simple scalar, which means <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-b4a46c8f0a341027477613ce759be18c_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#104;&#111;&#109;&#125;" title="Rendered by QuickLaTeX.com" height="12" width="38" style="vertical-align: -4px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-93dad3e35e173a6b44b1d97acc9ee4dd_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#110;&#100;&#99;&#125;" title="Rendered by QuickLaTeX.com" height="12" width="32" style="vertical-align: -4px;"/> are equivalent; they represent the same point! The matrix transformation does not affect equivalence, which means:</p>
<p class="ql-center-displayed-equation" style="line-height: 24px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-bca2715f0bbb2f50ec88afb2707730fb_l3.png" height="24" width="220" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#104;&#111;&#109;&#111;&#103;&#101;&#110;&#101;&#111;&#117;&#115;&#86;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#77;&#94;&#123;&#45;&#49;&#125;&#32;&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#110;&#100;&#99;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>is a homogeneous coordinate equivalent to <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-c7ce7561fe0a189ccb53aaf3fea094d7_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#118;&#105;&#101;&#119;&#125;" title="Rendered by QuickLaTeX.com" height="12" width="39" style="vertical-align: -4px;"/>. The last thing to do is map that back to the principal representation and we have the correct result:</p>
<p class="ql-center-displayed-equation" style="line-height: 40px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-d48ff049e0a16c62f8bdc2f1bf143c38_l3.png" height="40" width="195" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#116;&#101;&#120;&#116;&#98;&#102;&#123;&#112;&#125;&#95;&#123;&#104;&#111;&#109;&#111;&#103;&#101;&#110;&#101;&#111;&#117;&#115;&#86;&#105;&#101;&#119;&#125;&#125;&#123;&#119;&#95;&#123;&#104;&#111;&#109;&#111;&#103;&#101;&#110;&#101;&#111;&#117;&#115;&#86;&#105;&#101;&#119;&#125;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>To recap, unprojection happens as follows:</p>
<ol>
<li>Provide an NDC coordinate.</li>
<li><span style="font-size: 13px;">Multiply with the inverse projection matrix, yielding a homogeneous coordinate equivalent to the view position.</span></li>
<li>Divide by <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-dfee5c980777976ae8cf6541893fb572_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;" title="Rendered by QuickLaTeX.com" height="8" width="13" style="vertical-align: 0px;"/> to get the principal representation of the view position.</li>
</ol>
<p>This should at least explain what&#8217;s going on in the <a title="Reconstructing positions from the depth buffer pt. 2: Perspective and orthographic general case" href="http://www.derschmale.com/2014/03/19/reconstructing-positions-from-the-depth-buffer-pt-2-perspective-and-orthographic-general-case/">position reconstruction post</a>. The coordinates unprojected there are the NDC coordinates corresponding to the frustum corners.</p>
<h3>What about screen positions?</h3>
<p>If all you have are coordinates on the screen such as a mouse position, there&#8217;s some info lacking, huh? NDC coordinates are 3D so we&#8217;re obviously missing a <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-4586e340cb83d5b642972e97a288fec2_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;" title="Rendered by QuickLaTeX.com" height="8" width="9" style="vertical-align: 0px;"/> component. But first things first, let&#8217;s give you the NDC <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-ede05c264bba0eda080918aaa09c4658_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#120;" title="Rendered by QuickLaTeX.com" height="8" width="10" style="vertical-align: 0px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-0af556714940c351c933bba8cf840796_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#121;" title="Rendered by QuickLaTeX.com" height="12" width="9" style="vertical-align: -4px;"/> components. They&#8217;re obtained by a simple remapping to a [-1, 1] range:</p>
<p class="ql-center-displayed-equation" style="line-height: 36px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-95e7d53e84c75860bf55c2bc0bdc7e7f_l3.png" height="36" width="463" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#40;&#120;&#95;&#123;&#110;&#100;&#99;&#125;&#44;&#32;&#121;&#95;&#123;&#110;&#100;&#99;&#125;&#44;&#32;&#122;&#95;&#123;&#110;&#100;&#99;&#125;&#41;&#32;&#61;&#32;&#40;&#50;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#120;&#95;&#123;&#115;&#99;&#114;&#101;&#101;&#110;&#125;&#125;&#123;&#119;&#105;&#100;&#116;&#104;&#95;&#123;&#115;&#99;&#114;&#101;&#101;&#110;&#125;&#125;&#32;&#45;&#32;&#49;&#44;&#32;&#50;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#121;&#95;&#123;&#115;&#99;&#114;&#101;&#101;&#110;&#125;&#125;&#123;&#104;&#101;&#105;&#103;&#104;&#116;&#95;&#123;&#115;&#99;&#114;&#101;&#101;&#110;&#125;&#125;&#32;&#45;&#32;&#49;&#44;&#32;&#63;&#63;&#63;&#41; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>But <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-a83e07f962e930fcd870d5bd7e388183_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;&#95;&#123;&#110;&#100;&#99;&#125;" title="Rendered by QuickLaTeX.com" height="11" width="29" style="vertical-align: -3px;"/> is an unknown. This shouldn&#8217;t be surprising, as a whole ray of points in space project to that same point on the screen. You&#8217;re essentially free to pick your own <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-4586e340cb83d5b642972e97a288fec2_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;" title="Rendered by QuickLaTeX.com" height="8" width="9" style="vertical-align: 0px;"/> coordinate and something along that ray will come up. A value of 1 represents the intersection of the ray with the camera&#8217;s far plane. A value of 0 (DirectX) or -1 (OpenGL) represents one on the near field. You can use either to get an unprojected position and together with the camera position in the same space, this can be used to construct a ray to perform ray intersection tests in your scene.</p>
<p>I hope this helped if you&#8217;re struggling to figure out this stuff. Until next time!</p>
<p>&nbsp;</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.derschmale.com/2014/09/28/unprojections-explained/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>&#8220;Post-filtered&#8221; Soft Variance Shadow Mapping for Varying Penumbra Sizes</title>
		<link>https://www.derschmale.com/2014/07/24/faster-variance-soft-shadow-mapping-for-varying-penumbra-sizes/</link>
					<comments>https://www.derschmale.com/2014/07/24/faster-variance-soft-shadow-mapping-for-varying-penumbra-sizes/#respond</comments>
		
		<dc:creator><![CDATA[David]]></dc:creator>
		<pubDate>Thu, 24 Jul 2014 17:52:10 +0000</pubDate>
				<category><![CDATA[DirectX]]></category>
		<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Helix]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[directx]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[Shaders]]></category>
		<category><![CDATA[shading]]></category>
		<category><![CDATA[shadow mapping]]></category>
		<category><![CDATA[variance shadow mapping]]></category>
		<guid isPermaLink="false">http://www.derschmale.com/?p=1013</guid>

					<description><![CDATA[Okay, I&#8217;ll state this up front: I&#8217;m probably not going to use this approach in my own engine because of many issues inherent with Variance Shadow Mapping. However, I think I did end up with some interesting results to play with, so if VSM with fixed penumbra sizes (or just for filtering) is working well&#8230;]]></description>
										<content:encoded><![CDATA[<p><a href="http://www.derschmale.com/blog/wp-content/soft-shadows.jpg"><img loading="lazy" decoding="async" class="alignright size-medium wp-image-1018" src="http://www.derschmale.com/blog/wp-content/soft-shadows-300x187.jpg" alt="soft-shadows" width="300" height="187" srcset="https://www.derschmale.com/blog/wp-content/soft-shadows-300x187.jpg 300w, https://www.derschmale.com/blog/wp-content/soft-shadows-1024x640.jpg 1024w, https://www.derschmale.com/blog/wp-content/soft-shadows.jpg 1280w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a>Okay, I&#8217;ll state this up front: I&#8217;m probably not going to use this approach in my own engine because of many issues inherent with Variance Shadow Mapping. However, I think I did end up with some interesting results to play with, so if VSM with fixed penumbra sizes (or just for filtering) is working well for you, the article may still be useful anyway.<br />
Further worth noting is that most soft shadow articles discuss point lights, so I&#8217;ve done things with directional lights.</p>
<h3>Introduction</h3>
<div id="attachment_1017" style="width: 306px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/soft-shadows-statue1.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1017" class="size-medium wp-image-1017" src="http://www.derschmale.com/blog/wp-content/soft-shadows-statue1-300x289.jpg" alt="Penumbra widening with distance" width="300" height="289" srcset="https://www.derschmale.com/blog/wp-content/soft-shadows-statue1-300x289.jpg 300w, https://www.derschmale.com/blog/wp-content/soft-shadows-statue1.jpg 778w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1017" class="wp-caption-text">Penumbra widening with distance</p></div>
<p>If you need an introduction to Variance Shadow Maps, I recommend Andrew Lauritzen&#8217;s classic article in <a title="Summed-Area Variance Shadow Maps" href="http://http.developer.nvidia.com/GPUGems3/gpugems3_ch08.html">GPU Gems3: Summed-Area Variance Shadow Maps</a>. It also contains a technique for very nice precise soft shadows. So, yeah, VSM uses probability theory to estimate whether or not a point is in shadow. Groovy! Unlike standard shadow mapping, this allows for texture filtering the same way regular texture sampling does (bilinear/anisotropic sampling, mip-mapping, etc), and you can use anti-aliasing while rendering the shadow maps. What&#8217;s more, the shadow maps can be pre-filtered with a separable blur. This way we can eliminate <em>jaggies</em> using a small filter kernel, or create very soft shadows (with a fixed penumbra size) using a larger kernel.<br />
Real shadows, however, do not have a fixed size penumbra size; they get &#8220;softer&#8221; further away from the occluding object. No matter the size of the penumbra, we will need to filter the shadow map to get a blurred version of the original. The two general approaches are:</p>
<ul>
<li><strong>Using mip-maps</strong>: sample the mip-levels since they already contain further filtered data. By default, this starts looking very boxy with larger penumbrae. To alleviate this, you&#8217;d have to generate the mip levels with more expensive filtering (more or less like blurring the mip-maps as you&#8217;re generating them).</li>
<li><strong>Using Summed-Area Tables</strong> as in the GPU Gems article, resulting in very high-quality results. You can generate SATs like this: <a title="Fast Summed-Area Table Generation and its Applications" href="http://www.shaderwrangler.com/publications/sat/SAT_EG2005.pdf" target="_blank">&#8220;Hensley [2005]: Fast Summed-Area Table Generation and its Applications&#8221;</a>. Armed with this, any convolution using a SAT just takes 4 texture samples.</li>
</ul>
<p>Either way, you end up pre-filtering the shadow maps, the cost of which is dependent on the amount and resolution of the shadow maps (cubic shadow maps, different cascades for directional lights, etc&#8230;); not something I wanted to spend too much frame time on. So instead of pre-filtering, I wanted to try and combine it with &#8220;post-filtering&#8221; in the lighting shader in a way similar to <a title="Percentage Closer Soft Shadows" href="http://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf" target="_blank">PCSS</a> but without the crazy amount of samples. However, a standard mip-map chain still needs to be generated.</p>
<h3>Overview</h3>
<p>Calculating soft shadows with shadow maps is not exactly physically correct, but it does result in a visually pleasing approximation: penumbrae get wider with distance. Conceptually, we&#8217;ll be using the same method described in NVidia&#8217;s <a title="Percentage-Closer Soft Shadows" href="http://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf" target="_blank">Percentage-Closer Soft Shadows</a> paper (so review it ;) ) but of course the implementation will be quite different. As a recap, the steps involved are:</p>
<ul>
<li>Find the search area where potential occluders could be</li>
<li>Find the average occluder depth</li>
<li>Calculate the penumbra size from the average depth</li>
<li>Test the shadow map to find the percentage (or in our case: probability) of occluders in the penumbra region.</li>
</ul>
<h3>The main building block</h3>
<p>The main building block for our approach, unsurprisingly, uses the Chebyshev&#8217;s inequality theorem to find an upper bound for the probability that a sample is in the light. This is the default VSM fare:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;"><span style="color: #666666;">// moments contains float2(E(x), E(x^2))</span><br />
<span style="color: #666666;">// reference contains the depth value of the point to be compared</span><br />
<span style="color: #0000ff;">float</span> UpperBoundShadow<span style="color: #008000;">&#40;</span>float2 moments, <span style="color: #0000ff;">float</span> referenceDepth<span style="color: #008000;">&#41;</span><br />
<span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> variance <span style="color: #000080;">=</span> moments.<span style="color: #007788;">y</span> <span style="color: #000040;">-</span> moments.<span style="color: #007788;">x</span> <span style="color: #000040;">*</span> moments.<span style="color: #007788;">x</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// clamp to some minimum small variance value for numerical stability</span><br />
&nbsp; &nbsp; variance <span style="color: #000080;">=</span> max<span style="color: #008000;">&#40;</span>variance, MIN_VARIANCE<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> diff <span style="color: #000080;">=</span> referenceDepth <span style="color: #000040;">-</span> moments.<span style="color: #007788;">x</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// Chebyshev's inequality theorem</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> upperBound <span style="color: #000080;">=</span> variance <span style="color: #000040;">/</span> <span style="color: #008000;">&#40;</span>variance <span style="color: #000040;">+</span> diff<span style="color: #000040;">*</span>diff<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// The upper bound is only correct when referenceDepth &lt; moments.x (if not, return 1.0, ie: fully lit)</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">return</span> max<span style="color: #008000;">&#40;</span>upperBound, referenceDepth <span style="color: #000080;">&lt;</span> moments.<span style="color: #007788;">x</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span></div></div>
<h3>Finding the occluder search area</h3>
<p>This one is exactly the same as for <a title="Percentage Closer Soft Shadows" href="http://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf" target="_blank">PCSS</a> with an exception for directional lights. If actual directional lights would exist, there would be no penumbra. After all, all light rays are parallel! Also, the traditional PCSS way of back-projecting makes little sense either because the light doesn&#8217;t have an actual position in space. To get some handle on it, we&#8217;ll settle for a fixed search area instead.</p>
<h3>Find average occluder depth</h3>
<p>The original <a title="Percentage Closer Soft Shadows" href="http://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf" target="_blank">PCSS</a> approach tests shadow map samples in the search area to figure out whether they&#8217;re occluders. Then, the average depth for the occluders is calculated. Our approach will use Chebyshev&#8217;s inequality theory to again get an upper probability bound of occlusion for the entire search area. From this probability, we can calculate the average depth (see <a href="http://web4.cs.ucl.ac.uk/staff/j.kautz/publications/VSSM_PG2010.pdf" title="Variance Soft Shadow Mapping">[Yang 2010] Variance Soft Shadow Mapping</a>). <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-12898bd915fea772873a6d80d7be8583_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#100;&#95;&#123;&#116;&#111;&#116;&#97;&#108;&#125;" title="Rendered by QuickLaTeX.com" height="16" width="37" style="vertical-align: -3px;"/> is the total average depth, <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-b6769b895f0fb6ab853cf52c772906d6_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#100;&#95;&#123;&#111;&#99;&#99;&#108;&#117;&#100;&#101;&#114;&#125;" title="Rendered by QuickLaTeX.com" height="16" width="59" style="vertical-align: -3px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-353716af54fc1e8e7b166f490946fc82_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#100;&#95;&#123;&#110;&#111;&#110;&#79;&#99;&#99;&#108;&#117;&#100;&#101;&#114;&#125;" title="Rendered by QuickLaTeX.com" height="16" width="87" style="vertical-align: -3px;"/> are the average depths for occluders and non-occluders, respectively. <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#112;" title="Rendered by QuickLaTeX.com" height="12" width="10" style="vertical-align: -4px;"/> is the probability of a point being lit (ie: a non-occluder). Then, we can make the following observation:</p>
<p class="ql-center-displayed-equation" style="line-height: 18px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-2a22ee4ceacb17b6b6788b88d865b0cc_l3.png" height="18" width="342" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#091; &#100;&#95;&#123;&#116;&#111;&#116;&#97;&#108;&#125;&#32;&#61;&#32;&#100;&#95;&#123;&#111;&#99;&#99;&#108;&#117;&#100;&#101;&#114;&#115;&#125;&#32;&#42;&#32;&#40;&#49;&#32;&#45;&#32;&#112;&#41;&#32;&#43;&#32;&#100;&#95;&#123;&#110;&#111;&#110;&#79;&#99;&#99;&#108;&#117;&#100;&#101;&#114;&#115;&#125;&#32;&#42;&#32;&#112; &#92;&#093;" title="Rendered by QuickLaTeX.com"/></p>
<p class="ql-center-displayed-equation" style="line-height: 11px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-7a937a1e383c7f6082223acc1cd2f969_l3.png" height="11" width="31" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#091; &#92;&#105;&#102;&#102; &#92;&#093;" title="Rendered by QuickLaTeX.com"/></p>
<p class="ql-center-displayed-equation" style="line-height: 41px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-cbefad9e25127a55a51ce425ff6a4552_l3.png" height="41" width="274" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#091; &#100;&#95;&#123;&#111;&#99;&#99;&#108;&#117;&#100;&#101;&#114;&#115;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#100;&#95;&#123;&#116;&#111;&#116;&#97;&#108;&#125;&#32;&#45;&#32;&#100;&#95;&#123;&#110;&#111;&#110;&#79;&#99;&#99;&#108;&#117;&#100;&#101;&#114;&#115;&#125;&#32;&#42;&#32;&#112;&#125;&#123;&#40;&#49;&#32;&#45;&#32;&#112;&#41;&#125; &#92;&#093;" title="Rendered by QuickLaTeX.com"/></p>
<p>Since the area search approach is based on the simplification that receiving and casting planes are parallel to the shadow map, <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-353716af54fc1e8e7b166f490946fc82_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#100;&#95;&#123;&#110;&#111;&#110;&#79;&#99;&#99;&#108;&#117;&#100;&#101;&#114;&#125;" title="Rendered by QuickLaTeX.com" height="16" width="87" style="vertical-align: -3px;"/> is the reference depth. The only thing left to do is calculate the probability and the average depth for the entire search area. We&#8217;ll do this again very coarsely: we&#8217;ll use a single sample in a coarser mip level. It&#8217;s not exactly precise, but seems to work well enough. The average depth is already right there in the red channel of the shadow map, and for the probability we&#8217;ll use the upper bound again. The code below is for illustration, don&#8217;t expect any optimizations:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;height:300px;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;"><span style="color: #666666;">// searchAreaSize is expressed in shadow map UV coords (0 - 1)</span><br />
<span style="color: #666666;">// shadowMapSize is the size of the shadow map in texels</span><br />
<span style="color: #666666;">// shadowMapCoord is the shadow map coord projected into directional light space (so z contains its depth)</span><br />
<span style="color: #0000ff;">float</span> GetAverageOccluderDepth<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">float</span> searchAreaSize, <span style="color: #0000ff;">int</span> shadowMapSize, float4 shadowMapCoord<span style="color: #008000;">&#41;</span> <br />
<span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// calculate the mip level corresponding to the search area</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// Really, mipLevel would be a passed in as a constant.</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> mipLevel <span style="color: #000080;">=</span> log2<span style="color: #008000;">&#40;</span>searchAreaSize <span style="color: #000040;">*</span> shadowMapSize<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// retrieve the distribution's moments for the entire area</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// shadowMapSampler is a trilinear sampler, not a comparison sampler</span><br />
&nbsp; &nbsp; float2 moments <span style="color: #000080;">=</span> shadowMap.<span style="color: #007788;">SampleLevel</span><span style="color: #008000;">&#40;</span>shadowMapSampler, shadowMapCoord.<span style="color: #007788;">xy</span>, mipLevel<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> averageTotalDepth <span style="color: #000080;">=</span> moments.<span style="color: #007788;">x</span><span style="color: #008080;">;</span> &nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #666666;">// assign for semantic clarity</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> probability <span style="color: #000080;">=</span> UpperBoundShadow<span style="color: #008000;">&#40;</span>moments, shadowMapCoord.<span style="color: #007788;">z</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>&nbsp; &nbsp; <br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; <span style="color: #666666;">// prevent numerical issues</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>probability <span style="color: #000080;">&gt;</span> <span style="color:#800080;">.99</span><span style="color: #008000;">&#41;</span> <span style="color: #0000ff;">return</span> <span style="color:#800080;">0.0</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// calculate the average occluder depth</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">return</span> <span style="color: #008000;">&#40;</span>averageTotalDepth <span style="color: #000040;">-</span> probability <span style="color: #000040;">*</span> shadowMapCoord.<span style="color: #007788;">z</span><span style="color: #008000;">&#41;</span> <span style="color: #000040;">/</span> <span style="color: #008000;">&#40;</span><span style="color:#800080;">1.0</span> <span style="color: #000040;">-</span> probability<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span></div></div>
<h3>Calculate penumbra size from average depth</h3>
<p>We calculate the penumbra size in the same exact way as in the <a title="Percentage Closer Soft Shadows" href="http://developer.download.nvidia.com/shaderlibrary/docs/shadow_PCSS.pdf" target="_blank">PCSS</a>. For directional lights, this again doesn&#8217;t hold up very well (ah you missing light position!) Instead, we can simply use the distance to the average occluder as a scale factor instead. It&#8217;s fun when things get simpler!</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;"><span style="color: #666666;">// softness is the light size expressed in shadow map UV coords (0 - 1)</span><br />
<span style="color: #666666;">// shadowMapSize is the size of the shadow map in texels</span><br />
<span style="color: #666666;">// shadowMapCoord is the shadow map coord projected into directional light space (so z contains its depth)</span><br />
<span style="color: #666666;">// penumbraScale is a value describing how fast the penumbra should go soft. It can also be used to control the world space fall-off (by projecting world space distances to depth values)</span><br />
<span style="color: #0000ff;">float</span> EstimatePenumbraSize<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">float</span> lightSize, <span style="color: #0000ff;">int</span> shadowMapSize, float4 shadowMapCoord, <span style="color: #0000ff;">float</span> penumbraScale<span style="color: #008000;">&#41;</span><br />
<span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// the search area covers twice the light size</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> averageOccluderDepth <span style="color: #000080;">=</span> GetAverageOccluderDepth<span style="color: #008000;">&#40;</span>lightSize, shadowMapSize, shadowMapCoord<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> penumbraSize <span style="color: #000080;">=</span> lightSize <span style="color: #000040;">*</span> <span style="color: #008000;">&#40;</span>shadowMapCoord.<span style="color: #007788;">z</span> <span style="color: #000040;">-</span> averageOccluderDepth<span style="color: #008000;">&#41;</span> <span style="color: #000040;">*</span> penumbraScale<span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// clamp to the maximum softness, which matches the search area</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">return</span> min<span style="color: #008000;">&#40;</span>penumbraSize, lightSize<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span></div></div>
<h3>Calculate occluder probability</h3>
<p>Instead of using a pre-blurred mip-map chain or a SAT table, we&#8217;ll perform the filtering on the fly. We&#8217;ll start by sampling a fixed number of points in a Poisson disk distribution to get the (approximate) moments of the entire filter region (ie: the penumbra size). We&#8217;ll rotate the sample points randomly to reduce banding in favour of noise. This is essentially the same as percentage closer filtering, but using probabilities instead. So, a first draft:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;height:300px;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;">float4 shadowMapCoord <span style="color: #000080;">=</span> mul<span style="color: #008000;">&#40;</span>fragmentPosition, shadowMapMatrix<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<span style="color: #0000ff;">float</span> penumbraSize <span style="color: #000080;">=</span> EstimatePenumbraSize<span style="color: #008000;">&#40;</span>lightSize, shadowMapSize, shadowMapCoord, penumbraScale<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
float2 moments <span style="color: #000080;">=</span> <span style="color:#800080;">0.0</span><span style="color: #008080;">;</span><br />
<span style="color: #666666;">// ditherTexture contains 2d rotation matrix (cos, -sin, sin, cos), this will tile the texture across the screen</span><br />
float4 rotation <span style="color: #000080;">=</span> ditherTexture.<span style="color: #007788;">SampleLevel</span><span style="color: #008000;">&#40;</span>nearestWrapSampler, screenUV <span style="color: #000040;">*</span> screenSize <span style="color: #000040;">/</span> ditherTextureSize, <span style="color: #0000dd;">0</span><span style="color: #008000;">&#41;</span> <span style="color: #000040;">*</span> <span style="color:#800080;">2.0</span> <span style="color: #000040;">-</span> <span style="color:#800080;">1.0</span><span style="color: #008080;">;</span><br />
<br />
<span style="color: #0000ff;">for</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span> i <span style="color: #000080;">&lt;</span> numShadowSamples<span style="color: #008080;">;</span> <span style="color: #000040;">++</span>i<span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// poissonDiskValues contain the sampling offsets in the unit circle</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// scale by penumbraSize / 2 to get samples within the penumbra radius (penumbraSize is diameter)</span><br />
&nbsp; &nbsp; float2 sampleOffset <span style="color: #000080;">=</span> poissonDiskValues<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000040;">*</span> penumbraSize <span style="color: #000040;">/</span> <span style="color: #0000dd;">2</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; float4 coord <span style="color: #000080;">=</span> shadowMapCoord<span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// add rotated sample offset using dithered sample</span><br />
&nbsp; &nbsp; coord.<span style="color: #007788;">x</span> <span style="color: #000040;">+</span><span style="color: #000080;">=</span> sampleOffset.<span style="color: #007788;">x</span> <span style="color: #000040;">*</span> rotation.<span style="color: #007788;">x</span> <span style="color: #000040;">+</span> sampleOffset.<span style="color: #007788;">y</span> <span style="color: #000040;">*</span> rotation.<span style="color: #007788;">y</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; coord.<span style="color: #007788;">y</span> <span style="color: #000040;">+</span><span style="color: #000080;">=</span> sampleOffset.<span style="color: #007788;">x</span> <span style="color: #000040;">*</span> rotation.<span style="color: #007788;">z</span> <span style="color: #000040;">+</span> sampleOffset.<span style="color: #007788;">y</span> <span style="color: #000040;">*</span> rotation.<span style="color: #007788;">w</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// shadowMapSampler is a trilinear sampler, not a comparison sampler</span><br />
&nbsp; &nbsp; moments <span style="color: #000040;">+</span><span style="color: #000080;">=</span> shadowMap.<span style="color: #007788;">Sample</span><span style="color: #008000;">&#40;</span>shadowMapSampler, shadowMapCoord.<span style="color: #007788;">xy</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span><br />
moments <span style="color: #000040;">/</span><span style="color: #000080;">=</span> numShadowSamples<span style="color: #008080;">;</span><br />
<br />
<span style="color: #0000ff;">float</span> lightContribution <span style="color: #000080;">=</span> UpperBoundShadow<span style="color: #008000;">&#40;</span>moments, shadowMapCoord.<span style="color: #007788;">z</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span></div></div>
<p>But we can do better, observing that when sampling the disk distribution, we&#8217;d get a better approximation if we could get the average over every disk&#8217;s area instead of only at the sample point. Again, we can use the mip levels to get an approximation. A Poisson disk distribution has a minimum distance between any two points, so we can use this to calculate the mip level to sample from. Let&#8217;s replace some of the shader code:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;height:300px;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;">float4 shadowMapCoord <span style="color: #000080;">=</span> mul<span style="color: #008000;">&#40;</span>fragmentPosition, shadowMapMatrix<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<span style="color: #0000ff;">float</span> penumbraSize <span style="color: #000080;">=</span> EstimatePenumbraSize<span style="color: #008000;">&#40;</span>lightSize, shadowMapSize, shadowMapCoord<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
float2 moments <span style="color: #000080;">=</span> <span style="color:#800080;">0.0</span><span style="color: #008080;">;</span><br />
<span style="color: #666666;">// ditherTexture contains 2d rotation matrix (cos, -sin, sin, cos), this will tile the texture across the screen</span><br />
float4 rotation <span style="color: #000080;">=</span> ditherTexture.<span style="color: #007788;">SampleLevel</span><span style="color: #008000;">&#40;</span>nearestWrapSampler, screenUV <span style="color: #000040;">*</span> screenSize <span style="color: #000040;">/</span> ditherTextureSize, <span style="color: #0000dd;">0</span><span style="color: #008000;">&#41;</span> <span style="color: #000040;">*</span> <span style="color:#800080;">2.0</span> <span style="color: #000040;">-</span> <span style="color:#800080;">1.0</span><span style="color: #008080;">;</span><br />
<br />
<span style="color: #666666;">// calculate the mip level for the disk sample's area</span><br />
<span style="color: #666666;">// Sample points are expected to be penumbraSize * poissonRadius * shadowMapSize texels apart</span><br />
<span style="color: #666666;">// poissonRadius is half the minimum distance in the disk distribution</span><br />
<span style="color: #0000ff;">float</span> mipLevel <span style="color: #000080;">=</span> log2<span style="color: #008000;">&#40;</span>penumbraSize <span style="color: #000040;">*</span> poissonRadius <span style="color: #000040;">*</span> shadowMapSize<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span> <br />
<br />
<span style="color: #0000ff;">for</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span> i <span style="color: #000080;">&lt;</span> numShadowSamples<span style="color: #008080;">;</span> <span style="color: #000040;">++</span>i<span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// poissonDiskValues contain the sampling offsets in the unit circle</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// scale by penumbraSize / 2 to get samples within the penumbra radius (penumbraSize is diameter)</span><br />
&nbsp; &nbsp; float2 sampleOffset <span style="color: #000080;">=</span> poissonDiskValues<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000040;">*</span> penumbraSize <span style="color: #000040;">/</span> <span style="color: #0000dd;">2</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; float4 coord <span style="color: #000080;">=</span> shadowMapCoord<span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// add rotated sample offset using dithered sample</span><br />
&nbsp; &nbsp; coord.<span style="color: #007788;">x</span> <span style="color: #000040;">+</span><span style="color: #000080;">=</span> sampleOffset.<span style="color: #007788;">x</span> <span style="color: #000040;">*</span> rotation.<span style="color: #007788;">x</span> <span style="color: #000040;">+</span> sampleOffset.<span style="color: #007788;">y</span> <span style="color: #000040;">*</span> rotation.<span style="color: #007788;">y</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; coord.<span style="color: #007788;">y</span> <span style="color: #000040;">+</span><span style="color: #000080;">=</span> sampleOffset.<span style="color: #007788;">x</span> <span style="color: #000040;">*</span> rotation.<span style="color: #007788;">z</span> <span style="color: #000040;">+</span> sampleOffset.<span style="color: #007788;">y</span> <span style="color: #000040;">*</span> rotation.<span style="color: #007788;">w</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// shadowMapSampler is a trilinear sampler, not a comparison sampler</span><br />
&nbsp; &nbsp; moments <span style="color: #000040;">+</span><span style="color: #000080;">=</span> shadowMap.<span style="color: #007788;">SampleLevel</span><span style="color: #008000;">&#40;</span>shadowMapSampler, shadowMapCoord.<span style="color: #007788;">xy</span>, mipLevel<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span><br />
moments <span style="color: #000040;">/</span><span style="color: #000080;">=</span> numShadowSamples<span style="color: #008080;">;</span><br />
<br />
<span style="color: #0000ff;">float</span> lightContribution <span style="color: #000080;">=</span> UpperBoundShadow<span style="color: #008000;">&#40;</span>moments, shadowMapCoord.<span style="color: #007788;">z</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span></div></div>
<p>Not only does this give us a better estimate, it also reduces the noise from the random rotations because samples are expected to differ less. And what&#8217;s more, we don&#8217;t have to take all that many samples, even for quite large filter sizes! Another way of looking at this approach is as blurring the mip levels in the lighting shader &#8211; just enough to remove the jaggies &#8211; instead of doing so on the shadow map directly.</p>
<h3>Shadow map bounds</h3>
<p>As usual with soft shadows, there&#8217;s an issue with sampling outside the shadow map boundaries. For this reason, it may be required to extend the shadow maps to accommodate the largest penumbra size (our &#8220;lightSize&#8221; value). You might also want to keep the light size within certain limits so that most of the shadow map usage isn&#8217;t just there to provide the area not on the screen.</p>
<h3>Conclusion</h3>
<div id="attachment_1020" style="width: 306px" class="wp-caption aligncenter"><a href="http://www.derschmale.com/blog/wp-content/differences.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1020" class="size-medium wp-image-1020" src="http://www.derschmale.com/blog/wp-content/differences-300x215.jpg" alt="Different degrees of softness" width="300" height="215" srcset="https://www.derschmale.com/blog/wp-content/differences-300x215.jpg 300w, https://www.derschmale.com/blog/wp-content/differences-1024x733.jpg 1024w, https://www.derschmale.com/blog/wp-content/differences.jpg 1080w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1020" class="wp-caption-text">Note the difference between the vase&#8217;s shadow and that of the distant flag pole next to it.</p></div>
<div id="attachment_1019" style="width: 306px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/light-leaks.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1019" class="size-medium wp-image-1019" src="http://www.derschmale.com/blog/wp-content/light-leaks-300x180.jpg" alt="A problem with contact shadows for thin casters" width="300" height="180" srcset="https://www.derschmale.com/blog/wp-content/light-leaks-300x180.jpg 300w, https://www.derschmale.com/blog/wp-content/light-leaks-1024x614.jpg 1024w, https://www.derschmale.com/blog/wp-content/light-leaks.jpg 1230w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1019" class="wp-caption-text">A problem with contact shadows for thin casters</p></div>
<p>As I&#8217;ve said before, chances are slim that I&#8217;ll actually use this approach in my own code &#8211; unless if it&#8217;s for something very specific and manageable. Variance shadow maps have too many issues for me:</p>
<ul>
<li>Light leaking: while this can be ameliorated easily, but not completely avoided, solutions have a strong impact on the softness of the penumbra, destroying some of our hard work.</li>
<li>Thin caster leaks: the closer a point gets to the occluder, the smaller the upper bound becomes (as it&#8217;s less and less likely for it to be in the shadow). This creates severe light leaking close to thin casters.</li>
</ul>
<p>But, again, VSMs <i>have</i> been used with success in the past, so who knows this article still may be of use to someone. You might run into other problems too, if you&#8217;re up for pursuing this.<br />
And perhaps VSMs could be used only to perform the area search, and PCF sampling for the occlusion tests, which should remove any light leaking problems. Anyway, I&#8217;m up to receive any ideas, comments or poisonous arrows!</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.derschmale.com/2014/07/24/faster-variance-soft-shadow-mapping-for-varying-penumbra-sizes/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Deferred Subsurface Scattering using Compute Shaders</title>
		<link>https://www.derschmale.com/2014/06/02/deferred-subsurface-scattering-using-compute-shaders/</link>
					<comments>https://www.derschmale.com/2014/06/02/deferred-subsurface-scattering-using-compute-shaders/#comments</comments>
		
		<dc:creator><![CDATA[David]]></dc:creator>
		<pubDate>Mon, 02 Jun 2014 10:26:27 +0000</pubDate>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[DirectX]]></category>
		<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Helix]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[compute shader]]></category>
		<category><![CDATA[DirectX 11]]></category>
		<category><![CDATA[effects]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[rendering]]></category>
		<category><![CDATA[Shaders]]></category>
		<category><![CDATA[shading]]></category>
		<category><![CDATA[skin rendering]]></category>
		<category><![CDATA[Subsurface Scattering]]></category>
		<guid isPermaLink="false">http://www.derschmale.com/?p=994</guid>

					<description><![CDATA[I&#8217;ve recently decided to look into supporting subsurface scattering in my playground rendering engine Helix. It&#8217;s not the first time I&#8217;ve dabbled in the subject, but not being limited to crappy platforms I could push things a bit further. It&#8217;s a well researched and oft written about topic, so I&#8217;ve been reluctant to write up&#8230;]]></description>
										<content:encoded><![CDATA[<p><a href="http://www.derschmale.com/blog/wp-content/frontview.jpg"><img loading="lazy" decoding="async" class="alignleft size-medium wp-image-995" src="http://www.derschmale.com/blog/wp-content/frontview-233x300.jpg" alt="Deferred Subsurface Scattering using Compute Shaders" width="233" height="300" srcset="https://www.derschmale.com/blog/wp-content/frontview-233x300.jpg 233w, https://www.derschmale.com/blog/wp-content/frontview.jpg 576w" sizes="auto, (max-width: 233px) 100vw, 233px" /></a>I&#8217;ve recently decided to look into supporting subsurface scattering in my playground rendering engine Helix. It&#8217;s not the first time I&#8217;ve dabbled in the subject, but not being limited to crappy platforms I could push things a bit further. It&#8217;s a well researched and oft written about topic, so I&#8217;ve been reluctant to write up on the results of my implementation, especially seeing how heavily it&#8217;s based on these writings. But then I reflected back at why I started this blog in the first place: sharing a learning process. Not everyone likes reading papers so an implementation overview to go along with them might be helpful. The implementation I&#8217;ll show is simplified and slightly different to what you&#8217;ll find in the source material:</p>
<ul>
<li>Eugene d&#8217;Eon: <a title="Advanced Techniques for Realistic Real-Time Skin Rendering" href="http://http.developer.nvidia.com/GPUGems3/gpugems3_ch14.html" target="_blank">GPU Gems 3 Chapter 14: Advanced Techniques for Realistic Real-Time Skin Rendering</a></li>
<li><a href="http://www.iryoku.com/" title="Jorge Jimenez" target="_blank">Jorge Jimenez</a> and Diego Gutierrez&#8217; article in <a title="Screenspace Subsurface Scattering" href="http://www.crcpress.com/product/isbn/9781568814728#" target="_blank">GPU Pro: Screen-Space Subsurface Scattering</a></li>
<li>Nicolas Schulz: <a title="The Rendering Technology of Ryse" href="http://www.crytek.com/download/2014_03_25_CRYENGINE_GDC_Schultz.pdf" target="_blank">The Rendering Technology of Ryse</a> (I discovered relatively late how similar my solution was to this, which is both a bummer and encouraging ;) )</li>
</ul>
<p>I really recommend checking out these links if you have more than a passing interest in the subject (or to verify that I really can&#8217;t take credit for much in this post!). Finally, Gaussian convolutions are an important concept in what follows so if you&#8217;re hazy on the subject, read up on <a title="Gaussian blur" href="http://en.wikipedia.org/wiki/Gaussian_blur" target="_blank">Gaussian blurs</a> and how to implement them on compute shaders (explained in any decent DirectX 11 intro book, or <a href="http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/Efficient%20Compute%20Shader%20Programming.pps" title="Efficient Compute Shader Programming" target="_blank">here</a>).</p>
<p>The head model used in the screenshots is provided by Ten24 <a title="Ten24: Skin Shading in Unity" href="http://www.ten24.info/?p=1164" target="_blank">here</a>.</p>
<h3>Introduction</h3>
<p>Simply put, light can do two things when hitting an object. Either it reflects (specular reflections) or it enters the object. In the latter case, it bounces around inside for a bit before re-emerging at the surface (diffuse scattering). The properties of the material define how far the light is likely to travel before exiting. This distance is often so small that we traditionally assume that it emerges at the same exact point it enters. This distance can however be quite large for translucent materials and this assumption fails to result in convincing images; surfaces look to harsh or claylike.<br />
As light passes through, parts of its spectrum get absorbed. In other words, light tends to discolour more the further it travels underneath the surface. The function of discolouration over distance is expressed using a diffusion profile (refer to <a title="Advanced Techniques for Realistic Real-Time Skin Rendering" href="http://http.developer.nvidia.com/GPUGems3/gpugems3_ch14.html" target="_blank">d&#8217;Eon</a>). Some materials consist of several layers that absorb light differently (consider the layers of skin: oil, epidermis, dermis) so the diffusion profile can get quite complex. d&#8217;Eon uses the sum of 6 Gaussians to approximate the profile for skin, and Jimenez further reduces it to 4 (which can be implemented as 3, more on that later). Again, I suggest reading those articles if you haven&#8217;t yet, I don&#8217;t want to repeat them <em>too</em> much.<br />
Translucency results in a couple of visible effects, depending on where the light enters and exits the surface relative to the viewer. We&#8217;ll deal with each in separate ways.</p>
<ul>
<li><strong>Back-lit transmittance:</strong> Light enters the back side of an object and exits from the front.</li>
<li><strong>Same-side surface scattering:</strong> Light exits on the same side of an object.</li>
</ul>
<div id="attachment_997" style="width: 306px" class="wp-caption aligncenter"><a href="http://www.derschmale.com/blog/wp-content/alien3banner.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-997" class="size-medium wp-image-997" src="http://www.derschmale.com/blog/wp-content/alien3banner-300x127.jpg" alt="Backlit translucency" width="300" height="127" srcset="https://www.derschmale.com/blog/wp-content/alien3banner-300x127.jpg 300w, https://www.derschmale.com/blog/wp-content/alien3banner.jpg 650w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-997" class="wp-caption-text">&#8220;That&#8217;s a nice backlit ear you got there!&#8221;</p></div>
<p>For my own implementation, I put forth the following goals:</p>
<ul>
<li>Both back-lighting and same-side surface scattering should be supported.</li>
<li>It needs to work nicely with Helix&#8217;s deferred rendering pipeline. This meant working in screenspace.</li>
<li>Support multiple material profiles: not only skin, but wax, marble, jade, sticky white substances, &#8230;</li>
<li>While not being limited to it, it should be able to represent believable skin</li>
</ul>
<h3>Render pipeline</h3>
<p>To get started, I&#8217;ll detail the GBuffer layout and the render pipeline I settled on (even if it&#8217;s something I keep changing constantly ;) ):</p>
<div>
<table width="582" cellspacing="0" cellpadding="4">
<colgroup>
<col width="20" />
<col width="138" />
<col width="135" />
<col width="118" />
<col width="129" /> </colgroup>
<tbody>
<tr valign="TOP">
<td width="20">0</td>
<td colspan="3" bgcolor="#808080" width="407"><span style="color: #ffffff;">Depth</span></td>
<td bgcolor="#808080" width="129"><span style="color: #ffffff;">Stencil</span></td>
</tr>
<tr valign="TOP">
<td width="20">1</td>
<td bgcolor="#c5000b" width="138"><span style="color: #ffffff;">Albedo R</span></td>
<td bgcolor="#579d1c" width="135"><span style="color: #ffffff;">Albedo G</span></td>
<td bgcolor="#0000cc" width="118"><span style="color: #ffffff;">Albedo B</span></td>
<td bgcolor="#808080" width="129"><span style="color: #ffffff;">Emission</span></td>
</tr>
<tr valign="TOP">
<td width="20">2</td>
<td bgcolor="#c5000b" width="138"><span style="color: #ffffff;">Packed normal X</span></td>
<td bgcolor="#579d1c" width="135"><span style="color: #ffffff;">Packed normal Y</span></td>
<td bgcolor="#0000cc" width="118"><span style="color: #ffffff;">Translucency</span></td>
<td bgcolor="#808080" width="129"><span style="color: #ffffff;">Extended Material Profile ID</span></td>
</tr>
<tr valign="TOP">
<td width="20">3</td>
<td bgcolor="#c5000b" width="138"><span style="color: #ffffff;">Metallicness</span></td>
<td bgcolor="#579d1c" width="135"><span style="color: #ffffff;">Normal specular reflectivity</span></td>
<td bgcolor="#0000cc" width="118"><span style="color: #ffffff;">Roughness</span></td>
<td bgcolor="#808080" width="129"><span style="color: #ffffff;">TBD</span></td>
</tr>
</tbody>
</table>
<p>Layer 3 is irrelevant for subsurface scattering since it&#8217;s only used for specular reflections. In case you&#8217;re interested, it&#8217;s not unlike <a title="Real Shading in Unreal Engine 4 " href="http://blog.selfshadow.com/publications/s2013-shading-course/karis/s2013_pbs_epic_slides.pdf" target="_blank">Unreal Engine 4&#8217;s approach to specular representations</a>.</p>
</div>
<div>Two entries are mainly of concern here:</div>
<div>
<ul>
<li><strong>Translucency</strong>: The amount of back-lighting allowed to pass through the surface.</li>
<li><strong>Extended material profile: </strong>Contains an ID of the surface type. For example: 0 = default, 1 = skin, 2 = marble, &#8230; More on that later.</li>
</ul>
<p>Subsurface scattering does not affect specular reflections, so we&#8217;ll need to accumulate the lighting using separate HDR light accumulation buffers for diffuse and specular (the R11G11B11_FLOAT texture format worked well for me). Our diffuse target does not have albedo applied yet. The render pipeline is as follows:</p>
<ol>
<li>Render material properties to the GBuffer.</li>
<li>Render lights to diffuse + specular accumulation buffers (lighting includes transmittance)</li>
<li>Perform subsurface scattering</li>
<li>Combine lighting and apply albedo: <em>light = (diffuse + emission) * albedo + specular</em></li>
<li>Post-processing (bloom, tone mapping, &#8230;)</li>
</ol>
<div id="attachment_998" style="width: 306px" class="wp-caption aligncenter"><a href="http://www.derschmale.com/blog/wp-content/render-pipeline.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-998" class="size-medium wp-image-998" src="http://www.derschmale.com/blog/wp-content/render-pipeline-300x178.jpg" alt="Render pipeline" width="300" height="178" srcset="https://www.derschmale.com/blog/wp-content/render-pipeline-300x178.jpg 300w, https://www.derschmale.com/blog/wp-content/render-pipeline.jpg 1024w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-998" class="wp-caption-text">Helix&#8217; lighting pipeline</p></div>
<p>Why doesn&#8217;t the diffuse accumulation buffer have albedo applied, you ask? It probably doesn&#8217;t matter all that much, but my reasoning is as follows: albedo maps are usually generated from scans in evenly lit situations and as such already exhibit a degree of subsurface scattering (that&#8217;s why they are coloured the way they are in the first place!). Similarly, maps created by artists tend to mimic this look as well.</p>
<h3>Extended Material Profiles</h3>
<p>To support different material types, we store an &#8220;extended material profile&#8221; index in the GBuffer. This will be used to access a <a title="StructuredBuffer" href="http://msdn.microsoft.com/en-us/library/windows/desktop/ff471514(v=vs.85).aspx" target="_blank">structured buffer</a> object in the shaders. Each entry is of type <em>ExtendedMaterialProfile</em> which contains details about the (sub)surface properties. Since these properties are per material type and don&#8217;t vary per pixel (which is of course a simplification of reality) we don&#8217;t need to store all properties in the GBuffer, which would be prohibitively expensive. This construct is not necessarily only used for subsurface scattering but could be extended for other effects. The <em>ExtendedMaterialProfile</em> struct is defined in the shader as follows:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;"><span style="color: #0000ff;">struct</span> ExtendedMaterialProfile<br />
<span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// same-side scattering properties</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">int</span> enableSubsurfaceScattering<span style="color: #008080;">;</span><br />
&nbsp; &nbsp; uint numGaussians<span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> subsurfaceRadius<span style="color: #008080;">;</span><br />
&nbsp; &nbsp; float3 originalBlendFactors<span style="color: #008080;">;</span><br />
&nbsp; &nbsp; float3 subsurfaceBlends<span style="color: #008000;">&#91;</span>MAX_GAUSSIANS<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; float4 subsurfaceGaussianExponents<span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// back-lit transmittance properties</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">int</span> enableDistanceBasedTransmittance<span style="color: #008080;">;</span><br />
&nbsp; &nbsp; float3 transmittanceCoefficient<span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span></div></div>
<p>What every property means will be explained as we go.</p>
<h3>Back-lit Transmittance</h3>
<p>For default materials, we handle back-lit transmittance in a very traditional way: we invert the normal, calculate lighting for that and add it to the calculated light:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;">diffuseLight <span style="color: #000080;">=</span> Diffuse<span style="color: #008000;">&#40;</span>LightDir, Normal<span style="color: #008000;">&#41;</span> <span style="color: #000040;">+</span> Diffuse<span style="color: #008000;">&#40;</span>LightDir, <span style="color: #000040;">-</span>Normal<span style="color: #008000;">&#41;</span> <span style="color: #000040;">*</span> extendedMaterialProfile.<span style="color: #007788;">transmittanceCoefficient</span> <span style="color: #000040;">*</span> GBuffer.<span style="color: #007788;">translucency</span><span style="color: #008080;">;</span></div></div>
<p><em>transmittanceCoefficient</em> is a simple colour value to modulate the amount of light transmitted. This approach is useful for thin surfaces such as leaves or paper. For objects with more volume we need to calculate (or rather: estimate) how far light has travelled through the object in order to know how much of it is absorbed. This is in fact the same as we did way back in <a title="Subsurface Scattering and Advanced Skin Rendering in Away3D 4.0 (“Broomstick”)" href="http://www.derschmale.com/2011/04/22/subsurface-scattering-and-advanced-skin-rendering-in-away3d-4-0-broomstick/" target="_blank">Away3D Broomstick</a>.</p>
<p>To recap: we get the depth value from the shadow map and use that to calculate the position of the occluder (<a title="Reconstructing positions from the depth buffer pt. 2: Perspective and orthographic general case" href="http://www.derschmale.com/2014/03/19/reconstructing-positions-from-the-depth-buffer-pt-2-perspective-and-orthographic-general-case/" target="_blank">not sure how?</a>). We can use the distance between the occluder and the shaded point as an estimate of how far the light has travelled through the object. Unfortunately, this approach requires lights to have shadow maps associated with them. Helix simply ignores distance-based transmittance for those that don&#8217;t. You may also want to consider storing linear depth values for point and spot lights to prevent reduced precision further away from the light.</p>
<p>Armed with this distance value, I&#8217;ve found that using the <a title="Beer-Lambert Law" href="http://en.wikipedia.org/wiki/Beer%E2%80%93Lambert_law" target="_blank">Beer-Lambert law</a> for transmittance allows for convincing enough results for common cases. For each colour channel, the transmitted ratio of light for distance <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-ede05c264bba0eda080918aaa09c4658_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#120;" title="Rendered by QuickLaTeX.com" height="8" width="10" style="vertical-align: 0px;"/> is as follows:</p>
<p><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-a52d3d72c4e4ac44624828478076586f_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#92;&#42; &#84;&#40;&#120;&#41;&#32;&#61;&#32;&#101;&#94;&#123;&#45;&#99;&#120;&#125;&#92;&#92;&#42; &#99;&#32;&#61;&#32;&#116;&#114;&#97;&#110;&#115;&#109;&#105;&#116;&#116;&#97;&#110;&#99;&#101;&#67;&#111;&#101;&#102;&#102;&#105;&#99;&#105;&#101;&#110;&#116; " title="Rendered by QuickLaTeX.com" height="40" width="242" style="vertical-align: -4px;"/></p>
<p>Again, we simply use the inverted normal to get an approximation of light hitting the other side of the surface. The total diffuse lighting for the pixel will be:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;">diffuseLight <span style="color: #000080;">=</span> Diffuse<span style="color: #008000;">&#40;</span>LightDir, Normal<span style="color: #008000;">&#41;</span> <span style="color: #000040;">+</span> Diffuse<span style="color: #008000;">&#40;</span>LightDir, <span style="color: #000040;">-</span>Normal<span style="color: #008000;">&#41;</span> <span style="color: #000040;">*</span> <span style="color: #0000dd;">exp</span><span style="color: #008000;">&#40;</span><span style="color: #000040;">-</span>extendedMaterialProfile.<span style="color: #007788;">transmittanceCoefficient</span> <span style="color: #000040;">*</span> distance<span style="color: #008000;">&#41;</span> <span style="color: #000040;">*</span> GBuffer.<span style="color: #007788;">translucency</span><span style="color: #008080;">;</span></div></div>
<p>Louis Bavoil suggests a nice artist-friendly way to calculate the <em>transmittanceCoefficient</em> value for a measured colour at a given distance, which is implemented on the C++ side of the <em>ExtendedMaterialProfile</em> in the following convenience method:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;"><span style="color: #0000ff;">void</span> ExtendedMaterialProfile<span style="color: #008080;">::</span><span style="color: #007788;">SetTransmittanceCoefficientByDistance</span><span style="color: #008000;">&#40;</span>float3 measuredColor, <span style="color: #0000ff;">float</span> measureDistance<span style="color: #008000;">&#41;</span><br />
<span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; enableDistanceBasedTransmittance <span style="color: #000080;">=</span> <span style="color: #0000dd;">1</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; transmittanceCoefficient <span style="color: #000080;">=</span> <span style="color: #000040;">-</span>Ln<span style="color: #008000;">&#40;</span>measuredColor<span style="color: #008000;">&#41;</span> <span style="color: #000040;">/</span> measureDistance<span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span></div></div>
<div id="attachment_1009" style="width: 156px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/transmissionMask.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1009" class="size-thumbnail wp-image-1009" src="http://www.derschmale.com/blog/wp-content/transmissionMask-150x150.jpg" alt="The transmittance mask used for the head" width="150" height="150" srcset="https://www.derschmale.com/blog/wp-content/transmissionMask-150x150.jpg 150w, https://www.derschmale.com/blog/wp-content/transmissionMask-300x300.jpg 300w, https://www.derschmale.com/blog/wp-content/transmissionMask-50x50.jpg 50w, https://www.derschmale.com/blog/wp-content/transmissionMask.jpg 512w" sizes="auto, (max-width: 150px) 100vw, 150px" /></a><p id="caption-attachment-1009" class="wp-caption-text">The transmittance mask used for the head</p></div>
<p>The <em>enableDistanceBasedTransmittance </em>property dictates which approach is used. For leaves, we&#8217;d set it to 0, for skin we&#8217;d want 1. The amount of transmitted light is modulated using a transmittance mask, the values of which are written to the GBuffer.</p>
<p>You could also use the diffusion profile to calculate the transmittance (which is something I&#8217;ll probably experiment with at some point). For now, this is faster and quite acceptable.</p>
<div id="attachment_1007" style="width: 302px" class="wp-caption aligncenter"><a href="http://www.derschmale.com/blog/wp-content/transmission1.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1007" class="size-medium wp-image-1007" src="http://www.derschmale.com/blog/wp-content/transmission1-296x300.jpg" alt="Back-lit Transmission" width="296" height="300" srcset="https://www.derschmale.com/blog/wp-content/transmission1-296x300.jpg 296w, https://www.derschmale.com/blog/wp-content/transmission1-50x50.jpg 50w, https://www.derschmale.com/blog/wp-content/transmission1.jpg 732w" sizes="auto, (max-width: 296px) 100vw, 296px" /></a><p id="caption-attachment-1007" class="wp-caption-text">The result of the back-lit implementation</p></div>
<p>&nbsp;</p>
<h3>Same-side subsurface scattering</h3>
<div id="attachment_1008" style="width: 306px" class="wp-caption aligncenter"><a href="http://www.derschmale.com/blog/wp-content/sss-comparison.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1008" class="size-medium wp-image-1008" src="http://www.derschmale.com/blog/wp-content/sss-comparison-300x196.jpg" alt="SSS Comparison" width="300" height="196" srcset="https://www.derschmale.com/blog/wp-content/sss-comparison-300x196.jpg 300w, https://www.derschmale.com/blog/wp-content/sss-comparison-1024x671.jpg 1024w, https://www.derschmale.com/blog/wp-content/sss-comparison.jpg 1082w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1008" class="wp-caption-text">Comparison between not using same-side subsurface scattering and the implementation in Helix</p></div>
<p>This aspect deserves a bit more finesse. After all, it&#8217;s what gives skin its organic fleshy look and we want the implementation to be solid enough to support this believably. As humans, we&#8217;re very attuned to recognizing others as humans; we can easily spot fake ones based on small perceptional errors. We&#8217;ll base ourselves on Jimenez&#8217; approach of using 4 Gaussians and we&#8217;ll treat other diffusion profiles the same way. Remember what Gaussian distributions look like for variance <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-6a987274197f5fb6bfd3855d351bc2af_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#115;&#105;&#103;&#109;&#97;&#94;&#50;" title="Rendered by QuickLaTeX.com" height="15" width="18" style="vertical-align: 0px;"/>:</p>
<p class="ql-center-displayed-equation" style="line-height: 48px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-ee33a62f2bbef7cb409ef5627a507d47_l3.png" height="48" width="101" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#091; &#71;&#40;&#120;&#41;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#101;&#94;&#123;&#45;&#92;&#102;&#114;&#97;&#99;&#123;&#120;&#94;&#50;&#125;&#123;&#92;&#98;&#101;&#116;&#97;&#125;&#125;&#125;&#123;&#92;&#97;&#108;&#112;&#104;&#97;&#125; &#92;&#093;" title="Rendered by QuickLaTeX.com"/></p>
<p class="ql-center-displayed-equation" style="line-height: 18px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-2b322d7aaba5593e726a34cc81e72c16_l3.png" height="18" width="89" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#091; &#92;&#97;&#108;&#112;&#104;&#97;&#32;&#61;&#32;&#92;&#115;&#113;&#114;&#116;&#123;&#50;&#92;&#112;&#105;&#92;&#115;&#105;&#103;&#109;&#97;&#94;&#50;&#125; &#92;&#093;" title="Rendered by QuickLaTeX.com"/></p>
<p class="ql-center-displayed-equation" style="line-height: 21px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-e10054ae348d28bf98af3f6d57522686_l3.png" height="21" width="61" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#091; &#92;&#98;&#101;&#116;&#97;&#32;&#61;&#32;&#50;&#32;&#92;&#115;&#105;&#103;&#109;&#97;&#94;&#50; &#92;&#093;" title="Rendered by QuickLaTeX.com"/></p>
<p>Jimenez&#8217; approach can be thought of as performing 4 Gaussian blurs on the image and blending them together with different weights per colour channel. Note that the sum of all blend weights must be 1 for every colour channel or energy would be lost or gained when compositing.</p>
<div id="attachment_1001" style="width: 306px" class="wp-caption aligncenter"><a href="http://www.derschmale.com/blog/wp-content/weights.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1001" class="size-medium wp-image-1001" src="http://www.derschmale.com/blog/wp-content/weights-300x300.jpg" alt="Blending together weighted gaussian convolutions" width="300" height="300" srcset="https://www.derschmale.com/blog/wp-content/weights-300x300.jpg 300w, https://www.derschmale.com/blog/wp-content/weights-150x150.jpg 150w, https://www.derschmale.com/blog/wp-content/weights-50x50.jpg 50w, https://www.derschmale.com/blog/wp-content/weights.jpg 1024w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1001" class="wp-caption-text">Blending together weighted Gaussian convolutions</p></div>
<p>Talking about Gaussian blurs; some implementations do exactly that. By using a fixed sample count the samples&#8217; weights can be precomputed (the total blended sum-of-gaussians, that is). The sampling disk around the centre pixel is rotated to more closely match the orientation of the surface and projected to screen space. This way, a somewhat correct range is sampled. However, using the distance of the sampling points&#8217; on the sampling disk as weights for the Gaussian convolution assumes that all sampled points are evenly spaced. This does not necessarily match with what&#8217;s on screen. Take a look at a top-down view of such a sampling:</p>
<p><a href="http://www.derschmale.com/blog/wp-content/gaussian-defect.jpg"><img loading="lazy" decoding="async" class="aligncenter size-medium wp-image-1002" src="http://www.derschmale.com/blog/wp-content/gaussian-defect-300x137.jpg" alt="Distance discrepancy" width="300" height="137" srcset="https://www.derschmale.com/blog/wp-content/gaussian-defect-300x137.jpg 300w, https://www.derschmale.com/blog/wp-content/gaussian-defect.jpg 656w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a></p>
<p>As you can see, the real distances to the central point are quite different. This can lead to an over-contribution of samples at strongly varying surfaces, manifesting in halos. In my implementation, I used the depth buffer to reconstruct view space positions to get actual view-space distances. While still not a <em>correct</em> estimate of how far the light has travelled underneath the surface, it&#8217;s a better approximation. However, it also means that the samples aren&#8217;t necessarily evenly distributed with respect to the Gaussian curve. This is really only a problem at discontinuities and is in my opinion less objectionable than halo artefacts. Since our sampling weights are not known beforehand, we need to manually normalize the calculation. This means we&#8217;ll need to sum all calculated weights and use it to &#8216;average out&#8217; the total.</p>
<p>There&#8217;s some extra benefits to this approach. Because we&#8217;re manually normalizing the total, we don&#8217;t have to use a fixed sample count: we can limit the sampling to exactly the pixels we need. A traditional Gaussian convolution with precomputed weights would require us to sample <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-5793832f979c2268e3694c246d53b1bb_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#78;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;"/> points, even if the radius is less than <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-5793832f979c2268e3694c246d53b1bb_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#78;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;"/> pixels. This doesn&#8217;t really contribute any new info (sampling 11 points across 4 pixels is a waste). Similarly, when the radius exceeds <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-5793832f979c2268e3694c246d53b1bb_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#78;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;"/>, we can now add the contribution for every pixel, increasing the quality.</p>
<p>A smaller bonus is that the Gaussian calculation can be a bit simplified. The gaussian functions themselves don&#8217;t need to be scaled with a normalization factor as it will be implicit in the total (ie: we can drop the <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-8f0b6b1a01f8fcc2f95be0364c090397_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#97;&#108;&#112;&#104;&#97;" title="Rendered by QuickLaTeX.com" height="8" width="11" style="vertical-align: 0px;"/> factor in the Gaussian formula above).</p>
<p>Using the real distance and a manual normalization isn&#8217;t without its drawbacks, however. Most obviously, the position needs to be reconstructed from the depth buffer. This means extra texture fetches and calculations. Furthermore, we can&#8217;t precompute the sum-of-Gaussians and store them in a lookup texture. Every Gaussian will need to be calculated separately, and a total weight should be counted for each, so that each curve can be normalized individually. If we wouldn&#8217;t do this, we&#8217;d get an incorrect balance between the layers.</p>
<p>Finally, Jimenez observes that the first Gaussian for skin has such a small variance that it usually wholly falls within a single pixel. This means that we can just calculate 3 Gaussians and the value from the original diffuse buffer&#8217;s value.</p>
<h3>Separability</h3>
<p>2D Gaussians are separable: it&#8217;s identical to a horizontal 1D Gaussian followed by a vertical one, reducing the amount of samples necessary to <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-13b020be327c3fc955f64b2d96c329b1_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#50;&#78;" title="Rendered by QuickLaTeX.com" height="12" width="25" style="vertical-align: 0px;"/> instead of <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-929e3ccb61da0dd7de63dd25661941a3_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#78;&#94;&#50;" title="Rendered by QuickLaTeX.com" height="15" width="23" style="vertical-align: 0px;"/>. However, this is not the case with depth-dependent Gaussians, nor the sum of several. However, ignoring this and merging everything in 2 passes anyway (instead of doing it in up to 8) doesn&#8217;t result in a noticeable difference.</p>
<div id="attachment_1003" style="width: 306px" class="wp-caption aligncenter"><a href="http://www.derschmale.com/blog/wp-content/2dvsv1gaussian.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1003" class="size-medium wp-image-1003" src="http://www.derschmale.com/blog/wp-content/2dvsv1gaussian-300x202.jpg" alt="Correct vs Separated" width="300" height="202" srcset="https://www.derschmale.com/blog/wp-content/2dvsv1gaussian-300x202.jpg 300w, https://www.derschmale.com/blog/wp-content/2dvsv1gaussian-1024x691.jpg 1024w, https://www.derschmale.com/blog/wp-content/2dvsv1gaussian.jpg 1122w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1003" class="wp-caption-text">Comparing the correct 2D approach with the less correct 2x1D separated approach</p></div>
<p>I needed to use the histogram in Photoshop with the &#8220;difference&#8221; blend layer to verify that there was in fact a slight difference, mainly due to the wider red Gaussian. In any case, it gets my pseudo-separable stamp!</p>
<p>Performing 1D convolutions sampling only at pixel centres is pretty efficient using compute shaders, meaning we can get higher fidelity and less noise than using a fragment shader using jittered samples.</p>
<h3>Implementation</h3>
<p>Returning to the <em>ExtendedMaterialProfile </em>struct, we can now explain what the other properties mean:</p>
<ul>
<li><strong>enableSubsurfaceScattering;</strong> Indicates whether or not subsurface scattering should be performed for this material.</li>
<li><strong>numGaussians:</strong> the amount of Gaussians. 3 for skin, for example.</li>
<li><strong>subsurfaceRadius:</strong> the sample radius in meters of the largest Gaussian, derived from the variances.</li>
<li><strong>originalBlendFactors:</strong> the amount of unblurred diffuse lighting that is blended in. This is used to replace the smallest Gaussian for skin.</li>
<li><strong>subsurfaceBlends:</strong> the amount of blending for each Gaussian layer. Summed all together with originalBlendFactors, it needs to form 1 for each channel.</li>
<li><strong>subsurfaceGaussianExponents:</strong> The exponents used to calculate the Gaussian weights (<img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-afd3e3ea7519cbdc4cc412c1ce3616a2_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#45;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#92;&#98;&#101;&#116;&#97;&#125;" title="Rendered by QuickLaTeX.com" height="25" width="24" style="vertical-align: -9px;"/> from the Gaussian formula above)</li>
</ul>
<p>On the C++ side, another convenience method is provided to set the subsurface properties:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;height:300px;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;"><span style="color: #0000ff;">void</span> ExtendedMaterialProfile<span style="color: #008080;">::</span><span style="color: #007788;">SetSubsurfaceScattering</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">unsigned</span> <span style="color: #0000ff;">int</span> numGaussians, <span style="color: #0000ff;">const</span> float3<span style="color: #000040;">*</span> blendWeights, <span style="color: #0000ff;">const</span> <span style="color: #0000ff;">float</span><span style="color: #000040;">*</span> variances<span style="color: #008000;">&#41;</span><br />
<span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; float3 total<span style="color: #008000;">&#40;</span><span style="color:#800080;">0.0</span>, <span style="color:#800080;">0.0</span>, <span style="color:#800080;">0.0</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; enableSubsurfaceScattering <span style="color: #000080;">=</span> <span style="color: #0000dd;">1</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; this<span style="color: #000040;">-</span><span style="color: #000040;">&amp;</span>gt<span style="color: #008080;">;</span>numGaussians <span style="color: #000080;">=</span> numGaussians<span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #0000ff;">for</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">unsigned</span> <span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span> i <span style="color: #000040;">&amp;</span>lt<span style="color: #008080;">;</span> numGaussians<span style="color: #008080;">;</span> <span style="color: #000040;">++</span>i<span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #666666;">// calculate the total blend weights of the gaussians, used to automatically set the amount of unblurred light</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; total <span style="color: #000040;">+</span><span style="color: #000080;">=</span> blendWeights<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span> subsurfaceBlends<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> blendWeights<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #666666;">// gaussian normal distribution</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; subsurfaceGaussianExponents<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span> <span style="color: #000040;">/</span> <span style="color: #008000;">&#40;</span><span style="color:#800080;">2.0f</span> <span style="color: #000040;">*</span> variances<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #666666;">// use standard deviation as a radius estimate</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #666666;">// Gaussian is expressed in terms of millimeters, radius needs to be in meters, so divide by 1000.0f!</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0000ff;">float</span> radius <span style="color: #000080;">=</span> Sqrt<span style="color: #008000;">&#40;</span>variances<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#41;</span> <span style="color: #000040;">/</span> <span style="color:#800080;">1000.0f</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>radius <span style="color: #000040;">&amp;</span>gt<span style="color: #008080;">;</span> subsurfaceRadius<span style="color: #008000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; subsurfaceRadius <span style="color: #000080;">=</span> radius<span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #008000;">&#125;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// the amount of the original unblurred diffuse is just as much so that all blend weights sum to 1</span><br />
&nbsp; &nbsp; originalBlendFactors <span style="color: #000080;">=</span> <span style="color:#800080;">1.0</span> <span style="color: #000040;">-</span> total<span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span></div></div>
<p>We&#8217;re using compute shaders to perform the 1D convolutions, so we can efficiently pre-fetch the texture samples and calculate the view-space positions in faster group-shared memory. An overview of the compute shader:</p>
<ul>
<li>Gather and precompute everything required for the current compute shader thread group, so we don&#8217;t need to do this per sample:
<ul>
<li>diffuse radiance, sampled from accumulation buffer</li>
<li>view-space position, derived from depth buffer</li>
</ul>
</li>
<li>Retrieve the extended material ID.</li>
<li>Project the sample radius using the camera to get a radius approximation in screen space.</li>
<li>Loop over all samples falling within the projected sample radius.
<ul>
<li>Calculate distance to central pixel</li>
<li>Calculate Gaussian weight for pixel</li>
<li>Add weighted radiance sample</li>
<li>Add weight to total</li>
</ul>
</li>
<li>&#8220;Average out&#8221; total radiance using total weight count.</li>
</ul>
<p>The code below should make this a bit clearer. It&#8217;s for the horizontal Gaussians only, but the vertical is nearly identical:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;height:300px;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;"><span style="color: #339900;">#define NUMTHREADS 256</span><br />
<span style="color: #339900;">#define MAX_RADIUS 32</span><br />
<br />
<span style="color: #339900;">#define MAX_GAUSSIANS 4</span><br />
<br />
<span style="color: #0000ff;">struct</span> ExtendedMaterialProfile<br />
<span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">int</span> enableSubsurfaceScattering<span style="color: #008080;">;</span> <span style="color: #666666;">// whether or not to use subsurface scattering (0 or 1)</span><br />
&nbsp; &nbsp; uint numGaussians<span style="color: #008080;">;</span>&nbsp; &nbsp; &nbsp; <span style="color: #666666;">// the amount of gaussians to approximate the diffusion profile</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> subsurfaceRadius<span style="color: #008080;">;</span> &nbsp; &nbsp; <span style="color: #666666;">// the radius in meters of the largest Gaussian</span><br />
&nbsp; &nbsp; float3 originalBlendFactors<span style="color: #008080;">;</span>&nbsp; &nbsp; <span style="color: #666666;">// the ratio of the original unblurred texture to add</span><br />
&nbsp; &nbsp; float3 subsurfaceBlends<span style="color: #008000;">&#91;</span>MAX_GAUSSIANS<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span> <span style="color: #666666;">// the blend weights for each gaussian</span><br />
&nbsp; &nbsp; float4 subsurfaceGaussianExponents<span style="color: #008080;">;</span> <span style="color: #666666;">// The constant factor of the Gaussian exponents: -1.0f / (2.0f * variances[i]);</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// We store the exponents for all 4 in a single float4 for convenience (see code)</span><br />
<br />
&nbsp; &nbsp; <span style="color: #0000ff;">int</span> enableDistanceBasedTransmittance<span style="color: #008080;">;</span> &nbsp; <span style="color: #666666;">// whether or not to use distance-based transmittance scattering (0 or 1, unused here but used in the lighting shader)</span><br />
&nbsp; &nbsp; float3 transmittanceCoefficient<span style="color: #008080;">;</span>&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #666666;">// if enableDistanceBasedTransmittance = 0, this is just the colour of the backlit light, otherwise, it's the density of the beer-law exponent</span><br />
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span><br />
<br />
cbuffer cameraData<br />
<span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; float4x4 projectionMatrix<span style="color: #008080;">;</span>&nbsp; &nbsp; &nbsp; <span style="color: #666666;">// the local projection matrix</span><br />
&nbsp; &nbsp; float4 viewFrustumVectors<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">4</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span> &nbsp; <span style="color: #666666;">// contains the view frustum vectors for each edge going from the near to far plane, scaled so that z == 1, in clockwise order starting top-left</span><br />
&nbsp; &nbsp; float2 renderTargetResolution<span style="color: #008080;">;</span>&nbsp; <span style="color: #666666;">// the size of the render target</span><br />
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span><br />
<br />
Texture2D<span style="color: #000080;">&lt;</span>float3<span style="color: #000080;">&gt;</span> diffuseSource<span style="color: #008080;">;</span><br />
Texture2D<span style="color: #000080;">&lt;</span><span style="color: #0000ff;">float</span><span style="color: #000080;">&gt;</span> gbufferDepth<span style="color: #008080;">;</span><br />
Texture2D<span style="color: #000080;">&lt;</span>float4<span style="color: #000080;">&gt;</span> gbufferNormalMaterial<span style="color: #008080;">;</span><br />
StructuredBuffer<span style="color: #000080;">&lt;</span>ExtendedMaterialProfile<span style="color: #000080;">&gt;</span> extendedMaterialProfiles<span style="color: #008080;">;</span><br />
<br />
RWTexture2D<span style="color: #000080;">&lt;</span>float3<span style="color: #000080;">&gt;</span> diffuseTarget<span style="color: #008080;">;</span><br />
<br />
groupshared float3 radianceSamples<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">2</span> <span style="color: #000040;">*</span> MAX_RADIUS <span style="color: #000040;">+</span> NUMTHREADS<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
groupshared float3 positionSamples<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">2</span> <span style="color: #000040;">*</span> MAX_RADIUS <span style="color: #000040;">+</span> NUMTHREADS<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
<br />
<span style="color: #666666;">// Retrieve the view vector for a given pixel coordinate.</span><br />
float4 GetViewVector<span style="color: #008000;">&#40;</span>uint2 coord<span style="color: #008000;">&#41;</span><br />
<span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// turn coords into a uv ratio for interpolation</span><br />
&nbsp; &nbsp; float2 uv <span style="color: #000080;">=</span> coord <span style="color: #000040;">/</span> renderTargetResolution<span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// Uses standard trilinear interpolation</span><br />
&nbsp; &nbsp; float4 top <span style="color: #000080;">=</span> lerp<span style="color: #008000;">&#40;</span>viewFrustumVectors<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">0</span><span style="color: #008000;">&#93;</span>, viewFrustumVectors<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">1</span><span style="color: #008000;">&#93;</span>, uv.<span style="color: #007788;">x</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; float4 bottom <span style="color: #000080;">=</span> lerp<span style="color: #008000;">&#40;</span>viewFrustumVectors<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">3</span><span style="color: #008000;">&#93;</span>, viewFrustumVectors<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">2</span><span style="color: #008000;">&#93;</span>, uv.<span style="color: #007788;">x</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">return</span> lerp<span style="color: #008000;">&#40;</span>bottom, top, uv.<span style="color: #007788;">y</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span><br />
<br />
<br />
<span style="color: #666666;">// Unproject a value from the depth buffer to the Z value in view space.</span><br />
<span style="color: #666666;">// Multiply the result with an interpolated frustum vector to get the actual view-space coordinates</span><br />
<span style="color: #666666;">// Refer to http://www.derschmale.com/2014/01/26/reconstructing-positions-from-the-depth-buffer/ for more info on this:</span><br />
<span style="color: #0000ff;">float</span> DepthToViewZ<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">float</span> depthValue<span style="color: #008000;">&#41;</span><br />
<span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">return</span> projectionMatrix<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">3</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#91;</span><span style="color: #0000dd;">2</span><span style="color: #008000;">&#93;</span> <span style="color: #000040;">/</span> <span style="color: #008000;">&#40;</span>depthValue <span style="color: #000040;">-</span> projectionMatrix<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">2</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#91;</span><span style="color: #0000dd;">2</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span><br />
<br />
<br />
<span style="color: #666666;">// returns the view-space position for the point at the given pixel</span><br />
<span style="color: #666666;">// coord: the pixel coordinate to sample the depth at</span><br />
<span style="color: #666666;">// viewDir: the view direction (with z == 1) matching the pixel coordinate</span><br />
<span style="color: #666666;">// Refer to http://www.derschmale.com/2014/01/26/reconstructing-positions-from-the-depth-buffer/ for more info on this</span><br />
float3 GetViewPosition<span style="color: #008000;">&#40;</span>uint2 coord, float3 viewDir<span style="color: #008000;">&#41;</span><br />
<span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">return</span> viewDir <span style="color: #000040;">*</span> DepthToViewZ<span style="color: #008000;">&#40;</span>gbufferDepth<span style="color: #008000;">&#91;</span>coord<span style="color: #008000;">&#93;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span><br />
<br />
<span style="color: #008000;">&#91;</span>numthreads<span style="color: #008000;">&#40;</span>NUMTHREADS, <span style="color: #0000dd;">1</span>, <span style="color: #0000dd;">1</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#93;</span><br />
<span style="color: #0000ff;">void</span> main<span style="color: #008000;">&#40;</span>uint3 dispatchThreadID <span style="color: #008080;">:</span> SV_DispatchThreadID, uint3 groupThreadID <span style="color: #008080;">:</span> SV_GroupThreadID<span style="color: #008000;">&#41;</span><br />
<span style="color: #008000;">&#123;</span> &nbsp; <br />
&nbsp; &nbsp; <span style="color: #666666;">// Store all radiance samples and view-space positions in groupshared memory.</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// See Gaussian blur example at: http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/Efficient%20Compute%20Shader%20Programming.pps</span><br />
<br />
&nbsp; &nbsp; float3 viewDir <span style="color: #000080;">=</span> GetViewVector<span style="color: #008000;">&#40;</span>dispatchThreadID.<span style="color: #007788;">xy</span><span style="color: #008000;">&#41;</span>.<span style="color: #007788;">xyz</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; float3 frustumDiff <span style="color: #000080;">=</span> float3<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#40;</span>viewFrustumVectors<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">2</span><span style="color: #008000;">&#93;</span>.<span style="color: #007788;">x</span> <span style="color: #000040;">-</span> viewFrustumVectors<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">3</span><span style="color: #008000;">&#93;</span>.<span style="color: #007788;">x</span><span style="color: #008000;">&#41;</span> <span style="color: #000040;">/</span> renderTargetResolution.<span style="color: #007788;">x</span>, <span style="color:#800080;">0.0</span>, <span style="color:#800080;">0.0</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; float3 centerSample <span style="color: #000080;">=</span> diffuseSource<span style="color: #008000;">&#91;</span>dispatchThreadID.<span style="color: #007788;">xy</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; float3 viewPosition <span style="color: #000080;">=</span> GetViewPosition<span style="color: #008000;">&#40;</span>dispatchThreadID.<span style="color: #007788;">xy</span>, viewDir<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// the central pixel is placed in &quot;groupThreadID.x + MAX_RADIUS&quot;</span><br />
&nbsp; &nbsp; radianceSamples<span style="color: #008000;">&#91;</span>groupThreadID.<span style="color: #007788;">x</span> <span style="color: #000040;">+</span> MAX_RADIUS<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> centerSample<span style="color: #008080;">;</span><br />
&nbsp; &nbsp; positionSamples<span style="color: #008000;">&#91;</span>groupThreadID.<span style="color: #007788;">x</span> <span style="color: #000040;">+</span> MAX_RADIUS<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> viewPosition<span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>groupThreadID.<span style="color: #007788;">x</span> <span style="color: #000080;">&lt;</span> MAX_RADIUS<span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; float2 coord <span style="color: #000080;">=</span> dispatchThreadID.<span style="color: #007788;">xy</span> <span style="color: #000040;">-</span> uint2<span style="color: #008000;">&#40;</span>MAX_RADIUS, <span style="color: #0000dd;">0</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; radianceSamples<span style="color: #008000;">&#91;</span>groupThreadID.<span style="color: #007788;">x</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> diffuseSource<span style="color: #008000;">&#91;</span>coord<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; positionSamples<span style="color: #008000;">&#91;</span>groupThreadID.<span style="color: #007788;">x</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> GetViewPosition<span style="color: #008000;">&#40;</span>coord, viewDir <span style="color: #000040;">-</span> frustumDiff <span style="color: #000040;">*</span> MAX_RADIUS<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #008000;">&#125;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">else</span> <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>groupThreadID.<span style="color: #007788;">x</span> <span style="color: #000080;">&gt;=</span> NUMTHREADS <span style="color: #000040;">-</span> MAX_RADIUS<span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; float2 coord <span style="color: #000080;">=</span> dispatchThreadID.<span style="color: #007788;">xy</span> <span style="color: #000040;">+</span> uint2<span style="color: #008000;">&#40;</span>MAX_RADIUS, <span style="color: #0000dd;">0</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; radianceSamples<span style="color: #008000;">&#91;</span>groupThreadID.<span style="color: #007788;">x</span> <span style="color: #000040;">+</span> <span style="color: #0000dd;">2</span> <span style="color: #000040;">*</span> MAX_RADIUS<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> diffuseSource<span style="color: #008000;">&#91;</span>coord<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; positionSamples<span style="color: #008000;">&#91;</span>groupThreadID.<span style="color: #007788;">x</span> <span style="color: #000040;">+</span> <span style="color: #0000dd;">2</span> <span style="color: #000040;">*</span> MAX_RADIUS<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> GetViewPosition<span style="color: #008000;">&#40;</span>coord, viewDir <span style="color: #000040;">+</span> frustumDiff <span style="color: #000040;">*</span> MAX_RADIUS<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #008000;">&#125;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// Wait for all data to be ready</span><br />
&nbsp; &nbsp; GroupMemoryBarrierWithGroupSync<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; <span style="color: #666666;">// fetch material profile ID, stored in alpha channel of the normal/material GBuffer texture</span><br />
&nbsp; &nbsp; float4 normalMaterialData <span style="color: #000080;">=</span> gbufferNormalMaterial<span style="color: #008000;">&#91;</span>dispatchThreadID.<span style="color: #007788;">xy</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; ExtendedMaterialProfile profile <span style="color: #000080;">=</span> extendedMaterialProfiles<span style="color: #008000;">&#91;</span>normalMaterialData.<span style="color: #007788;">w</span> <span style="color: #000040;">*</span> <span style="color: #0000dd;">255</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// if no subsurface scattering is required, simply output the original sample</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span><span style="color: #000040;">!</span>profile.<span style="color: #007788;">enableSubsurfaceScattering</span><span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; diffuseTarget<span style="color: #008000;">&#91;</span>dispatchThreadID.<span style="color: #007788;">xy</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> centerSample<span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0000ff;">return</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #008000;">&#125;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// project the sample radius</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> w <span style="color: #000080;">=</span> viewPosition.<span style="color: #007788;">z</span> <span style="color: #000040;">*</span> projectionMatrix<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">2</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#91;</span><span style="color: #0000dd;">3</span><span style="color: #008000;">&#93;</span> <span style="color: #000040;">+</span> projectionMatrix<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">3</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#91;</span><span style="color: #0000dd;">3</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">float</span> radiusProjection <span style="color: #000080;">=</span> projectionMatrix<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">1</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#91;</span><span style="color: #0000dd;">1</span><span style="color: #008000;">&#93;</span> <span style="color: #000040;">/</span> w <span style="color: #000040;">*</span> renderTargetResolution.<span style="color: #007788;">x</span> <span style="color: #000040;">*</span> <span style="color:#800080;">.25</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">int</span> sampleRadius <span style="color: #000080;">=</span> profile.<span style="color: #007788;">subsurfaceRadius</span> <span style="color: #000040;">*</span> radiusProjection<span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// sample radius too small, would just convolute a single pixel, so just return that immediately</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>sampleRadius <span style="color: #000080;">&lt;</span> <span style="color: #0000dd;">1</span><span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; diffuseTarget<span style="color: #008000;">&#91;</span>dispatchThreadID.<span style="color: #007788;">xy</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> centerSample<span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0000ff;">return</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #008000;">&#125;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// make sure we don't go out of bounds (usually when getting the camera very close)</span><br />
&nbsp; &nbsp; sampleRadius <span style="color: #000080;">=</span> min<span style="color: #008000;">&#40;</span>sampleRadius, MAX_RADIUS <span style="color: #000040;">-</span> <span style="color: #0000dd;">1</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; float4 totalWeights <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; <span style="color: #666666;">// stores all 4 blurs in a single var</span><br />
&nbsp; &nbsp; float4x3 totals <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>&nbsp; &nbsp; <br />
<br />
&nbsp; &nbsp; <span style="color: #0000ff;">for</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #000040;">-</span>sampleRadius<span style="color: #008080;">;</span> i <span style="color: #000080;">&lt;=</span> sampleRadius<span style="color: #008080;">;</span> <span style="color: #000040;">++</span>i<span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span> &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #666666;">// Remember the central pixel is placed in &quot;groupThreadID.x + MAX_RADIUS&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0000ff;">int</span> index <span style="color: #000080;">=</span> groupThreadID.<span style="color: #007788;">x</span> <span style="color: #000040;">+</span> MAX_RADIUS <span style="color: #000040;">+</span> i<span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; float3 dir <span style="color: #000080;">=</span> positionSamples<span style="color: #008000;">&#91;</span>index<span style="color: #008000;">&#93;</span> <span style="color: #000040;">-</span> viewPosition<span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #666666;">// Calculate the squared distance and convert from meters^2 to millimeters^2 (squared, so multiply by 1000^2) &nbsp; &nbsp; &nbsp; </span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0000ff;">float</span> distSqr <span style="color: #000080;">=</span> dot<span style="color: #008000;">&#40;</span>dir, dir<span style="color: #008000;">&#41;</span> <span style="color: #000040;">*</span> <span style="color:#800080;">1000000.0f</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #666666;">// Calculate all 4 Gaussian weights</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; float4 weights <span style="color: #000080;">=</span> <span style="color: #0000dd;">exp</span><span style="color: #008000;">&#40;</span>distSqr <span style="color: #000040;">*</span> profile.<span style="color: #007788;">subsurfaceGaussianExponents</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; totalWeights <span style="color: #000040;">+</span><span style="color: #000080;">=</span> weights<span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #666666;">// add the sample to each layer with their respective Gaussian weight</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008000;">&#91;</span>unroll<span style="color: #008000;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #0000ff;">for</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> j <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span> j <span style="color: #000080;">&lt;</span> MAX_GAUSSIANS<span style="color: #008080;">;</span> <span style="color: #000040;">++</span>j<span style="color: #008000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; totals<span style="color: #008000;">&#91;</span>j<span style="color: #008000;">&#93;</span> <span style="color: #000040;">+</span><span style="color: #000080;">=</span> weights<span style="color: #008000;">&#91;</span>j<span style="color: #008000;">&#93;</span> <span style="color: #000040;">*</span> radianceSamples<span style="color: #008000;">&#91;</span>index<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; <span style="color: #008000;">&#125;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// start with the amount of original diffuse light that we specified</span><br />
&nbsp; &nbsp; float3 total <span style="color: #000080;">=</span> centerSample <span style="color: #000040;">*</span> profile.<span style="color: #007788;">originalBlendFactors</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// add blended blurred results</span><br />
&nbsp; &nbsp; <span style="color: #008000;">&#91;</span>unroll<span style="color: #008000;">&#93;</span><br />
&nbsp; &nbsp; <span style="color: #0000ff;">for</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">int</span> j <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span> j <span style="color: #000080;">&lt;</span> MAX_GAUSSIANS<span style="color: #008080;">;</span> <span style="color: #000040;">++</span>j<span style="color: #008000;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; total <span style="color: #000040;">+</span><span style="color: #000080;">=</span> totals<span style="color: #008000;">&#91;</span>j<span style="color: #008000;">&#93;</span> <span style="color: #000040;">/</span> totalWeights<span style="color: #008000;">&#91;</span>j<span style="color: #008000;">&#93;</span> <span style="color: #000040;">*</span> profile.<span style="color: #007788;">subsurfaceBlends</span><span style="color: #008000;">&#91;</span>j<span style="color: #008000;">&#93;</span>.<span style="color: #007788;">xyz</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; diffuseTarget<span style="color: #008000;">&#91;</span>dispatchThreadID.<span style="color: #007788;">xy</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> total<span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span></div></div>
<p>&nbsp;</p>
<h3>Conclusion</h3>
<div id="attachment_1005" style="width: 306px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/chin-detail.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1005" class="size-medium wp-image-1005" src="http://www.derschmale.com/blog/wp-content/chin-detail-300x192.jpg" alt="Chin close-up" width="300" height="192" srcset="https://www.derschmale.com/blog/wp-content/chin-detail-300x192.jpg 300w, https://www.derschmale.com/blog/wp-content/chin-detail.jpg 580w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-1005" class="wp-caption-text">Showing the subtle fleshy discolouration using the skin settings</p></div>
<p>By adding different configurations to the StructuredBuffer in the shader, you can easily support different materials per shader. Skin is created using Jimenez&#8217; settings:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;"><span style="color: #0000ff;">float</span> variances<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">3</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #008000;">&#123;</span> <span style="color:#800080;">.0516f</span>, <span style="color:#800080;">.2719f</span>, <span style="color:#800080;">2.0062f</span> <span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span><br />
float3 blends<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">3</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #008000;">&#123;</span> float3<span style="color: #008000;">&#40;</span><span style="color:#800080;">.1158f</span>, <span style="color:#800080;">.3661f</span>, <span style="color:#800080;">.3439f</span><span style="color: #008000;">&#41;</span>, float3<span style="color: #008000;">&#40;</span><span style="color:#800080;">.1836f</span>, <span style="color:#800080;">.1864f</span>, <span style="color:#800080;">.0f</span><span style="color: #008000;">&#41;</span>, float3<span style="color: #008000;">&#40;</span><span style="color:#800080;">.46f</span>, <span style="color:#800080;">.0f</span>, <span style="color:#800080;">.0402f</span><span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span><br />
profile.<span style="color: #007788;">SetSubsurfaceScattering</span><span style="color: #008000;">&#40;</span><span style="color: #0000dd;">3</span>, blends, variances<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
profile.<span style="color: #007788;">SetTransmittanceCoefficientByDistance</span><span style="color: #008000;">&#40;</span>float3<span style="color: #008000;">&#40;</span><span style="color:#800080;">.94f</span>, <span style="color:#800080;">.14f</span>, <span style="color:#800080;">.14f</span><span style="color: #008000;">&#41;</span>, <span style="color:#800080;">.0002f</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span></div></div>
<p>Other materials can be created, such as wax:</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;"><span style="color: #0000ff;">float</span> variances<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">4</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #008000;">&#123;</span> <span style="color:#800080;">.362f</span>, <span style="color:#800080;">2.144f</span>, <span style="color:#800080;">8.555f</span>, <span style="color:#800080;">34.833f</span> <span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span><br />
float3 blends<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">4</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #008000;">&#123;</span> float3<span style="color: #008000;">&#40;</span><span style="color:#800080;">.0544f</span>, <span style="color:#800080;">.1245f</span>, <span style="color:#800080;">.2177f</span><span style="color: #008000;">&#41;</span>, float3<span style="color: #008000;">&#40;</span><span style="color:#800080;">.2436f</span>, <span style="color:#800080;">.2435f</span>, <span style="color:#800080;">.1890f</span><span style="color: #008000;">&#41;</span>, float3<span style="color: #008000;">&#40;</span><span style="color:#800080;">.3105f</span>, <span style="color:#800080;">.3158f</span>, <span style="color:#800080;">.3742f</span><span style="color: #008000;">&#41;</span>, float3<span style="color: #008000;">&#40;</span><span style="color:#800080;">.3913f</span>, <span style="color:#800080;">.3161f</span>, <span style="color:#800080;">.2189f</span><span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span><br />
profile.<span style="color: #007788;">SetSubsurfaceScattering</span><span style="color: #008000;">&#40;</span><span style="color: #0000dd;">4</span>, blends, variances<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
profile.<span style="color: #007788;">SetTransmittanceCoefficientByDistance</span><span style="color: #008000;">&#40;</span>float3<span style="color: #008000;">&#40;</span><span style="color:#800080;">.3913f</span>, <span style="color:#800080;">.3161f</span>, <span style="color:#800080;">.2189f</span><span style="color: #008000;">&#41;</span>, <span style="color:#800080;">.1f</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span></div></div>
<div id="attachment_1004" style="width: 294px" class="wp-caption aligncenter"><a href="http://www.derschmale.com/blog/wp-content/wax-man.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-1004" class="size-medium wp-image-1004" src="http://www.derschmale.com/blog/wp-content/wax-man-288x300.jpg" alt="Using the wax settings" width="288" height="300" srcset="https://www.derschmale.com/blog/wp-content/wax-man-288x300.jpg 288w, https://www.derschmale.com/blog/wp-content/wax-man.jpg 765w" sizes="auto, (max-width: 288px) 100vw, 288px" /></a><p id="caption-attachment-1004" class="wp-caption-text">Using the wax settings</p></div>
<p>I&#8217;ll leave you with an observation. Recently, I&#8217;ve been watching Breaking Bad (yeah, I&#8217;m way behind on the cool stuff). Don&#8217;t you think Cranston has the <em>best</em> distribution profile going on?<a href="http://www.derschmale.com/blog/wp-content/breaking_bad_tv_series_bryan_cranston_walter_white_1680x1050_45522.jpg"><br />
<img loading="lazy" decoding="async" class="aligncenter size-medium wp-image-1006" src="http://www.derschmale.com/blog/wp-content/breaking_bad_tv_series_bryan_cranston_walter_white_1680x1050_45522-300x187.jpg" alt="Bryan Cranston's skin is epic" width="300" height="187" srcset="https://www.derschmale.com/blog/wp-content/breaking_bad_tv_series_bryan_cranston_walter_white_1680x1050_45522-300x187.jpg 300w, https://www.derschmale.com/blog/wp-content/breaking_bad_tv_series_bryan_cranston_walter_white_1680x1050_45522-1024x640.jpg 1024w, https://www.derschmale.com/blog/wp-content/breaking_bad_tv_series_bryan_cranston_walter_white_1680x1050_45522.jpg 1680w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a></p>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://www.derschmale.com/2014/06/02/deferred-subsurface-scattering-using-compute-shaders/feed/</wfw:commentRss>
			<slash:comments>4</slash:comments>
		
		
			</item>
		<item>
		<title>Reconstructing positions from the depth buffer pt. 2: Perspective and orthographic general case</title>
		<link>https://www.derschmale.com/2014/03/19/reconstructing-positions-from-the-depth-buffer-pt-2-perspective-and-orthographic-general-case/</link>
					<comments>https://www.derschmale.com/2014/03/19/reconstructing-positions-from-the-depth-buffer-pt-2-perspective-and-orthographic-general-case/#comments</comments>
		
		<dc:creator><![CDATA[David]]></dc:creator>
		<pubDate>Wed, 19 Mar 2014 21:30:56 +0000</pubDate>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[DirectX]]></category>
		<category><![CDATA[Graphics]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[directx]]></category>
		<category><![CDATA[DirectX 11]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[rendering]]></category>
		<category><![CDATA[Shaders]]></category>
		<guid isPermaLink="false">http://www.derschmale.com/?p=989</guid>

					<description><![CDATA[Introduction It&#8217;s been a while since last time, when I promised a general method for depth reconstruction regardless of projection type. I had told myself to do it &#8220;soon&#8221;. Due to lack of time partly caused by moving to Ghent and other random occurrences &#8220;soon&#8221; changed into &#8220;sooner or later&#8221; and finally we&#8217;ve arrived at&#8230;]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<p>It&#8217;s been a while since <a href="http://www.derschmale.com/2014/01/26/reconstructing-positions-from-the-depth-buffer/" title="Reconstructing positions from the depth buffer">last time</a>, when I promised a general method for depth reconstruction regardless of projection type. I had told myself to do it &#8220;soon&#8221;. Due to lack of time partly caused by moving to Ghent and other random occurrences &#8220;soon&#8221; changed into &#8220;sooner or later&#8221; and finally we&#8217;ve arrived at &#8220;later&#8221;, but here we are!<br />
As a small disclaimer, implementing such a general case only makes sense if you need to support different projection types and can&#8217;t provide separate shaders for each. Bespoke implementations are obviously faster, especially in the case of orthographic projections.<br />
For what follows I will continue to use excruciatingly slow step-by-step derivations.</p>
<h2>The orthographic case</h2>
<p>For completion&#8217;s sake, I&#8217;ll show position reconstruction for orthographic-only projections. This is considerably easier compared to perspective projections. After all, the value stored in the depth buffer is a value that maps linearly to the near and far plane. As a result, reconstructing the view-space <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-4586e340cb83d5b642972e97a288fec2_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;" title="Rendered by QuickLaTeX.com" height="8" width="9" style="vertical-align: 0px;"/> value is therefore a simple linear interpolation between the two:</p>
<p class="ql-center-displayed-equation" style="line-height: 20px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-89551aec3230534ac23252b674934c5d_l3.png" height="20" width="254" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#122;&#95;&#123;&#110;&#101;&#97;&#114;&#125;&#32;&#43;&#32;&#122;&#95;&#123;&#110;&#100;&#99;&#125;&#40;&#122;&#95;&#123;&#102;&#97;&#114;&#125;&#32;&#45;&#32;&#122;&#95;&#123;&#110;&#101;&#97;&#114;&#125;&#41; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>Recall that in DirectX, <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-a83e07f962e930fcd870d5bd7e388183_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;&#95;&#123;&#110;&#100;&#99;&#125;" title="Rendered by QuickLaTeX.com" height="11" width="29" style="vertical-align: -3px;"/> is simply the depth buffer value, while in OpenGL it&#8217;s <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-f63cc8f4e183be74e0e6cd3f495bc8f0_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;&#95;&#123;&#110;&#100;&#99;&#125;&#32;&#61;&#32;&#50;&#32;&#100;&#101;&#112;&#116;&#104;&#32;&#45;&#32;&#49;" title="Rendered by QuickLaTeX.com" height="17" width="136" style="vertical-align: -4px;"/>. The complete position can be similarly reconstructed:</p>
<p class="ql-center-displayed-equation" style="line-height: 20px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-8cb680055776d84600c4541a9ea829bc_l3.png" height="20" width="267" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#80;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#80;&#95;&#123;&#110;&#101;&#97;&#114;&#125;&#32;&#43;&#32;&#122;&#95;&#123;&#110;&#100;&#99;&#125;&#40;&#80;&#95;&#123;&#102;&#97;&#114;&#125;&#32;&#45;&#32;&#80;&#95;&#123;&#110;&#101;&#97;&#114;&#125;&#41; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-55e4e9120ff1c3e2268e0b6dae185399_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#80;&#95;&#123;&#110;&#101;&#97;&#114;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="38" style="vertical-align: -3px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-c4a19f10e37cc049b656d5a8e4411981_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#80;&#95;&#123;&#102;&#97;&#114;&#125;" title="Rendered by QuickLaTeX.com" height="18" width="32" style="vertical-align: -6px;"/> are the positions on the near and far plane for the current pixel being shaded. They are calculated by bilinearly interpolating between the near and far frustum corners. By simply passing the frustum corners from the vertex shader to the fragment shaders as interpolated values, this is handled by the hardware in the same way we interpolated the frustum direction <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-9ae8526feb6b8a99678e1d7ce2841d22_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#68;&#39;" title="Rendered by QuickLaTeX.com" height="14" width="19" style="vertical-align: 0px;"/> in the previous post. The corners themselves are the 8 different combinations of <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-e118de3ea74ee8930f53b66aaf4966e9_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#40;&#92;&#112;&#109;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#112;&#114;&#111;&#106;&#101;&#99;&#116;&#105;&#111;&#110;&#87;&#105;&#100;&#116;&#104;&#125;&#123;&#50;&#125;&#44;&#32;&#92;&#112;&#109;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#112;&#114;&#111;&#106;&#101;&#99;&#116;&#105;&#111;&#110;&#72;&#101;&#105;&#103;&#104;&#116;&#125;&#123;&#50;&#125;&#44;&#32;&#122;&#95;&#123;&#110;&#101;&#97;&#114;&#32;&#111;&#114;&#32;&#102;&#97;&#114;&#125;&#41;" title="Rendered by QuickLaTeX.com" height="23" width="346" style="vertical-align: -6px;"/>.</p>
<h2>Reconstructing z: Generalization</h2>
<p>Reconstruction the view-space <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-4586e340cb83d5b642972e97a288fec2_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;" title="Rendered by QuickLaTeX.com" height="8" width="9" style="vertical-align: 0px;"/> value for the generalization doesn&#8217;t change much compared to the perspective projection-only version, except that we can&#8217;t make the same assumptions concerning the projection matrix; <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-e8831897663574da67b56c5462c319d4_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#77;&#95;&#123;&#51;&#52;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="31" style="vertical-align: -3px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-6f719d38a515bcc9e9a56920c1962ea6_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#77;&#95;&#123;&#52;&#52;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="31" style="vertical-align: -3px;"/> are not 0 for orthographic projections. So we&#8217;re stuck with this:</p>
<p class="ql-center-displayed-equation" style="line-height: 39px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-e2eb78ad79518feb5b6ff08190296e07_l3.png" height="39" width="192" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#45;&#92;&#102;&#114;&#97;&#99;&#123;&#122;&#95;&#123;&#110;&#100;&#99;&#125;&#32;&#77;&#95;&#123;&#52;&#52;&#125;&#32;&#45;&#32;&#77;&#95;&#123;&#52;&#51;&#125;&#125;&#123;&#122;&#95;&#123;&#110;&#100;&#99;&#125;&#32;&#77;&#95;&#123;&#51;&#52;&#125;&#32;&#45;&#32;&#77;&#95;&#123;&#51;&#51;&#125;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>Review the <a href="http://www.derschmale.com/2014/01/26/reconstructing-positions-from-the-depth-buffer/" title="Reconstructing Positions from the Depth Buffer">previous post</a> if you&#8217;re hazy on why.</p>
<h2>Calculating the position from the z-value: Generalization</h2>
<p>In the perspective-only case, we reconstructed the position value assuming the ray origin was 0. This is no longer the case for orthographic projections, as you can see below:</p>
<p style="text-align: center;"><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-84ca0dcda8c7677a20cd49d91f4536f1_l3.png" height="268" width="582" class="ql-manual-mode quicklatex-auto-format" alt="Rendered by QuickLaTeX.com" title="Rendered by QuickLaTeX.com"/>
</p>
<p>Evidently this means we&#8217;re going to have to use the full ray equation. We&#8217;ll have to define a ray origin point which we&#8217;ll keep on the <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-12ecccbb9aee154a4cd4992c940458d7_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;&#89;" title="Rendered by QuickLaTeX.com" height="12" width="30" style="vertical-align: 0px;"/>-plane going through the origin to remain compatible with the origin used in the perspective version. We&#8217;ll call this point <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-039697931c388072c796f6752c1a142d_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#79;&#95;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="21" style="vertical-align: -3px;"/>, with <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-5e8deda4e34d5405f4c5ac306d33404a_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;&#95;&#123;&#79;&#95;&#48;&#125;&#32;&#61;&#32;&#48;" title="Rendered by QuickLaTeX.com" height="17" width="59" style="vertical-align: -5px;"/>. For both cases, we define the ray to be:</p>
<p style="text-align: center;"><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-3a11303f0cde8f1df5212f957d4c5ae4_l3.png" height="42" width="134" class="ql-manual-mode quicklatex-auto-format" alt="Rendered by QuickLaTeX.com" title="Rendered by QuickLaTeX.com"/>
</p>
<p>Assuming <em>nothing</em>, we need to calculate <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-039697931c388072c796f6752c1a142d_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#79;&#95;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="21" style="vertical-align: -3px;"/>. The <em>line</em> coinciding with the ray can be expressed as <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-67ae19f0b288c1a8088d407ad066a5ec_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#76;&#40;&#116;&#41;" title="Rendered by QuickLaTeX.com" height="18" width="31" style="vertical-align: -4px;"/> using a different origin, one whose value we know. Let&#8217;s pick the near position <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-55e4e9120ff1c3e2268e0b6dae185399_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#80;&#95;&#123;&#110;&#101;&#97;&#114;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="38" style="vertical-align: -3px;"/>. <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-039697931c388072c796f6752c1a142d_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#79;&#95;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="21" style="vertical-align: -3px;"/> is simply that line&#8217;s intersection with the <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-12ecccbb9aee154a4cd4992c940458d7_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#88;&#89;" title="Rendered by QuickLaTeX.com" height="12" width="30" style="vertical-align: 0px;"/> plane, found by solving for <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-908111d09df817ce0ec84757339a6214_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;&#32;&#61;&#32;&#48;" title="Rendered by QuickLaTeX.com" height="12" width="42" style="vertical-align: 0px;"/>:</p>
<p style="text-align: center;"><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-6c796c7c1cddc86e4e253f86fd77a8da_l3.png" height="44" width="368" class="ql-manual-mode quicklatex-auto-format" alt="Rendered by QuickLaTeX.com" title="Rendered by QuickLaTeX.com"/>
</p>
<p>Plugging <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-b4e3cbf5d4c5c6d9b702dd139f14c147_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#116;" title="Rendered by QuickLaTeX.com" height="12" width="6" style="vertical-align: 0px;"/> back into the line equation <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-67ae19f0b288c1a8088d407ad066a5ec_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#76;&#40;&#116;&#41;" title="Rendered by QuickLaTeX.com" height="18" width="31" style="vertical-align: -4px;"/>, we get <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-039697931c388072c796f6752c1a142d_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#79;&#95;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="21" style="vertical-align: -3px;"/>:</p>
<p class="ql-center-displayed-equation" style="line-height: 39px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-b6cb3a8eaf75a083d6188721faf93350_l3.png" height="39" width="415" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#79;&#95;&#48;&#32;&#61;&#32;&#76;&#40;&#45;&#92;&#102;&#114;&#97;&#99;&#123;&#122;&#95;&#123;&#110;&#101;&#97;&#114;&#125;&#125;&#123;&#122;&#95;&#68;&#125;&#41;&#32;&#61;&#32;&#80;&#95;&#123;&#110;&#101;&#97;&#114;&#125;&#32;&#45;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#68;&#125;&#123;&#122;&#95;&#68;&#125;&#122;&#95;&#123;&#110;&#101;&#97;&#114;&#125;&#32;&#61;&#32;&#80;&#95;&#123;&#110;&#101;&#97;&#114;&#125;&#32;&#45;&#32;&#68;&#39;&#122;&#95;&#123;&#110;&#101;&#97;&#114;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>Remember <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-65ab7656eb59f5dc88046cbd3f985ed5_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#102;&#114;&#97;&#99;&#123;&#68;&#125;&#123;&#122;&#95;&#68;&#125;" title="Rendered by QuickLaTeX.com" height="24" width="18" style="vertical-align: -8px;"/>? We defined this as <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-9ae8526feb6b8a99678e1d7ce2841d22_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#68;&#39;" title="Rendered by QuickLaTeX.com" height="14" width="19" style="vertical-align: 0px;"/>, the <em>z-normalized</em> view vector. Armed with a ray origin, we can redo the same math as before. We&#8217;re interested in the point on the ray that is the view position, ie: <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-ebefbea239a57fb7ac3860d16c86eac1_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#82;&#40;&#116;&#41;&#32;&#61;&#32;&#80;&#95;&#123;&#118;&#105;&#101;&#119;&#125;" title="Rendered by QuickLaTeX.com" height="18" width="97" style="vertical-align: -4px;"/>.</p>
<p class="ql-center-displayed-equation" style="line-height: 35px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-9900920275f98d666b358f4ee88b4b1f_l3.png" height="35" width="364" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#80;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#79;&#95;&#48;&#32;&#43;&#32;&#116;&#68;&#32;&#92;&#105;&#102;&#102;&#32;&#116;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#45;&#32;&#122;&#95;&#123;&#79;&#95;&#48;&#125;&#125;&#123;&#122;&#95;&#68;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#125;&#123;&#122;&#95;&#68;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>Using the fact that <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-f58bfa981fcb708c037580c6634380da_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#79;&#95;&#48;&#32;&#61;&#32;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="54" style="vertical-align: -3px;"/>. Plugging <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-b4e3cbf5d4c5c6d9b702dd139f14c147_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#116;" title="Rendered by QuickLaTeX.com" height="12" width="6" style="vertical-align: 0px;"/> back into the ray equation:</p>
<p class="ql-center-displayed-equation" style="line-height: 39px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-91395978460fad8e4208c8169c6605c7_l3.png" height="39" width="381" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#80;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#79;&#95;&#48;&#32;&#43;&#32;&#116;&#68;&#32;&#61;&#32;&#79;&#95;&#48;&#32;&#43;&#32;&#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#92;&#102;&#114;&#97;&#99;&#123;&#68;&#125;&#123;&#122;&#95;&#68;&#125;&#32;&#61;&#32;&#79;&#95;&#48;&#32;&#43;&#32;&#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#68;&#39; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>I&#8217;ve taken the long way round showing what you probably already figured intuitively: it&#8217;s the same as the perspective case, but simply taking into account the origin vector.<br />
Similarly to both the bespoke orthogonal and the perspective cases, <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-039697931c388072c796f6752c1a142d_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#79;&#95;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="21" style="vertical-align: -3px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-9ae8526feb6b8a99678e1d7ce2841d22_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#68;&#39;" title="Rendered by QuickLaTeX.com" height="14" width="19" style="vertical-align: 0px;"/> are to be precomputed for each corner of the quad that&#8217;s being rendered. The vertex shader can then simply pass them along to the fragment shader so they&#8217;re automatically interpolated for the pixel we&#8217;re currently operating on.</p>
<h2>Calculating the view vectors</h2>
<p>All that rests us to do is to calculate the z-normalized view vectors <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-9ae8526feb6b8a99678e1d7ce2841d22_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#68;&#39;" title="Rendered by QuickLaTeX.com" height="14" width="19" style="vertical-align: 0px;"/> for each quad corner. This is simply done by calculating the near and far frustum corners and z-normalizing the difference. The corners in view space are handled entirely the same as we did last time: unprojecting NDC coordinates. The following code shows this, and also performs the ray origin calculations.</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;height:300px;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;"><span style="color: #666666;">// For near corners, you should set z = -1.0f instead of 0.0f for OpenGL</span><br />
<span style="color: #666666;">// This time it does matter since we're using the unprojection position for the origin calculation.</span><br />
Vector3D nearHomogenousCorners<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">4</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #008000;">&#123;</span> &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Vector3D<span style="color: #008000;">&#40;</span><span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">0.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Vector3D<span style="color: #008000;">&#40;</span><span style="color:#800080;">1.0f</span>, <span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">0.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Vector3D<span style="color: #008000;">&#40;</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span>, <span style="color:#800080;">0.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Vector3D<span style="color: #008000;">&#40;</span><span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span>, <span style="color:#800080;">0.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span><br />
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span><br />
<br />
Vector3D farHomogenousCorners<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">4</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #008000;">&#123;</span>&nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Vector3D<span style="color: #008000;">&#40;</span><span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Vector3D<span style="color: #008000;">&#40;</span><span style="color:#800080;">1.0f</span>, <span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Vector3D<span style="color: #008000;">&#40;</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Vector3D<span style="color: #008000;">&#40;</span><span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span><br />
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span><br />
<br />
<br />
Matrix3D inverseProjection <span style="color: #000080;">=</span> projectionMatrix.<span style="color: #007788;">Inverse</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
Vector3D rays<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">4</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
Vector3D origins<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">4</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
<br />
<span style="color: #0000ff;">for</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">unsigned</span> <span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span> i <span style="color: #000080;">&lt;</span> <span style="color: #0000dd;">4</span><span style="color: #008080;">;</span> <span style="color: #000040;">++</span>i<span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span>&nbsp; <br />
&nbsp; &nbsp; Vector3D<span style="color: #000040;">&amp;</span> ray <span style="color: #000080;">=</span> rays<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// unproject the far and near frustum corners from NDC to view space</span><br />
&nbsp; &nbsp; Vector3D nearPos <span style="color: #000080;">=</span> inverseProjection <span style="color: #000040;">*</span> nearHomogenousCorners<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; Vector3D farPos <span style="color: #000080;">=</span> inverseProjection <span style="color: #000040;">*</span> farHomogenousCorners<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; nearPos <span style="color: #000040;">/</span><span style="color: #000080;">=</span> nearPos.<span style="color: #007788;">w</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; farPos <span style="color: #000040;">/</span><span style="color: #000080;">=</span> farPos.<span style="color: #007788;">w</span><span style="color: #008080;">;</span><br />
&nbsp; &nbsp; ray <span style="color: #000080;">=</span> farPos <span style="color: #000040;">-</span> nearPos<span style="color: #008080;">;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #666666;">// z-normalize this vector</span><br />
&nbsp; &nbsp; ray <span style="color: #000040;">/</span><span style="color: #000080;">=</span> ray.<span style="color: #007788;">z</span><span style="color: #008080;">;</span> &nbsp; &nbsp; &nbsp; <br />
<br />
&nbsp; &nbsp; origins<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> nearPos <span style="color: #000040;">-</span> ray <span style="color: #000040;">*</span> nearPos.<span style="color: #007788;">z</span><span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span></div></div>
<h2>Working in world space</h2>
<p>You can simply change to world space reconstruction by transforming both <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-9ae8526feb6b8a99678e1d7ce2841d22_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#68;&#39;" title="Rendered by QuickLaTeX.com" height="14" width="19" style="vertical-align: 0px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-039697931c388072c796f6752c1a142d_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#79;&#95;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="21" style="vertical-align: -3px;"/> to world space, and everything will happen by itself:</p>
<p class="ql-center-displayed-equation" style="line-height: 20px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-4d5ed27f09fc94cc783fce0bc86bef8c_l3.png" height="20" width="246" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#80;&#95;&#123;&#119;&#111;&#114;&#108;&#100;&#125;&#32;&#61;&#32;&#79;&#95;&#123;&#119;&#111;&#114;&#108;&#100;&#125;&#32;&#43;&#32;&#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#42;&#68;&#39;&#95;&#123;&#119;&#111;&#114;&#108;&#100;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<h2>Conclusion</h2>
<p>One final note about the <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-039697931c388072c796f6752c1a142d_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#79;&#95;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="21" style="vertical-align: -3px;"/> calculation above. Working with only 2 projection types, you may consider to check for the projection type and simply pass in more easily calculated values: <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-55c404405500d004b91cd15a1edb434f_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#40;&#48;&#44;&#32;&#48;&#44;&#32;&#48;&#41;" title="Rendered by QuickLaTeX.com" height="18" width="54" style="vertical-align: -4px;"/> for perspective and the 4 combinations of <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-8dd56b66957f86a641689dec4b740808_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#40;&#92;&#112;&#109;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#112;&#114;&#111;&#106;&#101;&#99;&#116;&#105;&#111;&#110;&#87;&#105;&#100;&#116;&#104;&#125;&#123;&#50;&#125;&#44;&#32;&#92;&#112;&#109;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#112;&#114;&#111;&#106;&#101;&#99;&#116;&#105;&#111;&#110;&#72;&#101;&#105;&#103;&#104;&#116;&#125;&#123;&#50;&#125;&#44;&#32;&#48;&#41;" title="Rendered by QuickLaTeX.com" height="23" width="282" style="vertical-align: -6px;"/> for orthographic projections. That&#8217;s up to you. Personally, unless absolutely necessary, I&#8217;ll always prefer an elegant calculation to a horrible if-statement checking for types!</p>
<p>So finally, I got this post out. I hope it can be of use to anyone. As always, any questions or suggestions, feel free to drop a line in the comment box.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.derschmale.com/2014/03/19/reconstructing-positions-from-the-depth-buffer-pt-2-perspective-and-orthographic-general-case/feed/</wfw:commentRss>
			<slash:comments>14</slash:comments>
		
		
			</item>
		<item>
		<title>Reconstructing positions from the depth buffer</title>
		<link>https://www.derschmale.com/2014/01/26/reconstructing-positions-from-the-depth-buffer/</link>
					<comments>https://www.derschmale.com/2014/01/26/reconstructing-positions-from-the-depth-buffer/#comments</comments>
		
		<dc:creator><![CDATA[David]]></dc:creator>
		<pubDate>Sun, 26 Jan 2014 17:27:14 +0000</pubDate>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[DirectX]]></category>
		<category><![CDATA[Graphics]]></category>
		<category><![CDATA[OpenGL]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[directx]]></category>
		<category><![CDATA[DirectX 11]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[rendering]]></category>
		<category><![CDATA[Shaders]]></category>
		<guid isPermaLink="false">http://www.derschmale.com/?p=986</guid>

					<description><![CDATA[Introduction [edit] To start things off more easily, I decided to limit this post to perspective projections and move on to the generalization (including orthographic projections) in a next blog post. When doing deferred shading or some post-processing effects, we&#8217;ll need the 3D position of the pixel we&#8217;re currently shading at some point. Rather than&#8230;]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<p>[edit] To start things off more easily, I decided to limit this post to perspective projections and move on to the generalization (including orthographic projections) in a next blog post.</p>
<p>When doing deferred shading or some post-processing effects, we&#8217;ll need the 3D position of the pixel we&#8217;re currently shading at some point. Rather than waste memory and bandwidth by storing the position vectors explicitly in a render target, the position can be reconstructed cheaply from the depth buffer alone. This is data we already have at our disposal. The techniques to do so are pretty commonplace, so this article will hardly be a major revelation. So why bother writing it at all? Well&#8230;</p>
<ul>
<li>Too often, you&#8217;ll stumble over code keeping a position render target anyway.</li>
<li>Often, articles explaining the technique are not entry-level and skip over derivations, making it hard for beginners to figure things out.</li>
<li>Many aspects of the implementation are scattered across many articles and forum posts. I&#8217;d like a single comprehensive article.</li>
<li>It somewhat made a relatively unexplained appearance in the <a title="An alternative implementation for HBAO" href="http://www.derschmale.com/2013/12/20/an-alternative-implementation-for-hbao-2/">previous post&#8217;s</a> sample code, so I figured I might as well elaborate.</li>
<li>The therapeutic value of writing? ;)</li>
</ul>
<p>Since I&#8217;m trying to keep it at an entry level, the math will have a slightly step-by-step approach. Sorry if that&#8217;s too slow :)</p>
<p>Note: I remember an article by Crytek briefly mentioning similar material, but I can&#8217;t seem to find it anymore. If anyone can point me to it, let me know!</p>
<h2>Non-linearity</h2>
<p>So as you already (should) know, the depth buffer contains a range between 0 and 1, representing the depth on the near plane and far plane respectively. A ray goes from the camera through the &#8220;screen&#8221; into the world, and the depth defines where exactly on the ray that lies. If you&#8217;ve never touched the depth buffer before, you might be tempted to simply linearly interpolate between where that ray intersects the near and far planes. However, the depth buffer&#8217;s depth values are <em>not</em> necessarily linear (perspective projections), so there goes that idea.</p>
<p style="text-align: center;"><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-d5267a5ae8f43800a41fbe8234a32279_l3.png" height="250" width="396" class="ql-manual-mode quicklatex-auto-format" alt="Rendered by QuickLaTeX.com" title="Rendered by QuickLaTeX.com"/></p>
<p>Often, linear depth is stored explicitly in a render target to make this approach possible. Depending on your case, this might be a valuable option. If you expect to sample the depth along with the normals most of the times, you could throw it in there and have all the data in one texture fetch. This is, of course, provided you&#8217;re using enough precision in your render target which you might not want to spend on your normals.</p>
<h2>Reconstructing z</h2>
<p>So, it&#8217;s obvious we&#8217;re going to need to convert the depth buffer&#8217;s value to a linear depth representation. Instead of converting to another [0, 1]-based range, we&#8217;ll calculate the view position&#8217;s z value directly instead. As we&#8217;ll see later on, we can use this to very cheaply reconstruct the whole position vector. To do so, remember what the depth buffer&#8217;s value contains. In the vertex shader, we projected our vertices to homogeneous (clip) coordinates. These are eventually converted to normalized device coordinates (NDC) by the gpu by dividing the whole vector with its w component. The NDC coordinates essentially result in XY coordinates from -1 to 1 which can be mapped to screen coordinates, and a Z coordinate that is used to compare with and store in the depth buffer. It&#8217;s this value that we want. In other words, given projection matrix <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-10ebb71bad275c1815a8f2a8c5dea0be_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#77;" title="Rendered by QuickLaTeX.com" height="12" width="19" style="vertical-align: 0px;"/> and view space position <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-1494fd313bbcd7944b681e8ed0515066_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#80;&#95;&#123;&#118;&#105;&#101;&#119;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="40" style="vertical-align: -3px;"/>:</p>
<p style="text-align: center;"><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-44bf68052f983267a5cfa778e120f0c3_l3.png" height="40" width="142" class="ql-manual-mode quicklatex-auto-format" alt="Rendered by QuickLaTeX.com" title="Rendered by QuickLaTeX.com"/></p>
<p>Writing this out in full for <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-4586e340cb83d5b642972e97a288fec2_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;" title="Rendered by QuickLaTeX.com" height="8" width="9" style="vertical-align: 0px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-dfee5c980777976ae8cf6541893fb572_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;" title="Rendered by QuickLaTeX.com" height="8" width="13" style="vertical-align: 0px;"/>:</p>
<p style="text-align: center;"><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-45981cf99eec894821b853fe62c1d8d9_l3.png" height="40" width="181" class="ql-manual-mode quicklatex-auto-format" alt="Rendered by QuickLaTeX.com" title="Rendered by QuickLaTeX.com"/></p>
<p class="ql-center-displayed-equation" style="line-height: 42px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-ccaa7590b67b5dc01fed7f0a8c23b59d_l3.png" height="42" width="242" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#122;&#95;&#123;&#110;&#100;&#99;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#122;&#95;&#123;&#99;&#108;&#105;&#112;&#125;&#125;&#123;&#119;&#95;&#123;&#99;&#108;&#105;&#112;&#125;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#77;&#95;&#123;&#51;&#51;&#125;&#32;&#43;&#32;&#77;&#95;&#123;&#52;&#51;&#125;&#125;&#123;&#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#77;&#95;&#123;&#51;&#52;&#125;&#32;&#43;&#32;&#77;&#95;&#123;&#52;&#52;&#125;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>Here, we assumed a regular projection matrix where the clip planes are parallel to the screen plane. No wonky <a title="Away3D 4.1 (dev) Dynamic Reflections" href="http://www.derschmale.com/2012/09/10/away3d-4-1-dev-dynamic-reflections/">oblique near planes</a>! This means <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-0ddf158001f81159d8a8de38f5b99c7e_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#77;&#95;&#123;&#49;&#51;&#125;&#32;&#61;&#32;&#77;&#95;&#123;&#50;&#51;&#125;&#32;&#61;&#32;&#77;&#95;&#123;&#49;&#52;&#125;&#32;&#61;&#32;&#77;&#95;&#123;&#50;&#52;&#125;&#32;&#61;&#32;&#48;" title="Rendered by QuickLaTeX.com" height="16" width="231" style="vertical-align: -4px;"/>. <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-1494fd313bbcd7944b681e8ed0515066_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#80;&#95;&#123;&#118;&#105;&#101;&#119;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="40" style="vertical-align: -3px;"/> is a regular point, hence <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-bf775483beb4f08bfeb508b204e7afed_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#119;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#49;" title="Rendered by QuickLaTeX.com" height="15" width="74" style="vertical-align: -3px;"/>.<br />
Solving for <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-f8b329fc8c4c08124e8dc16a19e01b75_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;" title="Rendered by QuickLaTeX.com" height="11" width="36" style="vertical-align: -3px;"/>:</p>
<p class="ql-center-displayed-equation" style="line-height: 39px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-e2eb78ad79518feb5b6ff08190296e07_l3.png" height="39" width="192" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#45;&#92;&#102;&#114;&#97;&#99;&#123;&#122;&#95;&#123;&#110;&#100;&#99;&#125;&#32;&#77;&#95;&#123;&#52;&#52;&#125;&#32;&#45;&#32;&#77;&#95;&#123;&#52;&#51;&#125;&#125;&#123;&#122;&#95;&#123;&#110;&#100;&#99;&#125;&#32;&#77;&#95;&#123;&#51;&#52;&#125;&#32;&#45;&#32;&#77;&#95;&#123;&#51;&#51;&#125;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>If you know you&#8217;ll have a perspective projection, you can optimize by entering the values for <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-e8831897663574da67b56c5462c319d4_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#77;&#95;&#123;&#51;&#52;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="31" style="vertical-align: -3px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-6f719d38a515bcc9e9a56920c1962ea6_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#77;&#95;&#123;&#52;&#52;&#125;" title="Rendered by QuickLaTeX.com" height="15" width="31" style="vertical-align: -3px;"/>.<br />
For DirectX (<img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-7b312b36843a5158b9138de964540322_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#77;&#95;&#123;&#51;&#52;&#125;&#32;&#61;&#32;&#49;" title="Rendered by QuickLaTeX.com" height="15" width="63" style="vertical-align: -3px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-5d6a4cd86d895ce0bc75b17f53edb9fe_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#77;&#95;&#123;&#52;&#52;&#125;&#32;&#61;&#32;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="64" style="vertical-align: -3px;"/>):</p>
<p class="ql-center-displayed-equation" style="line-height: 39px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-a79100c52569d18dc7c35defcfd9da34_l3.png" height="39" width="147" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#77;&#95;&#123;&#52;&#51;&#125;&#125;&#123;&#122;&#95;&#123;&#110;&#100;&#99;&#125;&#32;&#45;&#32;&#77;&#95;&#123;&#51;&#51;&#125;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>For OpenGL (<img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-30f795e3659b4b363521e037f4f978a6_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#77;&#95;&#123;&#51;&#52;&#125;&#32;&#61;&#32;&#45;&#49;" title="Rendered by QuickLaTeX.com" height="15" width="77" style="vertical-align: -3px;"/> and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-5d6a4cd86d895ce0bc75b17f53edb9fe_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#77;&#95;&#123;&#52;&#52;&#125;&#32;&#61;&#32;&#48;" title="Rendered by QuickLaTeX.com" height="15" width="64" style="vertical-align: -3px;"/>):</p>
<p class="ql-center-displayed-equation" style="line-height: 39px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-5b3c23f186d90cd47a50d76473074adf_l3.png" height="39" width="161" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#45;&#92;&#102;&#114;&#97;&#99;&#123;&#77;&#95;&#123;&#52;&#51;&#125;&#125;&#123;&#122;&#95;&#123;&#110;&#100;&#99;&#125;&#32;&#43;&#32;&#77;&#95;&#123;&#51;&#51;&#125;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>If you&#8217;re using DirectX, <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-a83e07f962e930fcd870d5bd7e388183_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;&#95;&#123;&#110;&#100;&#99;&#125;" title="Rendered by QuickLaTeX.com" height="11" width="29" style="vertical-align: -3px;"/> is simply the depth buffer value. OpenGL uses the convention that <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-a83e07f962e930fcd870d5bd7e388183_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;&#95;&#123;&#110;&#100;&#99;&#125;" title="Rendered by QuickLaTeX.com" height="11" width="29" style="vertical-align: -3px;"/> ranges from -1 to 1, so <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-f63cc8f4e183be74e0e6cd3f495bc8f0_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;&#95;&#123;&#110;&#100;&#99;&#125;&#32;&#61;&#32;&#50;&#32;&#100;&#101;&#112;&#116;&#104;&#32;&#45;&#32;&#49;" title="Rendered by QuickLaTeX.com" height="17" width="136" style="vertical-align: -4px;"/>.</p>
<p>Depending on your use case, you may want to precalculate this value into a lookup texture rather than performing the calculation for every shader that needs it.</p>
<h2>Calculating the position from the z-value for perspective matrices</h2>
<p>Now that we have the z coordinate of the position, we basically have everything we need to construct the position. For perspective projections, this is very simple. We know the point is somewhere on the view ray with direction <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-4b9ef1bbd23fd1b198de883813285620_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#68;" title="Rendered by QuickLaTeX.com" height="12" width="15" style="vertical-align: 0px;"/> (in view space, with origin = 0). For now, we assume nothing about <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-4b9ef1bbd23fd1b198de883813285620_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#68;" title="Rendered by QuickLaTeX.com" height="12" width="15" style="vertical-align: 0px;"/> (it&#8217;s not necessarily normalized or anything). We solve for <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-b4e3cbf5d4c5c6d9b702dd139f14c147_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#116;" title="Rendered by QuickLaTeX.com" height="12" width="6" style="vertical-align: 0px;"/> using <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-4586e340cb83d5b642972e97a288fec2_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;" title="Rendered by QuickLaTeX.com" height="8" width="9" style="vertical-align: 0px;"/>, the only component we know everything about, and substitute.</p>
<p class="ql-center-displayed-equation" style="line-height: 41px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-f90f0d85a7e41efc2734f056ad84e2eb_l3.png" height="41" width="328" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#82;&#40;&#116;&#41;&#32;&#61;&#32;&#116;&#68;&#32;&#92;&#82;&#105;&#103;&#104;&#116;&#97;&#114;&#114;&#111;&#119;&#32;&#116;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#122;&#95;&#82;&#40;&#116;&#41;&#125;&#123;&#122;&#95;&#68;&#125;&#32;&#92;&#82;&#105;&#103;&#104;&#116;&#97;&#114;&#114;&#111;&#119;&#32;&#80;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#125;&#123;&#122;&#95;&#68;&#125;&#68; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>(Yes, I hear you sighing, this is a simple intersection test.)<br />
With this formula, we can make an optimization by introducing a constraint on <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-4b9ef1bbd23fd1b198de883813285620_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#68;" title="Rendered by QuickLaTeX.com" height="12" width="15" style="vertical-align: 0px;"/>. If we resize D so that <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-2c5a341d645d6e066e31343ef69003f8_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;&#95;&#68;&#39;&#32;&#61;&#32;&#49;" title="Rendered by QuickLaTeX.com" height="19" width="52" style="vertical-align: -5px;"/> (let&#8217;s call this a z-normalization), then things simplify:</p>
<p class="ql-center-displayed-equation" style="line-height: 39px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-8202f947e5c90f6faf73ab9651ad998e_l3.png" height="39" width="217" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#68;&#39;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#68;&#125;&#123;&#122;&#95;&#68;&#125;&#32;&#92;&#82;&#105;&#103;&#104;&#116;&#97;&#114;&#114;&#111;&#119;&#32;&#80;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#61;&#32;&#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#68;&#39; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<p>We can precalculate <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-9ae8526feb6b8a99678e1d7ce2841d22_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#68;&#39;" title="Rendered by QuickLaTeX.com" height="14" width="19" style="vertical-align: 0px;"/> for each screen corner and pass it into the vertex shader. The vertex shader in turn can pass it on to the fragment shader as an interpolated value. Since interpolation is linear, we&#8217;ll always get a correct view ray with <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-994c688976ba1adc23567829d04912b0_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#122;&#32;&#61;&#32;&#49;" title="Rendered by QuickLaTeX.com" height="13" width="41" style="vertical-align: -1px;"/>. This way, the reconstruction happens with a single multiplication! Using a compute shader as with the HBAO example, the interpolation has to be performed manually.</p>
<h2>Calculating the view vectors</h2>
<p>There&#8217;s various ways to go about calculating the view vectors for a perspective projection. Since it&#8217;s only done once every time the projection properties change it&#8217;s not exactly a performance-critical piece of code. I&#8217;ll go for what I consider the &#8216;neatest&#8217; way. Since we use the projection matrix to map from view-space points to homogeneous coordinates and convert those to NDC, we can invert the process to go from NDC to view-space coordinates. The view directions for every quad corner correspond to the edges of the frustum linking near plane corners to far plane corners. These are simply formed by the NDC extents -1.0 and 1.0 (0.0 and 1.0 for z in DirectX), since the frustum in NDC forms the normal cube.</p>
<p>Edit: More on unprojections <a title="Unprojections Explained" href="http://www.derschmale.com/2014/09/28/unprojections-explained/">here</a>.</p>
<div class="codecolorer-container cpp default" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;height:300px;"><div class="cpp codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap;"><span style="color: #666666;">// You could set z = -1.0f instead of 0.0f for OpenGL</span><br />
<span style="color: #666666;">// but it doesn't matter since any z value lies on the same ray anyway.</span><br />
<br />
Vector3D homogenousCorners<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">4</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> <span style="color: #008000;">&#123;</span><br />
Vector3D<span style="color: #008000;">&#40;</span><span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">0.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span>,<br />
Vector3D<span style="color: #008000;">&#40;</span><span style="color:#800080;">1.0f</span>, <span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">0.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span>,<br />
Vector3D<span style="color: #008000;">&#40;</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span>, <span style="color:#800080;">0.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span>,<br />
Vector3D<span style="color: #008000;">&#40;</span><span style="color: #000040;">-</span><span style="color:#800080;">1.0f</span>, <span style="color:#800080;">1.0f</span>, <span style="color:#800080;">0.0f</span>, <span style="color:#800080;">1.0f</span><span style="color: #008000;">&#41;</span><br />
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span><br />
<br />
Matrix3D inverseProjection <span style="color: #000080;">=</span> projectionMatrix.<span style="color: #007788;">Inverse</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span><br />
Vector3D rays<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">4</span><span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
<br />
<span style="color: #0000ff;">for</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">unsigned</span> <span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span> i <span style="color: #000040;">&amp;</span>lt<span style="color: #008080;">;</span> <span style="color: #0000dd;">4</span><span style="color: #008080;">;</span> <span style="color: #000040;">++</span>i<span style="color: #008000;">&#41;</span> <span style="color: #008000;">&#123;</span><br />
Vector3D<span style="color: #000040;">&amp;</span>amp<span style="color: #008080;">;</span> ray <span style="color: #000080;">=</span> rays<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
<br />
<span style="color: #666666;">// unproject the frustum corner from NDC to view space</span><br />
ray <span style="color: #000080;">=</span> inverseProjection <span style="color: #000040;">*</span> homogenousCorners<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span><br />
ray <span style="color: #000040;">/</span><span style="color: #000080;">=</span> ray.<span style="color: #007788;">w</span><span style="color: #008080;">;</span><br />
<br />
<span style="color: #666666;">// z-normalize this vector</span><br />
ray <span style="color: #000040;">/</span><span style="color: #000080;">=</span> ray.<span style="color: #007788;">z</span><span style="color: #008080;">;</span><br />
<span style="color: #008000;">&#125;</span></div></div>
<p>Pass the rays into the vertex shader, either as a constant buffer using vertex IDs or as a vertex attribute and Bob&#8217;s your uncle!</p>
<h2>Working in world space</h2>
<p>If you want to perform your lighting or whatever in world space, you can simply transform the z-normalized view rays to world space and add the camera position. No need to perform matrix calculations in your fragment shader.</p>
<p class="ql-center-displayed-equation" style="line-height: 15px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-cb311683160886aa792ef3048c8bebe5_l3.png" height="15" width="298" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#91; &#80;&#95;&#123;&#119;&#111;&#114;&#108;&#100;&#125;&#32;&#61;&#32;&#67;&#97;&#109;&#101;&#114;&#97;&#95;&#123;&#119;&#111;&#114;&#108;&#100;&#125;&#32;&#43;&#32;&#122;&#95;&#123;&#118;&#105;&#101;&#119;&#125;&#32;&#42;&#32;&#68;&#95;&#123;&#119;&#111;&#114;&#108;&#100;&#125; &#92;&#93;" title="Rendered by QuickLaTeX.com"/></p>
<h2>Conclusion</h2>
<p>There we have it! I think it should be straightforward enough to implement this in a shader. If not, let me know and I shall have to expand on this. Just stop storing your position vectors now, mkay? :)</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.derschmale.com/2014/01/26/reconstructing-positions-from-the-depth-buffer/feed/</wfw:commentRss>
			<slash:comments>11</slash:comments>
		
		
			</item>
		<item>
		<title>An alternative implementation for HBAO</title>
		<link>https://www.derschmale.com/2013/12/20/an-alternative-implementation-for-hbao-2/</link>
					<comments>https://www.derschmale.com/2013/12/20/an-alternative-implementation-for-hbao-2/#comments</comments>
		
		<dc:creator><![CDATA[David]]></dc:creator>
		<pubDate>Thu, 19 Dec 2013 23:13:13 +0000</pubDate>
				<category><![CDATA[C++]]></category>
		<category><![CDATA[DirectX]]></category>
		<category><![CDATA[Graphics]]></category>
		<category><![CDATA[Helix]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[algorithm]]></category>
		<category><![CDATA[ambient occlusion]]></category>
		<category><![CDATA[directx]]></category>
		<category><![CDATA[DirectX 11]]></category>
		<category><![CDATA[hbao]]></category>
		<category><![CDATA[helix]]></category>
		<category><![CDATA[hlsl]]></category>
		<category><![CDATA[Shaders]]></category>
		<category><![CDATA[shading]]></category>
		<category><![CDATA[ssao]]></category>
		<guid isPermaLink="false">http://www.derschmale.com/?p=980</guid>

					<description><![CDATA[Introduction Image-space horizon-based ambient occlusion [HBAO] is a technique introduced by NVidia (Louis Bavoil et al.) in 2008. I recommend checking out the following resources to find out exactly how the algorithm works, this post will build on it further: ShaderX7 &#8211; Image-Space Horizon-Based Ambient Occlusion (by Louis Bavoil &#38; Miguel Sainz) The original Siggraph&#8230;]]></description>
										<content:encoded><![CDATA[<h2>Introduction</h2>
<p><a href="http://www.derschmale.com/blog/wp-content/hbao-overview.jpg"><img loading="lazy" decoding="async" class="alignright size-medium wp-image-968" alt="HBAO" src="http://www.derschmale.com/blog/wp-content/hbao-overview-300x168.jpg" width="300" height="168" /></a></p>
<p>Image-space horizon-based ambient occlusion [HBAO] is a technique introduced by NVidia (Louis Bavoil et al.) in 2008. I recommend checking out the following resources to find out exactly how the algorithm works, this post will build on it further:</p>
<ul>
<li>
<div><a title="ShaderX7 on Amazon" href="http://www.amazon.com/ShaderX7-Rendering-Techniques-Wolfgang-Engel/dp/1584505982/ref=sr_1_1?ie=UTF8&amp;qid=1387043804&amp;sr=8-1&amp;keywords=ShaderX7" target="_blank">ShaderX7 </a>&#8211; Image-Space Horizon-Based Ambient Occlusion (by Louis Bavoil &amp; Miguel Sainz)</div>
</li>
<li><a title="Image-Space Horizon-Based Ambient Occlusion Siggraph paper" href="http://www.twistedsanity.org/rdimitrov/HBAO_SIGGRAPH08.pdf" target="_blank">The original Siggraph paper</a> (by Louis Bavoil, Miguel Sainz &amp; Rouslan Dimitrov)</li>
<li>
<div><a title="Siggraph 2008: Image-Space Horizon-Based Ambient Occlusion - See more at: http://www.nvidia.com/object/siggraph-2008-HBAO.html#sthash.9zwhZohI.dpuf" href="http://www.nvidia.com/object/siggraph-2008-HBAO.html" target="_blank">Siggraph presentation</a></div>
</li>
<li>
<div><a title="Horizon-Based Ambient Occlusion using Compute Shaders" href="https://developer.nvidia.com/sites/default/files/akamai/gamedev/files/sdk/11/SSAO11.pdf" target="_blank">Horizon-Based Ambient Occlusion using Compute Shaders</a></div>
</li>
</ul>
<p>The ShaderX7 book in particular offers a well-explained detailed approach to the problem.<br />
Before I continue, I&#8217;d like to extend my gratitude to Louis Bavoil for taking the time to review this article and giving some insightful tips on how to improve the shader.</p>
<h2>A recap</h2>
<p>Roughly, HBAO works by raymarching the depth buffer, and doing this in a number of equiangular directions across a circle in screen-space.</p>
<p style="text-align: center;"><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-e860d0279c0ccffdf3a5da49da95c128_l3.png" height="134" width="315" class="ql-manual-mode quicklatex-auto-format" alt="Rendered by QuickLaTeX.com" title="Rendered by QuickLaTeX.com"/>
</p>
<p>For the derivations, all points and vectors are expressed in spherical coordinates relative to the <em>eye space</em> basis. The azimuth angle rotates about the negated eye space Z-axis and the elevation angle is relative to the XY plane. This is a bit different to the common way of defining the polar angle, which is usually relative to the main axis, but the approach serves us well. The elevation angle is the only angle that will be of much interest to us, since the azimuthal integration will be handled entirely by marching in the different directions.</p>
<p>With each raymarch step, the line between the sample point and the centre point <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-650eb7688af6737ac325425b5c9a5982_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#80;" title="Rendered by QuickLaTeX.com" height="12" width="14" style="vertical-align: 0px;"/> is considered. Each time the elevation angle is larger than the previous maximum, a new chunk of occluding geometry has been found. We add in the occlusion for the arc-segment between the last two found horizon vectors and weigh it with a distance function.</p>
<p style="text-align: center;"><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-2735f68780b5c5965c08fdeb8b2bad41_l3.png" height="259" width="471" class="ql-manual-mode quicklatex-auto-format" alt="Rendered by QuickLaTeX.com" title="Rendered by QuickLaTeX.com"/>
</p>
<p>Note that the horizon angle <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-711a54a6a1f4d6f0d766ebd954762db5_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#104;&#40;&#92;&#116;&#104;&#101;&#116;&#97;&#41;" title="Rendered by QuickLaTeX.com" height="18" width="32" style="vertical-align: -4px;"/> is measured with respect to the plane parallel to the XY view plane, and therefore a tangent vector <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-b4e3cbf5d4c5c6d9b702dd139f14c147_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#116;" title="Rendered by QuickLaTeX.com" height="12" width="6" style="vertical-align: 0px;"/> needs to be taken into consideration, with its angle <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-9412c238d056a7e88f1a7329013c153b_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#116;&#40;&#92;&#116;&#104;&#101;&#116;&#97;&#41;" title="Rendered by QuickLaTeX.com" height="18" width="28" style="vertical-align: -4px;"/> relative to the same plane. This yields the following equation, as noted in ShaderX7:</p>
<p class="ql-center-displayed-equation" style="line-height: 71px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-b03b1ae4780e6d9579dda1dda26d6880_l3.png" height="71" width="315" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#091; &#65;&#32;&#61;&#32;&#49;&#32;&#45;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#50;&#92;&#112;&#105;&#125;&#92;&#105;&#110;&#116;&#92;&#100;&#105;&#115;&#112;&#108;&#97;&#121;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#61;&#45;&#92;&#112;&#105;&#125;&#94;&#123;&#92;&#112;&#105;&#125;&#92;&#105;&#110;&#116;&#92;&#100;&#105;&#115;&#112;&#108;&#97;&#121;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#92;&#112;&#104;&#105;&#61;&#116;&#40;&#92;&#116;&#104;&#101;&#116;&#97;&#41;&#125;&#94;&#123;&#104;&#40;&#92;&#116;&#104;&#101;&#116;&#97;&#41;&#125;&#87;&#40;&#92;&#118;&#101;&#99;&#123;&#92;&#111;&#109;&#101;&#103;&#97;&#125;&#41;&#99;&#111;&#115;&#40;&#92;&#112;&#104;&#105;&#41;&#92;&#44;&#100;&#92;&#112;&#104;&#105;&#92;&#44;&#100;&#92;&#116;&#104;&#101;&#116;&#97; &#92;&#093;" title="Rendered by QuickLaTeX.com"/></p>
<p>For the derivation and other implementation details, please refer to the source material. It will be useful as I&#8217;ll explain the differences in the implementation I did for Helix.</p>
<h2>The Helix approach</h2>
<p>With respect to this article, the most important aspect of the original paper is that it uses eye-space as a basis for spherical coordinates, which is where this post will differ. While we&#8217;ll still use eye-space as a basis for our positions and vectors in the shader, for the derivations we&#8217;ll express the spherical coordinates relative to the normal vector and the tangent plane. Both changes will result in a leaner and, more importantly, trig-less approach. Here&#8217;s the previous figure reimagined:</p>
<p style="text-align: center;"><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-6a70c96df97363c150265dd603999737_l3.png" height="237" width="471" class="ql-manual-mode quicklatex-auto-format" alt="Rendered by QuickLaTeX.com" title="Rendered by QuickLaTeX.com"/>
</p>
<p>Which will result in a similar equation, without involving <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-9412c238d056a7e88f1a7329013c153b_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#116;&#40;&#92;&#116;&#104;&#101;&#116;&#97;&#41;" title="Rendered by QuickLaTeX.com" height="18" width="28" style="vertical-align: -4px;"/>:</p>
<p class="ql-center-displayed-equation" style="line-height: 70px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-b0b65ee230867d0ef709574d6163357a_l3.png" height="70" width="300" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#091; &#65;&#32;&#61;&#32;&#49;&#32;&#45;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#50;&#92;&#112;&#105;&#125;&#92;&#105;&#110;&#116;&#92;&#100;&#105;&#115;&#112;&#108;&#97;&#121;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#61;&#45;&#92;&#112;&#105;&#125;&#94;&#123;&#92;&#112;&#105;&#125;&#92;&#105;&#110;&#116;&#92;&#100;&#105;&#115;&#112;&#108;&#97;&#121;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#92;&#112;&#104;&#105;&#61;&#48;&#125;&#94;&#123;&#104;&#40;&#92;&#116;&#104;&#101;&#116;&#97;&#41;&#125;&#87;&#40;&#92;&#118;&#101;&#99;&#123;&#92;&#111;&#109;&#101;&#103;&#97;&#125;&#41;&#99;&#111;&#115;&#40;&#92;&#112;&#104;&#105;&#41;&#92;&#44;&#100;&#92;&#112;&#104;&#105;&#92;&#44;&#100;&#92;&#116;&#104;&#101;&#116;&#97; &#92;&#093;" title="Rendered by QuickLaTeX.com"/></p>
<p>We&#8217;ll take some liberties with the attenuation function, approximating it piecewise (see original). In practice, this means we define it to be constant between two adjacent horizon vectors, using the value at the furthest sample point. Furthermore, since we don&#8217;t snap to texel centres, we don&#8217;t need to make tangent adjustments as in the original. Focusing on the inner integral, we get:</p>
<p class="ql-center-displayed-equation" style="line-height: 69px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-fa8dfc7c3e16c1ed53392fd403141642_l3.png" height="69" width="415" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#091; &#92;&#105;&#110;&#116;&#92;&#100;&#105;&#115;&#112;&#108;&#97;&#121;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#92;&#112;&#104;&#105;&#61;&#92;&#112;&#104;&#105;&#95;&#105;&#125;&#94;&#123;&#92;&#112;&#104;&#105;&#95;&#123;&#105;&#43;&#49;&#125;&#125;&#87;&#40;&#92;&#118;&#101;&#99;&#123;&#92;&#111;&#109;&#101;&#103;&#97;&#125;&#41;&#99;&#111;&#115;&#40;&#92;&#112;&#104;&#105;&#41;&#92;&#44;&#100;&#92;&#112;&#104;&#105;&#92;&#32;&#92;&#97;&#112;&#112;&#114;&#111;&#120;&#32;&#87;&#40;&#92;&#118;&#101;&#99;&#123;&#92;&#111;&#109;&#101;&#103;&#97;&#95;&#123;&#105;&#43;&#49;&#125;&#125;&#41;&#40;&#115;&#105;&#110;&#40;&#92;&#112;&#104;&#105;&#95;&#123;&#105;&#43;&#49;&#125;&#41;&#32;&#45;&#32;&#115;&#105;&#110;&#40;&#92;&#112;&#104;&#105;&#95;&#105;&#41;&#41; &#92;&#093;" title="Rendered by QuickLaTeX.com"/></p>
<p>Making the following observation, we can make our lives a lot easier:</p>
<p style="text-align: center;"><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-d66a115f9286d6794282cd0d26b991c0_l3.png" height="184" width="286" class="ql-manual-mode quicklatex-auto-format" alt="Rendered by QuickLaTeX.com" title="Rendered by QuickLaTeX.com"/>
</p>
<p>This allows us to rewrite any occurence of <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-19930d1bacf2a15b32bb20cb4a831fc4_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#115;&#105;&#110;&#40;&#92;&#112;&#104;&#105;&#41;" title="Rendered by QuickLaTeX.com" height="18" width="49" style="vertical-align: -4px;"/> as a simple dot product. Using the piecewise approximation for <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-74287f47b5abe06f27e01b59bcfac8d9_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#87;&#40;&#92;&#118;&#101;&#99;&#123;&#92;&#111;&#109;&#101;&#103;&#97;&#125;&#41;" title="Rendered by QuickLaTeX.com" height="18" width="44" style="vertical-align: -4px;"/>, the equation then becomes:</p>
<p class="ql-center-displayed-equation" style="line-height: 61px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-2ce006f4fc874fb0cce107845584959c_l3.png" height="61" width="390" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#091; &#65;&#32;&#61;&#32;&#49;&#32;&#45;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#50;&#92;&#112;&#105;&#125;&#92;&#105;&#110;&#116;&#92;&#100;&#105;&#115;&#112;&#108;&#97;&#121;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#61;&#45;&#92;&#112;&#105;&#125;&#94;&#123;&#92;&#112;&#105;&#125;&#92;&#115;&#117;&#109;&#92;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#105;&#61;&#49;&#125;&#94;&#123;&#78;&#95;&#115;&#125;&#32;&#87;&#40;&#92;&#118;&#101;&#99;&#123;&#119;&#95;&#105;&#125;&#41;&#40;&#115;&#105;&#110;&#40;&#92;&#112;&#104;&#105;&#95;&#105;&#41;&#32;&#45;&#32;&#115;&#105;&#110;&#40;&#92;&#112;&#104;&#105;&#95;&#123;&#105;&#45;&#49;&#125;&#41;&#92;&#44;&#100;&#92;&#116;&#104;&#101;&#116;&#97; &#92;&#093;" title="Rendered by QuickLaTeX.com"/></p>
<p class="ql-center-displayed-equation" style="line-height: 61px;"><span class="ql-right-eqno"> &nbsp; </span><span class="ql-left-eqno"> &nbsp; </span><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-8d75704e90369cdb8c060cd610377317_l3.png" height="61" width="375" class="ql-img-displayed-equation quicklatex-auto-format" alt="&#92;&#091; &#61;&#32;&#49;&#32;&#45;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#125;&#123;&#50;&#92;&#112;&#105;&#125;&#92;&#105;&#110;&#116;&#92;&#100;&#105;&#115;&#112;&#108;&#97;&#121;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#92;&#116;&#104;&#101;&#116;&#97;&#61;&#45;&#92;&#112;&#105;&#125;&#94;&#123;&#92;&#112;&#105;&#125;&#92;&#115;&#117;&#109;&#92;&#108;&#105;&#109;&#105;&#116;&#115;&#95;&#123;&#105;&#61;&#49;&#125;&#94;&#123;&#78;&#95;&#115;&#125;&#32;&#87;&#40;&#92;&#118;&#101;&#99;&#123;&#119;&#95;&#105;&#125;&#41;&#40;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#32;&#92;&#99;&#100;&#111;&#116;&#32;&#72;&#95;&#105;&#125;&#123;&#124;&#72;&#95;&#105;&#124;&#125;&#32;&#45;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#78;&#32;&#92;&#99;&#100;&#111;&#116;&#32;&#72;&#95;&#123;&#105;&#45;&#49;&#125;&#125;&#123;&#124;&#72;&#95;&#123;&#105;&#45;&#49;&#125;&#124;&#125;&#41;&#92;&#44;&#100;&#92;&#116;&#104;&#101;&#116;&#97; &#92;&#093;" title="Rendered by QuickLaTeX.com"/></p>
<h3>[Update]</h3>
<p>One thing I forgot to mention that needs to be taken into account is the case when the angle extends 90 degrees:</p>
<p style="text-align: center;"><img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-8567d08a45386a18160e8613251663ed_l3.png" height="164" width="208" class="ql-manual-mode quicklatex-auto-format" alt="Rendered by QuickLaTeX.com" title="Rendered by QuickLaTeX.com"/>
</p>
<p>In this case, the occlusion should be total, but the dot product with the normal would result in the wrong angle. This is actually a limitation of working in screen-space, where marching in 2D does not match a marching in 3D due to overlap. What we need to do is test the angle with the tangent: if <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-1880c63a62dda0598235ec1e475b269d_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#116;&#32;&#92;&#99;&#100;&#111;&#116;&#32;&#104;&#32;&#60;&#32;&#48;" title="Rendered by QuickLaTeX.com" height="13" width="62" style="vertical-align: 0px;"/>, we simply do not have proper data in the direction we&#8217;re interested in: it could be completely occluded, or not at all. Personally, I find picking &#8220;half-occluded&#8221; works best in this case.</p>
<h2>Some more implementation details</h2>
<p>As with SSAO, to reduce banding artefacts, the orientations of all the sample directions should be rotated randomly per pixel. Helix&#8217;s approach takes one similar to Crytek&#8217;s original SSAO algorithm, using a 4&#215;4 &#8216;dither&#8217; texture containing the 2D rotation factors which is indexed so that it becomes tiled 1:1 over the screen. To assure an even distribution, rather than going for random angles, I instead used 16 evenly spaced angles between 0 and <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-c1a12422810c77e8ed61df6969f37fd1_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#92;&#102;&#114;&#97;&#99;&#123;&#50;&#92;&#112;&#105;&#125;&#123;&#78;&#125;" title="Rendered by QuickLaTeX.com" height="22" width="16" style="vertical-align: -6px;"/> with <img loading="lazy" decoding="async" src="https://www.derschmale.com/blog/wp-content/ql-cache/quicklatex.com-5793832f979c2268e3694c246d53b1bb_l3.png" class="ql-img-inline-formula quicklatex-auto-format" alt="&#78;" title="Rendered by QuickLaTeX.com" height="12" width="16" style="vertical-align: 0px;"/> being the amount of raymarching directions.<br />
Furthermore, the dither texture also contains a jitter factor, which we&#8217;ll use to offset the starting position to reduce banding artefacts. This is similar to the original, with some nuances to play nice with the way view space positions are reconstructed in Helix. As per Louis Bavoil&#8217;s suggestion, introducing an extra sample closer to the center of the kernel creates more interesting contact occlusions. In my approach, both are combined by jittering the ray&#8217;s start position between the closest neighbour and the first sample size.<br />
Using dithering obviously introduces a lot of noise. Performing a depth-dependent 4&#215;4 box blur afterwards will remove this while making sure every pixel always has contributions from the same sample directions. Some implementations rotate the directions entirely randomly (non-tiled) and blur heavily, but I feel this tends to create what I can only describe as &#8220;static AO clouding&#8221;: you can see a static AO pattern moving along with the camera. Other implementations like to combine 4&#215;4 dithering and heavy blurring, but personally I like to retain some of the higher frequency occlusions.</p>
<div id="attachment_979" style="width: 306px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/normal-occlusions.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-979" src="http://www.derschmale.com/blog/wp-content/normal-occlusions-300x168.jpg" alt="Subtle normal-based occlusion" width="300" height="168" class="size-medium wp-image-979" srcset="https://www.derschmale.com/blog/wp-content/normal-occlusions-300x168.jpg 300w, https://www.derschmale.com/blog/wp-content/normal-occlusions-1024x576.jpg 1024w, https://www.derschmale.com/blog/wp-content/normal-occlusions.jpg 1280w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-979" class="wp-caption-text">Subtle normal-based occlusion</p></div>
<p>Regarding the normals, the source recommends using face normals which can be derived from the depth buffer. However, I simply use the per-pixel normals that are stored in the deferred renderer&#8217;s normal buffer. While this can introduce artefacts, the use of a bias angle largely cancels them out while still adding some normal-based occlusion.</p>
<h2>Comparison</h2>
<p>Below, you can see a comparison between this HBAO approach and traditional SSAO using similar settings and sample count.</p>
<div id="attachment_974" style="width: 306px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/hbao4x4.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-974" class="size-medium wp-image-974" alt="HBAO (4x4 samples)" src="http://www.derschmale.com/blog/wp-content/hbao4x4-300x168.jpg" width="300" height="168" srcset="https://www.derschmale.com/blog/wp-content/hbao4x4-300x168.jpg 300w, https://www.derschmale.com/blog/wp-content/hbao4x4-1024x576.jpg 1024w, https://www.derschmale.com/blog/wp-content/hbao4x4.jpg 1280w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-974" class="wp-caption-text">HBAO (4&#215;4 samples)</p></div>
<div id="attachment_976" style="width: 306px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/ssao-16.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-976" class="size-medium wp-image-976 " alt="SSAO (16 samples)" src="http://www.derschmale.com/blog/wp-content/ssao-16-300x168.jpg" width="300" height="168" srcset="https://www.derschmale.com/blog/wp-content/ssao-16-300x168.jpg 300w, https://www.derschmale.com/blog/wp-content/ssao-16-1024x576.jpg 1024w, https://www.derschmale.com/blog/wp-content/ssao-16.jpg 1280w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-976" class="wp-caption-text">SSAO (16 samples)</p></div>
<div id="attachment_973" style="width: 306px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/hbao6x5.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-973" class="size-medium wp-image-973" alt="HBAO (6x5 samples)" src="http://www.derschmale.com/blog/wp-content/hbao6x5-300x168.jpg" width="300" height="168" srcset="https://www.derschmale.com/blog/wp-content/hbao6x5-300x168.jpg 300w, https://www.derschmale.com/blog/wp-content/hbao6x5-1024x576.jpg 1024w, https://www.derschmale.com/blog/wp-content/hbao6x5.jpg 1280w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-973" class="wp-caption-text">HBAO (6&#215;5 samples)</p></div>
<div id="attachment_975" style="width: 306px" class="wp-caption alignright"><a href="http://www.derschmale.com/blog/wp-content/ssao32.jpg"><img loading="lazy" decoding="async" aria-describedby="caption-attachment-975" class="size-medium wp-image-975" alt="SSAO (32 samples)" src="http://www.derschmale.com/blog/wp-content/ssao32-300x168.jpg" width="300" height="168" srcset="https://www.derschmale.com/blog/wp-content/ssao32-300x168.jpg 300w, https://www.derschmale.com/blog/wp-content/ssao32-1024x576.jpg 1024w, https://www.derschmale.com/blog/wp-content/ssao32.jpg 1280w" sizes="auto, (max-width: 300px) 100vw, 300px" /></a><p id="caption-attachment-975" class="wp-caption-text">SSAO (32 samples)</p></div>
<p>Notice how there&#8217;s less over-occlusion, especially near discontinuities (for example between the curtain and floor) while some details are handled more correctly (such as the creases in the curtain to the right). Even with a relatively small amount of samples, the improvements are remarkable. I did boost the parameters on both, <a href="http://www.derschmale.com/2013/12/12/screen-space-ambient-occlusion-battling-your-contrast-bias/">violating my own rules</a>, but purely for illustrational purposes!</p>
<h2>Example shader</h2>
<p>I guess no one will be happy without some sample code. I can&#8217;t publish the entire Helix code (mainly because it&#8217;s a pretty inefficient mess right now) but I can show the shaders for the ambient occlusion step. The rest (blur shaders, etc) are default fare. All code is in HLSL Shader Model 5 for DirectX 11.</p>
<ul>
<li>
<div><a title="Vertex Shader" href="http://www.derschmale.com/source/hbao/HBAOVertexShader.hlsl" target="_blank">Vertex shader</a></div>
</li>
<li>
<div><a title="Fragment Shader" href="http://www.derschmale.com/source/hbao/HBAOFragmentShader.hlsl" target="_blank">Fragment shader</a></div>
</li>
</ul>
<h2>Conclusion</h2>
<p>And that&#8217;s it! I hope this post was useful looking into different AO techniques. Any questions, comments, corrections, &#8230; are more than welcome!</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.derschmale.com/2013/12/20/an-alternative-implementation-for-hbao-2/feed/</wfw:commentRss>
			<slash:comments>5</slash:comments>
		
		
			</item>
	</channel>
</rss>
