<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>AMD Developer Central</title>
	
	<link>http://blogs.amd.com/developer</link>
	<description>Your central resource for tools, technologies, best practices, and expert guidance to optimize your software solution performance on AMD platforms.</description>
	<lastBuildDate>Tue, 29 May 2012 21:15:21 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/AmdDeveloperBlogs" /><feedburner:info uri="amddeveloperblogs" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><creativeCommons:license>http://creativecommons.org/licenses/by-nd/2.0/</creativeCommons:license><image><link>http://creativecommons.org/licenses/by-nd/2.0/</link><url>http://creativecommons.org/images/public/somerights20.gif</url><title>Some Rights Reserved</title></image><feedburner:emailServiceId>AmdDeveloperBlogs</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><item>
		<title>MapReduce Optimization in Mahout Recommendation Engine</title>
		<link>http://feedproxy.google.com/~r/AmdDeveloperBlogs/~3/IwudjpqULyk/</link>
		<comments>http://blogs.amd.com/developer/2012/05/29/mapreduce-optimization-in-mahout-recommendation-engine/#comments</comments>
		<pubDate>Tue, 29 May 2012 21:15:20 +0000</pubDate>
		<dc:creator>AMD DeveloperCentral</dc:creator>
				<category><![CDATA[Hard-Core Software Optimization]]></category>

		<guid isPermaLink="false">http://blogs.amd.com/developer/?p=2406</guid>
		<description><![CDATA[We have been working with Apache™ Hadoop™ and Apache™ Mahout™ to improve performance of Mahout-based workloads.  Hadoop is an infrastructure that supports big data distributed applications and Mahout is a machine‑learning library.  Our test case runs Mahout recommendations with the &#8230; <a href="http://blogs.amd.com/developer/2012/05/29/mapreduce-optimization-in-mahout-recommendation-engine/">Continue reading</a>]]></description>
			<content:encoded><![CDATA[<p>We have been working with Apache™ Hadoop™ and Apache™ Mahout™ to improve performance of Mahout-based workloads.  <a href="http://hadoop.apache.org/">Hadoop</a> is an infrastructure that supports big data distributed applications and <a href="http://mahout.apache.org/">Mahout</a> is a machine‑learning library.  Our test case runs Mahout recommendations with the Apache Software Foundation (ASF) Email dataset, using example scripts provided by Mahout. This workload recommends emails to the user based on recipient responses to emails sent by the user. </p>
<p>While running the workload we noticed that the <em>unsymmetrify mapper</em> job in the Mahout item-based recommendations was taking a long time to execute. The item-based recommendation approach calculates user preferences for an item based on user preferences towards similar items. When analyzing the profiles we noticed that one of the data structures is being recreated, and is allocating memory for every key value pair in the record. </p>
<p>To fix this behavior, we developed a patch that initializes the data structure per record.  This pattern, which could be easily overlooked by programmers, can cause serious performance degradation in the performance of Mahout™ MapReduce jobs.  In the case of the workload we were studying, we measured on our AMD Opteron 4228 HE cluster more than a 4.5X speed-up in the job execution time with a two-line code change!</p>
<p>The following graph illustrates the improved execution time gained by using our optimization.</p>
<p><a rel="attachment wp-att-2408" href="http://blogs.amd.com/developer/2012/05/29/mapreduce-optimization-in-mahout-recommendation-engine/executiontime/"><img class="aligncenter size-full wp-image-2408" title="ExecutionTime" src="http://blogs.amd.com/developer/files/2012/05/ExecutionTime.jpg" alt="" width="645" height="409" /></a></p>
<p>Performance improvements achieved by using this technique will vary by workload.  Gains depend on the number of key value pairs in the input record and the amount of heap being allocated.</p>
<p>The <strong>context.write</strong> method writes the data into intermediates and output files.  In MapReduce programming, when <strong>context.write</strong> is called the specified key value pair is guaranteed to be serialized and written.  Therefore, reusing the same object when possible will increase performance. </p>
<p>The ASF Email dataset is publically available for download at <a href="http://aws.amazon.com/datasets/7791434387204566">http://aws.amazon.com/datasets/7791434387204566</a>.  Our test cluster used eight data nodes and one name node.  Each data node has twelve hard drives and 64GB of RAM.  This system uses RHEL 6.2, Java version 1.7.0 update 4, Hadoop version 1.0.2 and Mahout 0.6 versions.</p>
<p><em>Bhaskar Devireddy is a Member Technical Staff in the Runtimes Team at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.</em></p>
<img src="http://feeds.feedburner.com/~r/AmdDeveloperBlogs/~4/IwudjpqULyk" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blogs.amd.com/developer/2012/05/29/mapreduce-optimization-in-mahout-recommendation-engine/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.amd.com/developer/2012/05/29/mapreduce-optimization-in-mahout-recommendation-engine/</feedburner:origLink></item>
		<item>
		<title>OpenCL™ 1.2 and C++ Static kernel language now available</title>
		<link>http://feedproxy.google.com/~r/AmdDeveloperBlogs/~3/GA6vwnX4G1M/</link>
		<comments>http://blogs.amd.com/developer/2012/05/21/opencl%e2%84%a2-1-2-and-c-static-kernel-language-now-available/#comments</comments>
		<pubDate>Mon, 21 May 2012 22:29:28 +0000</pubDate>
		<dc:creator>Mark Ireton</dc:creator>
				<category><![CDATA[AMD APP]]></category>
		<category><![CDATA[Inside Dev Central]]></category>
		<category><![CDATA[AMD Developer Inside Track]]></category>
		<category><![CDATA[APU]]></category>
		<category><![CDATA[Code Optimization]]></category>
		<category><![CDATA[Code Profiler]]></category>
		<category><![CDATA[code samples]]></category>
		<category><![CDATA[Fusion]]></category>
		<category><![CDATA[GPGPU]]></category>
		<category><![CDATA[heterogeneous computing]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[Parallel Computing]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Sample Code]]></category>

		<guid isPermaLink="false">http://blogs.amd.com/developer/?p=2391</guid>
		<description><![CDATA[Beginning with the AMD OpenCL™ APP SDK 2.6 availability back in December of 2011 AMD has  been making available preview versions of both OpenCL™ 1.2 support and improved C++ support for both host side and kernel side coding.  With our &#8230; <a href="http://blogs.amd.com/developer/2012/05/21/opencl%e2%84%a2-1-2-and-c-static-kernel-language-now-available/">Continue reading</a>]]></description>
			<content:encoded><![CDATA[<p>Beginning with the <a href="http://www.amd.com/us/press-releases/Pages/amd-opencl-app-2012jan11.aspx">AMD OpenCL™ APP SDK 2.6</a> availability back in December of 2011 AMD has  been making available preview versions of both OpenCL™ 1.2 support and improved C++ support for both host side and kernel side coding.  With our recent release of the AMD OpenCL™ APP SDK 2.7 these capabilities are now fully supported in the SDK and fully integrated into the run-time support delivered via the AMD Catalyst™  software drivers.  AMD also continues to demonstrate leadership in OpenCL™ by being first to submit for ratification what we believe is a  fully conformant<sup>1</sup> OpenCL 1.2 solution for both CPU and GPU.  I am also excited that AMD now supports both the C++ wrapper AP, and the AMD extension to support  the C++ kernel language enabling complete application development using C++ capabilities, removing the need for much of the OpenCL™ API boilerplate function calls in the host code while at the same time and improving type checking of kernel parameters.</p>
<p>Download AMD OpenCL™ APP SDK 2.7 now from <a href="http://developer.amd.com/appsdk">http://developer.amd.com/appsdk</a> .</p>
<p>In addition to the above we have updated to gDEBugger, APP Profiler, Kernel Analyzer and APP ML, and there are numerous new and improved samples.  We are continuing to work on our samples and new samples will be posted on <a href="http://developer.amd.com/sdks/AMDAPPSDK/samples/Pages/default.aspx">http://developer.amd.com/sdks/AMDAPPSDK/samples/Pages/default.aspx</a> as they become available over the next few months.</p>
<p>The OpenCL™ 1.2 adds the following key capabilities</p>
<ul>
<li>Host access flags for memory objects enable more efficient buffer handling and provide added protection. For example, a buffer that is created as “write only” cannot be read from the host.</li>
<li>Pattern based GPU buffer and image initialization can help eliminate need for certain buffer/image transfers</li>
<li>Memory objects migration supports transfer of buffers prior to need</li>
<li>New generalized image creation API</li>
<li>Enhanced image/buffer map operations</li>
<li>OpenCL  1.2 CPU device partition including partition of a CPU after addition to a context</li>
<li>Generalized 1D and 2D images, image arrays,  and image&lt;-&gt; buffer interop</li>
<li>Libraries support including the separation of compile and link phases and the ability to compile</li>
</ul>
<p>The C++ Wrapper API provide the following new capabilities</p>
<ul>
<li>Defaults for platform, queue, device, … helping to significantly reduce  the amount of boilerplate code required.</li>
<li>Improved simplified constructors for cl::Buffer and addition of cl::copy functions</li>
<li>Additional support for events to functors</li>
</ul>
<p>Notable C++ features that are supported by the OpenCL™ Static C++ Kernel language</p>
<ul>
<li>Kernel and function overloading</li>
<li>Inheritance
<ul>
<li>Strict inheritance</li>
<li>Friend classes</li>
<li>Multiple inheritance</li>
<li>Templates:
<ul>
<li>Kernel templates</li>
<li>Member templates</li>
<li>Template default argument</li>
<li>Limited class templates (the “virtual” keyword is not exposed)</li>
<li>Partial template specialization</li>
<li>Namespaces</li>
<li>References</li>
<li>‘this’ operator</li>
<li>with external symbols</li>
<li>Kernel reflection, the ability to query a kernel’s arguments</li>
<li>Support for printf as a built in function</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>Additional features supported in SDK 2.7 and the Catalyst 12.4 drivers include:</p>
<ul>
<li>Support for Asynchronous PCI transfers</li>
<li>Video encode using VCE Encode (Win7)</li>
<li>Open Encode update (12.4)</li>
<li>Cl_khr_fp64 is now supported on AMD Radeon HD™ 7900 series devices (“Cayman”)</li>
<li>Added OpenGL™ interoperability under Linux for  AMD Radeon HD™ 7000 series devices</li>
<li>Stability Improvements</li>
<li>Performance improvements</li>
<li>Support for AMD Radeon HD™ 7000 series devices (“Southern Islands”) NPI</li>
<li>Support for AMD&#8217;s Second Generation APUs (“Trinity”)</li>
<li>Kernel Analyzer v1.12</li>
<li>APP Profiler  v2.5</li>
</ul>
<p>gDEBugger version 6.2; downloaded for use with this SDK from http://developer.amd.com/gDEBugger.</p>
<ul>
<li>Introducing Linux® Support</li>
<li>New standalone user interface for both Linux® and Windows®, with enhancements for better navigation and ease of use</li>
<li>Supports OpenCL™ kernel and API level debugging on AMD Radeon™ HD 7000 series graphics cards</li>
<li>Supports OpenCL™ 1.2 beta drivers</li>
<li>Automatic updater to notify and download new product updates</li>
<li>Feature enhancements including support for static arrays, union variables and Find feature</li>
<li>Stability improvements</li>
</ul>
<p>APP KernelAnalyzer v 1.12</p>
<ul>
<li>Support for Catalyst revisions through 12.1 – 12.4.</li>
</ul>
<p>APP Profiler v2.5 includes several key new features, including:</p>
<ul>
<li>Stability improvements</li>
</ul>
<p>APP ML 1.8</p>
<ul>
<li>Support for real to complex FFT</li>
</ul>
<p>New and updated samples</p>
<ul>
<li>Nbody: optimized for improved performance</li>
<li>DeviceFission: a new version of this sample using OpenCL 1.2 Device Fission capabilities. The old version is still included but renamed as DeviceFission11Ext</li>
<li>ImageOverlap and GaussianNoiseGL are two new OpenCL™ 1.2 samples</li>
<li>DwtHaar1DCPPKernel: an additional version  of DwtHaar1D but modified to use the C++ kernel language</li>
<li>MatrixMultiplicationCPPKernel: an additional version  of MatrixMultiplication  but modified to use the C++ kernel language. This sample supports multiplication of both int and float matrices through use of a template.</li>
<li>TransferOverlapCPP: an additional version  of TransferOverlapCPP but modified to use the C++ wrapper API</li>
<li>The URNGNoiseGL and HistogramAtomics samples have been modified to use the C++ wrapper API</li>
<li>The FFT, MersenneTwister,  and EigenValue samples have been modified to use C++ kernel language</li>
<li>There have been incremental improvements to a number of additional samples</li>
</ul>
<p><em> </em></p>
<p><em> </em></p>
<ol>
<li><em>Information is based on a published Khronos Specification, and is expected to pass the Khronos Conformance Testing Process. Current conformance status can be found at </em><em>www.khronos.org/conformance</em><em>.”</em></li>
</ol>
<p><em><strong> </strong></em></p>
<p><em><strong>Mark Ireton is a Sr. Manager, Product Application Engineering at AMD.</strong></em><em> </em><em>His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.</em></p>
<img src="http://feeds.feedburner.com/~r/AmdDeveloperBlogs/~4/GA6vwnX4G1M" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blogs.amd.com/developer/2012/05/21/opencl%e2%84%a2-1-2-and-c-static-kernel-language-now-available/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.amd.com/developer/2012/05/21/opencl%e2%84%a2-1-2-and-c-static-kernel-language-now-available/</feedburner:origLink></item>
		<item>
		<title>The latest AMD CodeAnalyst v3.6 for Windows (Q1 2012 release) is now available.</title>
		<link>http://feedproxy.google.com/~r/AmdDeveloperBlogs/~3/G-D4EV8WuM4/</link>
		<comments>http://blogs.amd.com/developer/2012/05/02/the-latest-amd-codeanalyst-v3-6-for-windows-q1-2012-release-is-now-available/#comments</comments>
		<pubDate>Wed, 02 May 2012 05:27:30 +0000</pubDate>
		<dc:creator>Jegan</dc:creator>
				<category><![CDATA[Inside Dev Central]]></category>
		<category><![CDATA[Code Optimization]]></category>
		<category><![CDATA[Code Profiler]]></category>
		<category><![CDATA[CodeAnalyst]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[Parallel Computing]]></category>

		<guid isPermaLink="false">http://blogs.amd.com/developer/?p=2387</guid>
		<description><![CDATA[AMD CodeAnalyst v3.6 for Windows has been released and can be downloaded from AMD CodeAnalyst for Windows page at the following URL http://developer.amd.com/cpu/codeanalyst/codeanalystwindows 1. CodeAnalyst application runs as a native 64-bit application on 64-bit Windows OS flavors. 2. We have &#8230; <a href="http://blogs.amd.com/developer/2012/05/02/the-latest-amd-codeanalyst-v3-6-for-windows-q1-2012-release-is-now-available/">Continue reading</a>]]></description>
			<content:encoded><![CDATA[<p>AMD CodeAnalyst v3.6 for Windows has been released and can be downloaded from AMD CodeAnalyst for Windows page at the following URL <a href="http://developer.amd.com/cpu/codeanalyst/codeanalystwindows">http://developer.amd.com/cpu/codeanalyst/codeanalystwindows</a></p>
<p>1. CodeAnalyst application runs as a native 64-bit application on 64-bit Windows OS flavors.</p>
<p>2. We have made several optimizations to help improve overall CodeAnalyst experience a pleasant one.</p>
<p>3. We have fixed quite a lot of bugs to help improve overall stability of CodeAnalyst.</p>
<p>We hope that this release helps you with your work efficiently and pleasantly. If you encounter across any bugs or see where AMD CodeAnalyst could be improved, please reach us through our forums or by replying this blog page.</p>
<p>OpenCL and the OpenCL logo are trademarks of Apple Inc. used with permission by Khronos.</p>
<img src="http://feeds.feedburner.com/~r/AmdDeveloperBlogs/~4/G-D4EV8WuM4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blogs.amd.com/developer/2012/05/02/the-latest-amd-codeanalyst-v3-6-for-windows-q1-2012-release-is-now-available/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.amd.com/developer/2012/05/02/the-latest-amd-codeanalyst-v3-6-for-windows-q1-2012-release-is-now-available/</feedburner:origLink></item>
		<item>
		<title>Debug OpenCL™ on Linux® with gDEBugger 6.2</title>
		<link>http://feedproxy.google.com/~r/AmdDeveloperBlogs/~3/8VaojVRug_s/</link>
		<comments>http://blogs.amd.com/developer/2012/04/25/debug-opencl%e2%84%a2-on-linux%c2%ae-with-gdebugger-6-2/#comments</comments>
		<pubDate>Wed, 25 Apr 2012 22:10:59 +0000</pubDate>
		<dc:creator>Milind Kukanur</dc:creator>
				<category><![CDATA[AMD APP]]></category>
		<category><![CDATA[Inside Dev Central]]></category>
		<category><![CDATA[AMD Developer Inside Track]]></category>
		<category><![CDATA[APU]]></category>
		<category><![CDATA[Code Optimization]]></category>
		<category><![CDATA[Code Profiler]]></category>
		<category><![CDATA[code samples]]></category>
		<category><![CDATA[GPGPU]]></category>
		<category><![CDATA[GPU]]></category>
		<category><![CDATA[heterogeneous computing]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[Optimization]]></category>
		<category><![CDATA[Visual Studio]]></category>

		<guid isPermaLink="false">http://blogs.amd.com/developer/?p=2376</guid>
		<description><![CDATA[Thank you for using gDEBugger and helping the tool get better by providing your continued feedback and forum posts. As part of AMD’s commitment to developer tools for heterogeneous compute platforms, I am excited to introduce you to gDEBugger 6.2. &#8230; <a href="http://blogs.amd.com/developer/2012/04/25/debug-opencl%e2%84%a2-on-linux%c2%ae-with-gdebugger-6-2/">Continue reading</a>]]></description>
			<content:encoded><![CDATA[<p>Thank you for using gDEBugger and helping the tool get better by providing your continued feedback and forum posts.</p>
<p>As part of AMD’s commitment to developer tools for heterogeneous compute platforms, I am excited to introduce you to gDEBugger 6.2.</p>
<p>gDEBugger 6.2 is a key milestone that adds Linux® support and a new standalone user interface that is available for both Linux® and Windows®. The major highlights of this release are:</p>
<ul>
<li>Support for Red Hat®, Ubuntu® and OpenSUSE™ Linux® distributions</li>
<li>New standalone user interface with enhanced GUI for ease of use and better navigation</li>
<li>Support for OpenCL™ kernel and API level debugging on AMD Radeon HD 7000 series graphics cards</li>
<li>Support for OpenCL™ 1.2 beta drivers</li>
<li>Stability and feature enhancements along with updated Microsoft® Visual Studio® Plugin</li>
</ul>
<p>You can visit <a href="http://developer.amd.com/TOOLS/GDEBUGGER/Pages/default.aspx">gDEBugger landing page</a> to get more details and download it.</p>
<p>We value your input. If you have suggestions on how to improve our tools or if you experience any issues, let us know through our <a href="http://devgurus.amd.com/">forums</a> or comments to this blog.</p>
<p><em><strong>Milind Kukanur is a Sr. Manager, Product Management at AMD.</strong> His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.</em></p>
<img src="http://feeds.feedburner.com/~r/AmdDeveloperBlogs/~4/8VaojVRug_s" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blogs.amd.com/developer/2012/04/25/debug-opencl%e2%84%a2-on-linux%c2%ae-with-gdebugger-6-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.amd.com/developer/2012/04/25/debug-opencl%e2%84%a2-on-linux%c2%ae-with-gdebugger-6-2/</feedburner:origLink></item>
		<item>
		<title>Oracle Java and AllocatePrefetchStyle for “Bulldozer” Processors</title>
		<link>http://feedproxy.google.com/~r/AmdDeveloperBlogs/~3/PMoiCMaWdAA/</link>
		<comments>http://blogs.amd.com/developer/2012/04/25/oracle-java-and-allocateprefetchstyle-for-%e2%80%9cbulldozer%e2%80%9d-processors/#comments</comments>
		<pubDate>Wed, 25 Apr 2012 19:15:12 +0000</pubDate>
		<dc:creator>Tom Deneau</dc:creator>
				<category><![CDATA[AMD Java Labs]]></category>

		<guid isPermaLink="false">http://blogs.amd.com/developer/?p=2364</guid>
		<description><![CDATA[This blog describes an option, -XX:AllocatePrefetchStyle=0, that may help performance when using a Java 6 (or older) Oracle JVM and running on an AMD Bulldozer core processor. In our labs, this option was observed to give performance lift typically between &#8230; <a href="http://blogs.amd.com/developer/2012/04/25/oracle-java-and-allocateprefetchstyle-for-%e2%80%9cbulldozer%e2%80%9d-processors/">Continue reading</a>]]></description>
			<content:encoded><![CDATA[<p>This blog describes an option, -XX:AllocatePrefetchStyle=0, that may help performance when using a Java 6 (or older) Oracle JVM and running on an AMD Bulldozer core processor. In our labs, this option was observed to give performance lift typically between 5 and 15% on many workloads.   As usual, uplift on your own workload may vary.</p>
<p>AMD’s second generation architecture of  core processors  codenamed “Bulldozer” was released in 2011 (products included  AMD Opteron™ 6200, AMD Opteron 4200 and FX  Processors).  These processors include a more advanced hardware prefetcher which is able to pick up more varied data access patterns than previous generation processors. </p>
<p>Oracle JVMs by default use software prefetch instructions to prefetch heap memory when allocating new objects on the heap. (Java applications tend to do a lot of heap allocations).  Sometimes the software and hardware prefetching can get in each other’s way. We discovered that for most Java workloads the best performance on Bulldozer family processors could be achieved by just using the hardware prefetcher and disabling  the software prefetching.</p>
<p>AMD worked with Oracle to get this no software prefetch strategy into the Java 7 release (released in July 2011) as the default for these Bulldozer processors.  For those still using Java 6 or earlier releases, the same effect can be achieved by explicitly using the following option on the java command line: –XX:AllocatePrefetchStyle=0.</p>
<p><em>Tom Deneau is a Senior Member Technical Staff in the Runtimes Team </em><em>at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.</em></p>
<img src="http://feeds.feedburner.com/~r/AmdDeveloperBlogs/~4/PMoiCMaWdAA" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blogs.amd.com/developer/2012/04/25/oracle-java-and-allocateprefetchstyle-for-%e2%80%9cbulldozer%e2%80%9d-processors/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.amd.com/developer/2012/04/25/oracle-java-and-allocateprefetchstyle-for-%e2%80%9cbulldozer%e2%80%9d-processors/</feedburner:origLink></item>
		<item>
		<title>GCC 4.7 is available with support for AMD Opteron™ 6200 series and AMD FX series processors</title>
		<link>http://feedproxy.google.com/~r/AmdDeveloperBlogs/~3/xVjdkUMZTgE/</link>
		<comments>http://blogs.amd.com/developer/2012/04/23/gcc-4-7-is-available-with-support-for-amd-opteron%e2%84%a2-6200-series-and-amd-fx-series-processors/#comments</comments>
		<pubDate>Mon, 23 Apr 2012 21:29:55 +0000</pubDate>
		<dc:creator>Milind Kukanur</dc:creator>
				<category><![CDATA[AMD Libraries]]></category>
		<category><![CDATA[Hard-Core Software Optimization]]></category>
		<category><![CDATA[Inside Dev Central]]></category>
		<category><![CDATA[Processor Software Visible Features]]></category>
		<category><![CDATA[AMD Developer Inside Track]]></category>
		<category><![CDATA[Code Optimization]]></category>
		<category><![CDATA[compiler]]></category>
		<category><![CDATA[multi-core]]></category>
		<category><![CDATA[Open64]]></category>
		<category><![CDATA[Optimization]]></category>
		<category><![CDATA[Parallel Computing]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[SIMD]]></category>
		<category><![CDATA[SSE]]></category>

		<guid isPermaLink="false">http://blogs.amd.com/developer/?p=2345</guid>
		<description><![CDATA[Have you tried the new GCC release 4.7 yet? The recent GCC release helps improve support for AMD Opteron™ 6200 series and AMD FX series processors, and adds support for upcoming AMD processors with the “Piledriver” core.  This is an &#8230; <a href="http://blogs.amd.com/developer/2012/04/23/gcc-4-7-is-available-with-support-for-amd-opteron%e2%84%a2-6200-series-and-amd-fx-series-processors/">Continue reading</a>]]></description>
			<content:encoded><![CDATA[<p>Have you tried the new GCC release 4.7 yet?</p>
<p>The <a href="http://gcc.gnu.org/gcc-4.7/changes.html">recent GCC release</a> helps improve support for <a href="http://www.amd.com/us/products/server/processors/6000-series-platform/6200/Pages/6200-series-processors.aspx">AMD Opteron™ 6200 series</a> and <a href="http://www.amd.com/us/products/desktop/processors/amdfx/Pages/amdfx.aspx">AMD FX series processors</a>, and adds support for upcoming AMD processors with the “Piledriver” core.  This is an important release for the developer community and brings significant performance improvements, new features and infrastructure enhancements over the previous versions.  A preview of the key highlights includes:</p>
<ul>
<li>Performance      improvements  in the compiler option      for maximum optimization level (<em>-Ofast</em>)</li>
<li>ISA support      including FMA3, F16C, TBM and BMI instruction sets (<em>-m[no-]fma, -m[no-]f16c, -m[no-]tbm, -m[no-]bmi)</em></li>
<li>Improved robustness,      scalability, memory usage on link-time optimization (<em>-flto</em>).</li>
<li>Support for      OpenMP 3.1 (<em>-fopenmp</em>)</li>
<li>Option to store      local arrays on stack memory for FORTRAN (<em>-fstack-arrays</em>).</li>
<li>Addition of C++      11 (<em>-std=c++11</em>)</li>
</ul>
<p>GCC now has optimized performance settings and compile flags for the AMD processors. These include:</p>
<ul>
<li>AMD processors      with “Piledriver” core (options: <em>-march=bdver2</em> and <em>-mtune=bdver2)</em></li>
<li>AMD Opteron™ and      AMD FX series processors with “Bulldozer” processor core (options: <em>-march=bdver1</em> and <em>-mtune=bdver1)</em> and</li>
<li>AMD processors      with “Bobcat” core (options: <em>-march=btver1</em> and <em>-mtune=btver1</em>).</li>
</ul>
<p>For a list of compiler options to use with AMD processors, check out our <a href="http://developer.amd.com/Assets/CompilerOptQuickRef-62004200.pdf">compiler quick reference guide</a>.</p>
<p>Overall, GCC 4.7 runtime performance is designed to be faster than previous versions including GCC 4.6 or default versions that come with commercial Linux distributions (e.g. RHEL or SLES), such as GCC 4.4.6. If your application is sensitive to runtime performance then you might consider getting the latest version of GCC. Check out <a href="http://gcc.gnu.org/">gcc.gnu.org</a> to learn about the new updates and upgrade your compiler to the latest version.</p>
<p>GCC 4.7 release notes can be found <a href="http://gcc.gnu.org/gcc-4.7/changes.html">here</a>.</p>
<p>By the way, did I mention GNU Compiler Collection turned 25 this year… Happy anniversary GCC!</p>
<p><em><strong>Milind Kukanur is a Sr. Manager, Product Management at AMD.</strong></em><em> </em><em>His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.</em></p>
<img src="http://feeds.feedburner.com/~r/AmdDeveloperBlogs/~4/xVjdkUMZTgE" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blogs.amd.com/developer/2012/04/23/gcc-4-7-is-available-with-support-for-amd-opteron%e2%84%a2-6200-series-and-amd-fx-series-processors/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.amd.com/developer/2012/04/23/gcc-4-7-is-available-with-support-for-amd-opteron%e2%84%a2-6200-series-and-amd-fx-series-processors/</feedburner:origLink></item>
		<item>
		<title>Rob Farber presents in the OpenCL™ Programming Webinar Series</title>
		<link>http://feedproxy.google.com/~r/AmdDeveloperBlogs/~3/Q4PR0Dp0maY/</link>
		<comments>http://blogs.amd.com/developer/2012/04/09/rob-farber-presents-in-the-opencl%e2%84%a2-programming-webinar-series/#comments</comments>
		<pubDate>Mon, 09 Apr 2012 23:08:52 +0000</pubDate>
		<dc:creator>Sharon Troia</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[Parallel Programming]]></category>
		<category><![CDATA[Webinar]]></category>

		<guid isPermaLink="false">http://blogs.amd.com/developer/?p=2337</guid>
		<description><![CDATA[AMD offers an OpenCL™ Programming Webinar Series to help software developers become experts in the latest technologies, standards and best practices.  We are excited to bring a guest presenter and well known GPGPU guru, Rob Farber, to deliver a series &#8230; <a href="http://blogs.amd.com/developer/2012/04/09/rob-farber-presents-in-the-opencl%e2%84%a2-programming-webinar-series/">Continue reading</a>]]></description>
			<content:encoded><![CDATA[<p>AMD offers an <a href="http://developer.amd.com/zones/OpenCLZone/Events/pages/OpenCLWebinars.aspx">OpenCL™ Programming Webinar Series</a> to help software developers become experts in the latest technologies, standards and best practices. </p>
<p>We are excited to bring a guest presenter and well known GPGPU guru, Rob Farber, to deliver a series of three OpenCL webinars that accompany his 9 AMD sponsored OpenCL article series found below:</p>
<p>1.  April 10th, 10AM PDT, <a title="Click here to register" href="https://www3.gotomeeting.com/register/198987358" target="_blank">Introducing Portable Parallelism</a></p>
<ul>
<li>C and C++ APIs</li>
<li>OpenCL Memory Spaces</li>
<li>The OpenCL Execution Model</li>
<li>Reference articles: <a href="http://www.codeproject.com/KB/showcase/Portable-Parallelism.aspx">1</a>, <a href="http://www.codeproject.com/KB/showcase/Memory-Spaces.aspx">2</a> and <a href="http://www.codeproject.com/KB/showcase/Work-Groups-Sync.aspx">3</a></li>
</ul>
<p> 2.  April 24th, 10AM PDT, <a title="Click here to register" href="https://www3.gotomeeting.com/register/494132830" target="_blank">Coordinating OpenCL Computations on one more Heterogeneous Devices</a></p>
<ul>
<li>How to Concisely Utilize Multiple Command Queues and Coordinate Tasks Across Multiple Heterogeneous Devices such as two GPU + CPU</li>
<li>Code Sample Discussion: Massively Parallel Random Number Test Framework</li>
<li>Reference articles <a href="http://www.codeproject.com/KB/showcase/OpenCL-Queues.aspx">4</a>, <a href="http://www.codeproject.com/KB/showcase/OpenCL-Buffers.aspx">5</a> and <a href="http://www.codeproject.com/Articles/329633/Part-8-Heterogeneous-workflows-using-OpenCL">8</a></li>
</ul>
<p>3.  May 1st, 10AM PDT, <a title="Click here to register" href="https://www3.gotomeeting.com/register/538470790" target="_blank">Accelerate Rendering by an Order of Magnitude with OpenCL, Plus a View to the Multi-core and Web-enabled Future</a></p>
<ul>
<li>How to use OpenCL to Provide High-Quality, Fast Rendering in Combination with Primitive Restart</li>
<li>Device Fission, Partitioning Hardware Capabilities for Optimal Resource Usage</li>
<li>Looking to the Future – WebCL</li>
<li>Reference articles <a href="http://www.codeproject.com/Articles/201263/Part-6-Primitive-Restart-and-OpenGL-Interoperabili">6</a>, <a href="http://www.codeproject.com/Articles/329620/Part-7-OpenCL-plugins">7</a> and <a href="http://www.codeproject.com/Articles/330174/Part-9-OpenCL-Extensions-and-Device-Fission">9</a></li>
</ul>
<p> </p>
<p>Registration is limited, don&#8217;t wait<em>, </em><strong><a href="http://developer.amd.com/zones/OpenCLZone/Events/pages/OpenCLWebinars.aspx">register for all three today</a>!</strong></p>
<p><em><strong>Who is Rob Farber you ask?</strong></em></p>
<p>Rob Farber is a senior scientist and research consultant at the Irish Center for High-End Computing in Dublin, Ireland and Chief Scientist for BlackDog Endeavors, LLC in the US.  Previously, he has been on staff at several US national laboratories including Los Alamos National Laboratory, Lawrence Berkeley National Laboratory, and at Pacific Northwest National Laboratory. He also served as an external faculty member at the Santa Fe Institute, co-founded two successful start-ups, and has been a consultant to Fortune 100 companies. His articles have appeared in many venues including The Code Project, Doctor Dobb&#8217;s Journal, and Scientific Computing. Rob recently completed a book teaching massive parallel computing.</p>
<p><strong><em>Sharon Troia is a Sr. Developer Relations Engineer in the Developer Outreach Group at AMD.</em></strong><em> Her postings are her own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.</em></p>
<img src="http://feeds.feedburner.com/~r/AmdDeveloperBlogs/~4/Q4PR0Dp0maY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blogs.amd.com/developer/2012/04/09/rob-farber-presents-in-the-opencl%e2%84%a2-programming-webinar-series/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.amd.com/developer/2012/04/09/rob-farber-presents-in-the-opencl%e2%84%a2-programming-webinar-series/</feedburner:origLink></item>
		<item>
		<title>Welcome to OpenCL™ 1.2:   check out these Beta drivers that contain a complete OpenCL™ 1.2 solution.</title>
		<link>http://feedproxy.google.com/~r/AmdDeveloperBlogs/~3/004YYPmFvmY/</link>
		<comments>http://blogs.amd.com/developer/2012/03/23/welcome-to-opencl%e2%84%a2-1-2-check-out-these-beta-drivers-that-contain-a-complete-opencl%e2%84%a2-1-2-solution/#comments</comments>
		<pubDate>Fri, 23 Mar 2012 14:36:36 +0000</pubDate>
		<dc:creator>Mark Ireton</dc:creator>
				<category><![CDATA[AMD APP]]></category>
		<category><![CDATA[Inside Dev Central]]></category>
		<category><![CDATA[APU]]></category>
		<category><![CDATA[Fusion]]></category>
		<category><![CDATA[GPGPU]]></category>
		<category><![CDATA[heterogeneous computing]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[Parallel Computing]]></category>
		<category><![CDATA[Parallel Programming]]></category>

		<guid isPermaLink="false">http://blogs.amd.com/developer/?p=2332</guid>
		<description><![CDATA[Last December we released preview drivers that contained much of new the functionality defined in the OpenCL 1.2™ specification.  AMD is an ardent promoter of OpenCL™, and today, I am pleased to make available beta drivers that contain a complete &#8230; <a href="http://blogs.amd.com/developer/2012/03/23/welcome-to-opencl%e2%84%a2-1-2-check-out-these-beta-drivers-that-contain-a-complete-opencl%e2%84%a2-1-2-solution/">Continue reading</a>]]></description>
			<content:encoded><![CDATA[<p>Last December we released preview drivers that contained much of new the functionality defined in the OpenCL 1.2™ specification.  AMD is an ardent promoter of OpenCL™, and today, I am pleased to make available beta drivers that contain a complete  – beta level – implementation of the OpenCL™ 1.2 specification.  You can download the drivers from our <a href="http://developer.amd.com/sdks/AMDAPPSDK/downloads/Pages/default.aspx">APP SDK downloads page</a>.</p>
<p>The OpenCL  1.2 beta includes  the following OpenCL™ 1.2 functionality:</p>
<ul>
<li>Host access flags for memory objects enable more efficient buffer handling and provide added protection. For example, a buffer that is created as “write only” cannot be read from the host.</li>
<li>Pattern based GPU buffer and image initialization can help eliminate need for certain buffer/image transfers</li>
<li>Memory objects migration supports transfer of buffers prior to need</li>
<li>New generalized image creation API</li>
<li>Enhanced image/buffer map operations</li>
<li>OpenCL 1.2 CPU device partition including partition of a CPU after addition to a context</li>
<li>Generalized 1D and 2D images, image arrays,  and image&lt;-&gt; buffer interop</li>
<li>Libraries support including the separation of compile and link phases and the ability to compile with external symbols</li>
<li>Kernel reflection, the ability to query a kernel’s arguments</li>
<li>Support for printf as a built in function</li>
</ul>
<p>OpenCL™ 1.2 support is scheduled to  be included in our official AMD Catalyst™ driver releases in the coming months.</p>
<p><strong>Mark Ireton is the Product Manager for Compute Solutions at AMD.</strong> His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.</p>
<img src="http://feeds.feedburner.com/~r/AmdDeveloperBlogs/~4/004YYPmFvmY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blogs.amd.com/developer/2012/03/23/welcome-to-opencl%e2%84%a2-1-2-check-out-these-beta-drivers-that-contain-a-complete-opencl%e2%84%a2-1-2-solution/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://blogs.amd.com/developer/2012/03/23/welcome-to-opencl%e2%84%a2-1-2-check-out-these-beta-drivers-that-contain-a-complete-opencl%e2%84%a2-1-2-solution/</feedburner:origLink></item>
		<item>
		<title>AMD and the Visual Studio 11 Beta</title>
		<link>http://feedproxy.google.com/~r/AmdDeveloperBlogs/~3/dBGWBxzDW-o/</link>
		<comments>http://blogs.amd.com/developer/2012/03/01/amd-and-the-visual-studio-11-beta/#comments</comments>
		<pubDate>Thu, 01 Mar 2012 22:35:20 +0000</pubDate>
		<dc:creator>Robin Maffeo</dc:creator>
				<category><![CDATA[Hard-Core Software Optimization]]></category>
		<category><![CDATA[Inside Dev Central]]></category>
		<category><![CDATA[Processor Software Visible Features]]></category>
		<category><![CDATA[APU]]></category>
		<category><![CDATA[C++ AMP]]></category>
		<category><![CDATA[compiler]]></category>
		<category><![CDATA[GPGPU]]></category>
		<category><![CDATA[heterogeneous computing]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Piledriver]]></category>
		<category><![CDATA[SIMD]]></category>
		<category><![CDATA[SSE]]></category>
		<category><![CDATA[Visual Studio]]></category>

		<guid isPermaLink="false">http://blogs.amd.com/developer/?p=2327</guid>
		<description><![CDATA[AMD and the Visual Studio 11 Beta Today marks the release of the Beta of Visual Studio 11, an exciting new version of Visual Studio that contains a number of enhancements and improvements over prior versions.  While I won’t get &#8230; <a href="http://blogs.amd.com/developer/2012/03/01/amd-and-the-visual-studio-11-beta/">Continue reading</a>]]></description>
			<content:encoded><![CDATA[<p><strong>AMD and the Visual Studio 11 Beta</strong></p>
<p>Today marks the release of the Beta of Visual Studio 11, an exciting new version of Visual Studio that contains a number of enhancements and improvements over prior versions.  While I won’t get into a full feature breakdown (see my links to Microsoft blogs below for what’s new), there are a few specific areas that I’d like to touch on that I believe will help developers make the most of their AMD hardware platforms – across APU, CPU, and GPU.</p>
<p><strong>C++ AMP</strong></p>
<p>C++ AMP is new to Visual Studio and helps enable easy development of data parallel programs which are able to take advantage of underlying APU and GPU hardware that support DirectX 11.  C++ AMP consists of an extension to the C++ language in the form of a new keyword (restrict), infrastructure plumbing, and libraries.  By executing data parallel code directly on the APU/GPU – such as on an <a href="http://www.amd.com/us/products/notebook/apu/mainstream/Pages/mainstream.aspx#5">AMD A-Series APU</a> or <a href="http://www.amd.com/us/products/desktop/graphics/7000/7970/Pages/radeon-7970.aspx">AMD Radeon™ HD 7970</a> GPU – performance and power savings are possible over code executing on the CPU alone (even SSE vectorized code).  The tight integration of C++ AMP as a C++ language extension and relative ease of use are key benefits that will help drive widespread adoption of heterogeneous computing and lower the barrier of entry for data-parallel computation.</p>
<p><strong>Support for new instructions</strong></p>
<p>Visual Studio 11 includes a number of new intrinsics to support AMD’s processors, including the upcoming 2nd generation APU codenamed “Trinity”.  The processor core in “Trinity” has new instructions for three-operand FMA, also known as FMA3, as well as instructions for bit manipulation (BMI and TBM) and half-float conversion (F16C), all of which are supported via C++ intrinsics in Visual Studio 11.  For more information on the new instructions, see the AMD Software Optimization Guide located <a href="http://support.amd.com/us/Processor_TechDocs/47414_15h_sw_opt_guide.pdf">here</a>.</p>
<p><strong>Auto-vectorization </strong></p>
<p>Also new to C++ in Visual Studio 11 is an auto-vectorizer, which is on by default.  The compiler will vectorize loops where possible to improve performance using vector instructions on the processor (such as SSE2 and SSE4.x).  Take the following code snippet example:</p>
<p>float A[1000], B[1000], C[1000];</p>
<p>for (i = 0; i &lt; 1000; i++) {</p>
<p>    A[i] = B[i] + C[i];</p>
<p>}</p>
<p>The C++ compiler can vectorize this loop in order to execute multiple iterations simultaneously, improving performance significantly with instructions available on modern processors.</p>
<p>In addition to the vectorizer, the auto-parallelizer will execute loops across multiple CPU processors in the system, getting better effective utilization out of the underlying hardware.  The parallelizer requires input from the programmer to indicate those loops that should be parallelized, and can also be used in conjunction with the auto-vectorizer.</p>
<p>Finally, the Visual C++ compiler includes targeted code generation improvements to help improve the overall performance of applications that are built with it.  For more on specific C++ improvements in Visual Studio 11, check out Diego Dagum’s <a href="http://blogs.msdn.com/b/vcblog/archive/2012/02/29/10272778.aspx">blog</a>.</p>
<p><strong>Support for Windows 8 Metro-style applications</strong></p>
<p>Of course, one of the significant highlights of Visual Studio 11 is support for Metro-style applications and WinRT.  With WinRT, developers can use C++, JavaScript, and managed languages (such as C# and VB) to build new Metro-style applications for Windows 8.  Metro-style applications are hardware accelerated and use the APU or GPU for rendering to enable a smooth user experience.  C++ AMP can also be used in a Metro-style application for even more performance benefits!</p>
<p>In short, Visual Studio 11 has a number of great features to get the most out of the underlying hardware.  This spans AMD’s product line – from APU to CPU to GPU.  We’re excited to see what kind of great new applications developers will build with Visual Studio 11!  To get started with Visual Studio 11, read about what’s new in Jason Zander’s <a href="http://blogs.msdn.com/b/jasonz/archive/2012/02/29/welcome-to-the-beta-of-visual-studio-11-and-net-framework-4-5.aspx">blog</a>, and <a href="http://go.microsoft.com/?linkid=9801609">download the beta</a>.</p>
<p><strong> </strong></p>
<p><strong>Robin Maffeo is a Software Engineering Manager on the Microsoft team at AMD. </strong><em>His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. </em><em>Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only.  Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.</em><strong><em></em></strong></p>
<img src="http://feeds.feedburner.com/~r/AmdDeveloperBlogs/~4/dBGWBxzDW-o" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blogs.amd.com/developer/2012/03/01/amd-and-the-visual-studio-11-beta/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.amd.com/developer/2012/03/01/amd-and-the-visual-studio-11-beta/</feedburner:origLink></item>
		<item>
		<title>AMD OpenCL™ Coding Competition Results</title>
		<link>http://feedproxy.google.com/~r/AmdDeveloperBlogs/~3/HL-v6THDYA4/</link>
		<comments>http://blogs.amd.com/developer/2012/02/23/amd-opencl%e2%84%a2-coding-competition-results/#comments</comments>
		<pubDate>Thu, 23 Feb 2012 20:05:02 +0000</pubDate>
		<dc:creator>Sharon Troia</dc:creator>
				<category><![CDATA[Inside Dev Central]]></category>
		<category><![CDATA[APU]]></category>
		<category><![CDATA[heterogeneous computing]]></category>
		<category><![CDATA[OpenCL]]></category>

		<guid isPermaLink="false">http://blogs.amd.com/developer/?p=2323</guid>
		<description><![CDATA[Last year at the AMD Fusion11 Developer Summit we announced an AMD OpenCL™ Coding Competition.  This competition received a record breaking response in registration numbers.  It had two main components, the Innovation Challenge and the Performance Challenge.  Performance Challenge The &#8230; <a href="http://blogs.amd.com/developer/2012/02/23/amd-opencl%e2%84%a2-coding-competition-results/">Continue reading</a>]]></description>
			<content:encoded><![CDATA[<p>Last year at the AMD Fusion11 Developer Summit we announced an <a href="http://blogs.amd.com/developer/2011/06/15/announcing-the-amd-opencl%E2%84%A2-coding-competition-with-50000-in-prizes/">AMD OpenCL™ Coding Competition</a>.  This competition received a record breaking response in registration numbers.  It had two main components, the Innovation Challenge and the Performance Challenge. </p>
<p><strong>Performance Challenge</strong></p>
<p>The performance challenge definitely surprised us with how quickly developers learned how to implement advanced OpenCL™ techniques.  It only took two weeks for them to figure out all of the optimization techniques to help them get the best performance.  The majority of these participants were new to OpenCL technology, this quote from a challenge competitor summed up the typical contestant’s experience,</p>
<p><em>“</em><em>Honestly, I am new to OpenCL, except having a look at couple of CUDA programs before and little experience with OpenMP… I thought I could do it. It turned out to be not much difficult.”</em></p>
<p><em>-Topcoder Member: Pratap, aka: supercharger</em><em></em></p>
<p><strong>Innovation Challenge</strong></p>
<p>The submissions for this challenge were quite innovative in just about every category I could imagine and more.  People submitted OpenCL algorithms for games, financial modeling, video processing, image processing, artificial intelligence, gesture recognition, simulation, computer vision, robotics, and the list goes on.  Our engineers debated tirelessly over which entry should win and it took a while for us to close on a decision (as it should be, considering how significant the prizes were).</p>
<p>The Innovation Challenge prizes went to:</p>
<ol>
<li>First Place: Went to the designer for an application that could enable the future of a more human like AI for video games.  This app “sees” what a human person would see when playing a first person shooter game, and can track features so as to move around in the game.  The idea is to have a computer competitor that doesn’t have “privileged” information. </li>
<li>Second place:  Went to the designer of an application that used the Microsoft<sup>®</sup> Kinect camera along with the AMD APU processor to do real-time video effects.</li>
<li>Third place:  Went to the designer of an application that takes adaptive shadow detection to the next level and has it “teach” itself.</li>
<li>Fourth place:  Went to the designer of a numerical simulation of an x-ray generator. </li>
</ol>
<p>The end result is that we got some amazing apps that took advantage of the <a href="http://blogs.amd.com/developer/2011/08/01/cpu-to-gpu-data-transfers-exceed-15gbs-using-apu-zero-copy-path/">very low overhead</a>  between data transfers from the CPU and GPU with the AMD  APU architecture.  But, more importantly it showed how the low power, small form factor, AMD  APU has the power to be used like a personal and portable high performance computer. </p>
<p>Visit the <a href="http://developer.amd.com/community/events/pages/AMDOpenCL%e2%84%a2CodingCompetition.aspx">AMD OpenCL™ Competition webpage</a> to learn more about the winning entries and watch video demos.</p>
<p><strong><em>Sharon Troia is a Developer Relations Engineer at AMD.</em></strong><em> Her postings are her own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.</em></p>
<img src="http://feeds.feedburner.com/~r/AmdDeveloperBlogs/~4/HL-v6THDYA4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blogs.amd.com/developer/2012/02/23/amd-opencl%e2%84%a2-coding-competition-results/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.amd.com/developer/2012/02/23/amd-opencl%e2%84%a2-coding-competition-results/</feedburner:origLink></item>
	</channel>
</rss>

