<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Adventures in HttpContext</title>
	
	<link>http://blog.michaelhamrah.com</link>
	<description>All the stuff after "Hello, World!"</description>
	<lastBuildDate>Sun, 05 May 2013 22:11:17 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/AdventuresInHttpcontext" /><feedburner:info uri="adventuresinhttpcontext" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Updating Flickr Photos with Gpx Data using Scala: Getting Started</title>
		<link>http://feedproxy.google.com/~r/AdventuresInHttpcontext/~3/K56v6U0QZ4U/</link>
		<comments>http://blog.michaelhamrah.com/2013/05/updating-flickr-photos-with-gpx-data-using-scala-getting-started/#comments</comments>
		<pubDate>Sun, 05 May 2013 22:11:17 +0000</pubDate>
		<dc:creator>Michael</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.michaelhamrah.com/?p=864</guid>
		<description><![CDATA[If you read this blog you know I&#8217;ve just returned from six months of travels around Asia, documented on our tumblr, The Great Big Adventure with photos on Flickr. Even though my camera doesn&#8217;t have a GPS, I realized toward the second half of the trip I could mark GPS waypoints and write a program [...]]]></description>
				<content:encoded><![CDATA[<p>If you read this blog you know I&#8217;ve just returned from six months of travels around Asia, documented on our tumblr, <a href="http://thegreatbigadventure.tumblr.com">The Great Big Adventure</a> with photos on <a href="http://flickr.com/hamrah">Flickr</a>.  Even though my camera doesn&#8217;t have a GPS, I realized toward the second half of the trip I could mark GPS waypoints and write a program to link that data later.  I decided to write this little app in Scala, a language I&#8217;ve been learning since my return.  The app is still a work in progress, but instead of one long post I&#8217;ll spread it out as I go along.</p>

<p><span id="more-864"></span></p>

<h2>The Workflow</h2>

<p>When I took a photo I usually marked the location with a waypoint in my GPS.  I accumulated a set of around 1000 of these points spread out over three gpx (xml) files.  My plan is to:</p>

<ol>
<li>Read in the three gpx files and combine them into a distinct list.</li>
<li>For each day I have at least one gpx point, get all of my flickr images for that data.</li>
<li>For each image, find the waypoint timestamp with the least difference in time.</li>
<li>Update that image with the waypoint data on Flickr.</li>
</ol>

<h2>Getting Started</h2>

<p>If you&#8217;re going to be doing anything with Scala, learning <a href="http://scala-sbt.org">sbt</a> is essential.  Luckily, it&#8217;s pretty straightforward, but the documentation across the internet is somewhat inconsistent.  As of this writing, <a href="http://twitter.github.io/scala_school/sbt.html">Twitter&#8217;s Scala School SBT Documentation</a>, which I used as a reference to get started, incorrectly states that SBT creates a template for you.  It no longer does, with the preferred approach to use <a href="https://github.com/n8han/giter8">giter8</a>, an excellent templating tool.  I created <a href="https://github.com/mhamrah/sbt.g8">my own simplified version</a> which is based off of the excellently documented <a href="https://github.com/ymasory/sbt.g8">template by Yuvi Masory</a>.  Some of the versions in build.sbt are a outdated, but it&#8217;s worthwhile reading through the code to get a feel for the Scala and SBT ecosystem.  The g8 project also contains a good working example of custom sbt commands (like g8-test).  One gotcha with SBT: if you change your build.sbt file, you must call <em>reload</em> in the sbt console.  Otherwise, your new dependencies will not be picked up.  For rubyists this is similar to running <em>bundle update</em> after changing your gemfile.</p>

<h2>Testing</h2>

<p>I&#8217;m a big fan of TDD, and strive for a test-first approach.  It&#8217;s easy to get a feel for the small stuff in the scala repl, but orchestration is what programming is all about, and TDD allows you to design and throughly test functionality in a repeatable way.  The two main libraries are <a href="https://code.google.com/p/specs/">specs</a> (actually, it&#8217;s now <a href="http://etorreborre.github.io/specs2/">specs2</a>) and <a href="http://www.scalatest.org/">ScalaTest</a>.  I originally went with specs2.  It was fine, but I wasn&#8217;t too impressed with the output and not thrilled with the matchers.  I believe these are all customizable, but to get a better feel for the ecosystem I switched to ScalaTest.  I like ScalaTest&#8217;s default output better and the flexible composition of testing styles (I&#8217;m using FreeSpec) and matchers (ShouldMatchers) provide a great platform for testing.  Luckily, both specs2 and scalatest integrate with SBT which provides continuous testing and growl support, so you don&#8217;t need to fully commit to either one too early.</p>

<h1>#</h1>
<img src="http://feeds.feedburner.com/~r/AdventuresInHttpcontext/~4/K56v6U0QZ4U" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.michaelhamrah.com/2013/05/updating-flickr-photos-with-gpx-data-using-scala-getting-started/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.michaelhamrah.com/2013/05/updating-flickr-photos-with-gpx-data-using-scala-getting-started/</feedburner:origLink></item>
		<item>
		<title>Six Months of Computer Science Without Computers</title>
		<link>http://feedproxy.google.com/~r/AdventuresInHttpcontext/~3/coj8EntI06U/</link>
		<comments>http://blog.michaelhamrah.com/2013/04/six-months-of-computer-science-without-computers/#comments</comments>
		<pubDate>Mon, 22 Apr 2013 11:00:02 +0000</pubDate>
		<dc:creator>Michael</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.michaelhamrah.com/?p=848</guid>
		<description><![CDATA[A few weeks ago I returned from a six month trip around Asia. I didn&#8217;t have a computer while abroad, but I was able to catch up on several tech books I never had time for previously. Reading about programming without actually programming was an interesting and rewarding circumstance. It provided a unique mental model: [...]]]></description>
				<content:encoded><![CDATA[<p>A few weeks ago I returned from <a href="http://thegreatbigadventure.tumblr.com">a six month trip around Asia</a>.  I didn&#8217;t have a computer while abroad, but I was able to catch up on several tech books I never had time for previously.  Reading about programming without actually programming was an interesting and rewarding circumstance.  It provided a unique mental model: it was no longer about &#8220;how you do this&#8221; but about &#8220;why would you do this&#8221;.  Accomplishment of a task via implementation was not an end goal.  The end goal was simply absorbing information; once read, it didn&#8217;t need to be applied.  It only needed to be reasoned about and hypothetically applied under a specific situation (which I usually did on a trek or on a beach).  Before I would have been eager to try it out, hacking away, but without a computer, I couldn&#8217;t.  It was liberating.  Given a problem, and a set of constraints, what&#8217;s the ideal solution?  I realize this is somewhat of an ivory-tower mentality, however, I also realized some of the best software has emerged from an idealism to solve problems in an opinionated way.  Sometimes we are too consumed by the here-and-now we fail to step back for the bigger picture.  Conversely, we hold onto our ideals and fail to adapt to changing circumstances.</p>

<p><span id="more-848"></span></p>

<p>My favorite aspect of learning technology while traveling abroad did not come from any book or video.  A large part of computer science is about optimizing systems under the pressure of constraints.  Efficient algorithms, clean code, improving performance.  The world is full of sub-optimal processes.  Burmese hotels, the Lao transportation system, and Nepalese immigration to name a few.  On a larger scale sun-optimal problems are created by geographic, socio-economic, or political constraints.  People try the best they can to improve their way of life, and unfortunately, the processes are often &#8220;implemented&#8221; with a &#8220;naïve&#8221; solution.  Some are also inspiring.  It was powerful to see these systems up close, with cultural and historical factors so foreign.  One thing is certain: when you optimize for efficiency, everyone wins.</p>

<p>Below are a selection of books and resources I found particularly interesting.  I encourage you to check them out, hopefully away from a computer in a foreign land:</p>

<p><a href="http://programmer.97things.oreilly.com/wiki/index.php/97_Things_Every_Programmer_Should_Know">97 Things Every Programmer Should Know</a> : A great selection of tidbits from a variety of sources.  Nothing new for the experienced programmer, but reading through the sections is a great refresher to keep core principles fresh.  Worthwhile to randomly select a chapter now and again for those &#8220;oh yeah&#8221; moments.</p>

<p><a href="http://shop.oreilly.com/product/9780596510046.do">Beautiful Code</a> by Andy Oram and Greg Wilson: My favorite book.  Not so much about code, but the insight about solving problems makes it a great read.  I appreciate the intelligent thought process which went into some of the chapters.  Python&#8217;s hashtable implementation and debugging prioritization in the Linux kernel are two highlights.</p>

<p><a href="http://shop.oreilly.com/product/0636920022626.do">Exploring Everyday Things with R and Ruby</a> by Sau Sheong Chang:  This is a short book with great content.  You only need an elementary knowledge of programming and mathematics to appreciate the concepts.  It&#8217;s also a great way to get a taste of R.  The book covers a variety of topics from statistics, machine learning, and simulations.  My favorite aspect is how to use modeling to verify a hypothesis or create a simulation.  The chapters involving emergent behavior are particularly interesting.</p>

<p><a href="http://shop.oreilly.com/product/0636920018483.do">Machine Learning for Hackers</a> by Drew Conway and John Myles White: I&#8217;ve been interested in machine learning for a while, and I was very happy with this read.  Far more technical and mathematical than <em>Exploring Everyday Things</em>, this book digs into supervised and unsupervised learning and several aspects of statistics.  If you&#8217;re interested in data science and are comfortable with programming, this book is for you.</p>

<p><a href="http://www.manning.com/raychaudhuri/">Scala in Action</a> by Nilanjan Raychaudhuri:  Scala and Go have been on my radar for a while as new languages to learn.  It&#8217;s funny to learn a new programming language without being able to test-drive it, but I appreciated the separation.  My career has largely been focused on OOP: leveraging design patterns, class composition, SOLID principles, enterprise architecture.  After reading this book I realize I was missing out on great functional programming paradigms I was only unconsciously using.  Languages like Clojure and Haskell are gaining steam for a radically different approach to OOP, and Scala provides a nice balance between the two.  It&#8217;s also wonderfully expressive: traits, the type system, and for-comprehension are beautiful building blocks to managing complex behavior.  Since returning I&#8217;ve been doing Scala full-time and couldn&#8217;t be happier.  It&#8217;s everything you need with a statically typed language with everything you want from a dynamic one (well, there&#8217;s still no method-missing, at least not yet).  I looked at a few Scala books and this is easily on the top of the list.  Nilanjan does an excellent job balancing language fundamentals with applied patterns.</p>

<p><a href="http://shop.oreilly.com/product/0636920014348.do">HBase: The Definitive Guide</a> by Lars George: I&#8217;ve been deeply interested in distributed databases and performance for some time.  I purchased this book a few years ago when first exploring NoSQL databases.  Since then, Cassandra has eclipsed the distributed hashtable family of databases (Riak, Hbase, Voldemort) but I found this book a great read.  No matter what implementation you go with, this book will help you think in a column-orientated way, offering great tidbits into architectural tradeoffs which went into HBase&#8217;s design.  At the very least, this book will give you a solid foundation to compare against other BigTable/Dynamo clones.</p>

<p><a href="http://www.aosabook.org/en/index.html">The Architecture of Open Source Applications</a>:  I was excited when I stumbled upon this website.  It offers a plethora of information from elite contributors.  The applied-practices and deep architectural insight are valuable lessons to learn from.  <a href="http://www.aosabook.org/en/nginx.html">Andrew Alexeev on Nginx</a>, <a href="http://www.aosabook.org/en/distsys.html">Kate Matsudaira on Scalable Web Architecture</a> and <a href="http://www.aosabook.org/en/zeromq.html">Martin Sústrik on ZeroMQ</a> are highlights.</p>

<h2>iTunes U</h2>

<p>I was also able to check out some courses on iTunes U while traveling.  <a href="http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2010/index.htm">The MIT OCW Performance Engineering of Software Systems</a> was my favorite. Prof. Saman Amarasinghe and Prof. Charles Leiserson were both entertaining lecturers, and the course provided great insight into memory management, parallel programming, hardware architecture, and bit hacking.  I also watched several lectures on algorithms giving me a new found appreciation for Big-O notation (I wish I remembered more while on the job interview circuit).  I&#8217;ve been gradually neglecting the importance of algorithmic design since graduating ten years ago, but found revisiting sorting algorithms, dynamic programming, and graph algorithms refreshing.  Focusing on how well code runs is as important as how well it&#8217;s written.  Like most things, there&#8217;s a naïve brute-force solution and an elegant, efficient other solution.  You may not know what the other solution is, but knowing there&#8217;s one lurking behind the curtain will make you a better engineer.</p>

<p>So, if you can (and you definitely can!) take a break, grab a book, read it distraction free, gaze out in space and think.  You&#8217;ll like what you&#8217;ll find!</p>
<img src="http://feeds.feedburner.com/~r/AdventuresInHttpcontext/~4/coj8EntI06U" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.michaelhamrah.com/2013/04/six-months-of-computer-science-without-computers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.michaelhamrah.com/2013/04/six-months-of-computer-science-without-computers/</feedburner:origLink></item>
		<item>
		<title>SPDY Slide Deck</title>
		<link>http://feedproxy.google.com/~r/AdventuresInHttpcontext/~3/vsU7f2Lf2DE/</link>
		<comments>http://blog.michaelhamrah.com/2013/04/spdy-slide-deck/#comments</comments>
		<pubDate>Sun, 14 Apr 2013 20:31:35 +0000</pubDate>
		<dc:creator>Michael</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.michaelhamrah.com/?p=838</guid>
		<description><![CDATA[I recently gave a talk on SPDY, the new protocol which will serve as the foundation for HTTP 2.0. SPDY introduces some interesting features to solve current limitations with how HTTP 1.1 sits on top of TCP. Check out the deck for a high-level overview, with links..]]></description>
				<content:encoded><![CDATA[<p>I recently gave a talk on <a href="http://www.chromium.org/spdy">SPDY</a>, the new protocol which will serve as the foundation for HTTP 2.0.  SPDY introduces some interesting features to solve current limitations with how HTTP 1.1 sits on top of TCP.  <a href="http://www.michaelhamrah.com/spdy/">Check out the deck for a high-level overview, with links.</a>.</p>
<img src="http://feeds.feedburner.com/~r/AdventuresInHttpcontext/~4/vsU7f2Lf2DE" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.michaelhamrah.com/2013/04/spdy-slide-deck/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.michaelhamrah.com/2013/04/spdy-slide-deck/</feedburner:origLink></item>
		<item>
		<title>Choosing a Technology: You’re Asking the Wrong Question</title>
		<link>http://feedproxy.google.com/~r/AdventuresInHttpcontext/~3/Rp5QwVYXcLo/</link>
		<comments>http://blog.michaelhamrah.com/2013/03/choosing-a-technology-youre-asking-the-wrong-question/#comments</comments>
		<pubDate>Wed, 27 Mar 2013 19:51:49 +0000</pubDate>
		<dc:creator>Michael</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://blog.michaelhamrah.com/?p=834</guid>
		<description><![CDATA[When making a choice in the tech world there are two wide-spread approaches: &#8220;What&#8217;s better, X or Y?&#8221; and &#8220;Should I use xyz?&#8221;. The &#8220;or&#8221; debate is always an entertaining topic usually ending in an absurdly hilarious flame war. The &#8220;Should I use xyz?&#8221; is a subtler, more prevalent question in the tech community leading [...]]]></description>
				<content:encoded><![CDATA[<p>When making a choice in the tech world there are two wide-spread approaches:  &#8220;What&#8217;s better, X or Y?&#8221; and &#8220;Should I use xyz?&#8221;. The &#8220;or&#8221; debate is always an entertaining topic usually ending in an absurdly hilarious flame war.  The &#8220;Should I use xyz?&#8221; is a subtler, more prevalent question in the tech community leading to an extensive amount of discourse.  Fairly rational, usually with some good insight, but still a time consuming task.  I&#8217;ve fallen victim to both approaches when exploring a technology decision.  What I realized is I&#8217;m asking the wrong question.  There are only two things I should ask:</p>

<p><span id="more-834"></span></p>

<p>1) What problem do I need to a solve?
2) How do I want to solve it?</p>

<p>Once I take this approach I have an opinionated basis for decision-making and I have a clear direction in how to make that decision.  Frameworks&#8211;web or javascript&#8211;are excellent examples on taking this approach.  Most of these frameworks were born on the simple premise of solving a problem in an opinionated way.  Backbone takes a bare-bones approach to a front-end, event-driven structure; Ember offers a robust, &#8220;things just happen&#8221; framework.  Sinatra and co. offers an http-first approach to development.  Rails and variants are opinionated in web application structure.  Do you agree with that approach?  Yes, excellent! No? Find something else or roll your own.</p>

<p>Don&#8217;t know the answer?  That&#8217;s okay too.  Most beginners want to make the &#8220;right&#8221; choice on what to learn.  But the thing is there is no &#8220;right&#8221; answer.  For a beginner choosing python vs. ruby vs. php vs scala wastes effort.  Just build something using something: you&#8217;ll soon develop your own opinions, with &#8220;how easy is this to learn&#8221; probably the first.  Next, when your rails codebase is out of control and you&#8217;re drowning in method_missing issues maybe you&#8217;ll want a more granular, service-orientated approach and the type-safety of Scala.  Maybe not&#8230; But you&#8217;ll have a valid problem to solve and a reasonable opinion to go with it.</p>

<p>I suggest reading <a href="http://www.aosabook.org/en/nginx.html">Andrew Alexeev&#8217;s reason on why he built NGINX</a> and <a href="https://www.varnish-cache.org/trac/wiki/ArchitectNotes">Poul-Henning Kamp&#8217;s rationale on how you write a modern application</a>.  Like so many others these incredible open-source systems were born from a problem and the way someone wanted it solved.  But those systems didn&#8217;t happen overnight and the authors didn&#8217;t start from scratch.  They spent years encountering, learning, and dealing with problems in their respective spaces.  They knew the problem domain well, they knew how they wanted the problem solved, and they solved it.</p>

<p>So put your choice in a context and don&#8217;t sweat the details which are irrelevant to the task at hand.  When you need to know those details you&#8217;ll know them, and when you hit problems you&#8217;ll know how you want them solved.</p>
<img src="http://feeds.feedburner.com/~r/AdventuresInHttpcontext/~4/Rp5QwVYXcLo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.michaelhamrah.com/2013/03/choosing-a-technology-youre-asking-the-wrong-question/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.michaelhamrah.com/2013/03/choosing-a-technology-youre-asking-the-wrong-question/</feedburner:origLink></item>
		<item>
		<title>Markdown Powered Resume with CSS Print Styles</title>
		<link>http://feedproxy.google.com/~r/AdventuresInHttpcontext/~3/adsXP8PrewQ/</link>
		<comments>http://blog.michaelhamrah.com/2013/03/markdown-powered-resume-with-css-print-styles/#comments</comments>
		<pubDate>Sat, 23 Mar 2013 20:03:42 +0000</pubDate>
		<dc:creator>Michael</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[markdown]]></category>
		<category><![CDATA[resume]]></category>

		<guid isPermaLink="false">http://blog.michaelhamrah.com/?p=828</guid>
		<description><![CDATA[As much as I wish a LinkedIn profile could be a substitute for a resume, it&#8217;s not, and I needed an updated resume. My previous resume was done some time ago with InDesign when I was on a design-tools kick. It worked well, but InDesign isn&#8217;t the best choice for a straight forward approach to [...]]]></description>
				<content:encoded><![CDATA[<p>As much as I wish a LinkedIn profile could be a substitute for a resume, it&#8217;s not, and I needed an updated resume.  My previous resume was done some time ago with InDesign when I was on a design-tools kick.  It worked well, but InDesign isn&#8217;t the best choice for a straight forward approach to a resume and I was not interested in going back to word.  So in honor of my friend Karthik&#8217;s <a href="http://kufli.blogspot.com/2013/02/evolution-of-my-resume-karthik.html">programming themed resume</a> I had an idea: program my resume.  My requirements were simple:</p>

<p><span id="more-828"></span></p>

<ul>
<li>Easy to edit:  I should be able to update and output with minimal effort.</li>
<li>Easy to design: Something simple, but not boilerplate.</li>
<li>Export to Html and PDF: For easy distribution.</li>
</ul>

<p>I&#8217;m a big fan of <a href="http://daringfireball.net/projects/markdown/syntax">Markdown</a> and happy to see the prevalence of Markdown across the web, however fragmented.  I use Markdown to publish this blog and felt it would work well for writing a resume.  The only problem is layout: you have minimal control over structural html elements which can make aspects of design difficult.  For writing articles this isn&#8217;t a problem but when you need structural markup for CSS it can be limiting.  Luckily I found <a href="https://github.com/bhollis/maruku">Maruku</a>, a ruby-based markdown interpreter which supports <a href="http://michelf.ca/projects/php-markdown/extra/">PHP Markdown Extra</a> and a <a href="http://maruku.rubyforge.org/proposal.html">new meta-data syntax</a> for adding id, css, and div elements to a page.  It does take away from Markdown&#8217;s simplicity but adds enough structure for design.  Combined with CSS I had everything I needed to fulfill my requirements.</p>

<p>My <a href="https://github.com/mhamrah/mlh.com/blob/master/michael-hamrah-resume.md">markdown resume</a> is on GitHub.  I was surprised it rendered well with GitHub-Flavored Markdown despite the extraneous Maruku elements.  I knew I was on the right track.  Maruku lets you add your own stylesheets to the html output which I used for <a href="http://www.michaelhamrah.com/michael-hamrah-resume.html">posting online</a>.  One simple command gets me from markdown to ready-to-publish html.  Exactly what I wanted.</p>

<p>Markulu supports pdf output as well, but requires a heavy LaTex install which I wasn&#8217;t happy with.  I also wasn&#8217;t impressed with the LaTex PDF output.  Luckily there&#8217;s an easy alternative: printing to PDF.  I used some <a href="https://github.com/mhamrah/mlh.com/blob/master/scss/resume.scss">SASS media query overrides</a> on top of Html 5 Boilerplate&#8217;s default styles to control the print layout in the way I wanted.  You can even specify page breaks and print margins via CSS.  I favored Safari&#8217;s pdf output over Chrome&#8217;s for the sole reason Safari automatically embedded custom fonts in the final PDF.</p>

<p>At the end of the day I realized I probably didn&#8217;t need to add explicit divs to Markdown; I could have gotten the layout I wanted with just vanilla Markdown and CSS3 queries.  I also could have a semantically better markup if I used HAML to add &lt;section&gt; tags instead of divs where appropriate, but HAML would have added a considerable amount of extraneous information to the markup.  I&#8217;m also not sure editing the raw HAML text would have been as easy as Markdown.</p>

<p>At the end of the day, it&#8217;s all a tradeoff.  GitHub flavored markdown, Markdown Here and other interpreters support fenced code blocks; I like the idea of adding fenced blocks to get &lt;section&gt; elements to get semantic correctness and layout elements in the html output.  Unfortunately there&#8217;s no official Markdown spec and support is somewhat fragmented across various implementations, but <a href="http://www.codinghorror.com/blog/2012/10/the-future-of-markdown.html">hopefully it will come together soon</a>.  Until then, if you need it, you can always fork.  Luckily I didn&#8217;t have to take it that far.</p>
<img src="http://feeds.feedburner.com/~r/AdventuresInHttpcontext/~4/adsXP8PrewQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.michaelhamrah.com/2013/03/markdown-powered-resume-with-css-print-styles/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.michaelhamrah.com/2013/03/markdown-powered-resume-with-css-print-styles/</feedburner:origLink></item>
		<item>
		<title>Scalability comparison of WordPress with NGINX/PHP-FCM and Apache on an ec2-micro instance.</title>
		<link>http://feedproxy.google.com/~r/AdventuresInHttpcontext/~3/9h9_vcz4XXI/</link>
		<comments>http://blog.michaelhamrah.com/2013/03/scalability-comparison-of-wordpress-with-nginxphp-fcm-and-apache-on-an-ec2-micro-instance/#comments</comments>
		<pubDate>Sun, 17 Mar 2013 21:59:52 +0000</pubDate>
		<dc:creator>Michael</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[nginx]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[scalability]]></category>

		<guid isPermaLink="false">http://blog.michaelhamrah.com/?p=815</guid>
		<description><![CDATA[For the past few years this blog ran apache + mod_php on an ec2-micro instance. It was time for a change; I&#8217;ve enjoyed using nginx in other projects and thought I could get more out of my micro server. I went with a php-fpm/nginx combo and am very surprised with the results. The performance charts [...]]]></description>
				<content:encoded><![CDATA[<p>For the past few years this blog ran apache + mod_php on an ec2-micro instance.  It was time for a change; I&#8217;ve enjoyed using nginx in other projects and thought I could get more out of my micro server.  I went with a php-fpm/nginx combo and am very surprised with the results.  The performance charts are below; for php the response times varied little under minimal load, but nginx handled heavy load far better than apache.  Overall throughput with nginx was phenomenal from this tiny server.  The result for static content was even more impressive: apache effectively died after ~2000 concurrent connections and 35k total pages killing the server; nginx handled the load to 10,000 very well and delivered 160k successful responses.</p>

<p><span id="more-815"></span></p>

<p>Here&#8217;s the <a href="http://loader.io">loader.io</a> results from static content from http://www.michaelhamrah.com, comparing apache with nginx.  I suggest clicking through and exploring the charts:</p>

<div style="width: 600px;">
<iframe width='600' height='300' frameborder='0' src='//share.loader.io/results/f1c357b13b1f554eef534b79866eb5ce/widget'></iframe>
<div style="width: 100%; text-align: right;">
<a href="http://loader.io/results/f1c357b13b1f554eef534b79866eb5ce" target="_blank"  style="padding: 0 10px 10px 0; font-family: Arial, 'Helvetica Neue', Helvetica, sans-serif; font-size: 14px;">View on loader.io</a>
</div></div>

<p>Apache only handled 33.5k successful responses up to about 1,300 concurrent connections, and died pretty quickly.  Nginx did far better:</p>

<div style="width: 600px;">
<iframe width='600' height='300' frameborder='0' src='//share.loader.io/results/9430bdfcab50f31dc66f3ea3014beb84/widget'></iframe>
<div style="width: 100%; text-align: right;">
<a href="http://loader.io/results/9430bdfcab50f31dc66f3ea3014beb84" target="_blank"  style="padding: 0 10px 10px 0; font-family: Arial, 'Helvetica Neue', Helvetica, sans-serif; font-size: 14px;">View on loader.io</a>
</div></div>

<p>160k successful response with a 22% error rate and avg. response time of 142ms.  Not too shabby.  The apache run effectively killed the server and required a full reboot as ssh was unresponsive.  Nginx barely hiccuped.</p>

<p>The results of my wordpress/php performance is also interesting.  I only did 1000 concurrent users hitting blog.michaelhamrah.com.  Here&#8217;s the apache result:</p>

<div style="width: 600px;">
<iframe width='600' height='300' frameborder='0' src='//share.loader.io/results/210867953c97cdd2dd4308dce17bcae3/widget'></iframe>
<div style="width: 100%; text-align: right;">
<a href="http://loader.io/results/210867953c97cdd2dd4308dce17bcae3" target="_blank"  style="padding: 0 10px 10px 0; font-family: Arial, 'Helvetica Neue', Helvetica, sans-serif; font-size: 14px;">View on loader.io</a>
</div></div>

<p>There was a 21% error rate with 13.7k request served and a 237ms average response time (I believe the lower average is due to errors).  Overall not too bad for an ec2-micro instance, but the error rate was quite high and nginx again did far better:</p>

<div style="width: 600px;">
<iframe width='600' height='300' frameborder='0' src='//share.loader.io/results/631e11ff9206c6c7a3820c891380c9a3/widget'></iframe>
<div style="width: 100%; text-align: right;">
<a href="http://loader.io/results/631e11ff9206c6c7a3820c891380c9a3" target="_blank"  style="padding: 0 10px 10px 0; font-family: Arial, 'Helvetica Neue', Helvetica, sans-serif; font-size: 14px;">View on loader.io</a>
</div></div>

<p>A total of 19k successes with a 0% error rate.  The average response time was a little higher than apache, but nginx did serve far more responses.  I also get a kick out of the response time line between the two charts.  Apache is fairly choppy as it scales up, while nginx increases smoothly and evens out when the concurrent connections plateaus.  That&#8217;s what scalability should look like!</p>

<p>There are plenty of guides online showing how to get set up with nginx/php-fpm.  <a href="http://codex.wordpress.org/Nginx">The Nginx guide on WordPress Codex</a> is the most thorough, but there&#8217;s a <a href="http://todsul.com/install-configure-php-fpm">straightforward nginx/php guide on Tod Sul</a>.  I also relied on an <a href="http://dak1n1.com/blog/12-nginx-performance-tuning">nginx tuning guide from Dakini</a> and <a href="http://calendar.perfplanet.com/2012/using-nginx-php-fpmapc-and-varnish-to-make-wordpress-websites-fly/">this nginx/wordpress tuning guide from perfplanet</a>.  They both have excellent information.  I also think you should check out the <a href="https://github.com/h5bp/server-configs/blob/master/nginx/nginx.conf">html5 boilerplate nginx conf files</a> which have great bits of information.</p>

<p>If you&#8217;re setting this up yourself, start simple and work your way up.  The guides above have varying degrees of information and various configuration options which may conflict with each other.  Here&#8217;s some tips:</p>

<ol>
<li>Decide if you&#8217;re going with a socket or tcp/ip connection between nginx + php-fcm.  A socket connection is slightly faster and local to the system, but a tcp/ip is (marginally) easier to set up and good if you are spanning multiple nodes (you could create a php app farm to compliment an nginx front-facing web farm).<br />
I chose to go with the socket approach between nginx/php-fpm.  It was relatively painless, but I did hit a snag passing nginx requests to php.  I kept getting a &#8220;no input file specified&#8221; error.  It turns out it was a simple permissions issue:  the default php-fpm user was different the nginx user the webserver runs under.  Which leads me to:</li>
<li>Plan your users.  Security issues are annoying, so make sure file and app permissions are all in sync.</li>
<li>Check your settings!  Read through default configuration options so you know what&#8217;s going on.  For instance you may end up running more worker processes in your nginx instance than available cpu&#8217;s killing performance.  Well documented configuration files are essential to tuning.</li>
<li>Plan for access and error logging.  If things go wrong during the setup, you&#8217;ll want to know what&#8217;s going on and if your server is getting requests.  You can turn access logs of later.</li>
<li>Get your app running, test, and tune.  If you do too many configuration settings at once you&#8217;ll most likely hit a snag.  I only did a moderate amount of tuning; nginx configuration files vary considerably, so again it&#8217;s a good idea to read through the options and make your own call.  Ditto for php-fcm.</li>
</ol>

<p>I am really happy with the idea of running php as a separate process.  Running php as a daemon has many benefits: you have a dedicate process you can monitor and recycle for php without effecting your web server.  Pooling apps allows you to tune them individually.  You&#8217;re also not tying yourself to a particular web server; php-fpm can run fine with apache.  In TCP mode you can even offload your web server to separate node.  At the very least, you can distinguish php usage against web server usage.</p>

<p>So my only question is why would anyone still use apache?</p>
<img src="http://feeds.feedburner.com/~r/AdventuresInHttpcontext/~4/9h9_vcz4XXI" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.michaelhamrah.com/2013/03/scalability-comparison-of-wordpress-with-nginxphp-fcm-and-apache-on-an-ec2-micro-instance/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://blog.michaelhamrah.com/2013/03/scalability-comparison-of-wordpress-with-nginxphp-fcm-and-apache-on-an-ec2-micro-instance/</feedburner:origLink></item>
		<item>
		<title>How to Handle a Super Bowl Size Spike in Web Traffic</title>
		<link>http://feedproxy.google.com/~r/AdventuresInHttpcontext/~3/bF2723PY9_I/</link>
		<comments>http://blog.michaelhamrah.com/2013/02/how-to-handle-a-super-bowl-size-spike-in-web-traffic/#comments</comments>
		<pubDate>Wed, 06 Feb 2013 06:14:22 +0000</pubDate>
		<dc:creator>Michael</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[Http]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[scalability]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.michaelhamrah.com/blog/?p=785</guid>
		<description><![CDATA[I was shocked to learn the number of sites which failed to handle the spike in web traffic during the Super Bowl. Most of these sites served static content and should have scaled easily with the use of CDNs. Scaling sites, even dynamic ones, are achievable with well known tools and techniques. The Problem is [...]]]></description>
				<content:encoded><![CDATA[<p>I was shocked to learn the number of <a href="http://www.yottaa.com/blog/bid/265815/Coke-SodaStream-the-13-Websites-That-Crashed-During-Super-Bowl-2013">sites which failed to handle the spike in web traffic during the Super Bowl</a>.  Most of these sites served static content and should have scaled easily with the use of CDNs.  Scaling sites, even dynamic ones, are achievable with well known tools and techniques.</p>

<p><span id="more-785"></span></p>

<h2>The Problem is Simple</h2>

<p>At a basic level accessing a web page is when one computer, the client, connects to a server and downloads some content.  A problem occurs when the number of people requesting content exceeds the ability to deliver content.  It&#8217;s just like a restaurant.  When there are too many customers people must wait to be served.  Staff becomes stressed and strained.  Computers are the same.  Excessive load causes things to break down.</p>

<h2>Optimization Comes in Three Forms</h2>

<p>To handle more requests there are three things you can do: produce (render) content faster, deliver (download) content faster and add more servers to handle more connections.  Each of these solutions has a limit.  Designing for these limits is architecting for scale.</p>

<p>A page is composed of different types of content: html, css and js.  This content is either dynamic (changes frequently) or static (changes infrequently).  Static content is easier to scale because you create it once and deliver it repeatedly.  The work of rendering is eliminated.  Static content can be pushed out to CDNs or cached locally to avoid redownloading.  Requests to origin servers are reduced or eliminated.  You can also download content faster with small payload sizes.  There is less to deliver if there is less markup and the content is compressed.  Less to deliver means faster download.</p>

<p>Dynamic content is trickier to cache because it is always changing.  Reuse is difficult because pages must be regenerated for specific users at specific times.  Scaling dynamic content involves database tuning, server side caching, and code optimization.  If you can render a page quickly you can deliver more pages because the server can move on to new requests.  Most often, at scale, you want to treat treat dynamic content like static content as best you can.</p>

<p>Adding more servers is usually the easiest way to scale but breaks down quickly.  The more servers you have the more you need to keep in sync and manage.  You may be able to add more web servers, but those web servers must connect to database servers.  Even powerful database servers can only handle so many connections and adding multiple database servers is complicated.  You may be able to add specific types of servers, like cache servers, to achieve the results you need without increasing your entire topology.</p>

<p>The more servers you have the harder it is to keep content fresh.  You may feel increasing your servers will increase your load.  It will become expensive to both manage and run.  You may be able to achieve a similar result if you cut your response times which also gives the end user a better experience.  If you understand the knobs and dials of your system you can tune properly.</p>

<h2>Make Assumptions</h2>

<p>Don&#8217;t be afraid to make assumptions about your traffic patterns.  This will help you optimize for your particular situation.  For most publicly facing websites traffic is anonymous.  This is particularly true during spikes like the Super Bowl.  Because you can deliver the same page to every anonymous user you effectively have static content for those users.  Cache controls determine how long content is valid and powers HTTP accelerators and CDNs for distribution.  You don&#8217;t need to optimize for everyone; split your user base into groups and optimize for the majority.  Even laxing cache rules on pages to a minute can shift the burden away from your application servers freeing valuable resources.  Anonymous users will get the benefit of cached content with a quick download, dynamic users will have fast servers.</p>

<p>You can also create specific rendering pipelines for anonymous and known users for highly dynamic content.  If you can identify anonymous users early you may be able to avoid costly database queries, external API calls or page renders.</p>

<h2>Understand HTTP</h2>

<p>HTTP powers the web. The better you understand HTTP the better you can leverage tools for optimizing the web.  Specifically look at <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html">http cache headers</a> which allow you to use web accelerators like Varnish and CDNs.  The vary header will allow you to split anonymous and known users giving you fine grained control on who gets what.  Expiration headers determine content freshness.  The worst thing you can do is set cache headers to private on static content preventing browsers from caching locally.</p>

<h2>Try Varnish and ESI</h2>

<p><a href="http://www.varnish-cache.org">Varnish</a> is an HTTP accelerator.  It caches dynamic content produced from your website for efficient delivery.  Web frameworks usually have their own features for caching content, but Varnish allows you to bypass your application stack completely for faster response times.  You can deliver a pre-rendered dynamic page as if it were a static page sitting in memory for a greater number of connections.</p>

<p>Edge Side Includes allow you to mix static and dynamic content together.  If a page is 90% similar for everyone, you can cache the 90% in Varnish and have your application server deliver the other 10%.  This greatly reduces the work your app server needs to do.  ESI&#8217;s are just emerging into web frameworks.  It will play a more prominent role in Rails 4.</p>

<h2>Use a CDN and Multiple Data Centers</h2>

<p>You don&#8217;t need to add more servers to your own data center.  You can leverage the web to fan work out across the Internet.  I talk more about CDN&#8217;s, the importance of edge locations and latency in my post <a href="http://www.michaelhamrah.com/blog/2012/01/building-for-the-web-understanding-the-network/">Building for the Web: Understanding the Network</a>.</p>

<p>Your application servers should be reserved for doing application-specific work which is unique to every request.  There are more efficient ways of delivering the same content to multiple people than processing a request top-to-bottom via a web framework.  Remember &#8220;the same&#8221; doesn&#8217;t mean the same indefinitely; it&#8217;s the same for whatever timeframe you specify.</p>

<p>If you run Varnish servers in multiple data centers you can effectively create your own CDN.  Your database and content may be on the east coast but if you run a Varnish server on the west coast an anonymous user in San Fransisco will have the benefit of a fast response time and you&#8217;ve saved a connection to your app server.  Even if Varnish has to deliver 10% dynamic content via an ESI on the east coast it can leverage the fast connection between data centers.  This is much better then the end user hoping coast-to-coast themselves for an entire page.</p>

<p>Amazon&#8217;s Route 53 offers the ability to route requests to an optimal location.  There are other geo-aware DNS solutions.  If you have a multi-region setup you are not only building for resiliency your are horizontally scaling your requests across data centers.  At massive scale even load balancers may become overloaded so round-robin via DNS becomes essential.  DNS may be a bottleneck as well.  If your DNS provider can&#8217;t handle the flood of requests trying to map your URL to your IP address nobody can even get to your data center!</p>

<h2>Use Auto Scaling Groups or Alerting</h2>

<p>If you can take an action when things get rough you can better handle spikes.  Auto scaling groups are a great feature of AWS when some threshold is maxed.  If you&#8217;re not on AWS good monitoring tools will help you take action when things hit a danger zone.  If you design your application with auto-scaling in mind, leveraging load balancers for internal communication and avoiding state, you are in a better position to deal with traffic growth.  Scaling on demand saves money as you don&#8217;t need to run all your servers all the time.  Pinterest gave a talk explaining how it saves money by reducing its server farm at night when traffic is low.</p>

<h2>Compress and Serialized Data Across the Wire</h2>

<p>Page sizes can be greatly reduced if you enable compression.  Web traffic is mostly text which is easily compressible.  A 100kb page is a lot faster to download than a 1mb page.  Don&#8217;t forget about internal communication as well.  In todays API driven world using efficient serialization protocols like protocol buffers can greatly reduce network traffic.  Most RPC tools support some form of optimal serialization.  SOAP was the rage in the early 2000s but XML is one of the worst ways to serialize data for speed.  Compressed content allows you to store more in cache and reduces network I/O as well.</p>

<h2>Shut Down Features</h2>

<p>A performance bottleneck may be caused by one particular feature.  When developing new features, especially on a high traffic site, the ability to shut down a misbehaving feature could be the quick solution to a bad problem.  Most high-traffic websites &#8220;leak&#8221; new features by deploying them to only 10% of their users to monitor behavior.  Once everything is okay they activate the feature everywhere.  Similar to determining page freshness for caches, determining available features under load can keep a site alive.  What&#8217;s more important: one specific feature or the entire system?</p>

<h2>Non-Blocking I/O</h2>

<p>Asynchronous programming is a challenge and probably a last-resort for scaling.  Sometimes servers break down without any visible threshold.  You may have seen a slow request but memory, cpu, and network levels are all okay.  This scenario is usually caused by blocking threads waiting on some form of I/O.  Blocked threads are plugs that clog your application.  They do nothing and prevent other things from happening.  If you call external web services, run long database queries or perform disk I/O beware of synchronous operations.  They are bottlenecks.  Asynchronous based frameworks like node.js put asynchronous programming at the forefront of development making them attractive for handling numerous concurrent connections.  Asynchronous programming also paves the way for queue-based architectures.  If every request is routed through a queue and processed by a worker the queue will help even out spikes in traffic.  The queue size will also determine how many workers you need.  It may be trickier to code but it&#8217;s how things scale.</p>

<h2>Think at Scale</h2>

<p>When dealing with a high-load environment nothing can be off the table.  What works for a few thousand users will grow out of control for a few million.  Even small issues will become exponentially problematic.</p>

<p>Scaling isn&#8217;t just about the tools to deal with load.  It&#8217;s about the decisions you make on how your application behaves.  The most important thing is determining page freshness for users.  The decisions for an up-to-the-second experience for every user are a lot different than an up-to-the-minute experience for anonymous users.  When dealing with millions of concurrent requests one will involve a lot of engineering complexity and the other can be solved quickly.</p>
<img src="http://feeds.feedburner.com/~r/AdventuresInHttpcontext/~4/bF2723PY9_I" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.michaelhamrah.com/2013/02/how-to-handle-a-super-bowl-size-spike-in-web-traffic/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://blog.michaelhamrah.com/2013/02/how-to-handle-a-super-bowl-size-spike-in-web-traffic/</feedburner:origLink></item>
		<item>
		<title>Embracing Test Driven Development for Speed</title>
		<link>http://feedproxy.google.com/~r/AdventuresInHttpcontext/~3/JxFk8JPaRdA/</link>
		<comments>http://blog.michaelhamrah.com/2013/02/embracing-test-driven-development-for-speed/#comments</comments>
		<pubDate>Mon, 04 Feb 2013 06:28:28 +0000</pubDate>
		<dc:creator>Michael</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[TDD]]></category>
		<category><![CDATA[Testing]]></category>

		<guid isPermaLink="false">http://www.michaelhamrah.com/blog/?p=739</guid>
		<description><![CDATA[A few months ago I helped a developer looking to better embrace test driven development. The session was worthwhile and made me reflect on my journey with TDD. Writing tests is one thing. Striving for full test coverage, writing tests first and leveraging integration and unit tests is another. Some people find writing tests cumbersome [...]]]></description>
				<content:encoded><![CDATA[<p>A few months ago I helped a developer looking to better embrace test driven development.  The session was worthwhile and made me reflect on my journey with TDD.</p>

<p>Writing tests is one thing.  Striving for full test coverage, writing tests first and leveraging integration and unit tests is another.  Some people find writing tests cumbersome and slow.  Others may ignore tests for difficult scenarios or code spikes.  When first working with tests I felt the same way.  Over time I worked through issues and my feeling towards TDD changed.  The pain was gone and I worked more effectively.</p>

<p>TDD is about speed.  Speed of development and speed of maintenance.  Once you leverage TDD as a way to better produce code you&#8217;ve unlocked the promise of TDD: Code more, debug less.</p>

<p><span id="more-739"></span></p>

<h2>Stay In Your Editor</h2>

<p>How many times have you verified something works by firing up your browser in development?  Too many times.  You build, you wait for the app to start, you launch the browser, you click a link, you fill in forms, you hit submit.  Maybe there&#8217;s a breakpoint you step through or some trace statements you output.  How much time have you wasted going from coding to verifying your code works?  Too much time.</p>

<p>Stay in your editor.  It has everything you need to get stuff done.  Avoid the context switch.  Avoid repetitive typing.  Have one window for your code and another for your tests.  Even on small laptops you can split windows to have both open at once. Gary Bernhardt, in an excellent <a href="https://peepcode.com/products/play-by-play-bernhardt">Peepcode</a>, shows how he runs specs from within vim.  Ryan Bates, in his screencast <a href="http://railscasts.com/episodes/275-how-i-test">How I Test</a>, only uses the browser for UI design.  If you leave your editor you are wasting time and suffering a context switch.</p>

<p>Every language has some sort of continuous testing runtime.  Detect a file change, run applicable tests.  Take a look at <a href="https://github.com/guard/guard">Guard</a>.  Selenium and company are excellent browser testing tools.  Jasmine works great for Javascript.  Rspec and Capybara are a solid combination.  Growl works well for notifications.  By staying in your editor you are coding all that manual verification away.  Once coded you can repeat indefinitely.</p>

<h2>Start with Tests</h2>

<p>Test driven doesn&#8217;t mean test after.  This may be the hardest rule for newcomers to follow.  We&#8217;ve been so engrained to write code, to design classes, to focus on OOP.  We know what we need to do.  We just need to do it.  Once code works we&#8217;ll then write tests to ensure it always works.  I&#8217;ve done this bad practice myself.</p>

<p>When you test last you&#8217;re missing the <em>why</em>.  <em>Customer gets welcome email after signing up</em> means nothing without context.  If you know <em>why</em> this is needed you are in a better position to define your required tests and start shaping your code.  The notification could be a simple acknowledgement or part of some intricate flow.  If you know the <em>why</em> you are not driving blind.    The what you will build and the how you will build it will follow.  If you code the other way around, testing later, you&#8217;re molding the problem to your solution.  Define the problem first, then solve succinctly.</p>

<h2>Start with Failing Tests</h2>

<p>One of my favorite newbie mistakes is when a developer writes some code, then writes a test, watches the test pass, then is surprised when the code fails in the browser.  But the test passed!</p>

<p>Anyone can write a green test.  It is the action of going from red to green which gives the test meaning.  Something needs to work, it doesn&#8217;t.  Red state.  You change your code, you make it work.  Green state.  Without the red state first you have no idea how you got to a green state.  Was it a bug in your test? Did you test the right thing?  Did you forget to assert something?  Who knows.</p>

<p>Combined with the <em>why</em> going from red to green gives the code shape.  You don&#8217;t need to over-think class design.  The code you write has purpose: it implements a need to make something work that doesn&#8217;t.  As your functionality becomes more complex, your code becomes more nimble.  You deal with dependencies, spawning new tests and classes when cohesion breaks down.  You stay focused on your goal: make something work.  Combined with git commits you have a powerful history to branch and backtrack if necessary.  As always, don&#8217;t be afraid to refactor.</p>

<h2>Testing First Safeguards Agile Development</h2>

<p>Testing first also acts as a safeguard.  Too often developers will pull work from a backlog prematurely.  They&#8217;ll make assumptions, code to those assumptions, and have to make too many changes before release.  If the first thing you do after pulling a story is ask yourself &#8220;how can I verify this works&#8221; you&#8217;re thinking in terms of your end-user.  You&#8217;re writing acceptance tests.  You understand what you need to deliver.  BDD tools like <a href="http://cukes.info/">Cucumber</a> put this paradigm in the foreground.  You can achieve the same effect with vanilla integration tests.</p>

<h2>Always Test Difficult Code</h2>

<p>Most of the time not testing comes down to two reasons.  The code is too hard to test or the code is not worth testing.  There are other reasons, but they are all poor excuses.  If you want to test code you can test code.</p>

<p>Code shouldn&#8217;t be too hard to test.  Testing distributed, asynchronous systems is hard but still testable.  When code is too hard to test you have the wrong abstraction.  You&#8217;re API isn&#8217;t working.  You aren&#8217;t adhering to SOLID principles.  Your testing toolkit isn&#8217;t sufficient.</p>

<p>Static languages can rely on dependency injection to handle mocking, dynamic languages can intercept methods.  Tools like <a href="https://www.relishapp.com/vcr/vcr">VCR</a> and Cassette can fake http requests for external dependencies.  Databases can be tested in isolation or <a href="https://github.com/nulldb/nulldb">faked</a>.  Asynchronous code can be tricky to test but becomes easier when separating pre and post conditions (you can also block in unit tests to handle synchronization).</p>

<p>The code you don&#8217;t test, especially difficult code, will always bite you.  Taking the time to figure out how to test will clean up the code and will give you incredible insight into how your underlying framework works.</p>

<h2>Always Test Your Code</h2>

<p>I worked with a developer that didn&#8217;t write tests because the requirements, and thus code, were changing too much and dealing with the failing tests was tedious.  It actually signified a red flag exposing larger issues in the organization but the point is a common one.  Some developers don&#8217;t test because code may be thrown out or it&#8217;s just a spike and not worth testing.</p>

<p>If you&#8217;re not testing first because it&#8217;s a faster way to develop, realize that there is no such thing as throw away code (on the other hand, <a href="http://code.dblock.org/treat-every-line-of-code-as-if-its-going-to-be-thrown-away-one-day">all code is throw away code</a>).  Mixing good, tested code with untested code creates technical debt.  If you put a drop of sewer in a barrel of wine you will have a barrel of sewer.  The code has no <em>why</em>.  It may be just a spike but it could also turn out to be the  next best thing.  Then you&#8217;re left retrofitting unit tests, fitting a square peg in a round hole.</p>

<h2>Balancing Integration and Unit Tests</h2>

<p>Once you start testing first a lot of pieces fall into place.  The balance between integration and unit tests is an interesting topic when dealing with code coverage.  There will be overlap in code coverage but not in terms of covered functionality.</p>

<p>Unit tests are the distinct pieces of your code.  Integration tests are how those pieces fit together.  You have a customer class and a customer page.  The unit tests are the rules around the customer model or the distinct actions around the customer controller.  The integration tests are how the end user interacts with those models top to bottom.  <a href="http://pivotallabs.com/cucumber-step-definitions-are-not-methods/">Pivotal Labs talks about changing state in cucumber steps</a> showing how integration tests monitor the flow of events in an application.  Unit tests are for the discrete methods and properties which drive those individual events.</p>

<h2>Automate</h2>

<p>Developing applications is much more than coding.  Focusing on tools and techniques at your disposal will help you write code more effectively.  Your IDE, command line skills, testing frameworks, libraries and development paradigms are as important as the code you right.  They are your tools and become more powerful when used correctly.</p>
<img src="http://feeds.feedburner.com/~r/AdventuresInHttpcontext/~4/JxFk8JPaRdA" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.michaelhamrah.com/2013/02/embracing-test-driven-development-for-speed/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.michaelhamrah.com/2013/02/embracing-test-driven-development-for-speed/</feedburner:origLink></item>
		<item>
		<title>Bus travel tips in Turkey</title>
		<link>http://feedproxy.google.com/~r/AdventuresInHttpcontext/~3/rcRGUYZQjHs/</link>
		<comments>http://blog.michaelhamrah.com/2012/09/bus-travel-tips-in-turkey/#comments</comments>
		<pubDate>Thu, 13 Sep 2012 17:47:25 +0000</pubDate>
		<dc:creator>Michael</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[travel]]></category>
		<category><![CDATA[Turkey]]></category>

		<guid isPermaLink="false">http://www.michaelhamrah.com/blog/?p=742</guid>
		<description><![CDATA[We&#8217;ve been travelling around turkey for the past three weeks and have relied heavily on bus travel to get around. Travel books have great info, but there are a couple of more things to think about when dealing with buses in Turkey. First, there are several companies that serve various routes. Pamukkale, KamilKoc and Metro [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.michaelhamrah.com/blog/wp-content/uploads/2012/09/20120913-205957.jpg"><img src="http://www.michaelhamrah.com/blog/wp-content/uploads/2012/09/20120913-205957.jpg" alt="20120913-205957.jpg" class="alignnone size-full" /></a><br /><br /><span id="more-742"></span></p>

<p>We&#8217;ve been travelling around turkey for the past three weeks and have relied heavily on bus travel to get around.  Travel books have great info, but there are a couple of more things to think about when dealing with buses in Turkey.</p>

<p>First, there are several companies that serve various routes.  Pamukkale, KamilKoc and Metro are the big ones.  There are several more depending on where you are and where you&#8217;re going.  If you don&#8217;t see the bus time you want, or if a bus is full, check with another company.  Prices are fairly set so I don&#8217;t  think it&#8217;s worth negotiating down.  If you are booking through a tour operator, hotel or another reseller they will most likely book through another company, most likely one of the above.</p>

<p>It&#8217;s important to ask what type of bus you&#8217;ll be taking.  There are big coach buses, older buses, and minibuses.  Ideally you want to be on a big coach bus, often referred to as a big bus.  There is usually wifi and tv (turkish only) on big buses, but I haven&#8217;t been on one yet with power.  One did have USB outlets but was unable to charge the iPad.  Minibuses and older buses may not have air conditioning, so it&#8217;s important to ask.  Big buses have the smoothest ride and the most legroom.  If you don&#8217;t like the bus you&#8217;re getting at the time you want, see if there&#8217;s another time with a better bus or go to another company.  Always get your ticket from someone behind a desk.  There will be plenty of people trying to sherpa you here and there, but just go right to the desk.  At some otogars there are valets to help you. They may appear to be trying to sell you something.  Just ask the right questions and you&#8217;ll be fine.  Turkish people are very nice and very helpful.</p>

<p>The bus may make a lot of stops.  We were on a minibus from Denizli to Fethiye and the bus stopped for anybody along the road.  It was nuts!  People would be waiting on the road no more than 50 meters away from each other and the bus would pull up, slow down, see if anybody needed to get on.  Also, some buses will stop at rest areas every 45 minutes to an hour for breaks.  On our way to Selcuk we had to stop at a rest area for 15 minutes even though we were only five minutes out from our destination.  Ask if it is a direct bus and how many stops it will make.  Usually the big coaches are better than the minibuses in terms of stopping.  If possible, just avoid minibuses.  The ride will most likely be bumpy as well unless it is a newer minibus or a tourist minibus.</p>

<p>Seats are assigned on buses, so ask for a seat up in the front.  Some bus companies have seat maps so you can see where you&#8217;ll be seated.  You don&#8217;t need to rush onto the bus, just put your bags on, get on, and find your seat.  Everyone is very nice and will gladly help you out.  You&#8217;ll also get tea or coffee on the bus with a snack.  If the bus is really bumpy don&#8217;t get anything hot.  You&#8217;ll probably spill it, need to drink it really quickly, than have to go to the bathroom.  Most buses don&#8217;t have bathrooms (they do make a lot of stops, so don&#8217;t worry, but some restrooms cost 1TL).  Another funny thing is that on minibuses in small towns there will be guy walking up and down with  lemon or rose oil for your hands.  A nice little refresher!</p>

<p><br /><br /><a href="http://www.michaelhamrah.com/blog/wp-content/uploads/2012/09/20120919-164620.jpg"><img src="http://www.michaelhamrah.com/blog/wp-content/uploads/2012/09/20120919-164620.jpg" alt="20120919-164620.jpg" class="alignnone size-full" /></a>
<br />
Dolmuses are fantastic.  These are little vans that go around towns to pick people up and drop them off along the way.  They are extremely cheap, extremely frequent and should be leveraged.  They are just as good as taxis and cost a lot less.  They are great within cities to get to more remote areas and to travel among smaller towns.  They are great on the Turqouise coast to explore different beaches. Essentially, you just wait outside on the road in the direction you want and a van with people will pull up.  Hotel, pensyon and guest house operators are very helpful with Dolmus transport.  Depending on where you are on the Lycian way you could even send your bags ahead to be picked up by your next stop.</p>

<p>If you stick with the big buses and know your options bus travel in Turkey is a great and economical way to get around.  Always bring earplugs and an eyemask, especially on night buses.  There will always be a crying baby and someone reading with the light on.</p>
<img src="http://feeds.feedburner.com/~r/AdventuresInHttpcontext/~4/rcRGUYZQjHs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.michaelhamrah.com/2012/09/bus-travel-tips-in-turkey/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.michaelhamrah.com/2012/09/bus-travel-tips-in-turkey/</feedburner:origLink></item>
		<item>
		<title>Effective Caching Strategies: Understanding HTTP, Fragment and Object Caching</title>
		<link>http://feedproxy.google.com/~r/AdventuresInHttpcontext/~3/nK6Aptyoga4/</link>
		<comments>http://blog.michaelhamrah.com/2012/08/effective-caching-strategies-understanding-http-fragment-and-object-caching/#comments</comments>
		<pubDate>Sat, 18 Aug 2012 18:41:24 +0000</pubDate>
		<dc:creator>Michael</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[cache]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[scaling]]></category>

		<guid isPermaLink="false">http://www.michaelhamrah.com/blog/?p=718</guid>
		<description><![CDATA[Caching is one of the most effective techniques to speed up a website and has become a staple of modern web architecture. Effective caching strategies will allow you to get the most out of your website, ease pressure on your database and offer a better experience for users. Yet as the old adage says caching&#8211;especially [...]]]></description>
				<content:encoded><![CDATA[<p>Caching is one of the most effective techniques to speed up a website and has become a staple of modern web architecture.  Effective caching strategies will allow you to get the most out of your website, ease pressure on your database and offer a better experience for users.  Yet as the old <a href="http://martinfowler.com/bliki/TwoHardThings.html">adage says</a> caching&#8211;especially invalidation&#8211;is tricky.  How to deal with dynamic pages, deciding what to cache, per-user personalization and invalidation are some of the challenges which come along with caching.</p>

<p><span id="more-718"></span></p>

<h3>Caching Levels</h3>

<p>There a three broad levels of caching:</p>

<ul>
<li><p><em><a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html">HTTP Caching</a></em> allows for full-page caching via HTTP headers on URIs.  This must be enabled on all static content and should be added to dynamic content when possible.  It is the best form of caching, especially for dynamic pages, as you are serving generated html content and your application can effectively leverage reverse-proxies like <a href="http://www.squid-cache.org/">Squid</a> and <a href="https://www.varnish-cache.org/">Varnish</a>.  <a href="http://www.mnot.net/cache_docs/">Mark Nottingham&#8217;s great overview on HTTP Caching is worth a read</a>.</p></li>
<li><p><em>Fragment Caching</em> allows you to cache page fragments or partial templates.  When you cannot cache an entire http response, fragment caching is your next best bet.  You can quickly assemble your pages from pre-generated html snippets.  For a page involving disparate dynamic content you can build your result page from cached html fragments for each section.  For listing pages, like search results, you can build the page from html fragments for each id and not regenerate markup.  For detail pages you can separate less-volatile or common sections from high-volatile or per-user sections.</p></li>
<li><p><em>Object Caching</em> allows you to cache a full object (as in a model or viewmodel).  When you must generate html for each user/request, or when your objects are shared across various views, object caching can be extremely helpful.  It allows you to better deal with expensive queries and lessen hits to your database.</p></li>
</ul>

<p>The goal is to make your response times as fast as possible while lessening load.  The more html (or data) you can push closer to the end-user the better.  HTTP caching is better than fragment caching: you are ready to return the rendered page.  When combined with a CDN even dynamic pages can be pushed to edge locations for faster response times.  Fragment caching is better than object caching: you already have the rendered html to build the page.  Object caching is better than a database call: you already have the cached query result or denormalized object for your view.  The deeper you get in the stack (the closer to the datastore) the more options you have to vary the output.  Consequently the more expensive and longer the operation will take.</p>

<h3>Break Content Down; Cache for Views</h3>

<p>A cache strategy is dependent on breaking content down to store and reuse later.  The more granular you can get the more options you have to serve cached content.  There are two main dimensions: what to cache and whom to cache for.  It is difficult to HTTP cache a page with a &#8220;Hello, {{ username }}&#8221; in the header for all users.  However if you break your users down into logged-in users and anonymous users you can easily HTTP cache your homepage for just anonymous users using the <em>vary</em> http-header and defer to fragment caching for logged-in users.</p>

<p>Cache key naming strategies allow you to vary the <em>what</em> with the <em>who for</em> in a robust way by creating multiple versions of the same resource.  A cache key could include the role of the user and the page, such as <em>role:page:fragement:id</em>, as in <em>anon:widget_detail:widget:1234</em> and serve the <em>widget detail</em> html fragment to anonymous users.  The same widget could be represented in a search detail list via <em>anon:widget_search:widget:1234</em>.  When widget 1234 updates both keys are invalidated.  Most people opt for object caching for an easy win with dynamic pages, specifically by caching via a primary key or id.  This can be helpful, but if you break down your content into the <em>what</em> and <em>who for</em> with a good key naming strategy you can leverage fragment caching and save on rendering time.</p>

<p>The <em>vary</em> http header is very helpful for dealing with HTTP caching and is not used widely enough.  By varying URIs based on certain headers (like authorization or a cookie value) you can cache different representations for the same resource in a similar way to creating multiple keys.  Think of the cache key as the URI plus whatever is set in the <em>vary</em> header.  This opens up the power of HTTP caching for dynamic or per-user content.</p>

<p>You are ready to deliver content quickly when you think about your cache in terms of views and not data.  Cache a denormalized object with child associations for easy rendering without extra lookups.  Store rendered html fragments for sections of a page that are common to users on otherwise specific content.  &#8220;Popular&#8221; and &#8220;Recent&#8221; may be expensive queries; storing rendered html saves on processing time and can be injected into the main page.  You can even reuse fragments across pages.  A good cache key naming strategy allows for different representations of the same data which can easily be invalidated.</p>

<h3>Cache Invalidation</h3>

<p>Nobody likes stale data.  As you think about caching think about what circumstances to invalidate the cache.  Time-based expirations are convenient but can usually be avoided by invalidating caches on create and update commands.  A good cache key naming strategy helps.  Web frameworks usually have a notion of &#8220;callbacks&#8221; to perform secondary actions when a primary action takes place.  A set of fragment and object caches for a widget could be invalidated when a record is updated.  If cache values are granular enough you could invalidate sections of a page, like blog comments, when a comment is added and not expire the entire blog post.</p>

<p>HTTP Etags provide a great mechanism for dealing with stale HTTP requests.  Etags allow a more invalidation options than the basic if-modified-since headers.  When dealing with Etags the most important thing is to avoid processing the entire request simply to generate the Etag to validate against (this saves network bandwidth but does not save processing time).  Caching Etag values against URIs are a good way to see if an Etag is still valid to send the proper 304 NOT MODIFIED response as quickly as possible in the request cycle.  Depending on your needs you can also cache sets of Etag values against URIs to handle various representations.</p>

<p>If you must rely on time-based expiration try to add expiration callbacks to keep the cache fresh, especially for expensive queries in high-load scenarios.</p>

<h3>Edge Side Includes: Fragment Caching for HTTP</h3>

<p>Edge Side Includes are a great way of pushing more dynamic content closer to users.  ESIs essentially give you the benefits of fragment caching with the performance of HTTP caching.  If you are considering using a tool like <a href="http://www.squid-cache.org/">Squid</a> or <a href="https://www.varnish-cache.org/">Varnish</a> ESIs are essential and will allow you to add customized content to otherwise similar pages.  The <em>user panel</em> in the header of a page is a classic example of an ESI usage.  If the user panel is the only variant of an otherwise common page for all users, the common elements could be pulled from the reverse-proxy within milliseconds and the &#8220;Welcome, {{USER}}&#8221; injected dynamically as a fragment from the application server before sending everything to the client.  This bypasses the application stack lightening load and decreasing processing time.</p>

<h3>Distributed or Centralized Caches are Better</h3>

<p>Distributed and/or centralized caches are better than in-memory application server cache stores.  By using a distributed cache like <a href="http://memcached.org/">Memcache</a>, or a centralized cache store like <a href="http://redis.io">Redis</a>, you can drop duplicate data caches to make caching and invalidating objects easier.  Even though caching objects in a web app&#8217;s memory space is convenient and reduces network i/o, it soon becomes impractical in a web farm.  You do not want to build up caches per-server or steal memory space away from the web server.  Nor do you want to have to hunt and gather objects across a farm to invalidate caches.  If you do not want to support your own cache farm, there are plenty of SaaS services to deal with caching.</p>

<h3>Compress When Possible</h3>

<p>Compressing content helps.  Memory is a far more valuable resource for web apps than cpu cycles.  When possible, compress your serialized cache content.  This lowers the memory footprint so you can put more stuff in cache, and lightens the transfer load (and time) between your cache server and application server.  For HTTP caching the helpful <em>vary</em> http header can also be used to cache content for browsers supporting compression and those that don&#8217;t.  For object caching, only store what you need in the cache.  Even though compression helps reduce the footprint, not storing extraneous data further reduces the footprint and saves serialization time.</p>

<h3>NoSQL to the Rescue</h3>

<p>One of the interesting trends I am reading about is how certain NoSQL stores are eliminating the need for separate cache farms.  NoSQL solutions are beneficial for a variety of reasons even though they create significant data-modeling challenges.  What NoSQL solutions lack in the flexibility of representing and accessing data (i.e. no joins, minimal search) they can make up in their distributed nature, fault-tolerance, end access efficiency.  When you model your data for your views, putting the burden on storing data in the same way you want to get it out, you&#8217;re essentially replacing your denormalized memory-caching tier with a more durable solution.  Cassandra and other Dynamo/Bigtable type stores are key-value stores, similar to cache stores, with the value part offering some sort of structured data type (in the case of Cassandra, sorted lists via column families).  MongoDb and Redis, (not Dynamo inspired) offer similar advantages; Redis&#8217; sorted sets/sorted lists offer a variety of solutions for listing problems, MongoDb allows you to query objects.</p>

<p>If you are okay with storing (and updating) multiple-versions of your data (again, you are caching for views) you can cut the two-layer approach of separate cache and data stores. The trick is storing everything you need to render a view for a given key.  Searches could be handled by a search-server like Solr or ElasticSearch; listing results could be handled by maintaining your own index via a sorted-list value via another key.  When using Cassandra you&#8217;d get fast, masterless, and scalable persistant storage.  In general this approach is only worthwhile if your views are well-defined.  The worst thing you want to do is refactor your entire data model when your views change!</p>

<h3>How Web Frameworks Help</h3>

<p>There is always debate on differences between frameworks and languages.  One of the things I always look for is how easy it is to add caching to your application.  Rails offers great support for caching, and the <a href="http://guides.rubyonrails.org/caching_with_rails.html">Caching with Rails</a> guide is worth a read no matter what framework or language you use.  It easily supports fragment caching in views via content blocks, behind-the-scene action caching support, has a pluggable cache framework to use different stores, and most importantly has an extremely flexible invalidation framework via model observers and cache sweepers.  When choosing any type of framework, &#8220;how to cache&#8221; should be a bullet point at the top of the list.</p>
<img src="http://feeds.feedburner.com/~r/AdventuresInHttpcontext/~4/nK6Aptyoga4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.michaelhamrah.com/2012/08/effective-caching-strategies-understanding-http-fragment-and-object-caching/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.michaelhamrah.com/2012/08/effective-caching-strategies-understanding-http-fragment-and-object-caching/</feedburner:origLink></item>
	</channel>
</rss><!-- Dynamic page generated in 0.601 seconds. --><!-- Cached page generated by WP-Super-Cache on 2013-05-09 19:48:58 -->
