<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Basil Vandegriend: Professional Software Development</title>
	
	<link>http://www.basilv.com/psd</link>
	<description />
	<lastBuildDate>Fri, 17 May 2013 13:13:07 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/ProfessionalSoftwareDevelopment" /><feedburner:info uri="professionalsoftwaredevelopment" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><image><link>http://www.basilv.com/psd/</link><url>http://www.basilv.com/psd/wp-content/themes/bvpsd/images/gears.gif</url><title>Professional Software Development</title></image><item>
		<title>Bad News Early</title>
		<link>http://feedproxy.google.com/~r/ProfessionalSoftwareDevelopment/~3/DqCJlEWrM_Y/bad-news-early</link>
		<comments>http://www.basilv.com/psd/blog/2013/bad-news-early#comments</comments>
		<pubDate>Fri, 17 May 2013 13:13:07 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[professional]]></category>
		<category><![CDATA[communication]]></category>
		<category><![CDATA[corporate culture]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=885</guid>
		<description><![CDATA[For a software development project, when is the best time to communicate to stakeholders bad news like having insufficient budget or schedule? From the behaviors I have observed of some managers and leads, their answer would seem to be "never" - upon learning bad news they hope things will turn out in the end and [...]]]></description>
			<content:encoded><![CDATA[<p>For a software development project, when is the best time to communicate to stakeholders bad news like having insufficient budget or schedule? From the behaviors I have observed of some managers and leads, their answer would seem to be "never" - upon learning bad news they hope things will turn out in the end and never voluntarily communicate such news to avoid looking bad themselves. Unfortunately, denial and blind hope are not valid management practices and do not make the issues go away. In fact, I feel that this often makes the situation worse. Instead I believe that the best practice to follow is to communicate bad news early.</p>
<p>Early communication has many benefits:</p>
<ul>
<li>It demonstrates transparency with stakeholders, which helps build trust. As Stephen M.R. Covey writes in the book <a href="http://www.amazon.ca/gp/product/1416549005/ref=as_li_qf_sp_asin_tl?ie=UTF8&#038;camp=15121&#038;creative=330641&#038;creativeASIN=1416549005&#038;linkCode=as2&#038;tag=basilvandegri-20">The SPEED of Trust: The One Thing That Changes Everything</a><img src="http://www.assoc-amazon.ca/e/ir?t=basilvandegri-20&#038;l=as2&#038;o=15&#038;a=1416549005" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />, trust helps break down barriers that slow down business. In contrast, imagine stakeholders eventually finding out the bad news, and then learning that you knew for a while but kept it a secret. This can instantaneously destroy trust.
</li>
<li>The earlier people learn of an issue, the more time is available to deal with it. This is especially true for project management issues like insufficient budget or schedule. The options for dealing with this become extremely limited if you only find out when the budget is completely gone or the milestone date comes and passes unmet. Finding out early provides many more strategies for dealing with this.
</li>
<li>Awareness precedes action. Management attention and organizational resources can only be focused on resolving the issue only after it has been communicated.
</li>
<li>When you initiate the communication, you have an easier time controlling the message that goes out, which in turn usually leads to better outcomes. Letting bad news 'escape' on its own usually ends up being much less pleasant.
</li>
</ul>
<p>One potential drawback of communicating bad news early is that management attention can end up consuming your time, for example by having to explain over and over again what the issue is, why it happened, and what options there are for dealing with it. Even without a culture of shooting the messenger, this can demotivate people from raising bad news in the future.</p>
<p>I had the opportunity to communicate bad news early on a recent enterprise development project. The project was funded based on an initial high-level estimate of high-level requirements. After these requirements were decomposed into a prioritized backlog of stories and the first iteration was developed, I calculated the effort required to implement the minimum viable product based on the team's velocity. Unfortunately the numbers weren't good: a budget shortfall was predicted. So I promptly communicated this news to the product owner, project sponsors, and my management along with an action plan for addressing the issue. I did suffer the drawback I mentioned above: many of the managers were surprised at the bad news, and everyone wanted to meet with me to discuss the issue. Apart from that, the outcome was positive: scope was reduced from the minimum viable product (it turned out to be not-quite-minimum originally), the team was kept lean, and additional funding was pursued, leading to the budget going back to green. Some of these actions would have been much less effective if pursued later in development.</p>
<p>The practice of communicating bad news early applies to every level of the organization. Individual developers communicating obstacles or impediments to their team at the daily Scrum is an example at the grassroots level. The communication can also go multiple directions in the hierarchy. A Scrummaster or manager will typically communicate up the hierarchy concerning obstacles external to their team, while managers at every level should be communicating a realistic picture to their teams. Corporate culture should explicitly endorse and encourage the communication of bad news, as otherwise the tendency is to end up with a reality-distortion field, especially for senior management, as news is tweaked to be more positive (sometimes unconsciously each time it passes along).</p>
<p>One caveat regarding this practice: the "early" aspect of communicating bad news does not mean as early as possible: it means communicating at the <em>earliest responsible moment</em> (an adoption of the lean principle of deciding at the latest responsible moment.). You do not want to appear to be crying wolf. Sometimes it is beneficial or necessary to initiate actions to address the bad news prior to communicating. For example, communication of security holes regarding software products is done privately and typically only publicly communicated after the fix has been distributed in order to minimize the risk of attackers exploiting the hole.</p>
<p>So I encourage you to reflect on any bad news you might be hiding and seize the courage to communicate it to the appropriate parties.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=DqCJlEWrM_Y:VVtMHKzOL0U:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=DqCJlEWrM_Y:VVtMHKzOL0U:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/ProfessionalSoftwareDevelopment/~4/DqCJlEWrM_Y" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2013/bad-news-early/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.basilv.com/psd/blog/2013/bad-news-early</feedburner:origLink></item>
		<item>
		<title>Architects Anonymous</title>
		<link>http://feedproxy.google.com/~r/ProfessionalSoftwareDevelopment/~3/QQ08WZu1j38/architects-anonymous</link>
		<comments>http://www.basilv.com/psd/blog/2013/architects-anonymous#comments</comments>
		<pubDate>Mon, 22 Apr 2013 12:38:24 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[architecture]]></category>
		<category><![CDATA[continuous improvement]]></category>
		<category><![CDATA[corporate culture]]></category>
		<category><![CDATA[leadership]]></category>
		<category><![CDATA[quality]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=880</guid>
		<description><![CDATA[As an architect observing and helping multiple teams build and maintain enterprise software, sometimes I think I am living in an alternate reality. I see systems fail on a nearly daily basis, teams under intense schedule pressure, a lack of awareness of basic developer quality practices, repeated failures of communication, servers taking weeks to provision, [...]]]></description>
			<content:encoded><![CDATA[<p>As an architect observing and helping multiple teams build and maintain enterprise software, sometimes I think I am living in an alternate reality. I see systems fail on a nearly daily basis, teams under intense schedule pressure, a lack of awareness of basic developer quality practices, repeated failures of communication, servers taking weeks to provision, poor management oversight and planning, a lack of critical thinking - the list goes on and on. Yet many teams and managers act like this is normal or blame external factors, with no drive to continuously improve or tackle obstacles. Even acknowledging the issues would be progress in many cases.</p>
<p>When I talk about broadly-used industry practices that could help like agile development and continuous delivery I don't know which reaction bothers me more: the blank stares from people who have not heard of them, or the common refrain that such practices will not work or are not possible in this environment. (Actually, what bothers me the most is when I get both reactions from the same person, at the same time.)</p>
<p>Are my standards too high? While perfection like zero defects or 100% up-time is impossible to attain, surely we should strive to push ourselves to get as close as we can to those platonic ideals, in a cost-effective manner. (And most people, it seems, under-appreciate the hidden costs of poor quality.)  </p>
<p>Yet sometimes in meetings I feel like I am attending the opposite of an intervention, where everyone reassures myself and each other that things are going okay, and there are no serious problems. Like that old Nintendo Gameboy commercial, perhaps I just need to take a large stick and wack myself upside the head in order to see the beautiful colors that everyone else is seeing, instead of seeing problems in black-and-white.</p>
<p>Should I give up? Join the ranks of those in denial or ignorance? Abandon the vision of how things could be better?</p>
<p>Absolutely not! That would violate my personal values and sense of professionalism. Great software is hard and requires highly motivated individuals pushing for it. To paraphrase a quote from Edmund Burke, all that is necessary for mediocrity to triumph is for good men to do nothing.</p>
<p>So I will remain the one seeing problems and being a voice for change. I call on all architects to grasp that vision of a better tomorrow and strive towards it.</p>
<p>My name is Basil Vandegriend, and I am an architect.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=QQ08WZu1j38:QcOWvGtzoDw:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=QQ08WZu1j38:QcOWvGtzoDw:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/ProfessionalSoftwareDevelopment/~4/QQ08WZu1j38" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2013/architects-anonymous/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.basilv.com/psd/blog/2013/architects-anonymous</feedburner:origLink></item>
		<item>
		<title>Exposure to Extremes</title>
		<link>http://feedproxy.google.com/~r/ProfessionalSoftwareDevelopment/~3/bi0A3qsDUwo/exposure-to-extremes</link>
		<comments>http://www.basilv.com/psd/blog/2013/exposure-to-extremes#comments</comments>
		<pubDate>Mon, 15 Apr 2013 13:49:53 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[learning]]></category>
		<category><![CDATA[corporate culture]]></category>
		<category><![CDATA[hiring]]></category>
		<category><![CDATA[interview]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=877</guid>
		<description><![CDATA[I love being exposed to different ways of doing things, especially when they are extremes that provide a sharp contrast with standard, commonly-accepted methods. I deliberately search out such examples because I feel they provide great learning opportunities to reflect on the true principles underlying successful endeavors. If someone can be successful while doing the [...]]]></description>
			<content:encoded><![CDATA[<p>I love being exposed to different ways of doing things, especially when they are extremes that provide a sharp contrast with standard, commonly-accepted methods. I deliberately search out such examples because I feel they provide great learning opportunities to reflect on the true principles underlying successful endeavors. If someone can be successful while doing the complete opposite of what many would consider necessary for success, then this should prompt a reevaluation of what is truly a necessity.</p>
<p>I recently encountered an excellent example of this when I read the book <a href="http://www.amazon.ca/gp/product/B000PDYVXE/ref=as_li_qf_sp_asin_tl?ie=UTF8&#038;camp=15121&#038;creative=330641&#038;creativeASIN=B000PDYVXE&#038;linkCode=as2&#038;tag=basilvandegri-20">The Seven-Day Weekend</a><img src="http://www.assoc-amazon.ca/e/ir?t=basilvandegri-20&#038;l=as2&#038;o=15&#038;a=B000PDYVXE" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> by Richardo Semler. The subtitle "Changing the Way Work Works" highlights the theme of the book: how the author's company Semco has managed to be a success according to traditional financial measures (such as revenue growth and headcount growth) while using management practices completely at odds with standard North American business practices.</p>
<p>How extreme is it? Here, in Richardo's words, is what Semco does <em>not</em> do:</p>
<blockquote><p>
Semco has no official structure. It has no organizational chart. There's no business plan or company strategy, no two-year or five-year plan, no goal or mission statement, no long-term budget. The company often does not have a fixed CEO. There are no vice presidents or chief officers for information technology or operations. There are no standards or practices. There's no human resources department. There are no career plans, no job descriptions or employee contracts. No one approves reports or expense accounts. Supervision or monitoring of workers is rare indeed. </p>
<p>Most important, success is not measured only in profit and growth.
</p></blockquote>
<p> (The Seven Day Weekend, page 8)</p>
<p>So what does Semco do? Richardo summarizes the company's philosophy as follows:</p>
<blockquote><p>
It's our lack of formal structure, our willingness to let workers follow their interests and their instincts when choosing jobs or projects. </p>
<p>It's our insistence that workers seek personal challenges and satisfaction before trying to meet the company goals.</p>
<p>It's our commitment to encouraging employees to ramble through their day or week so that they will meander into new ideas and new business opportunities.</p>
<p>It's our philosophy of embracing democracy and open communication, and inciting questions and dissent in the workplace.
</p></blockquote>
<p> (The Seven Day Weekend, page 9)</p>
<p>So is Semco completely unstructured, free-for-all chaos? Not at all. Richardo lists a number of deliberate structures or processes that help keep the company going in alignment with its core values. As a specific example let's look at Semco's hiring practices, which Richard describes starting on page 147. Who participates in the interviews? Anyone who is interested. But the process is surprisingly structured. A template is drafted listing qualities sought, along with a numerical weighting for each. Any employee can contribute feedback to the creation of this template. Basic, must-have qualifications are left off the template and covered by specific tests. Past experience and schooling are explicitly ignored after initial screening to avoid too much uniformity. </p>
<p>After the template is completed, a few interested employees volunteer to coordinate the interviews, which are collective affairs involving multiple candidates brought into a big room face-to-face with employees interested in interviewing. If no employees show up for the collective interview, then the position is eliminated because it demonstrates that no one at the company cares about it. After speaking with candidates, interviewers ranked them numerically according to the template, including their general impression of whether the candidates was the right person for the job. After a number of rounds of interviews (depending on the size of the applicant pool and employee interest in interviewing), the winner is selected using the numerical scores from the completed templates. Management has a little input into the process, but the final decision is up to the interviewing group collectively, based on their ratings.</p>
<p>Richardo talks a lot about self-management, which is the eleventh principle of the <a href="http://agilemanifesto.org/principles.html">Agile Manifesto</a> and is one of the core tenets of Scrum. Both Scrum and Semco seem like they cannot possibly work according to traditionally-minded managers, yet they do. If a command-and-control management style is not a necessity for success, and fails to fully tap into and uplift each person's spirit, then shouldn't this style be abandoned?</p>
<p>The notion of extremes is based on the idea of significant deviations from the norm. Since cultural, organizational, or personal change is hard, there is a tendency to be mired in current norms and practices and lack vision of alternatives. Studying extremes is one way to lift yourself out of the mire, however briefly, and look at different ways of doing things.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=bi0A3qsDUwo:Mdt7VQHBB-w:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=bi0A3qsDUwo:Mdt7VQHBB-w:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/ProfessionalSoftwareDevelopment/~4/bi0A3qsDUwo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2013/exposure-to-extremes/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.basilv.com/psd/blog/2013/exposure-to-extremes</feedburner:origLink></item>
		<item>
		<title>Hierarchy of Advice</title>
		<link>http://feedproxy.google.com/~r/ProfessionalSoftwareDevelopment/~3/pJnDMSNyA-Q/hierarchy-of-advice</link>
		<comments>http://www.basilv.com/psd/blog/2013/hierarchy-of-advice#comments</comments>
		<pubDate>Mon, 18 Feb 2013 14:12:04 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[professional]]></category>
		<category><![CDATA[advice]]></category>
		<category><![CDATA[context]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=871</guid>
		<description><![CDATA[As an architect over the years I have given a lot of advice, some of it even asked for :) Over time, my choice of words when providing advice has evolved into a very precise hierarchy. Random Thought At this lowest level this is not advice but merely thoughts that are thrown out with very [...]]]></description>
			<content:encoded><![CDATA[<p>As an architect over the years I have given a lot of advice, some of it even asked for :)  Over time, my choice of words when providing advice has evolved into a very precise hierarchy.</p>
<p><a href="http://www.basilv.com/psd/wp-content/uploads/2013/02/Hierarchy-of-Advice.png"><img src="http://www.basilv.com/psd/wp-content/uploads/2013/02/Hierarchy-of-Advice.png" alt="Hierarchy of Advice" title="Hierarchy of Advice" width="186" height="560" class="alignright size-full wp-image-872" /></a></p>
<h3>Random Thought</h3>
<p>At this lowest level this is not advice but merely thoughts that are thrown out with very little, if any, evaluation of whether there is any merit to them. Brainstorming is the perfect time to throw out random thoughts, since by design the objective is to be creative rather than critical. I sometimes relay a random thought to raise a point that is radical given the organizational context in order to cause some cognitive dissonance and stimulate thinking.</p>
<h3>Idea</h3>
<p>Ideas have some merit to them, at least theoretically, but have not been evaluated as to whether they would indeed be advantageous and worth pursuing.</p>
<p>I find it interesting that ideas are so low on the hierarchy - they really are not that useful in their raw form. This runs counter to common perception that ideas are valuable. In reality, it is the analysis of ideas that converts them to higher levels of advice which adds value, and this analysis typically needs to be grounded in the organizational context and awareness of the other options ('ideas') that are also applicable.</p>
<h3>Good Idea</h3>
<p>Good ideas have a lot of merit to them that are generally useful in a broad set of contexts, but have still not been evaluated. Often what people call <em>best practices</em> fall into this category because they blindly tout the practice without analyzing its applicability.</p>
<h3>Suggest</h3>
<p>A suggestion is what I consider the first level of real advice. I have evaluated the benefits and believe acting on the suggestion will have a positive outcome. Unlike the higher levels, however, I am not confident of a large return on investment. Compared to other competing options I see only a slight advantage.</p>
<h3>Recommend</h3>
<p>A recommendation has a clearly positive outcome that I am confident is superior to competing alternatives, and that I have determined has a good return on investment compared to other ideas. Doing the analysis to justify making a recommendation is not always easy: I have written previously about <a href="http://www.basilv.com/psd/blog/2012/the-difficulty-of-making-good-recommendations">the difficulty of making a good recommendation</a>.</p>
<h3>Highly Recommend</h3>
<p>I highly recommend something when it has a clearly superior return on investment and is one of the best options compared to other ideas. Since I seldom like to be completely dogmatic, I sometimes use the phrase "highly recommend" to describe something that I feel should absolutely, positively be done.</p>
<h3>Conclusion</h3>
<p>The underlying message behind this hierarchy of advice is that good advice is so much more than just touting so-called "best practices" - it is grounded in deep thinking: brainstorming options, understanding the organizational context, evaluating return on investment, and prioritizing.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=pJnDMSNyA-Q:qq2QDctsr2U:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=pJnDMSNyA-Q:qq2QDctsr2U:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/ProfessionalSoftwareDevelopment/~4/pJnDMSNyA-Q" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2013/hierarchy-of-advice/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.basilv.com/psd/blog/2013/hierarchy-of-advice</feedburner:origLink></item>
		<item>
		<title>Faster Builds via Concurrency</title>
		<link>http://feedproxy.google.com/~r/ProfessionalSoftwareDevelopment/~3/n-Mu_cFj8M0/faster-builds-via-concurrency</link>
		<comments>http://www.basilv.com/psd/blog/2013/faster-builds-via-concurrency#comments</comments>
		<pubDate>Sat, 12 Jan 2013 13:54:21 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[tools]]></category>
		<category><![CDATA[automated build]]></category>
		<category><![CDATA[concurrency]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=866</guid>
		<description><![CDATA[Recently I have been looking for ways to make a Java build run faster. This is something I seem to do at least once a year, typically as a result of the application’s production code base and automated test suite both growing in size over time. The build had previously already been split into multiple [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I have been looking for ways to make a Java build run faster. This is something I seem to do at least once a year, typically as a result of the application’s production code base and automated test suite both growing in size over time. The build had previously already been split into multiple phases with the slower integration tests running in a later phase. My efforts this time were focused on the initial build phase used by developers prior to pushing their changes to the rest of the team. Speed matters for a routine process such as this that is executed multiple times a day: ideally this build would run instantaneously, giving feedback to the developer immediately whether their change was good to go or not.</p>
<p>So how could this build be made faster? It bothered me that despite having modern multi-core CPUs, the build still executed as a single serial process. Surely massive speed improvements could be obtained by having the build run concurrently over multiple threads or processes? </p>
<h3>Parallel Checks</h3>
<p>Since the build script used <a href="http://ant.apache.org/">ANT</a>, my first attempted optimization was to use ANT's <em>parallel</em> task. I mapped out the major activities that the build needed to perform and the dependencies between them (which was already defined within the ANT build script). I then implemented as much of the work in parallel in possible, which ended up requiring a four-level nesting of parallel and sequential tasks. The resulting build failed to work properly - it refused to run the tasks in the lower level nested parallel task. So I simplified the build down to a single parallel task. This worked, most of the time. Rarely the build would hang for no apparent reason. However, the build time improved significantly, with just under a 50% reduction in the time required. Here is a simplified version of the ANT code:</p>
<pre class="prettyprint">
&lt;target name="parallel-build" depends="compile-unit-tests"&gt;
	&lt;parallel failonany="true"&gt;
	&lt;sequential&gt;
		&lt;unit-test/&gt;
	&lt;/sequential&gt;
	&lt;sequential&gt;
		&lt;findbugs/&gt;
	&lt;/sequential&gt;
	&lt;sequential&gt;
		&lt;pmd/&gt;
	&lt;/sequential&gt;
	&lt;sequential&gt;
		&lt;antcall target="compile-integration-tests"/&gt;
	&lt;/sequential&gt;
	&lt;/parallel&gt;
	&lt;antcall target="parallel-build-result"/&gt;
&lt;/target&gt;
</pre>
<p>One interesting problem I ran into was that I wanted to address the frequent scenario where a developer runs this build on their workstation, prior to checkin. I wanted the build to summarize the result of the various checks (unit tests, FindBugs, and PMD) at the end, and ensure that all the checks had been performed instead of failing when the first problem was found. My initial use of the antcall task within each sequential task proved problematic as ANT prevents any property settings from being passed back from the child antcall. So I switched these invocations to ANT macros which let me set a property for each check that the final parallel-build-result target is able to use to provide a summary report. Here is a simplified version of the ANT code to accomplish this:</p>
<pre class="prettyprint">
&lt;target name="parallel-build-result"
	depends="unit-test-result, pmd-result, findbugs-result"/&gt;

&lt;target name="unit-test-result" if="unit.tests.failed"&gt;
	&lt;antcall target="unit-test-html-report"/&gt;
	&lt;echo message="Unit tests failed!"/&gt;
&lt;/target&gt;

&lt;!--
pmd-result and findbugs-result targets omitted
as they are very similar to the unit-test-results target
--&gt;
 </pre>
<h3>Parallel Tests</h3>
<p>Analyzing the performance of this new parallel build quickly revealed that it was the automated unit tests that were the bottleneck. Since unit tests are designed to run independently from one another, I was optimistic that I could run the unit tests in a highly parallel fashion and obtain further significant reductions in build time. My expectation was that it should be possible to accomplish this by simple changes to the build script without having to change individual unit tests or manually partition the tests into multiple suites.</p>
<p>When I investigated the options for accomplishing this via ANT and JUnit, I discovered, to my surprise, that this was not explicitly supported. The easiest option was running multiple tests within a single test class concurrently, which is not what I wanted and required modifying existing test code. I found experimental support in JUnit for running tests in parallel, but this was not easily invoked via ANT.</p>
<p>Having hit a brick wall, I considered other technologies. I discovered that TestNG (test framework alternative to JUnit), Maven (build tool), and Gradle (build tool) all provide built-in support for running test suites in parallel. Since I have played with Gradle in the past and like it as an up-and-coming build tool, I decided to give it a try. </p>
<p>The challenge with Gradle was getting it set up and working to replicate enough of the pre-existing ANT build in order to run the tests. Once the tests were running, making them run in parallel was extremely simple, as the following Grade code snippet shows:</p>
<pre class="prettyprint">
test {
	maxParallelForks = 3

}
</pre>
<p>However, the performance gains turned out to be far poorer than I was expecting. I observed a 20% improvement only when maxParallelForks was set to three - at other values like 2 or 4, performance was the same or worse than the non-concurrent version. I did run these tests on a multi-core workstation, so CPU was not the limiting factor. What was going on? Why didn't I see an improvement much closer to 300%?</p>
<p>Further investigation revealed that others have <a href="http://incodewetrustinc.blogspot.ca/2010/01/run-your-junit-tests-concurrently-with.html">encountered similar results</a> when running unit tests. Running tests with significant I/O like integration-style tests or web tests in parallel can experience significant performance improvements since while one test is stalled waiting for I/O, others can still run. Unit tests, however, typically have no such I/O waits.</p>
<p>Any parallelization effort must factor in the additional effort to launch new concurrent executions. In the case of Gradle, this means spinning up a new Java virtual machine and loading all the necessary classes. This is a lot of additional work duplicated across the parallel executions which reduces the efficiency of parallelization . (Other technologies such as Maven, use threads in the same VM to parallelize testing and thus have less start-up overhead, but a much higher risk of contention.) In the case of the unit test suite I was working with, we use a shared spring test context for some of the tests. As a singleton test fixture it is only set up once for all the tests when run sequentially, but when running concurrently in multiple VMs it needs to be initialized for each VM. While our unit test suite is fairly large (~5000 tests), it is also quite fast, so all the start-up overhead quickly dwarfs the speed increase from running in parallel. Given all this, I am fortunate to have seen any performance improvement at all.</p>
<h3>Conclusion</h3>
<p>Overall I am not satisfied with the current level of support for concurrency in Java build and test technologies. Some of the features I would like to see are:</p>
<ul>
<li>Build scripts already specify dependencies between tasks, so the build tool should be able to run unrelated tasks in parallel without a developer having to spell it out explicitly.</li>
<li>All tools involved in builds such as compilers, static code analysis, and test frameworks should take advantage of concurrency when possible when doing their own work.</li>
<li>Testing frameworks should support running test suites concurrently via a combination of multiple threads and multiple VMs, to allow developers control over the level of separation vs overhead.</li>
</ul>
<p>Given the trend towards more cores rather than faster clock speeds, I believe the use of concurrency in builds is inevitable.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=n-Mu_cFj8M0:n4DGP3TDMxY:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=n-Mu_cFj8M0:n4DGP3TDMxY:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/ProfessionalSoftwareDevelopment/~4/n-Mu_cFj8M0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2013/faster-builds-via-concurrency/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.basilv.com/psd/blog/2013/faster-builds-via-concurrency</feedburner:origLink></item>
		<item>
		<title>Scaling Up From One Developer</title>
		<link>http://feedproxy.google.com/~r/ProfessionalSoftwareDevelopment/~3/NMJ9PBk_TAQ/scaling-up-from-one-developer</link>
		<comments>http://www.basilv.com/psd/blog/2012/scaling-up-from-one-developer#comments</comments>
		<pubDate>Mon, 15 Oct 2012 13:03:29 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[agile]]></category>
		<category><![CDATA[definition of done]]></category>
		<category><![CDATA[maintenance]]></category>
		<category><![CDATA[process]]></category>
		<category><![CDATA[Scrum]]></category>
		<category><![CDATA[software development]]></category>
		<category><![CDATA[team]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=863</guid>
		<description><![CDATA[I have noticed a common problem afflicting small development teams formed to make significant enhancements to an application that was previously maintained by just one developer. Both the original maintenance developer and their management are accustomed to essentially solo development and this culture spills into the enhancement work. Development is treated as individual efforts rather [...]]]></description>
			<content:encoded><![CDATA[<p>I have noticed a common problem afflicting small development teams formed to make significant enhancements to an application that was previously maintained by just one developer. Both the original maintenance developer and their management are accustomed to essentially solo development and this culture spills into the enhancement work. Development is treated as individual efforts rather than a combined team, which results in issues like the following:</p>
<ul>
<li>Highly variable quality - some of it very poor - in delivered work.</li>
<li>Budget and schedule overruns due to inaccurate estimates and lack of scope control.</li>
<li>Communication breakdowns.</li>
<li>Insufficient monitoring and risk mitigation by management.</li>
</ul>
<p>These issues are all symptoms of the underlying problem of failing to adjust development practices to scale up from one developer to a team. One developer working alone on minor enhancements to an application can typically get by on their own ability, without explicit use of any development practices. The enhancements are often simple enough to fall within the <a href=”http://www.basilv.com/psd/blog/2010/minimum-and-optimal-thresholds-of-competence”>developer’s zone of competence</a> so issues are infrequent. Tackling a larger enhancement with a small team of three or more developers introduces the following scaling factors:</p>
<ul>
<li>Higher complexity of the changes (in addition to the larger magnitude of change).</li>
<li>Higher likelihood of requirement changes, additions, and refinements by the business.</li>
<li>Communication and coordination among the developers.</li>
<li>Dealing with variations in practices or abilities between the developers.</li>
<li>Ramp up needed for new developers to become familiar with the application and the business domain.</li>
<li>Less flexibility to recover from issues due to larger budgets, longer schedules, more people, and more management visibility.</li>
</ul>
<p>So what can these teams do to successfully scale? One idea that I often hear mentioned is to turn the work into a formal project and assign a project manager. However, this is seldom a viable option due to considerations like the following:</p>
<ul>
<li>Starting a formal project within the organization is a lengthy endeavor that the business wants to avoid since they want their changes made as soon as possible.</li>
<li>Adding a full-time project manager to a team of three developers represents too much overhead that the business (or I.T., depending on the organization) does not want to pay for.</li>
<li>The I.T. organization does not have sufficiently skilled project managers to staff these 'mini' projects full-time. (I have been burned by overly-junior project managers making things worse, rather than better.)</li>
</ul>
<p>So what options are available to teams needing to scale up? Based on some recent experiences, I believe that the <a href="http://www.scrumalliance.org/">Scrum</a> development method has a number of ways it can help. And I am not even talking about fully adopting Scrum (although that is an option). </p>
<p>I will start with the role of ScrumMaster which I have explored in a recent article titled <a href="http://www.basilv.com/psd/blog/2012/the-mindset-of-a-scrummaster">The Mindset of a ScrumMaster</a>. The role of the ScrumMaster is to help grow a high performance team. Some people view ScrumMasters as process coaches, which is part of what they do, but they are really team coaches. For small teams, ScrumMasters can help on a part-time basis, which helps mitigate concerns about high budget overhead or staffing. And rather than trying to directly manage the work as a traditional project manager would, a ScrumMaster helps the team learn how to self-manage, providing guidance when necessary. What exactly would the ScrumMaster help the team with? This is where some of the practices used in Scrum can help deal with scaling factors.</p>
<p>The first Scrum practice that can help a team scale is the daily standup, which develops communication between developers (which is surprisingly hard if they are used to working alone) and helps focus their efforts. Speaking of tasks, the next Scrum practice that can help a team scale is to use a task board which helps visualize tasks queued in the backlog versus in progress versus done. This provides a focus for the daily standup discussions and helps provide a basic structure for the team to track their current progress. To forecast future progress the next Scrum practice that can help is the use of velocity and burndown charts to predict when milestone targets will be achieved. These practices help with scaling factors relating to communication and monitoring, but do not really help with quality issues. For that, the Scrum practice of <a href="http://www.basilv.com/psd/blog/2009/why-you-need-a-definition-of-done">definition of done</a> is helpful, although the team will likely need coaching in how to adopt the way they work in order to achieve what they have defined as done.</p>
<p>The combination of these four practices is synergistic, imposes limited overhead, and is simple enough that developers can learn to do these practices on their own with some coaching from a ScrumMaster. Small teams of developers using these practices can therefore significantly mitigate the scaling factors that would otherwise cause issues.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=NMJ9PBk_TAQ:v0ay3nF2qFQ:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=NMJ9PBk_TAQ:v0ay3nF2qFQ:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/ProfessionalSoftwareDevelopment/~4/NMJ9PBk_TAQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2012/scaling-up-from-one-developer/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.basilv.com/psd/blog/2012/scaling-up-from-one-developer</feedburner:origLink></item>
		<item>
		<title>Deficient Outage Communication by the Alberta Government</title>
		<link>http://feedproxy.google.com/~r/ProfessionalSoftwareDevelopment/~3/RVWM-cEQRxw/deficient-outage-communication-by-the-alberta-government</link>
		<comments>http://www.basilv.com/psd/blog/2012/deficient-outage-communication-by-the-alberta-government#comments</comments>
		<pubDate>Mon, 01 Oct 2012 12:58:10 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[professional]]></category>
		<category><![CDATA[communication]]></category>
		<category><![CDATA[operations]]></category>
		<category><![CDATA[outage]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=854</guid>
		<description><![CDATA[Citizens of the province of Alberta, Canada experienced a rare event this summer when a major data center in the City of Calgary experienced a multi-day outage due to a fire. Many I.T. services provided by the Government of Alberta and taxpayer-funded organizations such as Alberta Health Services were hosted in this data center and [...]]]></description>
			<content:encoded><![CDATA[<p>Citizens of the province of Alberta, Canada experienced a rare event this summer when a major data center in the City of Calgary experienced a multi-day outage due to a fire. Many I.T. services provided by the Government of Alberta and taxpayer-funded organizations such as Alberta Health Services were hosted in this data center and thus suffered outages.</p>
<p>I have written previously about the <a href="http://www.basilv.com/psd/blog/2012/avoiding-outrage-over-outages">importance of good communication in avoiding outrage over outages</a>. In that article I decomposed outage communication into two pieces: first, reporting on the incident, and second reporting on the underlying problems or root causes. Today I want to focus on the latter category of problem communication and highlight major deficiencies in the relevant news releases published by the Alberta Government concerning this event.</p>
<p>To start, let us look at an example of excellent problem communication - <a href="https://aws.amazon.com/message/67457/">Amazon's report concerning their Eastern U.S. data center outage</a>, which occurred only a few weeks prior to the Alberta outage. The first thing to notice about this communication is that it is long - approximately 2700 words. The report starts by describing in detail the events leading up to the outage. Next Amazon provides their analysis of the faulty hardware and the steps they are taking in both the short and long term to address the issues they experienced with the hardware. Amazon then lists the various services provided by this data center and summarizes the impact to customers. This is a preamble to the rest of the report that then goes through each service one by one, describes the timeline and events leading to restoration of the service. As part of this detailed recount, Amazon describes various problems they encountered such as software bugs or design flaws and reports on what they are planning to do to correct or improve the situation. In aggregate across these service sections, Amazon provides details on two software bugs, two design flaws, and two other opportunities for improvements and indicate they have work underway to address all of these issues. Amazon then concludes with an apology for the disruption and a commitment to learn from and make improvements based on this outage. Based on this detailed analysis, one might expect that Amazon needed a lot of time to prepare this report. However, surprisingly, the timeline reported by Amazon indicates otherwise. The data center outage occurred on a Friday night at 8:04pm. Restoration of services continued into Saturday morning.  The staff working on the various issues likely worked all Friday night and either would have been exhausted from working an extended day, or would be coming off an eight hour shift if they started just before the outage. Plus it was the weekend, which I presume would lead to lower than normal staffing levels. Yet despite this, Amazon was able to come out with this detailed problem communication on Monday, immediately after this weekend event. To me, this is one of the most notable aspects because it highlights how quickly Amazon was able to do this thorough analysis involving multiple specialties / technologies (hardware plus each of the four services), pull it all together, and publish it only two days later despite the weekend. This says a lot about how seriously they took this situation.</p>
<p>Now let us examine what was published by the Alberta government for problem communication regarding the Calgary data center outage, and we will see the stark contrast. The government published four news releases regarding the outage: <a href="http://alberta.ca/NewsFrame.cfm?ReleaseID=/acn/201207/326577C38850F-D29F-736C-9B07029AC8DB778B.html">July 12</a>, <a href="http://alberta.ca/NewsFrame.cfm?ReleaseID=/acn/201207/32669813483BA-E32C-55AD-2D347BE751C49A3C.html">July 13</a>, <a href="http://alberta.ca/NewsFrame.cfm?ReleaseID=/acn/201207/326778CEDB568-E1B8-1D5F-3B8428E4A872B8EF.html">July 15</a>, and <a href="http://alberta.ca/NewsFrame.cfm?ReleaseID=/acn/201207/3268394F6A649-B7AA-163D-DC1DB8AB5874305F.html">July 17</a>. This last July 17 news release reported that all services were "now fully restored", so one would expect to see problem communication provided within this release. And in fact this is the case. Dissecting this news release paragraph by paragraph reveals the following:</p>
<ol>
<li>The first paragraph states that the remaining services have been restored. No issues here.
</li>
<li>The second paragraph contains an acknowledgement of impact to Albertans, using phrases like "... a frustrating inconvenience for many Albertans", and "I appreciate their patience and understanding...". This should have been an apology. In fact, reviewing all four news releases indicates that the government never once apologized for the disruption. In contrast here is the first sentence of Amazon's apology: "We apologize for the inconvenience and trouble this caused for affected customers." Reading the Amazon apology paragraph, I am left with the impression that they truly do care about their customers, take full responsibility for what happened, and are committed to improving. The impression I get from the government statement is quite different: they avoid accepting any responsibility.
</li>
<li>The third paragraph discusses the post-incident ramifications regarding temporary provisions the government put in place. This is excellent - no issues.
</li>
<li>
The fourth and final paragraph dashes our hopes for further details as it is only three sentences long. The first two sentences describe the initial incident and timeline to restoration, but provide absolutely no root cause analysis or indications of areas to improve, unlike the Amazon report. The government's last sentence talks about learning and improving from the event, which I was happy to see. However, a closer look at the language used compared to Amazon's reveals more disappointing contrasts. Here is a key sentence from Amazon, with emphasis added, "We will spend <em>many hours</em> over the coming days and weeks <em>improving our understanding</em> of the details of the various parts of this event and determining how to make <em>further changes</em> to improve our services and processes.". And here are the government statements, again with emphasis added (first sentence from the second paragraph, second sentence from the last paragraph) "Now we are focusing on what we <em>can learn</em> from this situation to improve our systems going forward.... Now that all the systems are restored, government will take the time to internally assess what happened and make <em>improvements if necessary</em>." Reading Amazon's statement, they give a concrete commitment to take action to improve in the short-term, and state that they will do <em>more</em> learning and improvement, which is impressive given how many details they had already determined and communicated. In contrast, the government's statements suggest that no learning has yet taken place and that improvements might not happen. Yet from examining the series of news releases it is apparent to those with an I.T. background that there are many opportunities for improvement. I have identified the following just from analyzing the information publicly available from the government's news releases:</p>
<ul>
<li>Services using mirrored data were restored more quickly than those having to recover from tape backups (see the July 13 and July 15 news releases). Why weren't all services using mirroring, in particular services like land titles, one of the last services restored, whose disruption had a much more significant impact on Albertans than some of the other services.
</li>
<li>Why did the land titles system and motor vehicle registry take two extra days to restore (see July 15 and July 17 news releases)? These were fairly critical services compared to others restored sooner such as fishing and hunting license sales that are clearly lower priority. This suggests that something went wrong in the attempt to restore these services.
</li>
<li>Even for services using mirrored data, it took at least one and a half days to report that these services were back up. (I cannot be more precise as to the time interval because none of the government news releases specify only dates and no times, unlike the Amazon report.) Perhaps some mirrored systems did fail over immediately and never suffered a service disruption and thus were never reported on - I cannot tell from the news releases. But for these mirrored services that were disrupted, there must be improvements that can be made to the time required to fail over.
</li>
<li>The government's I.T. disaster recovery plan certainly can be improved since any real disaster such as this one will provide lessons learned above and beyond what regular disaster recovery testing will identify.
</li>
</ul>
</li>
</ol>
<p>One potential critique of my use of Amazon to contrast with the Alberta government is that Amazon is a large, world-class organization specializing in providing I.T. services (as well as selling items online). Perhaps one cannot expect the Alberta government to have the same level of I.T. expertise. Fortunately for me, the government itself negated this critique by stating multiple times that they are using IBM as their service provider, and in particular making statements like "We will continue to work with our partner IBM, a world leader in information technology,..." and "A dedicated, broad team of IBM experts will continue to work non-stop..." (from the July 15 news release). So in my view, the government has no excuse for their poor communication.</p>
<p>As a citizen and taxpayer of Alberta, I conclude by calling on <em>my</em> government to take accountability and step up by providing proper problem communication detailing what has been learned regarding this outage and what improvements have and will be made.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=RVWM-cEQRxw:Tbwtb-V6EOg:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=RVWM-cEQRxw:Tbwtb-V6EOg:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/ProfessionalSoftwareDevelopment/~4/RVWM-cEQRxw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2012/deficient-outage-communication-by-the-alberta-government/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.basilv.com/psd/blog/2012/deficient-outage-communication-by-the-alberta-government</feedburner:origLink></item>
		<item>
		<title>Avoiding Outrage Over Outages</title>
		<link>http://feedproxy.google.com/~r/ProfessionalSoftwareDevelopment/~3/I_Y-Hg8HP70/avoiding-outrage-over-outages</link>
		<comments>http://www.basilv.com/psd/blog/2012/avoiding-outrage-over-outages#comments</comments>
		<pubDate>Tue, 25 Sep 2012 13:46:35 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[professional]]></category>
		<category><![CDATA[communication]]></category>
		<category><![CDATA[operations]]></category>
		<category><![CDATA[outage]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=848</guid>
		<description><![CDATA[When outages - also known as service disruptions - happen to I.T. services, the response from the organization providing the service is the most important factor in determining the level of outrage felt by consumers of the service. Responses can be grouped into three main categories inspired by ITIL: Incident Communication: The information provided to [...]]]></description>
			<content:encoded><![CDATA[<p>When outages - also known as service disruptions - happen to I.T. services, the response from the organization providing the service is the most important factor in determining the level of outrage felt by consumers of the service. Responses can be grouped into three main categories inspired by <a href="http://en.wikipedia.org/wiki/Information_Technology_Infrastructure_Library">ITIL</a>:</p>
<ol>
<li><em>Incident Communication</em>: The information provided to consumers during the outage. Outrage is reduced by apologizing to consumers, acknowledging that the issue exists, confirming that staff are working on a resolution as a top priority, and communicating what steps have been and will be taken to mitigate or resolve the outage. For a very blunt discussion on how to word communications about outages, see <a href="http://37signals.com/svn/posts/1528-the-bullshit-of-outage-language">this article by 37 Signals</a>.
</li>
<li><em>Problem Communication</em>: The information provided to consumers after the outage is resolved about the underlying problems or root causes of the outage. This communication should repeat the apology to consumers, explain why the outage occurred, and identify the actions underway to prevent such outages in the future. This helps reconfirm to consumers that the issue has been taken seriously, provides reassurance that it will not reoccur in the future, and helps bring a sense of closure regarding the incident. For an excellent example of this type of communication, see <a href="https://aws.amazon.com/message/67457/">Amazon's communication regarding a data center outage</a>.
</li>
<li><em>Problem Resolution</em>: The actual resolution of underlying problems and causes of outages. This helps avoid what I call future outrage by improving the availability / reliability of services so that outages, and thus outrage, are less common. Skipping this step leads eventually to disillusionment and then rejection by consumers since in the long term they judge organizations by what they do rather than what they say.
</li>
</ol>
<p>This may sound easy to do, but it is amazing how many organizations fall dramatically short in one or more of these areas. I plan to write a follow-up post dissecting some case studies. In the meantime, feel free to post comments providing examples - both good and bad - of organizations dealing with outages.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=I_Y-Hg8HP70:p2wOp_SVIzE:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=I_Y-Hg8HP70:p2wOp_SVIzE:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/ProfessionalSoftwareDevelopment/~4/I_Y-Hg8HP70" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2012/avoiding-outrage-over-outages/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.basilv.com/psd/blog/2012/avoiding-outrage-over-outages</feedburner:origLink></item>
		<item>
		<title>The Difficulty of Making Good Recommendations</title>
		<link>http://feedproxy.google.com/~r/ProfessionalSoftwareDevelopment/~3/ABciECDTqcg/the-difficulty-of-making-good-recommendations</link>
		<comments>http://www.basilv.com/psd/blog/2012/the-difficulty-of-making-good-recommendations#comments</comments>
		<pubDate>Tue, 04 Sep 2012 13:06:27 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[professional]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[continuous improvement]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=845</guid>
		<description><![CDATA[Recently I have been doing an architectural assessment of an application for an organization I have not worked with before. As I have been writing up my findings, I have noticed that the portion of my analysis causing the most difficulty for me is in coming up with recommendations. Why is this? In this context [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I have been doing an architectural assessment of an application for an organization I have not worked with before. As I have been writing up my findings, I have noticed that the portion of my analysis causing the most difficulty for me is in coming up with recommendations. Why is this? </p>
<p>In this context the word "recommendation" can be defined as "to advise as to the best course or choice". The process of analysis to arrive at a recommendation can be decomposed into the following three steps:</p>
<ol>
<li><em>Observation</em>: This first step involves learning about the application, the business need it fills, and the broader organization. A variety of methods need to be used to gather all the required information such as interviewing members of I.T. and business, examining documentation, and reviewing application code.
</li>
<li><em>Gap Analysis</em>: Once the current state is known, this step involves comparing it with the ideal state. Every deviation from the norm or from the optimal is a potential opportunity for improvement.
</li>
<li><em>Evaluation</em>: Each identified improvement is then evaluated. First a cost-benefit analysis must be done to determine whether it is even worthwhile to action the improvement. Second, a prioritization must be done to determine which are the highest priority and value to proceed with first. This very last part is what identifies "the best course or choice", which is the actual recommendation.
</li>
</ol>
<p>It is the <em>Evaluation</em> step that has been causing me the difficulty, especially the cost-benefit analysis of each improvement idea. Why would this be? Upon initial reflection it seems like the benefits side is easy to determine: it is often clear what non-functional characteristics or process would improve for a given idea. But evaluating the impact of this improvement is harder. For example, consider an improvement to the maintainability of a portion of the code base. What impact will this have? This depends in large part on how much that code base will need to change in the future. What business drivers might lead to change? How is this portion of code used by or interacting with the reminder of the application. Performing this analysis requires deep and broad knowledge about the application and the organization. And as I stated at the start, I am new to the organization and application. </p>
<p>Therefore fundamentally my difficulty with coming up with recommendations is a function of lack of contextual knowledge. Ironically this can be an asset during the <em>Observation</em> and <em>Gap Analysis</em> steps, since I am examining what is in place with a fresh pair of eyes. So is it worthwhile for me to perform the <em>Evaluation</em> step? There is the temptation to avoid the difficult thinking and just present various suggestions for improvement. While there is value in each of the first two steps on their own, the final step of making recommendations completes the value stream, and thus I believe is the most valuable. It turns a list of observations or improvement ideas into a prioritized, actionable list. I often see this as a major challenge for teams doing continuous improvement: lots of talk about the current state and ideas to improve, but no clear identification and actioning of the top priorities.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=ABciECDTqcg:Yr4eg39T16Q:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=ABciECDTqcg:Yr4eg39T16Q:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/ProfessionalSoftwareDevelopment/~4/ABciECDTqcg" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2012/the-difficulty-of-making-good-recommendations/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.basilv.com/psd/blog/2012/the-difficulty-of-making-good-recommendations</feedburner:origLink></item>
		<item>
		<title>The Cost of Poor Quality</title>
		<link>http://feedproxy.google.com/~r/ProfessionalSoftwareDevelopment/~3/hhRnaTUPUoU/the-cost-of-poor-quality</link>
		<comments>http://www.basilv.com/psd/blog/2012/the-cost-of-poor-quality#comments</comments>
		<pubDate>Tue, 28 Aug 2012 12:51:43 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[quality]]></category>
		<category><![CDATA[IT industry]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=839</guid>
		<description><![CDATA[Given my interest (some say obsession :) with producing high quality software, I am always on the lookout for stories highlighting the need for quality. So I was intrigued to hear a few weeks ago that the U.S. global financial services firm Knight Capital Group that lost over $440 million dollars in under one hour [...]]]></description>
			<content:encoded><![CDATA[<p>Given my interest (some say obsession :) with producing high quality software, I am always on the lookout for stories highlighting the need for quality. So I was intrigued to hear a few weeks ago that the U.S. global financial services firm <a href="http://www.knight.com/index.asp">Knight Capital Group</a> that lost over $440 million dollars in under one hour of trading due to a critical defect in newly installed trading software. </p>
<p>As bad as this sounds already, the broader ramifications were worse. This loss represented nearly four times Knight's 2011 profit, and instantaneously depleted Knight's capital, pushing it to the brink of bankruptcy. Since Knight is the one of the largest providers of market making services in the U.S., Wall Street essentially was forced to bail out the company to avoid a potential financial disaster in the markets. The New York Stock Exchange even had to formally submit a rules change to the U.S. Security and Exchange Commission to permit these emergency bailout actions. Existing shareholders of Knight experienced a bloodbath: their shares were diluted roughly 70% and the share price dropped to roughly one-quarter its original value as per the diagram below</p>
<p><a href="http://www.basilv.com/psd/wp-content/uploads/2012/08/KnightStockPrice.png"><img src="http://www.basilv.com/psd/wp-content/uploads/2012/08/KnightStockPrice-300x148.png" alt="" title="Knight&#039;s Stock Price Drop" width="300" height="148" class="alignright size-medium wp-image-840" /></a></p>
<p>Knight's story highlights the trend of software becoming more and more critical to organizations. David Kirkpatrick of Forbes magazine coined the phrase "<a href="http://www.forbes.com/sites/techonomy/2011/11/30/now-every-company-is-a-software-company/">Now every company is a software company</a>" to represent this new reality. This means that the impact to organizations of poor quality software is greatly magnified. Alistair Cockburn has defined <a href="http://alistair.cockburn.us/Cockburn+Scale/v/slim">a scale of software criticality</a> comprised of the following levels: Life, Essential money, Discretionary money, and Comfort. Alistair uses this scale (along with other factors) to tailor the method used to develop software. </p>
<p>Like Knight, many organizations now have software at the essential phase of criticality, but have not adjusted their processes to compensate. I suspect that they will only come to appreciate the need for high quality after experiencing the high cost of poor quality.</p>
<p>References regarding Knight's software defect:</p>
<ul>
<li>Knight press releases covering the event: <a href="http://www.knight.com/investorRelations/pressReleases.asp?compid=105070&#038;releaseID=1721599">August 2, 2012</a> and <a href="http://www.knight.com/investorRelations/pressReleases.asp?compid=105070&#038;releaseID=1722656">August 6, 2012</a>
</li>
<li><a href="href="http://www.tecca.com/news/2012/08/03/trading-software-algorithm-glitch/">News article summarizing the impact</a>
</li>
</ul>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=hhRnaTUPUoU:oSzz2IUlCEg:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?a=hhRnaTUPUoU:oSzz2IUlCEg:I9og5sOYxJI"><img src="http://feeds.feedburner.com/~ff/ProfessionalSoftwareDevelopment?d=I9og5sOYxJI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/ProfessionalSoftwareDevelopment/~4/hhRnaTUPUoU" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2012/the-cost-of-poor-quality/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.basilv.com/psd/blog/2012/the-cost-of-poor-quality</feedburner:origLink></item>
	</channel>
</rss>
