<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Mark Needham</title>
	
	<link>http://www.markhneedham.com/blog</link>
	<description>Thoughts on Software Development</description>
	<lastBuildDate>Sun, 19 May 2013 22:45:10 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/MarkNeedham" /><feedburner:info uri="markneedham" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:emailServiceId>MarkNeedham</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><item>
		<title>Ruby/Python: Constructing a taxonomy from an array using zip</title>
		<link>http://feedproxy.google.com/~r/MarkNeedham/~3/GTlJllUEYus/</link>
		<comments>http://www.markhneedham.com/blog/2013/05/19/rubypython-constructing-a-taxonomy-from-an-array-using-zip/#comments</comments>
		<pubDate>Sun, 19 May 2013 22:44:40 +0000</pubDate>
		<dc:creator>Mark Needham</dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.markhneedham.com/blog/?p=5337</guid>
		<description><![CDATA[As I mentioned in my previous blog post I&#8217;ve been hacking on a product taxonomy and I wanted to create a &#8216;CHILD&#8217; relationship between a collection of categories. For example, I had the following array and I wanted to transform it into an array of &#8216;SubCategory, Category&#8217; pairs: taxonomy = &#91;&#34;Cat&#34;, &#34;SubCat&#34;, &#34;SubSubCat&#34;&#93; # I [...]]]></description>
				<content:encoded><![CDATA[<p>As I <a href="http://www.markhneedham.com/blog/2013/05/19/neo4jcypher-keep-longest-path-when-finding-taxonomy/">mentioned in my previous blog post</a> I&#8217;ve been hacking on a product taxonomy and I wanted to create a &#8216;CHILD&#8217; relationship between a collection of categories.</p>
<p>For example, I had the following array and I wanted to transform it into an array of &#8216;SubCategory, Category&#8217; pairs:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">taxonomy = <span style="color: black;">&#91;</span><span style="color: #483d8b;">&quot;Cat&quot;</span>, <span style="color: #483d8b;">&quot;SubCat&quot;</span>, <span style="color: #483d8b;">&quot;SubSubCat&quot;</span><span style="color: black;">&#93;</span>
<span style="color: #808080; font-style: italic;"># I wanted this to become [(&quot;Cat&quot;, &quot;SubCat&quot;), (&quot;SubCat&quot;, &quot;SubSubCat&quot;)</span></pre></div></div>

<p>In order to do this we need to zip the first 2 items with the last which I found reasonably easy to do using Python:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">zip</span><span style="color: black;">&#40;</span>taxonomy<span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>, taxonomy<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>:<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'Cat'</span>, <span style="color: #483d8b;">'SubCat'</span><span style="color: black;">&#41;</span>, <span style="color: black;">&#40;</span><span style="color: #483d8b;">'SubCat'</span>, <span style="color: #483d8b;">'SubSubCat'</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span></pre></div></div>

<p>Here we using the <a href="http://stackoverflow.com/questions/509211/the-python-slice-notation">python array slicing notation</a> to get all but the last item of &#8216;taxonomy&#8217; and then all but the first item of &#8216;taxonomy&#8217; and zip them together.</p>
<p>I wanted to achieve that effect in Ruby though because my import job was written in that!</p>
<p>We can&#8217;t achieve the open ended slicing as far as I can tell so the following gives us an error:</p>

<div class="wp_syntax"><div class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#006600; font-weight:bold;">&gt;</span> taxonomy<span style="color:#006600; font-weight:bold;">&#91;</span>..<span style="color:#006600; font-weight:bold;">-</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span>
<span style="color:#CC00FF; font-weight:bold;">SyntaxError</span>: <span style="color:#006600; font-weight:bold;">&#40;</span>irb<span style="color:#006600; font-weight:bold;">&#41;</span>:<span style="color:#006666;">10</span>: syntax error, unexpected tDOT2, expecting <span style="color:#996600;">']'</span>
taxonomy<span style="color:#006600; font-weight:bold;">&#91;</span>..<span style="color:#006600; font-weight:bold;">-</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span>
           ^
	from <span style="color:#006600; font-weight:bold;">/</span>Users<span style="color:#006600; font-weight:bold;">/</span>markhneedham<span style="color:#006600; font-weight:bold;">/</span>.<span style="color:#9900CC;">rbenv</span><span style="color:#006600; font-weight:bold;">/</span>versions<span style="color:#006600; font-weight:bold;">/</span>1.9.3<span style="color:#006600; font-weight:bold;">-</span>p327<span style="color:#006600; font-weight:bold;">/</span>bin<span style="color:#006600; font-weight:bold;">/</span>irb:<span style="color:#006666;">12</span>:<span style="color:#9966CC; font-weight:bold;">in</span> <span style="color:#996600;">`&lt;main&gt;'</span></pre></div></div>

<p>The way negative indexing works is a bit different so to remove the last item of the array we use &#8216;-2&#8242; rather than &#8216;-1&#8242;:</p>

<div class="wp_syntax"><div class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#006600; font-weight:bold;">&gt;</span> taxonomy<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">0</span>..<span style="color:#006600; font-weight:bold;">-</span><span style="color:#006666;">2</span><span style="color:#006600; font-weight:bold;">&#93;</span>.<span style="color:#9900CC;">zip</span><span style="color:#006600; font-weight:bold;">&#40;</span>taxonomy<span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006666;">1</span>..<span style="color:#006600; font-weight:bold;">-</span><span style="color:#006666;">1</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
<span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#996600;">&quot;Cat&quot;</span>, <span style="color:#996600;">&quot;SubCat&quot;</span><span style="color:#006600; font-weight:bold;">&#93;</span>, <span style="color:#006600; font-weight:bold;">&#91;</span><span style="color:#996600;">&quot;SubCat&quot;</span>, <span style="color:#996600;">&quot;SubSubCat&quot;</span><span style="color:#006600; font-weight:bold;">&#93;</span><span style="color:#006600; font-weight:bold;">&#93;</span></pre></div></div>

<img src="http://feeds.feedburner.com/~r/MarkNeedham/~4/GTlJllUEYus" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.markhneedham.com/blog/2013/05/19/rubypython-constructing-a-taxonomy-from-an-array-using-zip/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.markhneedham.com/blog/2013/05/19/rubypython-constructing-a-taxonomy-from-an-array-using-zip/</feedburner:origLink></item>
		<item>
		<title>neo4j/cypher: Keep longest path when finding taxonomy</title>
		<link>http://feedproxy.google.com/~r/MarkNeedham/~3/B-80ZOgeC8U/</link>
		<comments>http://www.markhneedham.com/blog/2013/05/19/neo4jcypher-keep-longest-path-when-finding-taxonomy/#comments</comments>
		<pubDate>Sun, 19 May 2013 22:15:06 +0000</pubDate>
		<dc:creator>Mark Needham</dc:creator>
				<category><![CDATA[neo4j]]></category>
		<category><![CDATA[cypher]]></category>

		<guid isPermaLink="false">http://www.markhneedham.com/blog/?p=5334</guid>
		<description><![CDATA[I&#8217;ve been playing around with modelling a product taxonomy and one thing that I wanted to do was find out the full path where a product sits under the tree. I created a simple data set to show the problem: CREATE (cat { name: &#34;Cat&#34; }) CREATE (subcat1 { name: &#34;SubCat1&#34; }) CREATE (subcat2 { [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;ve been playing around with modelling a product taxonomy and one thing that I wanted to do was find out the full path where a product sits under the tree.</p>
<p>I created a <a href="http://console.neo4j.org/?id=62rmy2">simple data set</a> to show the problem:</p>

<div class="wp_syntax"><div class="code"><pre class="cypher" style="font-family:monospace;">CREATE (cat { name: &quot;Cat&quot; })
CREATE (subcat1 { name: &quot;SubCat1&quot; })
CREATE (subcat2 { name: &quot;SubCat2&quot; })
CREATE (subsubcat1 { name: &quot;SubSubCat1&quot; })
CREATE (product1 { name: &quot;Product1&quot; })
CREATE (cat)-[:CHILD]-subcat1-[:CHILD]-subsubcat1
CREATE (product1)-[:HAS_CATEGORY]-(subsubcat1)</pre></div></div>

<p>I wanted to write a query which would return &#8216;product1&#8242; and the tree &#8216;Cat -> SubCat1 -> SubSubCat1&#8242; and initially wrote the following query:</p>

<div class="wp_syntax"><div class="code"><pre class="cypher" style="font-family:monospace;">START product=node:node_auto_index(name=&quot;Product1&quot;) 
MATCH product-[:HAS_CATEGORY]-category, taxonomy=category&lt;-[:CHILD*1..]-parent 
RETURN product, EXTRACT(n IN NODES(taxonomy): n.name)</pre></div></div>

<p>which returns:</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">==&gt; +--------------------------------------------------------------------+
==&gt; | product                    | EXTRACT(n IN NODES(taxonomy): n.name) |
==&gt; +--------------------------------------------------------------------+
==&gt; | Node[888]{name:&quot;Product1&quot;} | [&quot;SubSubCat1&quot;,&quot;SubCat1&quot;]              |
==&gt; | Node[888]{name:&quot;Product1&quot;} | [&quot;SubSubCat1&quot;,&quot;SubCat1&quot;,&quot;Cat&quot;]        |
==&gt; +--------------------------------------------------------------------+
==&gt; 2 rows</pre></div></div>

<p>I didn&#8217;t want to return the first row since that isn&#8217;t the full tree and <a href="https://twitter.com/andres_taylor">Andres</a> suggested that looking for nodes which didn&#8217;t have any incoming children would help me do that:</p>

<div class="wp_syntax"><div class="code"><pre class="cypher" style="font-family:monospace;">START product=node:node_auto_index(name=&quot;Product1&quot;) 
MATCH product-[:HAS_CATEGORY]-category, 
      taxonomy=category&lt;-[:CHILD*1..]-parent 
WHERE NOT parent&lt;-[:CHILD]-() 
RETURN product, EXTRACT(n IN NODES(taxonomy): n.name)</pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">==&gt; +--------------------------------------------------------------------+
==&gt; | product                    | EXTRACT(n IN NODES(taxonomy): n.name) |
==&gt; +--------------------------------------------------------------------+
==&gt; | Node[888]{name:&quot;Product1&quot;} | [&quot;SubSubCat1&quot;,&quot;SubCat1&quot;,&quot;Cat&quot;]        |
==&gt; +--------------------------------------------------------------------+
==&gt; 1 row</pre></div></div>

<p>If we want to reverse the taxonomy so it&#8217;s in the right order we can follow <a href="http://stackoverflow.com/questions/13024098/how-to-get-a-null-value-when-using-the-head-function-with-an-empty-list">Wes Freeman&#8217;s advice from the following Stack Overflow thread</a>:</p>

<div class="wp_syntax"><div class="code"><pre class="cypher" style="font-family:monospace;">START product=node:node_auto_index(name=&quot;Product1&quot;) 
MATCH product-[:HAS_CATEGORY]-category, taxonomy=category&lt;-[:CHILD*1..]-parent 
WHERE NOT parent&lt;-[:CHILD]-() 
RETURN product, 
       REDUCE(acc=[], cat IN EXTRACT(n IN NODES(taxonomy): n.name): cat + acc) AS taxonomy</pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">==&gt; +-------------------------------------------------------------+
==&gt; | product                    | taxonomy                       |
==&gt; +-------------------------------------------------------------+
==&gt; | Node[888]{name:&quot;Product1&quot;} | [&quot;Cat&quot;,&quot;SubCat1&quot;,&quot;SubSubCat1&quot;] |
==&gt; +-------------------------------------------------------------+
==&gt; 1 row</pre></div></div>

<img src="http://feeds.feedburner.com/~r/MarkNeedham/~4/B-80ZOgeC8U" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.markhneedham.com/blog/2013/05/19/neo4jcypher-keep-longest-path-when-finding-taxonomy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.markhneedham.com/blog/2013/05/19/neo4jcypher-keep-longest-path-when-finding-taxonomy/</feedburner:origLink></item>
		<item>
		<title>Unix: Working with parts of large files</title>
		<link>http://feedproxy.google.com/~r/MarkNeedham/~3/1FSSx8J9H7g/</link>
		<comments>http://www.markhneedham.com/blog/2013/05/19/unix-working-with-parts-of-large-files/#comments</comments>
		<pubDate>Sun, 19 May 2013 21:44:03 +0000</pubDate>
		<dc:creator>Mark Needham</dc:creator>
				<category><![CDATA[Shell Scripting]]></category>
		<category><![CDATA[unix]]></category>

		<guid isPermaLink="false">http://www.markhneedham.com/blog/?p=5331</guid>
		<description><![CDATA[Chris and I were looking at the neo4j log files of a client earlier in the week and wanted to do some processing of the file so we could ask the client to send us some further information. The log file was over 10,000 lines long but the bit of the file we were interesting [...]]]></description>
				<content:encoded><![CDATA[<p><a href="https://twitter.com/digitalstain">Chris</a> and I were looking at the neo4j log files of a client earlier in the week and wanted to do some processing of the file so we could ask the client to send us some further information.</p>
<p>The log file was over 10,000 lines long but the bit of the file we were interesting in was only a few hundred lines.</p>
<p>I usually use Vim and the &#8216;:set number&#8217; when I want to refer to line numbers in a file but Chris showed me that we can achieve the same thing with e.g. &#8216;less -N data/log/neo4j.0.0.log&#8217;.</p>
<p>We can then operate on say lines 10-100 by passing the &#8216;-n&#8217; flag to sed:</p>
<blockquote><p>
-n      By default, each line of input is echoed to the standard output after all of the commands have been applied to it.  The -n option suppresses this behavior.
</p></blockquote>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">$ <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'10,15p'</span> data<span style="color: #000000; font-weight: bold;">/</span>log<span style="color: #000000; font-weight: bold;">/</span>neo4j.0.0.log
INFO: Enabling HTTPS on port <span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #000000;">7473</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>
May <span style="color: #000000;">19</span>, <span style="color: #000000;">2013</span> <span style="color: #000000;">11</span>:<span style="color: #000000;">11</span>:<span style="color: #000000;">52</span> AM org.neo4j.server.logging.Logger log
INFO: No SSL certificate found, generating a self-signed certificate..
May <span style="color: #000000;">19</span>, <span style="color: #000000;">2013</span> <span style="color: #000000;">11</span>:<span style="color: #000000;">11</span>:<span style="color: #000000;">53</span> AM org.neo4j.server.logging.Logger log
INFO: Mounted discovery module at <span style="color: #7a0874; font-weight: bold;">&#91;</span><span style="color: #000000; font-weight: bold;">/</span><span style="color: #7a0874; font-weight: bold;">&#93;</span>
May <span style="color: #000000;">19</span>, <span style="color: #000000;">2013</span> <span style="color: #000000;">11</span>:<span style="color: #000000;">11</span>:<span style="color: #000000;">53</span> AM org.neo4j.server.logging.Logger log</pre></div></div>

<p>We then used a combination of grep, awk and sort to work out which log files we needed.</p>
<img src="http://feeds.feedburner.com/~r/MarkNeedham/~4/1FSSx8J9H7g" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.markhneedham.com/blog/2013/05/19/unix-working-with-parts-of-large-files/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.markhneedham.com/blog/2013/05/19/unix-working-with-parts-of-large-files/</feedburner:origLink></item>
		<item>
		<title>A/B Testing: User Experience vs Conversion</title>
		<link>http://feedproxy.google.com/~r/MarkNeedham/~3/kFcH-Ie7uKY/</link>
		<comments>http://www.markhneedham.com/blog/2013/05/18/ab-testing-user-experience-vs-conversion/#comments</comments>
		<pubDate>Sat, 18 May 2013 20:18:50 +0000</pubDate>
		<dc:creator>Mark Needham</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[absplittesting]]></category>

		<guid isPermaLink="false">http://www.markhneedham.com/blog/?p=5328</guid>
		<description><![CDATA[I&#8217;ve written a couple of posts over the last few months about my experiences with A/B testing and one conversation we often used to have was around user experience vs conversion rate. Once you start running an A/B test it encourages you to focus more on the conversion rate of users in different parts of [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;ve written <a href="http://www.markhneedham.com/blog/2013/01/27/ab-testing-thoughts-so-far/">a</a> <a href="http://www.markhneedham.com/blog/2013/04/28/ab-testing-reporting/">couple</a> of posts over the last few months about my experiences with A/B testing and one conversation we often used to have was around user experience vs conversion rate.</p>
<p>Once you start running an A/B test it encourages you to focus more on the conversion rate of users in different parts of the flow and <strong>your inclination is to make changes that increase that conversion rate</strong>.</p>
<p>Another one of our drivers is to provide the best user experience that we can to our customers and since sometimes this means that the best thing for them is not to switch it seems that these two must be in conflict.</p>
<p>I found it particularly interesting seeing how the conversion rate could be <strong>impacted by the way that information was displayed to a user</strong>.</p>
<p>This was an idea that I first came across when reading about <a href="http://kylerush.net/blog/optimization-at-the-obama-campaign-ab-testing/">how the Obama campaign used A/B testing</a> where they noticed big changes in conversion rates by making small tweaks to sentences and imagery.</p>
<p>Our goal from a user experience perspective was to put all the information in front of the user so that they could make an informed choice about what to do.</p>
<p>Initially we made the negative features of the plans very prominent and had them in a large font which led to a drop in conversion.</p>
<p>We assumed that people were now giving more importance to the negative features than was warranted e.g. some plans had a cancellation fee but it typically only accounted for 5% of the saving they&#8217;d make by switching to the plan.</p>
<p>When the product is a bit more complicated we could argue that we improve the user experience by helping the user to make an appropriate choice.</p>
<p>On a website the way that we do this is by how we display information by changing the font size, font weight, positioning and a variety of other things.</p>
<p>It&#8217;s an interesting balance to find between the two drivers but if we veer towards conversion at all costs then although we&#8217;ll get a higher conversion rate in the long term we&#8217;ll have some frustrated customers who won&#8217;t use our website again.</p>
<p>If we look at it that way then the two drivers don&#8217;t seem so opposed to each other.</p>
<img src="http://feeds.feedburner.com/~r/MarkNeedham/~4/kFcH-Ie7uKY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.markhneedham.com/blog/2013/05/18/ab-testing-user-experience-vs-conversion/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.markhneedham.com/blog/2013/05/18/ab-testing-user-experience-vs-conversion/</feedburner:origLink></item>
		<item>
		<title>neo4j: When the web console returns nothing…use the data browser!</title>
		<link>http://feedproxy.google.com/~r/MarkNeedham/~3/2iDIcTdKp0A/</link>
		<comments>http://www.markhneedham.com/blog/2013/05/17/neo4j-when-the-web-console-returns-nothinguse-the-data-browser/#comments</comments>
		<pubDate>Fri, 17 May 2013 00:00:16 +0000</pubDate>
		<dc:creator>Mark Needham</dc:creator>
				<category><![CDATA[neo4j]]></category>

		<guid isPermaLink="false">http://www.markhneedham.com/blog/?p=5326</guid>
		<description><![CDATA[In my time playing around with neo4j I&#8217;ve run into a problem a few times where I executed a query using the web console (usually accessible @ http://localhost:7474/webadmin/#/console/) and have got absolutely no response. I noticed a similar thing today when Rickard and I were having a look at why a Lucene index query wasn&#8217;t [...]]]></description>
				<content:encoded><![CDATA[<p>In my time playing around with <a href="http://www.neo4j.org/">neo4j</a> I&#8217;ve run into a problem a few times where I executed a query using the web console (usually accessible @ <a href="http://localhost:7474/webadmin/#/console/">http://localhost:7474/webadmin/#/console/</a>) and have got absolutely no response.</p>
<p>I noticed a similar thing today when <a href="https://twitter.com/rickardoberg">Rickard</a> and I were having a look at why a Lucene index query wasn&#8217;t behaving as we expected.</p>
<p>I setup some data in a neo4j database using <a href="https://github.com/maxdemarzi/neography">neography</a> with the following code:</p>

<div class="wp_syntax"><div class="code"><pre class="ruby" style="font-family:monospace;"><span style="color:#CC0066; font-weight:bold;">require</span> <span style="color:#996600;">'neography'</span>
&nbsp;
<span style="color:#0066ff; font-weight:bold;">@neo</span> = <span style="color:#6666ff; font-weight:bold;">Neography::Rest</span>.<span style="color:#9900CC;">new</span>
&nbsp;
<span style="color:#0066ff; font-weight:bold;">@neo</span>.<span style="color:#9900CC;">create_node_index</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;Id_Index&quot;</span>, <span style="color:#996600;">&quot;exact&quot;</span>, <span style="color:#996600;">&quot;lucene&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
&nbsp;
node1 = <span style="color:#0066ff; font-weight:bold;">@neo</span>.<span style="color:#9900CC;">create_node</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;Hour&quot;</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#006666;">1</span>, <span style="color:#996600;">&quot;name&quot;</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;Max&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
node2 = <span style="color:#0066ff; font-weight:bold;">@neo</span>.<span style="color:#9900CC;">create_node</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;Hour&quot;</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#006666;">2</span>, <span style="color:#996600;">&quot;name&quot;</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;Mark&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
node3 = <span style="color:#0066ff; font-weight:bold;">@neo</span>.<span style="color:#9900CC;">create_node</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;Hour&quot;</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#006666;">3</span>, <span style="color:#996600;">&quot;name&quot;</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#996600;">&quot;Rickard&quot;</span><span style="color:#006600; font-weight:bold;">&#41;</span>
&nbsp;
<span style="color:#0066ff; font-weight:bold;">@neo</span>.<span style="color:#9900CC;">add_node_to_index</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;Id_Index&quot;</span>, <span style="color:#996600;">&quot;Hour&quot;</span>, <span style="color:#006666;">1</span>, node1<span style="color:#006600; font-weight:bold;">&#41;</span>
<span style="color:#0066ff; font-weight:bold;">@neo</span>.<span style="color:#9900CC;">add_node_to_index</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;Id_Index&quot;</span>, <span style="color:#996600;">&quot;Hour&quot;</span>, <span style="color:#006666;">2</span>, node2<span style="color:#006600; font-weight:bold;">&#41;</span> 
<span style="color:#0066ff; font-weight:bold;">@neo</span>.<span style="color:#9900CC;">add_node_to_index</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#996600;">&quot;Id_Index&quot;</span>, <span style="color:#996600;">&quot;Hour&quot;</span>, <span style="color:#006666;">3</span>, node3<span style="color:#006600; font-weight:bold;">&#41;</span></pre></div></div>

<p>I then ran the following query which I was expecting to return all the nodes:</p>

<div class="wp_syntax"><div class="code"><pre class="cypher" style="font-family:monospace;">start hour=node:Id_Index(&quot;Hour:[00 TO 02] or Hour:[03 TO 05]&quot;) RETURN hour</pre></div></div>

<p>Instead it returned nothing and I couldn&#8217;t see anything being logged either.</p>
<p>Rickard pointed out was because the exception is only returned to the API caller and that it would be better to run the query from the Data Browser which is typically accessible from <a href="http://localhost:7474/webadmin/#/data/search/">http://localhost:7474/webadmin/#/data/search/</a></p>
<p>If we run the query from there then we can see what&#8217;s going wrong:</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">BadInputException
&nbsp;
StackTrace:
org.neo4j.server.rest.repr.RepresentationExceptionHandlingIterable.exceptionOnHasNext(RepresentationExceptionHandlingIterable.java:50)
org.neo4j.helpers.collection.ExceptionHandlingIterable$1.hasNext(ExceptionHandlingIterable.java:60)
org.neo4j.helpers.collection.IteratorWrapper.hasNext(IteratorWrapper.java:42)
org.neo4j.server.rest.repr.ListRepresentation.serialize(ListRepresentation.java:58)
org.neo4j.server.rest.repr.Serializer.serialize(Serializer.java:75)
org.neo4j.server.rest.repr.MappingSerializer.putList(MappingSerializer.java:61)
org.neo4j.server.rest.repr.CypherResultRepresentation.serialize(CypherResultRepresentation.java:57)
org.neo4j.server.rest.repr.MappingRepresentation.serialize(MappingRepresentation.java:42)
org.neo4j.server.rest.repr.OutputFormat.assemble(OutputFormat.java:179)
org.neo4j.server.rest.repr.OutputFormat.formatRepresentation(OutputFormat.java:131)
org.neo4j.server.rest.repr.OutputFormat.response(OutputFormat.java:117)
org.neo4j.server.rest.repr.OutputFormat.ok(OutputFormat.java:55)
org.neo4j.server.rest.web.CypherService.cypher(CypherService.java:94)
java.lang.reflect.Method.invoke(Method.java:597)</pre></div></div>

<p>There seemed to be some strangeness going on with how Lucene handles the query when a default search field isn&#8217;t provided but we noticed that it behaved as expected if we didn&#8217;t use an OR since Lucene has an implicit OR between statements anyway. </p>

<div class="wp_syntax"><div class="code"><pre class="cypher" style="font-family:monospace;">start hour=node:Id_Index(&quot;Hour:[00 TO 02] Hour:[03 TO 05]&quot;) RETURN hour</pre></div></div>

<p>Either way, the lesson for me was if the console isn&#8217;t giving a result run the query in the data browser to work out what&#8217;s going wrong!</p>
<img src="http://feeds.feedburner.com/~r/MarkNeedham/~4/2iDIcTdKp0A" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.markhneedham.com/blog/2013/05/17/neo4j-when-the-web-console-returns-nothinguse-the-data-browser/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.markhneedham.com/blog/2013/05/17/neo4j-when-the-web-console-returns-nothinguse-the-data-browser/</feedburner:origLink></item>
		<item>
		<title>Book Review: The Signal and the Noise – Nate Silver</title>
		<link>http://feedproxy.google.com/~r/MarkNeedham/~3/M4menT7wNeU/</link>
		<comments>http://www.markhneedham.com/blog/2013/05/14/book-review-the-signal-and-the-noise-nate-silver/#comments</comments>
		<pubDate>Tue, 14 May 2013 00:16:56 +0000</pubDate>
		<dc:creator>Mark Needham</dc:creator>
				<category><![CDATA[Books]]></category>

		<guid isPermaLink="false">http://www.markhneedham.com/blog/?p=5323</guid>
		<description><![CDATA[Nate Silver is famous for having correctly predicted the winner of all 50 states in the 2012 United States elections and Sid recommended his book so I could learn more about statistics for the A/B tests that we were running. I thought the book was a really good introduction to applied statistics and by using [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://en.wikipedia.org/wiki/Nate_Silver">Nate Silver</a> is famous for having correctly predicted the winner of all 50 states in the 2012 United States elections and <a href="https://twitter.com/siddharthdawara">Sid</a> recommended <a href="http://www.amazon.co.uk/The-Signal-Noise-Science-Prediction/dp/0141975652/ref=sr_1_1?ie=UTF8&#038;qid=1368486945&#038;sr=8-1&#038;keywords=nate+silver">his book</a> so I could learn more about statistics for the <a href="http://www.markhneedham.com/blog/2013/04/28/ab-testing-reporting/">A/B tests</a> that we were running.</p>
<p>I thought the book was a really good introduction to applied statistics and by using real life examples which most people would be able to relate to it makes a potentially dull subject interesting.</p>
<p>Reasonably early on the author points out that there&#8217;s a difference between making a prediction and making a forecast:</p>
<ul>
<li><strong>Prediction</strong> &#8211; a definitive and specific statement about when and where something will happen e.g. a major earthquake will hit Kyoto, Japan, on June 28.
</li>
<li><strong>Forecast</strong> &#8211; a probabilistic statement over a longer time scale e.g. there is a 60% chance of an earthquake in Southern California over the next 30 years.
</ul>
<p>The book mainly focuses on the latter.
</p>
<p>We then move onto quite an interesting section about <strong>over fitting which is where we mistake noise for signal in our data</strong>.</p>
<p>I first came across this term when <a href="https://twitter.com/jennifersmithco">Jen</a> and I were working through one of the <a href="http://www.markhneedham.com/blog/tag/kaggle/">Kaggle</a> problems and were using a <a href="http://www.markhneedham.com/blog/2012/10/27/kaggle-digit-recognizer-mahout-random-forest-attempt/">random forest</a> of deliberately over fitted Decision Trees to do digit recognition.</p>
<p>It&#8217;s not a problem when we combine lots of decision trees together and use a majority wins algorithm to make our prediction but if we use just one of them its predictions on any new data will be completely wrong.</p>
<p>Later on in the book he points out that a lot of conspiracy theories come <strong>when we look at data retrospectively</strong> and can easily detect signal from noise in data when at the time it was much more difficult.</p>
<p>He also points out that sometimes there isn&#8217;t actually any signal, it&#8217;s all noise, and we can fall into the trap of looking for something that isn&#8217;t there. I think this &#8216;noise&#8217; is what we&#8217;d refer to as random variation in the context of an <a href="http://www.markhneedham.com/blog/2013/01/27/ab-testing-thoughts-so-far/">A/B test</a>.</p>
<p>Silver also encourages us to make sure that we understand the theory behind any inference we make:</p>
<blockquote><p>Statistical inferences are much stronger when backed up by theory or at least some deeper thinking about their root causes.</p></blockquote>
<p>When we were running A/B tests Sid encouraged people to <strong>think whether a theory about why conversion had changed made logical sense</strong> before assuming it was true which I think covers similar ground.</p>
<p>A big chunk of the book covers <a href="http://en.wikipedia.org/wiki/Bayes'_theorem">Bayes&#8217; theorem</a> and how often when we&#8217;re making forecasts we have prior beliefs which it forces us to make explicit.</p>
<p>For example there is a section which talks about the probability a lady is being cheated on given that she&#8217;s found some underwear that she doesn&#8217;t recognise in her house.</p>
<p>In order to work out the probability she&#8217;s being cheated on we need to know the probability that she was being cheated on before she found the underwear. Silver suggests that since 4% of married partners cheat on their spouses that would be a good number to use.</p>
<p>He then goes on to show multiple other problems throughout the book that we can apply Bayes&#8217; theorem to.</p>
<p>Some other interesting things I picked up are that <strong>if we&#8217;re good at forecasting then being given more information should make our forecast better</strong> and that <strong>when we don&#8217;t have any special information we&#8217;re better off following the opinion of the crowd</strong>.</p>
<div style="float:right">
<img src="http://www.markhneedham.com/blog/wp-content/uploads/2013/05/IMG_20130514_011256.jpg" alt="IMG 20130514 011256" title="IMG_20130514_011256.jpg" border="0" width="400" />
</div>
<p>Silver also showed a clever trick for inferring data points on a data set which follows a power law i.e. the long tail distribution where there are very few massive events but lots of really small ones.</p>
<p>We have a power law distribution when modelling the number of terrorists attacks vs number of fatalities but if we <strong>change both scales to be logarithmic</strong> we can come up with a probability of how likely more deadly attacks are.</p>
<p>There is then some discussion of how we can make changes in the way that we treat terrorism to try and impact the shape of the chart e.g. in Israel Silver suggests that they really want to avoid a very deadly attack but at the expense of there being more smaller attacks.</p>
<p>A lot of the book is spent discussing weather/earthquake forecasting which is very interesting to read about but I couldn&#8217;t quite see a link back to the software context.</p>
<p>Overall though I found it an interesting read although there are probably a few places that you can skim over the detail and still get the gist of what he&#8217;s saying.</p>
<img src="http://feeds.feedburner.com/~r/MarkNeedham/~4/M4menT7wNeU" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.markhneedham.com/blog/2013/05/14/book-review-the-signal-and-the-noise-nate-silver/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.markhneedham.com/blog/2013/05/14/book-review-the-signal-and-the-noise-nate-silver/</feedburner:origLink></item>
		<item>
		<title>Sublime: Overriding default file type/Assigning specific files to a file type</title>
		<link>http://feedproxy.google.com/~r/MarkNeedham/~3/ZR76OtqfSx0/</link>
		<comments>http://www.markhneedham.com/blog/2013/05/05/sublime-overriding-default-file-typeassigning-specific-files-to-a-file-type/#comments</comments>
		<pubDate>Sun, 05 May 2013 00:03:17 +0000</pubDate>
		<dc:creator>Mark Needham</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[sublime]]></category>

		<guid isPermaLink="false">http://www.markhneedham.com/blog/?p=5319</guid>
		<description><![CDATA[I&#8217;ve been using Sublime a bit recently and one thing I wanted to do was put neo4j cypher queries into files with arbitrary extensions and have them recognised as cypher files every time I open them. I&#8217;m using the cypher Sublime plugin to get the syntax highlighting but since I&#8217;ve got my cypher in a [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;ve been using <a href="">Sublime</a> a bit recently and one thing I wanted to do was put <a href="">neo4j cypher</a> queries into files with arbitrary extensions and have them recognised as cypher files every time I open them.</p>
<p>I&#8217;m using the <a href="">cypher Sublime plugin</a> to get the syntax highlighting but since I&#8217;ve got my cypher in a .haml file it only remembers that it should have cypher highlighting as long as the file is open.</p>
<p>As soon as I close and then re-open the file it goes back to being highlighted as HAML.</p>
<p>I initially thought that the way around this would be to write a plugin which kept track of files that you&#8217;d manually assigned a syntax to but then I came across the <a href="https://github.com/facelessuser/ApplySyntax">ApplySyntax</a> plugin which seems even better.</p>
<p>ApplySyntax allows you to assign syntaxes to files based on regular expression matching on the file name or on the first line of the file.</p>
<p>At the moment, the easiest way to detect that a file is a cypher query is that the first line will begin with &#8216;START&#8217; so I wrote the following in my user settings file:</p>
<p><em>~/Library/Application Support/Sublime Text 2/Packages/User/ApplySyntax.sublime-settings</em></p>

<div class="wp_syntax"><div class="code"><pre class="json" style="font-family:monospace;">{
	&quot;reraise_exceptions&quot;: false,
	&quot;new_file_syntax&quot;: false,
	&quot;syntaxes&quot;: [
		{			
			&quot;name&quot;: &quot;Cypher&quot;,
			&quot;rules&quot;: [
				{&quot;first_line&quot;: &quot;^START&quot;}
			]
		}	
	]
}</pre></div></div>

<p>ApplySyntax is a pretty neat plugin, worth having a look if you have this problem to solve!</p>
<img src="http://feeds.feedburner.com/~r/MarkNeedham/~4/ZR76OtqfSx0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.markhneedham.com/blog/2013/05/05/sublime-overriding-default-file-typeassigning-specific-files-to-a-file-type/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.markhneedham.com/blog/2013/05/05/sublime-overriding-default-file-typeassigning-specific-files-to-a-file-type/</feedburner:origLink></item>
		<item>
		<title>Ruby 1.9.3 p0: Investigating weirdness with HTTP POST request in net/http</title>
		<link>http://feedproxy.google.com/~r/MarkNeedham/~3/cVCUqR6TSDw/</link>
		<comments>http://www.markhneedham.com/blog/2013/04/30/ruby-1-9-3-p0-investigating-weirdness-with-http-post-request-in-nethttp/#comments</comments>
		<pubDate>Tue, 30 Apr 2013 21:37:11 +0000</pubDate>
		<dc:creator>Mark Needham</dc:creator>
				<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.markhneedham.com/blog/?p=5313</guid>
		<description><![CDATA[Thibaut and I spent the best part of the last couple of days trying to diagnose a problem we were having trying to make a POST request using rest-client to one of our services. We have nginx fronting the application server so the request passes through there first: The problem we were having was that [...]]]></description>
				<content:encoded><![CDATA[<p><a href="https://twitter.com/the_T_bot">Thibaut</a> and I spent the best part of the last couple of days trying to diagnose a problem we were having trying to make a POST request using <a href="https://www.ruby-toolbox.com/projects/rest-client">rest-client</a> to one of our services.</p>
<p>We have nginx fronting the application server so the request passes through there first:</p>
<div align="center">
<p><img src="http://www.markhneedham.com/blog/wp-content/uploads/2013/04/post.png" alt="Post" title="post.png" border="0" width="335" height="185" /></div>
<p>The problem we were having was that the request was timing out on the client side before it had been processed and the request wasn&#8217;t reaching the application server.</p>
<p>We initially thought there might be a problem with our nginx configuration because we don&#8217;t have many POST requests with largish (40kb) payloads so we initially tried tweaking the <a href="http://wiki.nginx.org/HttpProxyModule#proxy_buffer_size">proxy buffer size</a>.</p>
<p>It was a bit of a long shot because changing that setting only reduces the likelihood that nginx writes the request body to disc and then loads it later which shouldn&#8217;t impact performance that much.</p>
<p>The next thing we tried was replicating the request using <a href="http://curl.haxx.se/">cURL</a> with a smaller payload which worked fine. cURL had no problem with the bigger payload either.</p>
<p>We therefore thought there must be a difference in the request headers being sent by rest-client and our initial investigation suggested that it might be to do with the &#8216;<a href="http://stackoverflow.com/questions/2773396/whats-the-content-length-field-in-http-header">Content-Length</a>&#8216; header.</p>
<p>There was a 1 byte difference in the value being sent by cURL and the one being sent by rest-client which was to do with the last character of the payload being a <a href="http://homepage.smc.edu/morgan_david/CS41/lineterminators.htm">0A</a> (linefeed) character.</p>
<p>We changed the &#8216;Content-Length&#8217; header on our cURL request to match that of the rest-client request (i.e. 1 byte too large) and were able to replicate the timeout problem.</p>
<p>At this stage we thought that calling &#8216;strip&#8217; on the body of our rest-client request would solve the problem as the &#8216;Content-Length&#8217; header would now be set to the correct value. It did set the &#8216;Content-Length&#8217; header properly but unfortunately didn&#8217;t get rid of the timeout.</p>
<p>Our next step was to check whether or not we could get any request to work from rest-client so we tried using a smaller payload which worked fine.</p>
<p>At this stage <a href="https://twitter.com/jasonneylon">Jason</a> heard us discussing what to do next and said that he&#8217;d come across it earlier and that upgrading our Ruby Version from &#8217;1.9.3p0&#8242; would solve all our woes.</p>
<p>That Ruby version is a couple of years old and most of our servers are running &#8217;1.9.3p392&#8242; but somehow this one had slipped through the net.</p>
<p>We <a href="http://www.markhneedham.com/blog/2013/04/27/treat-servers-as-cattle-spin-them-up-tear-them-down/">spun up a new server</a> with that version of Ruby installed and it did indeed fix the problem.</p>
<p>However, we were curious what the fix was and had a look at the <a href="http://svn.ruby-lang.org/repos/ruby/tags/v1_9_3_125/ChangeLog">change log of the first patch release after &#8217;1.9.3p0&#8242;</a>. We noticed the following which seemed relevant:</p>
<blockquote><p>
Tue May 31 17:03:24 2011  Hiroshi Nakamura  <nahi@ruby-lang.org></p>
<p>	* lib/net/http.rb, lib/net/protocol.rb: Allow to configure to wait<br />
	  server returning &#8217;100 continue&#8217; response before sending HTTP request<br />
	  body. See NEWS for more detail. See #3622.<br />
	  Original patch is made by Eric Hodel <drbrain@segment7.net>.</p>
<p>	* test/net/http/test_http.rb: test it.</p>
<p>	* NEWS: Add new feature.
</p></blockquote>
<p>One thing we noticed from looking at the requests with <a href="http://vccv.posterous.com/use-ngrep-to-inspect-http-headers">ngrep</a> was that cURL was setting the <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2.3">100 Continue Expect request header</a> and rest-client wasn&#8217;t.</p>
<p>When the payload size was small nginx didn&#8217;t seem to send a &#8217;100 Continue&#8217; response which was presumably why we weren&#8217;t seeing a problem with the small payloads.</p>
<p>I wasn&#8217;t sure how to go about finding out exactly what was going wrong but given how long it took us to get to this point I thought I&#8217;d summarise what we tried and see if anyone could explain it to me.</p>
<p>So if you&#8217;ve come across this problem (probably 2 years ago!) it&#8217;d be cool to know exactly what the problem was.</p>
<img src="http://feeds.feedburner.com/~r/MarkNeedham/~4/cVCUqR6TSDw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.markhneedham.com/blog/2013/04/30/ruby-1-9-3-p0-investigating-weirdness-with-http-post-request-in-nethttp/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.markhneedham.com/blog/2013/04/30/ruby-1-9-3-p0-investigating-weirdness-with-http-post-request-in-nethttp/</feedburner:origLink></item>
		<item>
		<title>Mac OS X: A couple of neat tools</title>
		<link>http://feedproxy.google.com/~r/MarkNeedham/~3/wVRQ8C81CK8/</link>
		<comments>http://www.markhneedham.com/blog/2013/04/30/mac-os-x-a-couple-of-neat-tools/#comments</comments>
		<pubDate>Tue, 30 Apr 2013 20:07:57 +0000</pubDate>
		<dc:creator>Mark Needham</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://www.markhneedham.com/blog/?p=5306</guid>
		<description><![CDATA[When I first started working at uSwitch Sid installed a couple of &#8216;productivity applications&#8217; on my Mac which I&#8217;ve found pretty useful but from talking to others I realised they aren&#8217;t known/being used by everyone. Alfred Alfred is a Quick Silver replacement which allows you to quickly open applications, find files, search Google and more. [...]]]></description>
				<content:encoded><![CDATA[<p>When I first started working at <a href="http://www.uswitch.com/">uSwitch</a> <a href="https://twitter.com/siddharthdawara">Sid</a> installed a couple of &#8216;productivity applications&#8217; on my Mac which I&#8217;ve found pretty useful but from talking to others I realised they aren&#8217;t known/being used by everyone.</p>
<h4>Alfred</h4>
<p><a href="http://www.alfredapp.com/">Alfred</a> is a <a href="http://en.wikipedia.org/wiki/Quicksilver_(software)">Quick Silver</a> replacement which allows you to quickly open applications, find files, search Google and more. Even though we&#8217;re not using half of its features it&#8217;s still proved to be useful.</p>
<p>I quite like the calculator feature which we&#8217;ve been using for adhoc calculation like working out <a href="http://www.markhneedham.com/blog/2013/04/10/awk-parsing-free-m-output-to-get-memory-usageconsumption/">how much free memory there was on a server</a> or the <a href="http://www.markhneedham.com/blog/2013/04/28/ab-testing-reporting/">conversion rate on part of an A/B test</a>.</p>
<div align="center">
<img src="http://www.markhneedham.com/blog/wp-content/uploads/2013/04/calculator.png" alt="Calculator" title="calculator.png" border="0" width="600" height="169" />
</div>
<h4>Moom</h4>
<p>The other application is <a href="http://manytricks.com/moom/">Moom</a> which allows you to move/resize windows.</p>
<p>I didn&#8217;t see the point when I first saw it but it&#8217;s actually really useful when you&#8217;re working on a big monitor and want to put say the terminal alongside the browser.</p>
<p>We have the following shortcuts set up:</p>
<div align="center">
<img src="http://www.markhneedham.com/blog/wp-content/uploads/2013/04/moom1.png" alt="Moom1" title="moom1.png" border="0" width="474" height="443" />
</div>
<p>That allows us to type &#8216;Ctrl + Space&#8217; to make the window fill the left hand side of the screen, &#8216;Alt + Space&#8217; to make it fill the right hand side of the screen and &#8216;Alt + Ctrl + Space&#8217; to fill the whole screen.</p>
<p>You can also set up shortcuts to allow you to move a window between displays or to rearrange the windows based on certain events.</p>
<p>Highly recommended!</p>
<p>If anyone knows any other cool tools like this I&#8217;d love to hear about them.</p>
<img src="http://feeds.feedburner.com/~r/MarkNeedham/~4/wVRQ8C81CK8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.markhneedham.com/blog/2013/04/30/mac-os-x-a-couple-of-neat-tools/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		<feedburner:origLink>http://www.markhneedham.com/blog/2013/04/30/mac-os-x-a-couple-of-neat-tools/</feedburner:origLink></item>
		<item>
		<title>neo4j/cypher: Returning a row with zero count when no relationship exists</title>
		<link>http://feedproxy.google.com/~r/MarkNeedham/~3/u_kB1-aHa58/</link>
		<comments>http://www.markhneedham.com/blog/2013/04/30/neo4jcypher-returning-a-row-with-zero-count-when-no-relationship-exists/#comments</comments>
		<pubDate>Tue, 30 Apr 2013 07:02:09 +0000</pubDate>
		<dc:creator>Mark Needham</dc:creator>
				<category><![CDATA[neo4j]]></category>
		<category><![CDATA[cypher]]></category>

		<guid isPermaLink="false">http://www.markhneedham.com/blog/?p=5299</guid>
		<description><![CDATA[I&#8217;ve been trying to see if I can match some of the football stats that OptaJoe posts on twitter and one that I was looking at yesterday was around the number of red cards different teams have received. 1 &#8211; Sunderland have picked up their first PL red card of the season. The only team [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;ve been trying to see if I can match some of the football stats that <a href="https://twitter.com/OptaJoe">OptaJoe</a> posts on twitter and one that I was looking at yesterday was around the <a href="https://twitter.com/OptaJoe/status/328969438361690113">number of red cards different teams have received</a>.</p>
<blockquote><p>
1 &#8211; Sunderland have picked up their first PL red card of the season. The only team without one now are Man Utd. Angels.
</p></blockquote>
<p>To refresh this is the sub graph that we&#8217;ll need to look at to work it out:</p>
<div align="center">
<img src="http://www.markhneedham.com/blog/wp-content/uploads/2013/04/sent_off.png" alt="Sent off" title="sent_off.png" border="0" width="242" height="262" />
</div>
<p>I started off with the following query which traverses out from each match, finds the players who were sent off in the match and then <a href="http://www.markhneedham.com/blog/2013/02/17/neo4jcypher-sql-style-group-by-functionality/">groups</a> the sendings off by the team they were playing for:</p>

<div class="wp_syntax"><div class="code"><pre class="cypher" style="font-family:monospace;">START game = node:matches('match_id:*')
MATCH game&lt;-[:sent_off_in]-player-[:played]-&gt;likeThis-[:in]-&gt;game, 
      likeThis-[:for]-&gt;team
RETURN team.name, COUNT(game) AS redCards
ORDER BY redCards
LIMIT 5</pre></div></div>

<p>When we run this we get the following results:</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">+------------------------------+
| team.name         | redCards |
+------------------------------+
| &quot;Sunderland&quot;      | 1        |
| &quot;West Ham United&quot; | 1        |
| &quot;Norwich City&quot;    | 1        |
| &quot;Reading&quot;         | 1        |
| &quot;Liverpool&quot;       | 2        |
+------------------------------+
5 rows</pre></div></div>

<p>The problem we have here is that it hasn&#8217;t returned Manchester United because they haven&#8217;t yet received any red cards and therefore none of their players match the &#8216;sent_off_in&#8217; relationship.</p>
<p>I ran into something similar in a post I wrote about a month ago where I was <a href="http://www.markhneedham.com/blog/2013/03/20/neo4jcypher-getting-the-hang-of-the-with-statement/">working out which day of the week players scored on</a>.</p>
<p>The first step towards getting Manchester United to return with a count of 0 is to make the &#8216;sent_off_in&#8217; relationship optional.</p>
<p>However, that on its own that isn&#8217;t enough because it now returns a count of all the player performances for each team:</p>

<div class="wp_syntax"><div class="code"><pre class="cypher" style="font-family:monospace;">START game = node:matches('match_id:*')
MATCH game&lt;-[?:sent_off_in]-player-[:played]-&gt;likeThis-[:in]-&gt;game, 
      likeThis-[:for]-&gt;team
RETURN team.name, COUNT(game) AS redCards
ORDER BY redCards ASC
LIMIT 5</pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">+-----------------------------+
| team.name        | redCards |
+-----------------------------+
| &quot;Chelsea&quot;        | 448      |
| &quot;Wigan Athletic&quot; | 459      |
| &quot;Fulham&quot;         | 460      |
| &quot;Liverpool&quot;      | 466      |
| &quot;Everton&quot;        | 467      |
+-----------------------------+
5 rows</pre></div></div>

<p>Instead what we need to do is collect up all the &#8216;sent_off_in&#8217; relationships and sum them up.</p>
<p>We can use the <a href="http://www.markhneedham.com/blog/2013/03/20/neo4jcypher-with-collect-extract/">COLLECT</a> function to do that and the neat thing about COLLECT is that it doesn&#8217;t bother collecting the empty relationships so we end up with exactly what we need:</p>

<div class="wp_syntax"><div class="code"><pre class="cypher" style="font-family:monospace;">START game = node:matches('match_id:*')
MATCH game&lt;-[r?:sent_off_in]-player-[:played]-&gt;likeThis-[:in]-&gt;game, 
      likeThis-[:for]-&gt;team
RETURN team.name, COLLECT(r) AS redCards
LIMIT 5</pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">+-----------------------------------------------------------------------------------------------------+
| team.name          | redCards                                                                       |
+-----------------------------------------------------------------------------------------------------+
| &quot;Wigan Athletic&quot;   | [:sent_off_in[26443] {},:sent_off_in[37785] {}]                                |
| &quot;Everton&quot;          | [:sent_off_in[6795] {minute:61},:sent_off_in[21735] {},:sent_off_in[34594] {}] |
| &quot;Newcastle United&quot; | [:sent_off_in[434] {minute:75},:sent_off_in[32389] {},:sent_off_in[34915] {}]  |
| &quot;Southampton&quot;      | [:sent_off_in[49393] {minute:70},:sent_off_in[49392] {minute:82}]              |
| &quot;West Ham United&quot;  | [:sent_off_in[21734] {minute:67}]                                              |
+-----------------------------------------------------------------------------------------------------+
5 rows</pre></div></div>

<p>We then just need to call the LENGTH function to work out how many red cards there are in each collection and then we&#8217;re done:</p>

<div class="wp_syntax"><div class="code"><pre class="cypher" style="font-family:monospace;">START game = node:matches('match_id:*')
MATCH game&lt;-[r?:sent_off_in]-player-[:played]-&gt;likeThis-[:in]-&gt;game, 
      likeThis-[:for]-&gt;team
RETURN team.name, LENGTH(COLLECT(r)) AS redCards
ORDER BY redCards
LIMIT 5</pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">+--------------------------------+
| team.name           | redCards |
+--------------------------------+
| &quot;Manchester United&quot; | 0        |
| &quot;West Ham United&quot;   | 1        |
| &quot;Sunderland&quot;        | 1        |
| &quot;Norwich City&quot;      | 1        |
| &quot;Reading&quot;           | 1        |
+--------------------------------+
5 rows</pre></div></div>

<img src="http://feeds.feedburner.com/~r/MarkNeedham/~4/u_kB1-aHa58" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.markhneedham.com/blog/2013/04/30/neo4jcypher-returning-a-row-with-zero-count-when-no-relationship-exists/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		<feedburner:origLink>http://www.markhneedham.com/blog/2013/04/30/neo4jcypher-returning-a-row-with-zero-count-when-no-relationship-exists/</feedburner:origLink></item>
	</channel>
</rss>
