<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Connecting Things - Ross Bates</title>
	
	<link>http://www.rossbates.com</link>
	<description />
	<lastBuildDate>Thu, 19 Jan 2012 20:06:19 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/connectingthings" /><feedburner:info uri="connectingthings" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>To Understand is to Perceive Patterns</title>
		<link>http://feedproxy.google.com/~r/connectingthings/~3/J6ttSgeA9n4/</link>
		<comments>http://www.rossbates.com/2012/01/to-understand-is-to-perceive-patterns/#comments</comments>
		<pubDate>Thu, 19 Jan 2012 18:19:27 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Collaboration]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[Misc]]></category>
		<category><![CDATA[Social Networks]]></category>
		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/?p=356</guid>
		<description><![CDATA[Take a minute to watch this video made by @jason_silva and @notthisbody From cells to a cities, visualizing the world as a series of recurring patterns which can be understood is awe inspiring to me. To think that once we unlock these patterns we&#8217;ll find nothing is random, that makes me optimistic.]]></description>
			<content:encoded><![CDATA[<p>Take a minute to watch this video made by <a href="https://twitter.com/jason_silva" title="@jason_silva">@jason_silva</a> and <a href="https://twitter.com/notthisbody" title="@notthisbody">@notthisbody</a></p>
<p><iframe src="http://player.vimeo.com/video/34182381?title=0&amp;byline=0&amp;portrait=0" width="400" height="225" frameborder="0" webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe></p>
<p>From cells to a cities, visualizing the world as a series of recurring patterns which can be understood is awe inspiring to me.</p>
<p>To think that once we unlock these patterns we&#8217;ll find nothing is random, that makes me optimistic.</p>
<img src="http://feeds.feedburner.com/~r/connectingthings/~4/J6ttSgeA9n4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2012/01/to-understand-is-to-perceive-patterns/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.rossbates.com/2012/01/to-understand-is-to-perceive-patterns/</feedburner:origLink></item>
		<item>
		<title>CherryPy Performance Management</title>
		<link>http://feedproxy.google.com/~r/connectingthings/~3/LPrXRLAg_p8/</link>
		<comments>http://www.rossbates.com/2011/12/cherrypy-new-relic/#comments</comments>
		<pubDate>Wed, 21 Dec 2011 22:24:43 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/?p=347</guid>
		<description><![CDATA[Being a Python guy I&#8217;m doing most of my web development these days using CherryPy. If you are not familiar it&#8217;s a  Python web framework that to me strikes just the right balance of simplicity and power. Behind the scenes at Key Ring we also have a large Rails footprint. For the past year we&#8217;ve been using New Relic [...]]]></description>
			<content:encoded><![CDATA[<p>Being a Python guy I&#8217;m doing most of my web development these days using <a title="CherryPy - A Minimalist Python Web Framework" href="http://cherrypy.org/">CherryPy</a>. If you are not familiar it&#8217;s a  Python web framework that to me strikes just the right balance of simplicity and power.</p>
<p>Behind the scenes at <a href="http://keyringapp.com">Key Ring</a> we also have a large Rails footprint. For the past year we&#8217;ve been using <a href="http://newrelic.com/">New Relic</a> to monitor application performance up and down the stack. When it comes to monitoring and troubleshooting a web app I can&#8217;t recommend these guys enough. They&#8217;re constantly making improvements, adding features, and the product has become indispensable to our team.</p>
<p>So I just finished building some data services in CherryPy and was psyched to hook up the New Relic agent to this app. The <a href="http://newrelic.com/docs/python/">docs</a> on the New Relic site are pretty thorough but I wasn&#8217;t able to piece together how to hook the agent into the CherryPy app. Also, could&#8217;t find anything on the web. Through trial and error I was able to get it working&#8230;. it&#8217;s actually extremely easy if you just follow these steps.</p>
<p>&nbsp;</p>
<p>1. Download the latest Python agent. Unpack.</p>
<pre style="padding-left: 30px;"> wget http://download.newrelic.com/python_agent/release/newrelic-1.0.5.156.tar.gz</pre>
<p>2. Run the setup</p>
<pre style="padding-left: 30px;">python setup.py install</pre>
<p>3. Generate a New Relic agent config file using your API key.</p>
<pre style="padding-left: 30px;">newrelic-admin generate-config $YOUR_API_KEY newrelic.ini</pre>
<p>4. Change your app name in the New Relic config. There are some other logging bells and whistles which are all self-explanatory</p>
<p>5.  Add these 2 lines to your sites startup script. Make sure you do it early in the script before any database calls, etc&#8230;</p>
<pre style="padding-left: 30px;">import newrelic.agent</pre>
<pre style="padding-left: 30px;">newrelic.agent.initialize('/path/to/newrelic.ini')</pre>
<p>&nbsp;</p>
<p>And that&#8217;s it. In the next minute or so you&#8217;ll begin to see your stats streaming into your New Relic dashboard.</p>
<pre></pre>
<img src="http://feeds.feedburner.com/~r/connectingthings/~4/LPrXRLAg_p8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2011/12/cherrypy-new-relic/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.rossbates.com/2011/12/cherrypy-new-relic/</feedburner:origLink></item>
		<item>
		<title>ADB Timeout on Samsung Galaxy S</title>
		<link>http://feedproxy.google.com/~r/connectingthings/~3/35ni9ZPR_1o/</link>
		<comments>http://www.rossbates.com/2011/04/adb-timeout-on-samsung-galaxy-s/#comments</comments>
		<pubDate>Wed, 06 Apr 2011 20:15:33 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Misc]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/?p=343</guid>
		<description><![CDATA[I was going crazy this morning trying to get a build of Key Ring running on a Samsung Galaxy S (Vibrant). Phone showed up under &#8216;adb devices&#8217;, I could push the .apk manually using &#8216;adb install&#8217;, but every time I would deploy from Eclipse I&#8217;d get a &#8220;Failed to install app.apk on device &#8216;phone&#8217;: timeout&#8221; I kept [...]]]></description>
			<content:encoded><![CDATA[<p>I was going crazy this morning trying to get a build of Key Ring running on a Samsung Galaxy S (Vibrant). Phone showed up under &#8216;adb devices&#8217;, I could push the .apk manually using &#8216;adb install&#8217;, but every time I would deploy from Eclipse I&#8217;d get a &#8220;Failed to install app.apk on device &#8216;phone&#8217;: timeout&#8221;</p>
<p>I kept searching for issues related to the actual phone, Samsung USB drivers,  etc &#8211; but as these things sometimes turn out it was a simple setting I finally stumbled upon in <a href="http://stackoverflow.com/questions/4775603/android-error-failed-to-install-apk-on-device-timeout">this post</a>. I just needed to increase the ADB timeout to 10000 and everything worked fine. Side note&#8230; these Galaxy S phones are painfully slow to work with, in the end I wasn&#8217;t suprised by the solution.</p>
<p>Anyway, here&#8217;s hoping that anyone else running into this issue comes upon this post quickly and can get back to productive work. Happy hacking!</p>
<p>&nbsp;</p>
<img src="http://feeds.feedburner.com/~r/connectingthings/~4/35ni9ZPR_1o" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2011/04/adb-timeout-on-samsung-galaxy-s/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.rossbates.com/2011/04/adb-timeout-on-samsung-galaxy-s/</feedburner:origLink></item>
		<item>
		<title>Thoughts on the Real Time Web</title>
		<link>http://feedproxy.google.com/~r/connectingthings/~3/idVyvXWXkY8/</link>
		<comments>http://www.rossbates.com/2009/12/thoughts-on-the-real-time-web/#comments</comments>
		<pubDate>Tue, 08 Dec 2009 18:41:15 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Collaboration]]></category>
		<category><![CDATA[Misc]]></category>
		<category><![CDATA[Social Networks]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Twitter]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/?p=321</guid>
		<description><![CDATA[With Google&#8217;s announcement that they are now including live updates from Twitter, Facebook, and MySpace into their search results I expect the term Real Time Web is going to become more familiar to the non TechCrunch public. While the term “Real Time” has taken off over the past 6 months most realize that our existing [...]]]></description>
			<content:encoded><![CDATA[<p>With Google&#8217;s <a href="http://googleblog.blogspot.com/2009/12/relevance-meets-real-time-web.html">announcement</a> that they are now including live updates from Twitter, Facebook, and MySpace into their search results I expect the term Real Time Web is going to become more familiar to the non <a href="http://www.google.com/search?hl=en&amp;q=site%3Atechcrunch.com+Real+Time&amp;aq=f&amp;oq=&amp;aqi=">TechCrunch</a> public.</p>
<p>While the term “Real Time” has taken off over the past 6 months most realize that our existing communications infrastructure already operates at near real time. You send an email, it arrives in seconds. You place a call, someone picks up. Blog posts, satellite television, GPS, IM, etc etc etc.</p>
<p>I’d say the fundamental shift in behavior we are seeing on the web today is related to “Always On”. It’s ubiquitous network connectivity that makes us feel the <em>already</em> real time nature of the web even more.</p>
<p>So what&#8217;s up with Real Time Search and the Real Time Web? Basically it’s about content being indexed and presented in search results as fast as it&#8217;s being produced. This is a certainly a step in the right direction towards the larger goal of instant and ubiquitous human knowledge – “when I know, you know”.  The problem is there&#8217;s just too much noise when you turn on the stream and the only filter in place are keywords.</p>
<p>The technology is important though; data must be collected and indexed before it can be filtered/ranked. We’re getting there.</p>
<p>What gets me excited about the Real Time Web are the ways it can be used to augment existing methods for consumption of news and entertainment. Imagine the ways that the <a href="http://en.wikipedia.org/wiki/Publish/subscribe">PubSub</a> model combined with Real Time Search will allow people to &#8220;tune-in&#8221; to personalized data feeds during sporting events, tv shows, breaking news.</p>
<p>For example, when I am watching the Dallas Cowboys on TV I don&#8217;t want to type &#8220;Dallas Cowboys&#8221; into a search engine and be flooded by results. I want to tune-in to a list of people that I&#8217;ve selected (or have been recommended).  These people may be professionals, they might be my neighbor. It&#8217;s these people that will be providing insight, analysis, and commentary. Troy Aikmen and Joe Buck? Nope. I want comedy. I want bias. I want camaraderie. Then when the game is over I want to tune out, I want it all to go away.</p>
<p>To me the Real Time Web is not about speed, it&#8217;s about moving past the period where Social Networks are persistent. The Real Time Web will introduce Social Networks that are dynamic.  Networks that emerge and disappear in short spans of time. These networks will be asynchronous &#8211; increasingly the Real Time Web will look more like the Real World.</p>
<img src="http://feeds.feedburner.com/~r/connectingthings/~4/idVyvXWXkY8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2009/12/thoughts-on-the-real-time-web/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.rossbates.com/2009/12/thoughts-on-the-real-time-web/</feedburner:origLink></item>
		<item>
		<title>urllib2 With Multiple Network Interfaces</title>
		<link>http://feedproxy.google.com/~r/connectingthings/~3/k8fzUfjE-Xw/</link>
		<comments>http://www.rossbates.com/2009/10/urllib2-with-multiple-network-interfaces/#comments</comments>
		<pubDate>Mon, 26 Oct 2009 21:08:34 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Misc]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/?p=304</guid>
		<description><![CDATA[Normally if I have an issue which is answered in the first 2-3 results of a Google search I won&#8217;t create a post. On the other hand when I spend 2-3 hours trying to solve something which should be simple I like to take the opportunity to describe the issue &#38; resolution in hopes that [...]]]></description>
			<content:encoded><![CDATA[<p>Normally if I have an issue which is answered in the first 2-3 results of a Google search I won&#8217;t create a post. On the other hand when I spend 2-3 hours trying to solve something which should be simple I like to take the opportunity to describe the issue &amp; resolution in hopes that someone will find it quickly in the future.</p>
<p>So the task here was to find a way to specify the IP address, aka socket, aka network interface when making an http request using Python&#8217;s urllib2. Why would you want to do this you ask? Well for many web API&#8217;s the request rate is limited by whitelisting the IP address &#8211; such is the case with Twitter. In the event that you want to be able to use the same machine (with multiple network interfaces) to run jobs in parallel you need to be able to specify where the requests should be routed.</p>
<p>The problem is Python&#8217;s urllib2 is based on the <span>httplib library which doesn&#8217;t let you specify which address to bind to. This person <a href="http://www.opensubscriber.com/message/python-list@python.org/1463382.html">tried to get around the problem</a> in 2005 without any luck, another guy <a href="http://bugs.python.org/issue3972">created a patch</a> for httplib in 2008 which  hasn&#8217;t been accepted, and finally someone else created <a href="http://www.thegoldfish.org/2009/05/python-httpconnection-bound-to-network-interface/">a subclass for httplib</a> which unfortunately I couldn&#8217;t get hooked up to the urllib2 class.</span></p>
<p><span>The best solution I found was this &#8220;monkey patch&#8221; from Alex Martelli over on <a href="http://stackoverflow.com/questions/1150332/source-interface-with-python-and-urllib2">Stack Overflow</a>.  In his example he attacks the problem using the socket library instead of the httplib. By his own admission stuff like this is not ideal, but the solution is actually very simple and elegant. I like it.</span></p>
<p>I wrapped the snippet up into a function which can be called in a Python script anytime before you invoke a urllib2 request.</p>
<pre>def bind_alt_socket(alt_ip):</pre>
<pre style="padding-left: 30px;">true_socket = socket.socket
def bound_socket(*a, **k):
     sock = true_socket(*a, **k)
     sock.bind((alt_ip, 0))
     return sock
socket.socket = bound_socket</pre>
<p>Hope this can be of help to someone in the future who&#8217;s searching for the same thing I was.</p>
<p><span><br />
</span></p>
<img src="http://feeds.feedburner.com/~r/connectingthings/~4/k8fzUfjE-Xw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2009/10/urllib2-with-multiple-network-interfaces/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.rossbates.com/2009/10/urllib2-with-multiple-network-interfaces/</feedburner:origLink></item>
		<item>
		<title>MySQL Upgrade Issue</title>
		<link>http://feedproxy.google.com/~r/connectingthings/~3/LNlcFjEuLZA/</link>
		<comments>http://www.rossbates.com/2009/10/mysql-upgrade-issue/#comments</comments>
		<pubDate>Tue, 13 Oct 2009 17:20:59 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Databases]]></category>
		<category><![CDATA[Misc]]></category>
		<category><![CDATA[mysql]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/?p=300</guid>
		<description><![CDATA[I just spent more time than I should have troubleshooting why the upgrade of MySQL from 5.0 to 5.1 on a Debian box resulted in a MySQL instance that wouldn&#8217;t start. Not a lot out there on this so hopefully this will save someone a bit of time in the future. When upgrading from 5.0 [...]]]></description>
			<content:encoded><![CDATA[<p>I just spent more time than I should have troubleshooting why the upgrade of MySQL from 5.0 to 5.1 on a Debian box resulted in a MySQL instance that wouldn&#8217;t start. Not a lot out there on this so hopefully this will save someone a bit of time in the future.</p>
<p>When upgrading from 5.0 to 5.1 using apt everything will install normally. Then when the MySQL service tries to restart you&#8217;ll see and init.d error and an error that looks something like this:</p>
<p style="padding-left: 30px;"><em><strong>Errors were encountered while processing:mysql-server-5.1mysql-server</strong></em></p>
<p>Not a lot to go on here but as it turns<em> </em>there is a deprecated entry in the my.cnf file called <strong>skip-bdb</strong>.  Comment this line out and you should be good to go.</p>
<img src="http://feeds.feedburner.com/~r/connectingthings/~4/LNlcFjEuLZA" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2009/10/mysql-upgrade-issue/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.rossbates.com/2009/10/mysql-upgrade-issue/</feedburner:origLink></item>
		<item>
		<title>All Programming is Programming</title>
		<link>http://feedproxy.google.com/~r/connectingthings/~3/twJ0Nn5-C8I/</link>
		<comments>http://www.rossbates.com/2009/08/all-programming-is-programming/#comments</comments>
		<pubDate>Sun, 16 Aug 2009 16:08:28 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Collaboration]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[javascript]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/?p=291</guid>
		<description><![CDATA[Over on Coding Horror Jeff Atwood wrote a post which claimed that &#8220;All Programming is Web Programming&#8220;. Jeff made some good points about how the web provides programmers with the ability to reach an audience of a previously unimaginable size. Definitely agree. Also mentioned was that for better or for worse JavaScript is becoming the [...]]]></description>
			<content:encoded><![CDATA[<p>Over on Coding Horror Jeff Atwood wrote a <a href="http://www.codinghorror.com/blog/archives/001296.html">post</a> which claimed that &#8220;<em>All Programming is Web Programming</em>&#8220;. Jeff made some good points about how the web provides programmers with the ability to reach an audience of a previously unimaginable size. Definitely agree. Also mentioned was that for better or for worse JavaScript is becoming the most important language in the world of software development. I agree with this as well, though I would add that the significance of JavaScript is in user facing applications only at this point.</p>
<p>It was unfortunate that the post was written in such a polarizing manner and that the comment thread quickly eroded into a shouting match because there is another important point to be made here.</p>
<p>Something else I want to put out there to developers is that the evolution of programming should be focused less on desktop vs web vs embedded or  choice of language/platform, and more on how lowering the barriers to entry for new programmers is a positive and not a negative.</p>
<p>To someone writing device drivers or kernel patches the idea of writing a JavaScript function to manipulate the DOM may seem &#8220;uninteresting&#8221;, but the fact is that more and more people are getting started with programming this way. All they need is a text editor and a web browser and they are on their way. This is a very good thing.</p>
<p>Programming is about automation and automation is about improving efficiency. The more people we can somehow involve in this process the better because in the end the web provides not only the largest number of potential users, but also the largest number of potential programmers. The exciting thing is we are just getting started.</p>
<img src="http://feeds.feedburner.com/~r/connectingthings/~4/twJ0Nn5-C8I" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2009/08/all-programming-is-programming/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.rossbates.com/2009/08/all-programming-is-programming/</feedburner:origLink></item>
		<item>
		<title>Structuring the Unstructured</title>
		<link>http://feedproxy.google.com/~r/connectingthings/~3/ggo6B-cmnTY/</link>
		<comments>http://www.rossbates.com/2009/08/structuring-the-unstructured/#comments</comments>
		<pubDate>Thu, 06 Aug 2009 15:28:22 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Databases]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/?p=281</guid>
		<description><![CDATA[Martin Willcox from Teradata wrote a couple of blog posts outlining the reasons why he feels the phrase &#8220;unstructured data&#8221; is marketing jargon and that &#8220;nontraditional data&#8221; is more appropriate. Let me start by saying that the examples Martin uses in the first post are technically accurate if we were all disk manufacturers. Whether bitmap [...]]]></description>
			<content:encoded><![CDATA[<p>Martin Willcox from Teradata wrote a couple of <a href="http://www.teradata.com/t/blogs/emea/its_data_Jim_but_not_as_we_know_it/">blog</a> <a href=" http://www.teradata.com/t/blogs/emea/Its-data-jim-but-not-as-we-know-it-Part2/">posts</a> outlining the reasons why he feels the phrase &#8220;unstructured data&#8221; is marketing jargon and that &#8220;nontraditional data&#8221; is more appropriate.</p>
<p>Let me start by saying that the examples Martin uses in the first post are technically accurate if we were all disk manufacturers. Whether bitmap (audio, video) or text (email, html), it&#8217;s true all of these file types use a structured format when being processed by a computer. That being said, we are not all disk manufacturers.</p>
<p>As a data architect I&#8217;ve always felt the true spirit of the phrase &#8220;unstructured data&#8221; corresponds to the modeling and analysis of the data. If you have a collection of objects in an email, an image, or web page&#8230; then these things are unstructured. They tell you nothing without the context of the structured model.</p>
<p>If this were simply a preference in terminology then I wouldn&#8217;t think too much of it, but when a relational database vendor claims that &#8220;nontraditional&#8221; (unstructured) data is easily converted to &#8220;traditional&#8221; data by running fact/entity extraction routines and loading a table it makes me stop and question the true intent of the original message. It&#8217;s not as simple as pushing a button, and an RDBMS is most often not your best option. This isn&#8217;t something which should be glossed over.</p>
<p>The problem is that when using a relational database schema the relationships, attributes, and quantities must be defined before running any extraction routines. That&#8217;s ok when running against a fixed set of data looking for a known set of attributes/measures &#8211; but when you are mining millions of images or billions of web pages all of the edges don&#8217;t start to show up until you actually start to extract and analyze the data. In this situation a relational database actually makes it harder to consume unstructured data due to the high cost associated with schema changes</p>
<p>To me the term unstructured makes sense&#8230; it&#8217;s simply the inverse of structured. Data without a model if you will.  And remember, the larger and more diverse the data set, the less you will know about it&#8217;s characteristcs ahead of time.</p>
<img src="http://feeds.feedburner.com/~r/connectingthings/~4/ggo6B-cmnTY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2009/08/structuring-the-unstructured/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.rossbates.com/2009/08/structuring-the-unstructured/</feedburner:origLink></item>
		<item>
		<title>Refactoring and Resistance</title>
		<link>http://feedproxy.google.com/~r/connectingthings/~3/MAlx0uCrotc/</link>
		<comments>http://www.rossbates.com/2009/07/refactoring-and-resistance/#comments</comments>
		<pubDate>Tue, 21 Jul 2009 22:29:50 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[software design]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/?p=269</guid>
		<description><![CDATA[Yesterday Paul Miller wrote a blog post called &#8220;Does Linked Data need RDF?&#8221; In the comments I said something in support of Paul&#8217;s question which I want to expand upon here. My comment was not directly related to the topic (linked data), but rather the format in which it was asked. Let me trim it [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday Paul Miller wrote a blog post called &#8220;<a href="http://cloudofdata.com/2009/07/does-linked-data-need-rdf/">Does Linked Data need RDF</a>?&#8221; In the comments I said something in support of Paul&#8217;s question which I want to expand upon here. My comment was not directly related to the topic (linked data), but rather the format in which it was asked. Let me trim it down to this:</p>
<address style="padding-left: 30px;"><strong>&#8220;Could we do ${solution} without ${component}?&#8221;</strong></address>
<p>What I am thinking is this&#8230;. as system architects or software engineers this is a critical question to ask not only during design, but also as our solutions age over time. When asked this way it challenges us to think about dependencies from a different perspective. Instead of &#8220;<em>are things working? If so, add new feature</em>&#8220;, we should equally be looking for ways to remove, replace, or consolidate components for the benefits of efficiency, simplicity and performance. Sounds simple right, then why is it so hard to do?</p>
<p>I don&#8217;t think we go through this exercise often enough because politically it can be a very difficult. The sheer inertia of a large project can prevent people from even thinking about the question. The thing is that writing code, choosing a framework, establishing an API&#8230;. these are all sunk costs. We often attach emotional value to the time and effort that went into doing the work, and have a hard time imagining what things would look like without them.</p>
<p>It&#8217;s certainly a challenge, but to build better systems we need to be able to let things go, to scrap code, and to replace components without emotions or personal bias.</p>
<p>Oh and for the record, I do think RDF and Linked Data are the right combination to build the Web of Data. I just hope that we can keep asking questions, challenging assumptions, and continue to have constructive debates about the future of the Web like the one which took place yesterday on Paul&#8217;s website. Great stuff.</p>
<img src="http://feeds.feedburner.com/~r/connectingthings/~4/MAlx0uCrotc" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2009/07/refactoring-and-resistance/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.rossbates.com/2009/07/refactoring-and-resistance/</feedburner:origLink></item>
		<item>
		<title>Getting Into Amazon EC2</title>
		<link>http://feedproxy.google.com/~r/connectingthings/~3/dcouwLGigDg/</link>
		<comments>http://www.rossbates.com/2009/07/getting-into-amazon-ec2/#comments</comments>
		<pubDate>Mon, 20 Jul 2009 03:45:47 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[aws]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/?p=262</guid>
		<description><![CDATA[I spent some time this weekend diving deeper into Amazon&#8217;s EC2 and all of the associated services. I&#8217;ve read about EC2, discussed it with colleagues, I pretty much thought I knew what it was all about&#8230;.. virtual hosting right? Yeah, I was wrong. It was going through the process off setting up an instance and [...]]]></description>
			<content:encoded><![CDATA[<p>I spent some time this weekend diving deeper into Amazon&#8217;s <a href="http://aws.amazon.com/ec2/">EC2</a> and all of the associated services. I&#8217;ve read about EC2, discussed it with colleagues, I pretty much thought I knew what it was all about&#8230;.. virtual hosting right? Yeah, I was wrong. It was going through the process off setting up an instance and configuring all the network and storage services completely that changed my perspective. EC2 is really, really, cool.</p>
<p>What is really rocking my world is the whole concept of throw-away servers. The idea that a discrete process can spin up a new server that gets built at run time, does some work, then disappears is amazing.  I see this as turning the whole concept of linear scale on it&#8217;s head. You don&#8217;t scale an app, you scale individual threads. Powerful stuff, especially when dealing with data mining and event processing.</p>
<p>Much more coming soon&#8230;..</p>
<img src="http://feeds.feedburner.com/~r/connectingthings/~4/dcouwLGigDg" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2009/07/getting-into-amazon-ec2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.rossbates.com/2009/07/getting-into-amazon-ec2/</feedburner:origLink></item>
	</channel>
</rss>
