<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><!-- generator="wordpress/2.1" --><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Enabling the Distributed Family Tree</title>
	<link>http://www.dftproject.org/blog</link>
	<description />
	<pubDate>Wed, 05 Mar 2008 04:04:17 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.1</generator>
	<language>en</language>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/dft" type="application/rss+xml" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com" /><item>
		<title>Oh, I Almost Forgot…</title>
		<link>http://feedproxy.google.com/~r/dft/~3/91gWfN5Rl-E/</link>
		<comments>http://www.dftproject.org/blog/2008/03/04/oh-i-almost-forgot/#comments</comments>
		<pubDate>Wed, 05 Mar 2008 04:04:17 +0000</pubDate>
		<dc:creator>Hilton</dc:creator>
		
		<category><![CDATA[Genesis]]></category>

		<guid isPermaLink="false">http://www.dftproject.org/blog/2008/03/04/oh-i-almost-forgot/</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.dftproject.org/blog/wp-content/uploads/2008/03/duplicates.png' alt='Screenshot of Genesis automatically finding potential duplicates' /></p>
<img src="http://feeds.feedburner.com/~r/dft/~4/91gWfN5Rl-E" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.dftproject.org/blog/2008/03/04/oh-i-almost-forgot/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.dftproject.org/blog/2008/03/04/oh-i-almost-forgot/</feedburner:origLink></item>
		<item>
		<title>Not to Mention…</title>
		<link>http://feedproxy.google.com/~r/dft/~3/7ZBu8NfGSZU/</link>
		<comments>http://www.dftproject.org/blog/2008/02/26/not-to-mention/#comments</comments>
		<pubDate>Tue, 26 Feb 2008 21:28:32 +0000</pubDate>
		<dc:creator>Hilton</dc:creator>
		
		<category><![CDATA[Genesis]]></category>

		<guid isPermaLink="false">http://www.dftproject.org/blog/2008/02/26/not-to-mention/</guid>
		<description><![CDATA[Samuel Martin&#8217;s ancestors on one Web site:

Priscilla Layton&#8217;s ancestors on another Web site:

The two pedigrees seamlessly linked:

]]></description>
			<content:encoded><![CDATA[<p>Samuel Martin&#8217;s ancestors on one Web site:</p>
<p><img src='http://www.dftproject.org/blog/wp-content/uploads/2008/02/pedigree-site1.png' alt='Pedigree from Site 1' /></p>
<p>Priscilla Layton&#8217;s ancestors on another Web site:</p>
<p><img src='http://www.dftproject.org/blog/wp-content/uploads/2008/02/pedigree-site2.png' alt='Pedigree from Site 2' /></p>
<p>The two pedigrees seamlessly linked:</p>
<p><img src='http://www.dftproject.org/blog/wp-content/uploads/2008/02/pedigree-merged.png' alt='Virtually Merged Pedigrees' /></p>
<img src="http://feeds.feedburner.com/~r/dft/~4/7ZBu8NfGSZU" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.dftproject.org/blog/2008/02/26/not-to-mention/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.dftproject.org/blog/2008/02/26/not-to-mention/</feedburner:origLink></item>
		<item>
		<title>Also Coming Soon…</title>
		<link>http://feedproxy.google.com/~r/dft/~3/7uh_TWejN5c/</link>
		<comments>http://www.dftproject.org/blog/2008/02/19/also-coming-soon/#comments</comments>
		<pubDate>Tue, 19 Feb 2008 18:59:37 +0000</pubDate>
		<dc:creator>Hilton</dc:creator>
		
		<category><![CDATA[Genesis]]></category>

		<guid isPermaLink="false">http://www.dftproject.org/blog/2008/02/19/also-coming-soon/</guid>
		<description><![CDATA[

]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.dftproject.org/blog/wp-content/uploads/2008/02/pedigree-teaser.png' alt='Pedigree' /></p>
<p><img src='http://www.dftproject.org/blog/wp-content/uploads/2008/02/pedigree-zoom-teaser.png' alt='Zoomed-out Pedigree' /></p>
<img src="http://feeds.feedburner.com/~r/dft/~4/7uh_TWejN5c" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.dftproject.org/blog/2008/02/19/also-coming-soon/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.dftproject.org/blog/2008/02/19/also-coming-soon/</feedburner:origLink></item>
		<item>
		<title>Coming Soon…</title>
		<link>http://feedproxy.google.com/~r/dft/~3/E4vmg0hx2l4/</link>
		<comments>http://www.dftproject.org/blog/2008/02/12/coming-soon/#comments</comments>
		<pubDate>Wed, 13 Feb 2008 04:02:47 +0000</pubDate>
		<dc:creator>Hilton</dc:creator>
		
		<category><![CDATA[Genesis]]></category>

		<guid isPermaLink="false">http://www.dftproject.org/blog/2008/02/12/coming-soon/</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.dftproject.org/blog/wp-content/uploads/2008/02/circle-diagram-teaser.png' alt='Circle Diagram' /></p>
<img src="http://feeds.feedburner.com/~r/dft/~4/E4vmg0hx2l4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.dftproject.org/blog/2008/02/12/coming-soon/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.dftproject.org/blog/2008/02/12/coming-soon/</feedburner:origLink></item>
		<item>
		<title>Goodbye Database!</title>
		<link>http://feedproxy.google.com/~r/dft/~3/EQ3zxJ2Y6Wg/</link>
		<comments>http://www.dftproject.org/blog/2007/06/19/goodbye-database/#comments</comments>
		<pubDate>Tue, 19 Jun 2007 20:39:32 +0000</pubDate>
		<dc:creator>Hilton</dc:creator>
		
		<category><![CDATA[Genesis]]></category>

		<category><![CDATA[PGVAgent]]></category>

		<guid isPermaLink="false">http://www.dftproject.org/blog/2007/06/19/goodbye-database/</guid>
		<description><![CDATA[In a completely unanticipated reversal, Genesis is now shedding its database.  &#8220;What!?&#8221;, you ask in blinking disbelief (as I seem to have attracted a particularly quiet readership, I get to put words in your mouth).  An excellent question, I&#8217;m glad you asked.  Permit me, if you will, to entertain it.
Way, way back in the beginning, long before I&#8217;d ever even [...]]]></description>
			<content:encoded><![CDATA[<p>In a completely unanticipated reversal, Genesis is now shedding its database.  &#8220;What!?&#8221;, you ask in blinking disbelief (as I seem to have attracted a particularly quiet readership, I get to put words in your mouth).  An excellent question, I&#8217;m glad you asked.  Permit me, if you will, to entertain it.</p>
<p>Way, way back in the beginning, long before I&#8217;d ever even heard of this semantic web thing, I was planning on Genesis being nothing more than a really good record manager (like PAF, only usable).  This of course necessitated a database for storing all the data on the user&#8217;s computer.  I always assumed that one would be there, even though the whole concept later evolved.  It never occurred to me that a database wasn&#8217;t really neccessary anymore.</p>
<p>The flash of insight came yesterday morning as I was contemplating the next step in the replumbing/resurfacing effort.  I don&#8217;t recall the exact circumstances, but I do remember asking myself what would happen if I stopped caching data in the database.  Well, performance would go through the roof, for starters!  Startup and shutdown time would become negligible.  Disk space usage would fall dramatically.  And perhaps most important of all, I could take advantage of the <a href="http://jena.sourceforge.net/inference/index.html#owl">OWL inference support in Jena</a>!</p>
<p>This last point bears explanation.  A major part of this project is the ability for the user to indicate that Person A and Person B are in fact the same person.  This is done by creating an <code>owl:sameAs</code> relationship between the two.  Given this fact, Genesis should infer (using the <a href="http://www.w3.org/TR/owl-ref/">OWL inference rules</a>) that anything said about Person A is also true about Person B, and vice versa; the two are effectively one.  With some tricks this could efficiently be done using a database.  However, anything more complex would be next to impossible without bloating the size (and reducing the speed) of the database several orders of magnitude; all inferences would need to be precomputed each time new data is added to the database.  But inferences like this can be done in-memory (without a database) <em>on demand!</em></p>
<p>Well there are obviously many positive aspects, but are there any downsides?  The most obvious drawback is the fact that it takes PGVAgent a long time to search each PhpGedView website one-by-one.  Having a database means that these search results can be cached for future searches.  If the database goes, so does the persistent caching.  In fact, this is why I had never considered dropping the database before.  Which begs the question, why did I suddenly start considering it now?</p>
<img src="http://feeds.feedburner.com/~r/dft/~4/EQ3zxJ2Y6Wg" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.dftproject.org/blog/2007/06/19/goodbye-database/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.dftproject.org/blog/2007/06/19/goodbye-database/</feedburner:origLink></item>
		<item>
		<title>Search Disclaimer</title>
		<link>http://feedproxy.google.com/~r/dft/~3/bFaFxZZx4zI/</link>
		<comments>http://www.dftproject.org/blog/2007/06/12/search-disclaimer/#comments</comments>
		<pubDate>Tue, 12 Jun 2007 12:22:05 +0000</pubDate>
		<dc:creator>Hilton</dc:creator>
		
		<category><![CDATA[Genesis]]></category>

		<guid isPermaLink="false">http://www.dftproject.org/blog/2007/06/12/search-disclaimer/</guid>
		<description><![CDATA[After the last few posts on the new search, I think it&#8217;s appropriate to mention that the purpose of all this is not to create the ultimate genealogy search engine.  Others are tackling that beast, and more power to them.  If I wanted to get in on that action, I would have written a search engine that [...]]]></description>
			<content:encoded><![CDATA[<p>After the last few posts on the new search, I think it&#8217;s appropriate to mention that the purpose of all this is not to create the ultimate genealogy search engine.  Others are tackling that beast, and more power to them.  If I wanted to get in on that action, I would have written a search engine that crawls PGV websites and lets users search from their web browser, rather than writing a plug-in for a client application that searches PGV websites one-by-one.  I hope that someone <em>does</em> put up a web server that does that (soon? please?), and that they or someone else provides a plug-in that accesses it so that Genesis can perform a single search instead of many.  But searching the PGV websites one-by-one serves my purpose just fine.</p>
<p>So what is that purpose?  To patch up all the independent genealogical trees out there into one distributed family tree (see the title of this blog, up there at the top of this page).  These improvements to search, particularly the Lucene index I mentioned yesterday, allow Genesis to take a given indiviudal and ask, &#8220;Are there any other individuals out there that are similar to this one?&#8221;  Using the index, a list of candidates in order of decreasing likelihood can be quickly returned, the most likely of which can be scrutinized in greater detail to find potential matches.  The user then has the ability to confirm, reject, or punt on suggested matches, implicitly forging links between hitherto independent genealogical trees.  These links will be stored directly on participating websites (PGV already provides a mechanism for this), as well as on third-party servers where websites do not provide a mechanism.</p>
<p>And what&#8217;s so great about that?  For users navigating participating websites through a web browser, they will see links on individual and family tree pages that lead to additional information and connections on other sites.  For those using Genesis or any other potential application that may come along (whether it be a client application, a website, a mashup, or what-have-you) users will be able to navigate a seamless family tree, with information coming from any number of sites and aggregated according to a user-specified trust policy.  That will be <em>really</em> neat, but when autonomous agents get to work on that network&#8230; well, you ain&#8217;t seen nothin&#8217; yet.</p>
<img src="http://feeds.feedburner.com/~r/dft/~4/bFaFxZZx4zI" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.dftproject.org/blog/2007/06/12/search-disclaimer/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.dftproject.org/blog/2007/06/12/search-disclaimer/</feedburner:origLink></item>
		<item>
		<title>Dynamic Search Results</title>
		<link>http://feedproxy.google.com/~r/dft/~3/qwvwbSUO5fQ/</link>
		<comments>http://www.dftproject.org/blog/2007/06/11/dynamic-search-results/#comments</comments>
		<pubDate>Mon, 11 Jun 2007 13:40:05 +0000</pubDate>
		<dc:creator>Hilton</dc:creator>
		
		<category><![CDATA[Genesis]]></category>

		<guid isPermaLink="false">http://www.dftproject.org/blog/2007/06/11/dynamic-search-results/</guid>
		<description><![CDATA[A week-and-a-half ago I promised further details on a resurfacing project I wanted to do after I finish replumbing.  Last Friday I alluded to this when I wrote about upcoming changes in search.  Today I&#8217;d like to outline how the new search interface and experience will work.
Search in Genesis is a tricky problem.  In most search [...]]]></description>
			<content:encoded><![CDATA[<p>A week-and-a-half ago I <a href="http://www.dftproject.org/blog/2007/06/01/more-replumbing/">promised</a> further details on a resurfacing project I wanted to do after I finish replumbing.  Last Friday I alluded to this when I wrote about upcoming changes in search.  Today I&#8217;d like to outline how the new search interface and experience will work.</p>
<p>Search in Genesis is a tricky problem.  In most search applications the entire corpus is available and indexed.  For example, when you do a search on Google, the search engine doesn&#8217;t go and find web pages that are related to your search.  The search engine already knows all the web pages that are out there (or at least all the web pages it has found), so it can quickly gather the most relevant results (supposedly) and present them, in order, almost immediately.  By contrast, Genesis may have some data cached and indexed from a previous search which it can show immediately, but there&#8217;s almost certainly a lot more out there, possibly better, which will take some time to locate.</p>
<p>Peer-to-peer applications such as Kazaa, Guntella, and the original Napster, face a similar problem.  Their approach, and the one used by Genesis until now, is to throw all the search results into one, big, scrollable list.  This doesn&#8217;t work very well for two main reasons:</p>
<ol>
<li>The list can get really long, really fast, making it difficult to find what you&#8217;re looking for.  You could sort it, but&#8230;</li>
<li>When the list is sorted to help find something, new results continually displace old ones, forcing you to scroll down to find what you were looking at.  This is especially annoying if it happens just as you&#8217;re double-clicking on a result.</li>
</ol>
<p>Most web search engines solve the first problem by sorting the results according to relevance and paging them.  Not only does this help the user find the best results quickly, but it is also much more efficient than showing all the results at once.  I want to do something similar with Genesis, but unfortunately I don&#8217;t know what <em>the</em> most relevant results are.  Also, the list is sorted by relevance, so won&#8217;t results be continually displaced if I try to do this dynamically?</p>
<p>Well, I may not know what <em>the</em> most relevant results are, but I <em>do</em> know what the most relevant results are <em>at any given point in time</em>, which suggests a novel approach.  Without going into too much depth, it operates a little something like this:</p>
<p>The moment the user clicks the &#8220;Search&#8221; button, Genesis gets to work.  It start by telling all the search agents (PGVAgent, for example) to start searching their respective domains.  The agents may take a while before they return any meaningful results, so Genesis performs a quick search over the local cache to see if it can give any results immediately.  If it finds any, it ranks them by relevance and displays them on the first page.  If there are more than 10 or so, it onlys shows the top 10 and puts links at the bottom of the page to access subsequent pages.  So far, nothing new.</p>
<p>While the agents continue to search in the background, the user looks over the first page of results and might go on to the second or even third page.  Genesis keeps track of which pages have been visited and doesn&#8217;t mess with them; if the user ever goes back to a page that has already been seen, the page will remain the same.  If, however, the user goes to a subsequent page and better results have become available through the work of the search agents in the background, then that page will show the current top results (excluding those that have already been seen).  Also, the user can at any time refresh the search, returning to the first page of results but with all the current results correctly ordered by relevance.</p>
<p>I don&#8217;t know of any other system that does this, but I&#8217;d be interested to learn of any that exist.  I&#8217;d also be interested in learning about any other, possibly-better solutions.  Any ideas?</p>
<img src="http://feeds.feedburner.com/~r/dft/~4/qwvwbSUO5fQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.dftproject.org/blog/2007/06/11/dynamic-search-results/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.dftproject.org/blog/2007/06/11/dynamic-search-results/</feedburner:origLink></item>
		<item>
		<title>Still Replumbing + New Search</title>
		<link>http://feedproxy.google.com/~r/dft/~3/eVvbxEx1ijY/</link>
		<comments>http://www.dftproject.org/blog/2007/06/08/still-replumbing-new-search/#comments</comments>
		<pubDate>Fri, 08 Jun 2007 16:19:38 +0000</pubDate>
		<dc:creator>Hilton</dc:creator>
		
		<category><![CDATA[Genesis]]></category>

		<category><![CDATA[SPARQL]]></category>

		<guid isPermaLink="false">http://www.dftproject.org/blog/2007/06/08/still-replumbing-new-search/</guid>
		<description><![CDATA[So far I&#8217;ve been largely successful with replacing my own RDF subsystem with the Jena, ARQ, SDB, and NG4J libraries.  Whereas before it took a minute or more to import the PGV website directory, thanks to bulk loading it now imports almost instantaneously.  Previously only a very limited class of queries on the data were possible, but [...]]]></description>
			<content:encoded><![CDATA[<p>So far I&#8217;ve been largely successful with replacing my own RDF subsystem with the <a href="http://jena.sourceforge.net/">Jena</a>, <a href="http://jena.sourceforge.net/ARQ/">ARQ</a>, <a href="http://seaborne.blogspot.com/2007/02/jena-sdb.html">SDB</a>, and <a href="http://sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/">NG4J</a> libraries.  Whereas before it took a minute or more to import the PGV website directory, thanks to bulk loading it now imports almost instantaneously.  Previously only a very limited class of queries on the data were possible, but now with full SPARQL support, anything goes.  I&#8217;m really happy with it.</p>
<p>I&#8217;ve decided to scrap the old search, however.  I was going to simply refactor it in the interest of getting on with the thesis, but it will take just as much work to create a new search that works much better.  Central to this new search is a recent addition to the plumbing: full-text search.  Over the last two days I integrated <a href="http://lucene.apache.org/">Lucene</a>, a full-text search engine, into the project.  Lucene makes it unbelievably easy to index content and then search it.  And it&#8217;s incredibly fast (at both)!</p>
<p>As proof of concept, I refactored the PGV website list to use a Lucene index (instead of a SPARQL query with caching).  Those of you who have used the PGV Websites view before will know very well that it was very slow.  You will be pleased to learn that it now takes less than a second to populate the list, even the first time it is opened!  True, the PGV Websites view is really not all that important (or at least not yet; in the future you&#8217;ll go here to add/remove sites and enter account details for non-anonymous access).  But it does suggest that using Lucene indexes for search will be very successful.</p>
<div class="postmetadata"><img src="http://blog.nucleartoiletpaper.com/dft/wp-content/themes/dft/images/technorati.png" border="0" /> Technorati Tags: <a href="http://technorati.com/tag/Genesis" rel="tag">Genesis</a>, <a href="http://technorati.com/tag/RDF" rel="tag"> RDF</a>, <a href="http://technorati.com/tag/Jena" rel="tag"> Jena</a>, <a href="http://technorati.com/tag/ARQ" rel="tag"> ARQ</a>, <a href="http://technorati.com/tag/SDB" rel="tag"> SDB</a>, <a href="http://technorati.com/tag/NG4J" rel="tag"> NG4J</a>, <a href="http://technorati.com/tag/PGV" rel="tag"> PGV</a>, <a href="http://technorati.com/tag/SPARQL" rel="tag"> SPARQL</a>, <a href="http://technorati.com/tag/Lucene" rel="tag"> Lucene</a></div>
<img src="http://feeds.feedburner.com/~r/dft/~4/eVvbxEx1ijY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.dftproject.org/blog/2007/06/08/still-replumbing-new-search/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.dftproject.org/blog/2007/06/08/still-replumbing-new-search/</feedburner:origLink></item>
		<item>
		<title>More Replumbing</title>
		<link>http://feedproxy.google.com/~r/dft/~3/H9YAO6JOe8U/</link>
		<comments>http://www.dftproject.org/blog/2007/06/01/more-replumbing/#comments</comments>
		<pubDate>Fri, 01 Jun 2007 18:21:15 +0000</pubDate>
		<dc:creator>Hilton</dc:creator>
		
		<category><![CDATA[Genesis]]></category>

		<category><![CDATA[SPARQL]]></category>

		<guid isPermaLink="false">http://www.dftproject.org/blog/2007/06/01/more-replumbing/</guid>
		<description><![CDATA[The replumbing effort is coming along very nicely.  I only have time to actually write code in short spurts, but that gives me time to think out the issues I&#8217;m tackling and address them carefully.
So far I&#8217;ve only had one problem with Jena SDB, and it&#8217;s already been fixed.  The results of SPARQL queries were being collected into [...]]]></description>
			<content:encoded><![CDATA[<p>The replumbing effort is coming along very nicely.  I only have time to actually write code in short spurts, but that gives me time to think out the issues I&#8217;m tackling and address them carefully.</p>
<p>So far I&#8217;ve only had one problem with Jena SDB, and it&#8217;s already been fixed.  The results of SPARQL queries were being collected into an array before being returned as a result set to my code.  This was a problem because Genesis relies on the results being streamed out as they are found.  I <a href="http://tech.groups.yahoo.com/group/jena-dev/message/29336">mentioned it</a> on the jena-dev discussion group and <em>within twenty-four hours</em> the code was altered so that result sets now stream!  All I had to do was perform an update on the code and Genesis immediately began performing <em>much</em> better.  I love open source.  Thanks Andy!</p>
<p>In other news, I&#8217;m contemplating a resurfacing project to follow the replumbing effort, which will make Genesis look and act a lot more like a web browser than an IDE.  I&#8217;ll tell you more later.</p>
<div class="postmetadata"><img src="http://blog.nucleartoiletpaper.com/dft/wp-content/themes/dft/images/technorati.png" border="0" /> Technorati Tags: <a href="http://technorati.com/tag/Jena+SDB" rel="tag">Jena SDB</a>, <a href="http://technorati.com/tag/SPARQL" rel="tag"> SPARQL</a>, <a href="http://technorati.com/tag/Genesis" rel="tag"> Genesis</a></div>
<img src="http://feeds.feedburner.com/~r/dft/~4/H9YAO6JOe8U" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.dftproject.org/blog/2007/06/01/more-replumbing/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.dftproject.org/blog/2007/06/01/more-replumbing/</feedburner:origLink></item>
		<item>
		<title>Replumbing</title>
		<link>http://feedproxy.google.com/~r/dft/~3/Bh0qHivXdAo/</link>
		<comments>http://www.dftproject.org/blog/2007/05/23/replumbing/#comments</comments>
		<pubDate>Wed, 23 May 2007 23:22:05 +0000</pubDate>
		<dc:creator>Hilton</dc:creator>
		
		<category><![CDATA[Genesis]]></category>

		<category><![CDATA[SPARQL]]></category>

		<guid isPermaLink="false">http://www.dftproject.org/blog/2007/05/23/replumbing/</guid>
		<description><![CDATA[I should probably be coding right now, but perhaps my sanity will be best preserved if I take a break to update any readers I may still have on what&#8217;s going on.
When I embarked on this project, the plan was to use Jena and NG4J for the data plumbing.  When I started programming, however, I found it [...]]]></description>
			<content:encoded><![CDATA[<p>I should probably be coding right now, but perhaps my sanity will be best preserved if I take a break to update any readers I may still have on what&#8217;s going on.</p>
<p>When I embarked on this project, the plan was to use Jena and NG4J for the data plumbing.  When I started programming, however, I found it much easier to just write my own custom plumbing.  It wasn&#8217;t as complete or as powerful as Jena/NG4J, but at least I understood how it worked and it did exactly what I wanted.  As time wore on I began to feel the constraining effects of that decision, though.  I vowed to myself that I would eventually replace my code with Jena and NG4J.  Well, as I can&#8217;t make any more progress on the thesis until I&#8217;ve fixed this mess, the time has come.  So that&#8217;s what I&#8217;ve been doing.  What&#8217;s really sick and twisted is that the parts I&#8217;ve already upgraded are so much more elegant and clean now.  What was I thinking!?</p>
<p>So I&#8217;ve come full circle.  When I finish this huge refactoring effort, I&#8217;ll be able to leverage all the improvements that have been happening in Jena and NG4J lately.  Particularly exciting is the new <a href="http://seaborne.blogspot.com/2007/02/jena-sdb.html">Jena SDB</a>, which is a very well written SPARQL to SQL translator.  The lack of such a translator is what originally <a href="http://www.dftproject.org/blog/2006/12/04/sparql-to-sql/">motivated me to do my own custom plumbing</a>.  It still lacks SQL FILTER evaluation, though, which perhaps can be my contribution to the community.  I&#8217;ve already extended it for named graphs.</p>
<p>In other news, the Genesis interface has really been grating on me lately.  It needs to be easier to use, fun, and, well, obvious.  Think Google.  We&#8217;ll see if I have enough self-restraint to leave it as is until I finish the thesis.  Then I can play all I want.</p>
<div class="postmetadata"><img src="http://blog.nucleartoiletpaper.com/dft/wp-content/themes/dft/images/technorati.png" border="0" /> Technorati Tags: <a href="http://technorati.com/tag/Genesis" rel="tag">Genesis</a>, <a href="http://technorati.com/tag/Jena" rel="tag"> Jena</a>, <a href="http://technorati.com/tag/NG4J" rel="tag"> NG4J</a>, <a href="http://technorati.com/tag/Jena+SDB" rel="tag"> Jena SDB</a></div>
<img src="http://feeds.feedburner.com/~r/dft/~4/Bh0qHivXdAo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.dftproject.org/blog/2007/05/23/replumbing/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.dftproject.org/blog/2007/05/23/replumbing/</feedburner:origLink></item>
	</channel>
</rss>
