<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>planetwater</title>
	
	<link>http://planetwater.org</link>
	<description>ground- water, engineering, science, geo- statistics</description>
	<lastBuildDate>Sat, 11 May 2013 20:32:05 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/planetwater" /><feedburner:info uri="planetwater" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Improved Individual Decision-Making: Weather</title>
		<link>http://feedproxy.google.com/~r/planetwater/~3/ceuzOjCcf7A/</link>
		<comments>http://planetwater.org/2013/05/11/improved-individual-decision-making-weather/#comments</comments>
		<pubDate>Sat, 11 May 2013 20:32:05 +0000</pubDate>
		<dc:creator>Claus</dc:creator>
				<category />

		<guid isPermaLink="false">http://planetwater.org/?p=2350</guid>
		<description><![CDATA[When I read Rufus Pollock&#8217;s editorial on &#8220;Forget big data, small data is the real revolution&#8221;, it occurred to me that everybody, probably even I, could take advantage of what Pollock calls the &#8220;democratization of the masses&#8221;. In this post I will show how information can be &#8220;pulled together&#8221; using only basic programming skills. This [...]]]></description>
				<content:encoded><![CDATA[<p>When I read <a href="http://m.guardian.co.uk/news/datablog/2013/apr/25/forget-big-data-small-data-revolution">Rufus Pollock&#8217;s editorial</a> on &#8220;Forget big data, small data is the real revolution&#8221;, it occurred to me that everybody, probably even I, could take advantage of what Pollock calls the &#8220;democratization of the masses&#8221;. In this post I will show how information can be &#8220;pulled together&#8221; using only basic programming skills. This information then can be used for improved decision making. The example that I decided to use to put this into practice might be the most universal conversation topic: weather <img src='http://planetwater.org/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>

<h2>Practicing &#8220;Small Data&#8221;</h2>

<p>Usually, I follow my interest in weather on a very basic level: I read the weather forecast. I try to use not the basic forecasts. Hence, I like to visit <a href="http://www.wetterzentrale.de/topkarten/fsavnmgeur.html">wetterzentrale.de</a> because of the fantastic amount of information they make available, and the fantastic visualizations that <a href="http://forecast.io">forecast.io</a> and <a href="http://forecast.io/lines/">forecast.io/lines</a> present.</p>

<p>The unfortunate thing is that you kind of have to believe those products. I hadn&#8217;t seen a good weather map in a long time, until I was sailing recently at  <a href="http://www.dhh.de/segelschule-yachtschule-chiemsee.html">DHH Chiemsee</a>, who make prints from the DWD analysis maps of air pressure at ground surface (together with annotations of observations) available on a daily basis.</p>

<p>The following ideas came to my mind:</p>

<ul>
<li>it would be very interesting to see the progression of these pressure maps over time</li>
<li>since they are analysis maps, commonly still hand drawn, it would be interesting to compare them to other analyses, done by somebody else</li>
<li>a description of the current situation associated with the pressure maps would be useful, so that an amateur like me gets some hints</li>
</ul>

<p>With this information at hand, everybody could form their own opinion of the current weather situation in an improved way!</p>

<p>After some research, that did not take very long at all, I found some other sources on the internet, that allowed me to come up with the following map:</p>

<p><img style="display:block; margin-left:auto; margin-right:auto;" src="http://planetwater.org/wp-content/uploads/2013/05/grosswetterlage_overview_2013_05_07_07_30_08.png" alt="Grosswetterlage overview 2013 05 07 07 30 08" title="grosswetterlage_overview_2013_05_07_07_30_08.png" border="0" width="210" height="450" /></p>

<p>The code I wrote allows to create this plot at times that can be specified. The left column shows the current analysis performed by different institutions, the right column shows predictions performed by <a href="http://www.knmi.nl/waarschuwingen_en_verwachtingen/weerkaarten.php">KMNI</a>.</p>

<p>I wanted to do all this in python, so I needed to figure out how to get images from the internet and learned about the packages <code>urllib2</code>, <code>HTMLParser</code>, <code>Image</code> (I didn&#8217;t know that there was a greyscale png), and <code>sched</code>. Despite the fact, that there are still some (minor?) things that need to be ironed out (plotting of text with matplotlib, style of the headings) I put the code up on <a href="https://github.com/clausTue/weather">github</a>.</p>

<p>I&#8217;d be very happy to hear what you guys think! Happy birthday Ferdi! <img src='http://planetwater.org/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<img src="http://feeds.feedburner.com/~r/planetwater/~4/ceuzOjCcf7A" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://planetwater.org/2013/05/11/improved-individual-decision-making-weather/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://planetwater.org/2013/05/11/improved-individual-decision-making-weather/</feedburner:origLink></item>
		<item>
		<title>identi.ca updates</title>
		<link>http://feedproxy.google.com/~r/planetwater/~3/dYbvHBxNGLs/</link>
		<comments>http://planetwater.org/2013/05/09/identi-ca-updates/#comments</comments>
		<pubDate>Thu, 09 May 2013 20:30:10 +0000</pubDate>
		<dc:creator>Claus</dc:creator>
				<category><![CDATA[identi.ca]]></category>

		<guid isPermaLink="false">http://planetwater.org/2013/05/09/identi-ca-updates/</guid>
		<description><![CDATA[2013-05-03]]></description>
				<content:encoded><![CDATA[<ul class="ws_tweet_list">

<li class="ws_tweet">More than half of the world&#039;s population lives inside this circle: <a href="http://t.co/vs3E2pxaNB" rel="nofollow">http://t.co/vs3E2pxaNB</a> <a class="ws_tweet_time" href="http://twitter.com/@planetwater/statuses/331859903322914816">2013-05-07</a></li>


</ul>
<img src="http://feeds.feedburner.com/~r/planetwater/~4/dYbvHBxNGLs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://planetwater.org/2013/05/09/identi-ca-updates/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://planetwater.org/2013/05/09/identi-ca-updates/</feedburner:origLink></item>
		<item>
		<title>More than half of the world’s population lives inside this circle</title>
		<link>http://feedproxy.google.com/~r/planetwater/~3/-F32ds5uf-I/</link>
		<comments>http://planetwater.org/2013/05/07/more-than-half-of-the-worlds-population-lives-inside-this-circle/#comments</comments>
		<pubDate>Tue, 07 May 2013 19:53:43 +0000</pubDate>
		<dc:creator>Claus</dc:creator>
				<category />

		<guid isPermaLink="false">http://planetwater.org/?p=2346</guid>
		<description><![CDATA[  … and there&#8217;s a lot of water in that circle too&#8230; via Very Spatial]]></description>
				<content:encoded><![CDATA[<p><img style="display: block; margin-left: auto; margin-right: auto;" title="NewImage.png" src="http://planetwater.org/wp-content/uploads/2013/05/NewImage.png" alt="NewImage" width="600" height="337" border="0" /></p>

<p> </p>

<p>… and there&#8217;s a lot of water in that circle too&#8230;</p>

<p>via <a href="http://veryspatial.com/2013/05/the-best-geographic-visualization-ive-seen-in-ages/">Very Spatial</a></p>
<img src="http://feeds.feedburner.com/~r/planetwater/~4/-F32ds5uf-I" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://planetwater.org/2013/05/07/more-than-half-of-the-worlds-population-lives-inside-this-circle/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://planetwater.org/2013/05/07/more-than-half-of-the-worlds-population-lives-inside-this-circle/</feedburner:origLink></item>
		<item>
		<title>Twitter Digests</title>
		<link>http://feedproxy.google.com/~r/planetwater/~3/7USwDQ8_vw0/</link>
		<comments>http://planetwater.org/2013/05/06/twitter-digests/#comments</comments>
		<pubDate>Mon, 06 May 2013 07:48:10 +0000</pubDate>
		<dc:creator>Claus</dc:creator>
				<category />

		<guid isPermaLink="false">http://planetwater.org/?p=2343</guid>
		<description><![CDATA[Folks, I just experimented with twitter digests on this blog &#8212;  a feature that has been broken since twitter did some changes to their api. I am suspecting that this might have lead / might lead to some blog posts showing up in your RSS feed, which are actually &#8220;just&#8221; twitter posts. Sorry for that [...]]]></description>
				<content:encoded><![CDATA[<p>Folks,</p>

<p>I just experimented with twitter digests on this blog &#8212;  a feature that has been broken since twitter did some changes to their api. I am suspecting that this might have lead / might lead to some blog posts showing up in your RSS feed, which are actually &#8220;just&#8221; twitter posts. Sorry for that inconvenience.</p>
<img src="http://feeds.feedburner.com/~r/planetwater/~4/7USwDQ8_vw0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://planetwater.org/2013/05/06/twitter-digests/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://planetwater.org/2013/05/06/twitter-digests/</feedburner:origLink></item>
		<item>
		<title>Stimulating Presentation on Supposedly Boring Topic: Data Types</title>
		<link>http://feedproxy.google.com/~r/planetwater/~3/1Y9-Bvif23g/</link>
		<comments>http://planetwater.org/2013/05/01/stimulating-presentation-on-supposedly-boring-topic-data-types/#comments</comments>
		<pubDate>Wed, 01 May 2013 08:43:56 +0000</pubDate>
		<dc:creator>Claus</dc:creator>
				<category />

		<guid isPermaLink="false">http://planetwater.org/?p=1886</guid>
		<description><![CDATA[This is awesome, funny, shocking, and horrifying &#8212; all at the same time and for the entire four minutes! Quick demonstration about types in ruby and java script via Hillary Mason]]></description>
				<content:encoded><![CDATA[<p>This is awesome, funny, shocking, and horrifying &#8212; all at the same time and for the entire four minutes! Quick demonstration about types in ruby and java script</p>

<iframe width="420" height="315" src="http://www.youtube.com/embed/kXEgk1Hdze0" frameborder="0" allowfullscreen></iframe>

<p>via <a href="http://www.hilarymason.com/speaking/speaking-entertain-dont-teach/">Hillary Mason</a></p>
<img src="http://feeds.feedburner.com/~r/planetwater/~4/1Y9-Bvif23g" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://planetwater.org/2013/05/01/stimulating-presentation-on-supposedly-boring-topic-data-types/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://planetwater.org/2013/05/01/stimulating-presentation-on-supposedly-boring-topic-data-types/</feedburner:origLink></item>
		<item>
		<title>Research, Reproducibility, Data</title>
		<link>http://feedproxy.google.com/~r/planetwater/~3/uazfvTzoJag/</link>
		<comments>http://planetwater.org/2013/04/25/research-reproducibility-data/#comments</comments>
		<pubDate>Thu, 25 Apr 2013 06:50:18 +0000</pubDate>
		<dc:creator>Claus</dc:creator>
				<category />

		<guid isPermaLink="false">http://planetwater.org/?p=1875</guid>
		<description><![CDATA[Last week, Thomas Herndon, an economics grad student, published a paper that refuted a renown economics paper authored by two Harvard professors on three accounts: some data was excluded from the analysis without stating the reasons; during processing of the data, a debatable method for weighting the data was used; there was a “coding error” [...]]]></description>
				<content:encoded><![CDATA[<p>Last week, Thomas Herndon, an economics grad student, <a href="http://www.peri.umass.edu/236/hash/31e2ff374b6377b2ddec04deaa6388b1/publication/566/">published a paper</a> that refuted a renown economics paper authored by two Harvard professors on three accounts:</p>

<ol>
<li>some data was excluded from the analysis without stating the reasons;</li>
<li>during processing of the data, a debatable method for weighting the data was used;</li>
<li>there was a “coding error” &#8211; The authors had used MS Excel for their analysis and used a seemingly wrong range of cells for one calculation;</li>
</ol>

<p>As far as I can tell, Mike Konczal was the first to write about the freshly published paper on <a href="http://www.nextnewdeal.net/rortybomb/researchers-finally-replicated-reinhart-rogoff-and-there-are-serious-problems">April 16th</a>.</p>

<blockquote>First, Reinhart and Rogoff selectively exclude years of high debt and average growth. Second, they use a debatable method to weight the countries. Third, there also appears to be a coding error that excludes high-debt and average-growth countries. All three bias in favor of their result, and without them you don&#8217;t get their controversial result.
</blockquote>

<p>On April 17th, Arindrajit Dube, assistant professor at economics at the University of Massachusetts, Amherst (the same school of Thomas Herndon). He presents a short and concise analysis of the reasoning behind Herndon’s paper. One key analysis of his relates to the fact that different ranges of the data have a varying degree of dependence. In this case, the strength of the relationship [between growth and debt-to-GDP] is actually much stronger at low ratios of debt-to-GDP. From there he goes on to wonder about the causes of this changing relationship.</p>

<blockquote>Here is a simple question: does a high debt-to-GDP ratio better predict future growth rates, or past ones?  If the former is true, it would be consistent with the argument that higher debt levels cause growth to fall. On the other hand, if higher debt &#8220;predicts&#8221; past growth, that is a signature of reverse causality.</blockquote>

<p><img style="display: block; margin-left: auto; margin-right: auto;" title="growth.png" src="http://planetwater.org/wp-content/uploads/2013/04/growth.png" alt="NewImage" width="600" height="435" border="0" /></p>

<p>Future and Past Growth Rates and Current Debt-to-GDP Ratio. </strong>Figure&#8217;s <a href="http://www.nextnewdeal.net/rortybomb/guest-post-reinhartrogoff-and-growth-time-debt">source</a>.</p>

<p>Looking at the data is one thing, but looking at causal relationships should always be related. A lot of people suggest that making data and analysis methods publicly available would prevent such errors. I agree to some extent. It is nice to see a re-analysis <a href="http://nbviewer.ipython.org/urls/raw.github.com/vincentarelbundock/Reinhart-Rogoff/master/reinhart-rogoff.ipynb">performed in python online</a>. However, why did the authors not see these causal relationships? Did they not have enough time for a rigorous analysis? And would a rigorous analysis not be necessary for research that forms the basis for (current) political decisions?</p>

<p><a href="http://jpktd.blogspot.de/2013/04/statistics-in-python-reproducing_24.html">Josef Perktold</a> frames it in slightly different words (and also <a href="http://blog.fperez.org/2013/04/literate-computing-and-computational.html">links to a post by Fernando Perez</a> that examines the role of ipython and literate programming on reproducibility):</p>

<blockquote> […] it&#8217;s just the usual (mis)use of economics research results. Politicians like the numbers that give them ammunition for their position </blockquote>

<p>and</p>

<blockquote>&#8220;Believable&#8221; research: If your results sound too good or too interesting to be true, maybe they are not, and you better check your calculations. Although mistakes are not uncommon, the business as usual part is that the results are often very sensitive to assumptions, and it takes time to figure out what results are robust. I have seen enough economic debates where there never was a clear answer that convinced more than half of all economists. A long time ago, when the Asian Tigers where still tigers, one question was: Did they grow because of or in spite of government intervention? </blockquote>

<p>Stephen Colbert, of course, has his own thoughts, and has invited Thomas Herndon to chat with him:</p>

<iframe width="640" height="360" src="http://www.youtube.com/embed/BFf0HgpJ-qc?feature=player_detailpage" frameborder="0" allowfullscreen></iframe>
<img src="http://feeds.feedburner.com/~r/planetwater/~4/uazfvTzoJag" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://planetwater.org/2013/04/25/research-reproducibility-data/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://planetwater.org/2013/04/25/research-reproducibility-data/</feedburner:origLink></item>
		<item>
		<title>Pythonanywhere Update</title>
		<link>http://feedproxy.google.com/~r/planetwater/~3/ZOEzCyb6dnQ/</link>
		<comments>http://planetwater.org/2013/01/03/pythonanywhere-update/#comments</comments>
		<pubDate>Thu, 03 Jan 2013 10:09:52 +0000</pubDate>
		<dc:creator>Claus</dc:creator>
				<category />

		<guid isPermaLink="false">http://planetwater.org/?p=1863</guid>
		<description><![CDATA[Pretty much one year ago, I had written about Pythonanywhere. I thought I&#8217;d try their servers out again. Here is what I noticed, after not having used it for a year: the connection to the dropbox folder still works it is faster (see updated chart) they updated python to 2.7.3 (and 3.2 if you like), [...]]]></description>
				<content:encoded><![CDATA[<p>Pretty much one year ago, <a href="http://planetwater.org/2011/12/30/pythonanywhere/">I had written</a> about <a href="https://www.pythonanywhere.com">Pythonanywhere</a>.
I thought I&#8217;d try their servers out again. Here is what I noticed, after not having used it for a year:</p>

<ul>
<li>the connection to the dropbox folder still works</li>
<li>it is faster (see updated chart)</li>
<li>they updated python to 2.7.3 (and 3.2 if you like), and numpy to  1.6.2</li>
<li>it is possible to run ipython and ipython notebooks!</li>
<li>it is possible to schedule tasks!</li>
<li>they extended possibilities for web servers and mysql</li>
<li><a href="http://blog.pythonanywhere.com/50/">latex</a>, <a href="http://blog.pythonanywhere.com/43/">git</a> integration</li>
</ul>

<p>Awesomeness! <img src='http://planetwater.org/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>

<p><img style="display:block; margin-left:auto; margin-right:auto;" src="http://planetwater.org/wp-content/uploads/2013/01/comp_any_local2.png" alt="comparison pythonanwhere vs. local machine" title="comp_any_local2.png" border="0" width="586" height="600" /></p>
<img src="http://feeds.feedburner.com/~r/planetwater/~4/ZOEzCyb6dnQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://planetwater.org/2013/01/03/pythonanywhere-update/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://planetwater.org/2013/01/03/pythonanywhere-update/</feedburner:origLink></item>
		<item>
		<title>New Books on Numpy, Pandas, Data Analysis</title>
		<link>http://feedproxy.google.com/~r/planetwater/~3/fRQK7de3MdY/</link>
		<comments>http://planetwater.org/2012/11/20/new-books-on-numpy-pandas-data-analysis/#comments</comments>
		<pubDate>Tue, 20 Nov 2012 11:01:31 +0000</pubDate>
		<dc:creator>Claus</dc:creator>
				<category />

		<guid isPermaLink="false">http://planetwater.org/?p=1856</guid>
		<description><![CDATA[I became aware recently of three books that are related to data-analysis, modelling, and statistics in a fairly broad sense. They are pictured below, from left to right: &#8220;Python for Data Analysis&#8221; by Wes McKinney (of pandas fame) published by O&#8217;Reilly &#8220;NumPy Cookbook&#8221; by Ivan Idris published by Packt Publishing &#8220;NumPy 1.5 Beginner&#8217;s Guide&#8221; by [...]]]></description>
				<content:encoded><![CDATA[<p>I became aware recently of three books that are related to data-analysis, modelling, and statistics in a fairly broad sense.</p>

<p>They are pictured below, from left to right:</p>

<ul>
<li>&#8220;<a href="http://www.amazon.de/gp/product/1449319793/ref=as_li_ss_tl?ie=UTF8&#038;camp=1638&#038;creative=19454&#038;creativeASIN=1449319793&#038;linkCode=as2&#038;tag=planetwateror-21">Python for Data Analysis</a><img src="http://www.assoc-amazon.de/e/ir?t=planetwateror-21&#038;l=as2&#038;o=3&#038;a=1449319793" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />&#8221; by Wes McKinney (of pandas fame) published by O&#8217;Reilly</li>
<li>&#8220;<a href="http://www.amazon.de/gp/product/1849518920/ref=as_li_ss_tl?ie=UTF8&#038;camp=1638&#038;creative=19454&#038;creativeASIN=1849518920&#038;linkCode=as2&#038;tag=planetwateror-21">NumPy Cookbook</a><img src="http://www.assoc-amazon.de/e/ir?t=planetwateror-21&#038;l=as2&#038;o=3&#038;a=1849518920" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />&#8221; by Ivan Idris published by Packt Publishing</li>
<li>&#8220;<a href="http://www.amazon.de/gp/product/1849515301/ref=as_li_ss_tl?ie=UTF8&#038;camp=1638&#038;creative=19454&#038;creativeASIN=1849515301&#038;linkCode=as2&#038;tag=planetwateror-21">NumPy 1.5 Beginner&#8217;s Guide</a><img src="http://www.assoc-amazon.de/e/ir?t=planetwateror-21&#038;l=as2&#038;o=3&#038;a=1849515301" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />&#8221; by Ivan Idris published by Packt Publishing</li>
</ul>

<p><img src="http://planetwater.org/wp-content/uploads/2012/11/NumpyBooks.jpg" alt="NumpyBooks" title="NumpyBooks.jpg" border="0" width="600" height="266" /></p>

<h2>Python for Data Analysis</h2>

<p>This is the most in-depth book. It covered the most important python modules: iPython, NumPy, Pandas, Matplotlib. Additionally, it has chapters containing examples on practical issues with data (aggregation, data with a time-stamp, sorting). I really just started diving into it. However, it already led me to upgrade to iPython 13.1. It seems like it is well suited for my level of understanding of programming: Having some experience, trying to learn more existing tools</p>

<h2>NumPy Cookbook</h2>

<p>The title already gives it away: The book is organised in sections with &#8220;recipes&#8221;. Mostly, these recipes are self-containing. The focus is clearly on NumPy, even though Matplotlib, iPython, and also Pandas are covered to some extent. I enjoyed browsing through it, most of the examples are interesting (resizing images, playing with PythonAnywhere (like <a href="http://planetwater.org/2011/12/30/pythonanywhere/">I&#8217;ve done before</a>), f.ex). Generally, I think this is a great resource to have.</p>

<h2>NumPy 1.5</h2>

<p>Despite being by the same author (Ivan Idris), there is positively little overlap between his two books. &#8220;NumPy 1.5&#8243; covered NumPy in great detail, and is as such mostly useful for beginners who try to use python for some numerical analysis. When I read this book, I also was reminded, that the <a href="http://www.scipy.org/Numpy_Example_List_With_Doc/#head-88ade192dacf0c15e4f1377096134ee559df07a0">webpage that lists NumPy functions</a> is a very valuable resource (which I tend to spend too little time with).</p>

<p>It is interesting to see that people realise that there is a market for books explaining open source tools. And I do think those books complement available documentation nicely.</p>
<img src="http://feeds.feedburner.com/~r/planetwater/~4/fRQK7de3MdY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://planetwater.org/2012/11/20/new-books-on-numpy-pandas-data-analysis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://planetwater.org/2012/11/20/new-books-on-numpy-pandas-data-analysis/</feedburner:origLink></item>
		<item>
		<title>Pandas in Python: Use with Hydrological Time Series</title>
		<link>http://feedproxy.google.com/~r/planetwater/~3/jsh6QM87I48/</link>
		<comments>http://planetwater.org/2012/11/20/pandas-in-python-use-with-hydrological-time-series/#comments</comments>
		<pubDate>Tue, 20 Nov 2012 08:57:32 +0000</pubDate>
		<dc:creator>Claus</dc:creator>
				<category />

		<guid isPermaLink="false">http://planetwater.org/?p=1852</guid>
		<description><![CDATA[I recently had some time series analysis to do, and I decided to do this work in Pandas. Particualarly, I had to deal with many timeseries, stretching from different startpoints in time to different endpoints in time. Each timeseries was available as an ASCII file. The data were daily data. The time information was given [...]]]></description>
				<content:encoded><![CDATA[<p>I recently had some time series analysis to do, and I decided to do this work in <a href="http://pandas.pydata.org">Pandas</a>. Particualarly, I had to deal with many timeseries, stretching from different startpoints in time to different endpoints in time. Each timeseries was available as an ASCII file. The data were daily data. The time information was given in three columns, each representing year, month, and day, respectively.</p>

<p><img src="http://planetwater.org/wp-content/uploads/2012/11/Looking-Out.jpg" alt="Looking Out" title="Looking Out.jpg" border="0" width="600" height="450" caption="Pandas are awesome" /></p>

<p>Image: http://flic.kr/p/9gK3ZH</p>

<p>I found these two posts by Randal S. Olson very useful resources:</p>

<ul>
<li>Using pandas DataFrames to process data from multiple replicate runs in Python (<a href="http://www.randalolson.com/2012/06/26/using-pandas-dataframes/">link</a>)</li>
<li>Statistical analysis made easy in Python with SciPy and pandas DataFrames (<a href="http://www.randalolson.com/2012/08/06/statistical-analysis-made-easy-in-python/">link</a>)</li>
</ul>

<p>Here is a cookbook style layout of what I did:</p>

<p>The following steps show how easy it was to deal with the data
1. Read the input data</p>

<pre><code>    cur_sim_ts = pa.io.parsers.read_table(os.path.join(data_path, 'test', result)
        , header=None
        , sep='\s*'
        , parse_dates=[[0,1, 2070]]
        , names=['year','month', 'day', result[:-4]]
        , index_col=[0]
        )
</code></pre>

<p>where `&#8217;\s*&#8217; means any whitespace. The dates were given in three colums, one for year, month, and day. Can it get simpler and nicer than that?</p>

<ol>
<li><p>It is possible to repeat 1 multiple time, each time extending the pandas data_frame. Unfortunately, this looks a little ugly still, but works</p>

<pre><code>if counter_cur_file &gt; 0:
    final_df = final_df.combine_first(cur_sim_ts)
else:
    final_df = cur_sim_ts.copy() / 10.0 
</code></pre>

<p>In the else part, the Pandas data_frame is initialised. It so happens that this and only this series of the loop has to be divided by ten. In all other cases the time_series that was read in step 1 is &#8220;appended&#8221; (or rather combined with) the the previously initialized data_frame. The wicked thing is that each time_series is put at the propper &#8220;place&#8221; in time within the data_frame. Dates are real dates. This is beautiful, but I had to be a little carfule with the data I had at hand, in which every month has 30 days.</p></li>
<li><p>As soon as this data_frame is constructed, things are easy, for example</p></li>
</ol>

<ul>
<li><p>plotting, particularly plotting only a certain time-interval of the data.</p>

<pre><code> final_df['2070-01-01':'2100-12-30'].plot(ylim=[-10,45])
</code></pre></li>
<li><p>saving the data_frame</p>

<pre><code> final_df.save(os.path.join(out_path, pickle_filename))
</code></pre></li>
</ul>

<ol>
<li><p>For me it was of particular interest to find out, how many consecutive dry and wet days there are in each time series. I introduced a threshold of precipitation. If the daily amount of precipitation is above that threshold, this day is considered to be &#8220;wet&#8221;, else it&#8217;s considered to be &#8220;dry&#8221;. I wanted to count the number of consecutive dry and wet days, and remember them for a time series. This is the purpose of the function below. It is coded a little bute force. Still I was surprised, that it performed reasonably well. If anybody has a better idea, please let me know. Maybe it can be of use for other Pandas users. Note: a time_series in the Pandas world is obtained by looping over a data_frame</p>

<pre><code>def dry_wet_spells(ts, threshold):
 """
 returns the duration of spells below and above threshold

 input
 -----
 ts          a pandas timeseries
 threshold   threshold below and above which dates are counted

 output
 ------
 ntot_ts               total number of measurements in ts
 n_lt_threshold        number of measurements below threshold
 storage_n_cons_days   array that stores the lengths of sequences
                       storage_n_cons_days[0] for dry days
                       storage_n_cons_days[1] for wet days
 """
 # total number in ts
 ntot_ts = ts[~ ts.isnull()].count()
 # number lt threshold
 n_lt_threshold = ts[ts &lt;= threshold].count()

 # type_day = 0   # dry
 # type_day = 1   # wet

 # initialisierung: was ist der erste Tag
 type_prev_day = 0
 storage_n_cons_days = [[],[]]
 n_cons_days = 0

 for cur_day in ts[~ ts.isnull()]:
     # current day is dry
     if cur_day &lt;= threshold:
         type_cur_day = 0
         if type_cur_day == type_prev_day:
             n_cons_days += 1
         else:
             storage_n_cons_days[1].append(n_cons_days)
             n_cons_days = 1
         type_prev_day = type_cur_day
     else:
         type_cur_day = 1
         if type_cur_day == type_prev_day:
             n_cons_days += 1
         else:
             storage_n_cons_days[0].append(n_cons_days)
             n_cons_days = 1
         type_prev_day = type_cur_day

 return ntot_ts, n_lt_threshold, storage_n_cons_days    
</code></pre></li>
<li>With all of this, I can produce histograms like this:</li>
</ol>

<p><img src="http://planetwater.org/wp-content/uploads/2012/11/P_9224__MEAS_1950-01_1992-04_HistDryWetDays.png" alt="P 9224 MEAS 1950 01 1992 04 HistDryWetDays" title="P_9224__MEAS_1950-01_1992-04_HistDryWetDays.png" border="0" width="600" height="436" /></p>
<img src="http://feeds.feedburner.com/~r/planetwater/~4/jsh6QM87I48" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://planetwater.org/2012/11/20/pandas-in-python-use-with-hydrological-time-series/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://planetwater.org/2012/11/20/pandas-in-python-use-with-hydrological-time-series/</feedburner:origLink></item>
		<item>
		<title>Water Levels in New York City</title>
		<link>http://feedproxy.google.com/~r/planetwater/~3/yK5jQIt100s/</link>
		<comments>http://planetwater.org/2012/10/29/water-levels-in-new-york-city/#comments</comments>
		<pubDate>Mon, 29 Oct 2012 20:09:56 +0000</pubDate>
		<dc:creator>Claus</dc:creator>
				<category />

		<guid isPermaLink="false">http://planetwater.org/?p=1840</guid>
		<description><![CDATA[update Monday; October 29, 2012 3:18pm (CDT) Eric Holthaus at the Wall Street Journal has a more in-depth analysis for the coming hours I&#8217;ve been following the unfolding Sandy in the last couple of days. I&#8217;ve been relying of many online sources. Since I don&#8217;t have anything original to say, I decided to wait with [...]]]></description>
				<content:encoded><![CDATA[<p>update Monday; October 29, 2012 3:18pm (CDT)
Eric Holthaus at the Wall Street Journal has a more <a href="http://blogs.wsj.com/metropolis/2012/10/29/hurricane-sandy-what-to-expect-at-landfall-in-next-two-hours/">in-depth analysis</a> for the coming hours</p>

<hr />

<p>I&#8217;ve been following the unfolding Sandy in the last couple of days. I&#8217;ve been relying of many online sources. Since I don&#8217;t have anything original to say, I decided to wait with a post here until things have settled down a bit.</p>

<p>However, I wanted to share this one chart of the water levels at &#8220;The Battery NY&#8221; (original available <a href="http://hudson.dl.stevens-tech.edu/SSWS/d/index.shtml?station=N017">here</a>. It shows observed (red dots) vs. modelled (pink and green) water levels. Note that the current water level is (slightly) higher than predicted. This has been true for the last couple of high tides, but those had smaller peaks.</p>

<p><img src="http://planetwater.org/wp-content/uploads/2012/10/Water-Levels.jpg" alt="Water Levels in New York City" title="Water Levels.jpg" border="0" width="597" height="399" /></p>

<p>The coming hours will be critical! There is a high tide coming up, coinciding with the landfall of Sandy. Additionally there&#8217;s a mid-latitude trough just east of the Great Lakes that pulls Sandy onto the North American continent. Additionally, as if that wasn&#8217;t enough, the North Atlantic Oscillation is in a negative phase, pulling Sandy towards the North-East. Decent summaries can be found <a href="http://t.co/EkKcadA7">here</a> and <a href="http://all-geo.org/highlyallochthonous/2012/10/storm-comin/">here</a>.</p>
<img src="http://feeds.feedburner.com/~r/planetwater/~4/yK5jQIt100s" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://planetwater.org/2012/10/29/water-levels-in-new-york-city/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://planetwater.org/2012/10/29/water-levels-in-new-york-city/</feedburner:origLink></item>
	</channel>
</rss>
