tag:blogger.com,1999:blog-18143537322741009252023-03-14T10:25:15.127-07:00Murray's JournalMurrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.comBlogger52125tag:blogger.com,1999:blog-1814353732274100925.post-13198012690686059642013-12-17T11:00:00.000-08:002013-12-17T11:23:21.677-08:00HistogramTools 0.3<p>A new version 0.3 of <a href="http://cran.r-project.org/web/packages/HistogramTools/">HistogramTools</a> is now on <a href="http://cran.r-project.org/">CRAN</a>. HistogramTools provides a number of methods for manipulating histograms, measuring the distance between histograms, calculating the information loss due to binning aggregate data sets, and other tools useful for statistical analysis of binned/histogram data. It also uses <a href="http://cran.r-project.org/web/packages/RProtoBuf/index.html">RProtoBuf</a> to provide a protocol buffer representations of the default R histogram class to allow histograms over large data sets to be computed and manipulated in a MapReduce environment with tools written in other languages.</p>
<p>The full list of updates includes :</p>
<ul>
<li>Moved 'Hmisc' from Depends to Imports.</li>
<li>Improved introduction vignette significantly.</li>
<li>Added <tt>ScaleHistograms</tt> function.</li>
<li>Added <tt>PlotRelativeFrequency</tt> function to plot relative frequency histograms.</li>
<li>Added <tt>minkowski.dist</tt>, <tt>intersect.dist</tt>, <tt>kl.divergence</tt>, <tt>jeffrey.divergence</tt> measures for two histograms.</li>
<li>Added <tt>PreBinnedHistogram</tt> for creating histogram objects from an already binned dataset (e.g. just a vector of bins and counts).</li>
</ul>
<p>Dirk's <a href="http://dirk.eddelbuettel.com/cranberries/">CRANberries</a> service provides a <a href="http://dirk.eddelbuettel.com/cranberries/2013/12/11/#HistogramTools_0.3">diff</a> to the previous release 0.2. More information is at the <a href="http://cran.r-project.org/web/packages/HistogramTools/">HistogramTools</a> page on CRAN which includes the 18-page package <a href="http://cran.r-project.org/web/packages/HistogramTools/vignettes/HistogramTools.pdf">vignette</a> and 1-page <a href="http://cran.r-project.org/web/packages/HistogramTools/vignettes/HistogramTools-quickref.pdf">Quick Reference Guide</a>. Please mail me directly with any questions or suggestions about this package.</p>
Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-88519693573031340732013-06-07T23:21:00.000-07:002013-06-07T23:21:46.512-07:00New Work on Flash Provisioning at USENIX ATC on June 26<p>Later this month I'll be at the <a href="http://www.usenix.org/conference/atc13">USENIX Annual Technical Conference</a> in San Jose with some coauthors on the Storage Analytics and Colossus teams at Google to present some of our recent work on optimizing flash provisioning for cloud storage workloads. Our paper is titled "Janus: Optimal Flash Provisioning for Cloud Storage Workloads", and a pre-print is available from <a href="http://research.google.com/pubs/pub41179.html">Google Research</a>.</p>
<a href="https://www.usenix.org/conference/atc13"><img src="https://www.usenix.org/sites/default/files/atc13_going.png" border="0" width="168" height="67" align="left" alt="I'm going to ATC '13"></a>
<p>This work is about using statistical samples of I/O patterns from a large distributed filesystem to formulate and solve an optimization problem that helps us allocate flash better in our datacenters. I'm looking forward to returning to USENIX ATC as it has been nearly 10 years since I've been to this conference. Shoot me a mail if you will be there and want to meet up.</p>
Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-46770764725623446302012-10-14T17:33:00.000-07:002012-10-14T17:33:24.249-07:00Two Recent Short Papers<p>My group at Google continues to grow, and we had the opportunity to publish a few short workshop papers this year about some of the areas we've investigated this year.</p>
<ul>
<li><a href="http://research.google.com/pubs/pub37747.html">Projecting Disk Usage Based on Historical Trends in a Cloud Environment</a>,
Proceedings of the 3rd international workshop on Scientific cloud computing, ACM (2012)</li>
<li><a href="http://research.google.com/pubs/pub40378.html">Uncertainty in Aggregate Estimates from Sampled Distributed Traces</a>, 2012 Workshop on Managing Systems Automatically and Dynamically, USENIX</li>
</ul>
<p>The first paper describes some of the work we've done on forecasting storage growth in datacenters for capacity planning purposes using ensemble forecasting methods and trend-change detection. It builds on some of the earlier work we did for <a href="http://research.google.com/pubs/pub37483.html">websearch traffic forecasting</a> and, to a lesser extent, building a <a href="http://research.google.com/pubs/pub35115.html">market economy for datacenter resources</a>.</p>
<p>The second paper, to which I made only minor contributions, is a more mathematical description of a method of quantifying the uncertainty in aggregate metrics from a sampled RPC tracing system for large-scale distributed systems (e.g. Dapper).</p>
<p>Both papers are addressing problems that usually come up in very large-scale distributed systems, and the applicability is somewhat limited in smaller contexts, but I would be very interested in feedback regardless.</p>
Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-50427491877890081232012-09-06T23:34:00.000-07:002012-09-06T23:34:35.821-07:00Cycling 320+ Miles Next Week for CharityNext week I will be cycling from Eureka to San Francisco for the <a href="http://www.climateride.org">California Climate Ride</a>. I'll be mostly out of touch for the week, but will try to post pictures and check in via email and mobile phone when possible. Please consider donating towards <a href="http://bike.climateride.org/index.cfm?fuseaction=donorDrive.participant&participantID=1862">my fundraising goal</a> to support the <a href="http://www.sfbike.org">San Francisco Bicycle Coalition.</a>
<iframe src="http://bike.climateride.org/index.cfm?fuseaction=widgets.200x420thermo&participantID=1862" width="202" height="422" frameborder="0" scrolling="no"><a href="http://bike.climateride.org/index.cfm?fuseaction=donorDrive.participant&participantID=1862">Make a Donation!</a></iframe>
Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-27302681775205679262010-10-28T23:20:00.000-07:002010-10-31T23:15:02.413-07:00What I've been up to..It's been nearly a year since I posted here and much has changed. The obvious and most important change is a second new addition to our family which I've been blogging about elsewhere. On the work front I was able to publish a paper about some of my work studying the <a href="http://research.google.com/pubs/pub36737.html">Availability in Globally Distributed Storage Systems</a> at Google last year. This is an exciting space given the growth of cloud based storage services and sophisticated distributed storage software.<br /><br />I've been blogging a little more regularly about work-related topics on Google company blogs, with four posts so far this year :<ul><br /><li><a href="http://google-opensource.blogspot.com/2010/10/integrating-r-with-c-rcpp-rinside-and.html">Integrating R with C++: Rcpp, RInside, and RProtobuf</a><br /><li><a href="http://googleresearch.blogspot.com/2010/10/google-at-usenix-symposium-on-operating.html">Google at USENIX Symposium on Operating Systems Design and Implementation (OSDI '10)</a><br /><li><a href="http://google-opensource.blogspot.com/2010/09/freebsds-summer-highlights.html">FreeBSD's Summer Highlights</a><br /><li><a href="http://google-opensource.blogspot.com/2010/07/notes-from-user-2010.html">Notes from useR! 2010</a><br /></ul><br />As you can see I've been working on data analysis, distributed cloud storage, and open source, along with some other projects I'm not yet able to talk about. I'll try to post more updates about some of my interests and side projects in the remainder of the year.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-31954816844965846812010-01-10T17:25:00.000-08:002010-01-10T18:52:04.938-08:00Fun with Amazon Web Services<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_eXmV_8Vp5GE/S0qM4jY-WyI/AAAAAAAAqXo/0YRvzZiyCS8/s1600-h/chart01_traditional_240x240.jpg"><img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 200px;" src="http://1.bp.blogspot.com/_eXmV_8Vp5GE/S0qM4jY-WyI/AAAAAAAAqXo/0YRvzZiyCS8/s200/chart01_traditional_240x240.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5425303604321737506" /></a>Amazon has been doing a really great job at selling excess compute capacity in their datacenters through products such as <a href="http://aws.amazon.com/ec2/">Amazon Elastic Compute Cloud (EC2)</a>, <a href="http://aws.amazon.com/elasticmapreduce/">Elastic MapReduce</a>, and their simple and structured distributed storage products. The economics of this kind of model, as represented in the two graphs here are clearly compelling. Instead of buying large numbers of computer to mostly sit idle, new start-up companies, researchers, and individuals can rent the excess capacity from Amazon instead. Last year I worked on some <a href="http://research.google.com/pubs/pub35115.html">related ideas for internal pricing and provisioning of resources</a> at work. This was my first direct experience with the Amazon consumer offerings however, and I was impressed. It took less than half an hour last night to sign up, start a few basic Linux instances, copy some application code over, compile it, and begin running it on the Linux Xen instances.<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_eXmV_8Vp5GE/S0qM_J-azJI/AAAAAAAAqXw/d8otlXB1qsQ/s1600-h/chart02_aws_240x240.jpg"><img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 200px; height: 200px;" src="http://3.bp.blogspot.com/_eXmV_8Vp5GE/S0qM_J-azJI/AAAAAAAAqXw/d8otlXB1qsQ/s200/chart02_aws_240x240.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5425303717758553234" /></a>Not everything is so easily scaled to run on more computers. Some tasks are more feasibly done with human involvement, and I've also been experimenting with <a href="http://www.mturk.com">Amazon Mechanical Turk</a> as well. This service is named after the <a href="http://en.wikipedia.org/wiki/The_Turk">18th century fake chess-playing machine</a> that actually used a hidden human operator to control the device. I have used this service recently to <a href="http://freebsd.stokely.org/2010/01/improved-conference-captions-from.html">improve the captions for FreeBSD technical conference videos</a> that I am involved with and the results have been stunning.<br /><br />The results of cheap on-demand distributed computer clusters and a global english language work force that can be paid by the task almost engender too many business ideas to contemplate.. If only there were more hours in the day..Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-62162931958942144342009-06-07T23:12:00.000-07:002009-06-07T23:20:54.504-07:00Support Simon Singh and Scientific Debate<a href="http://www.simonsingh.net/">Simon Singh</a> has been sued for libel by the British Chiropractic Association. Simon is an author, journalist, and TV producer who works to popularize math and science. I had the opportunity to hear Simon speak about an earlier book on the <a href="http://www.amazon.com/Big-Bang-Origin-Universe-P-S/dp/0007162219/ref=sr_1_1?ie=UTF8&s=books&qid=1244441782&sr=8-1">Big Bang</a> at <a href="http://www.keble.ox.ac.uk/">Keble College</a>, Oxford. Simon wrote a more recent book on alternative medicine and suggests that there is no evidence for the efficacy of chiropractic treatments for asthma, ear infections, and other infant conditions. British Libel laws are more strict than those in the U.S. and this scientific debate has unbelievably been construed as a form of libel. Read more about the dispute and sign the <a href="http://www.senseaboutscience.org.uk/index.php/site/project/334">petition</a> here.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-68561946482429193732009-06-01T01:10:00.000-07:002009-06-01T01:19:33.834-07:00Erdős NumberMy current <a href="http://www.oakland.edu/enp/">Erdős number</a> is 4. There are several paths of length 3 from my M.Sc. advisor, Joël Ouaknine, to Paul Erdős. The path currently returned by the <a href="http://www.ams.org/mathscinet/collaborationDistance.html">AMS Collaboration Distance Calculator</a> is:<br /><ul><br /><li>Murray Stokely coauthored with Joël Ouaknine<br /><li>Joël Ouaknine coauthored with A. W. Roscoe<br /><li>A. W. Roscoe coauthored with Mary Ellen Rudin<br /><li>Mary Ellen Rudin coauthored with Paul Erdős<br /></ul>Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-69425301282127645522009-05-09T17:06:00.000-07:002009-05-09T18:37:07.415-07:00Dice WarsI was almost completely unaware of the phenomenon of Flash Games until late last year when a friend of mine started working at a <a href="http://www.mochimedia.com/">company</a> that makes ads for them. I recently discovered the <a href="http://www.gamedesign.jp/flash/dice/dice.html">Dice Wars</a> game which is a typical example of the genre. It is simple yet addictive. The basic premise is similar to the classic board game <a href="http://en.wikipedia.org/wiki/Risk_(game)">Risk</a>. Unlike the board game however, the board is smaller, you can not transit armies, army placement is random, and games are much much quicker.<br /><br />Although the dice layout is completely random, it is addicting because much of the game involved strategy of placement. New armies are awarded after each turned based on the largest connected set of game territories your armies control. As with Risk, you roll one die for each army and the sum of the faces for all N attacking dice is compared to the sum of the faces for all M defending dice with defenders winning the tie. Some quick <a href="http://www.r-project.com">R</a> code can be used to compute the probabilities of winning a given attack given N attacking armies and M defending armies. The left column represents the number of attacking dice and the first row represents the number of defending dice. Each cell represents the probability of a successful attack given M vs N fair dice.<table border=1><tr><th>Dice</th><th align="center">1</th><th align="center">2</th><th align="center">3</th><th align="center">4</th><th align="center">5</th><th align="center">6</th></tr><br /><tr><th>1</th><td>.4167</td><td>.0926</td><td>.0116</td><td>.0008</td><td>2e-05</td><td>0</td></tr><br /><tr><th>2</th><td>.8380</td><td>.4437</td><td>.1520</td><td>.0358</td><td>.0061</td><td>.0077 </td></tr><br /><tr><th>3</th><td>.9730</td><td>.7786</td><td>.4536</td><td>.1917</td><td>.0607</td><td>.0149 </td></tr><br /><tr><th>4</th><td>.9973</td><td>.9392</td><td>.7427</td><td>.4595</td><td>.2204</td><td>.0834 </td></tr><br /><tr><th>5</th><td>.9998</td><td>.9879</td><td>.9093</td><td>.7181</td><td>.4637</td><td>.2424 </td></tr><br /><tr><th>6</th><td>.999997</td><td>.9982</td><td>.9753</td><td>0.884</td><td>.6996</td><td>.4667 </td></tr><br /></table>Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-13525693231543242412009-04-28T23:02:00.000-07:002009-04-29T01:08:07.689-07:00Back after a 4 year hiatus...I've imported a number of short posts I made on a previous personal blog from 2004-2005 and relaunched this as <a href="http://blog.stokely.org">blog.stokely.org</a>. My posts in the past tended to be short updates about travel plans or links to pictures. Those type of updates now go to something like Flickr, Facebook, or Twitter, so I'll be using this blog to post less frequent but longer musings about technology, math, computer science, travel, and life in the bay area. I try to keep things partitioned across three separate blogs so that friends and family are not inundanted with posts about which they are uninterested:<br /><ul><br /> <li><a href="http://blog.stokely.org">blog.stokely.org</a> - This blog: general posts about my life, math, technology, and more.</li><br /> <li><a href="http://freebsd.stokely.org">freebsd.stokely.org</a> - Posts about my involvement with the open-source FreeBSD Operating System.</li><br /> <li><a href="http://ava.stokely.org">ava.stokely.org</a> - Posts about Ava and our life with a new baby girl.</li><br /></ul>Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-80831802174079415692005-09-21T23:38:00.000-07:002009-04-28T00:49:33.228-07:00Returning to IndustryI've decided to accept a position at <a href="http://www.google.com">Google</a> and postpone the Ph.D. for now. We are moving to Mountain View next week. Our new apartment is 1 mile from downtown and 2 miles from the Googleplex. Pictures are available <a href="http://www.stokely.org/20050916-apartment/">here</a>.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-49779917125842448592005-06-24T23:37:00.000-07:002009-04-28T00:49:54.820-07:00Summer Events - Garden Party, Ball, etc.I've posted a few <a href="http://www.stokely.org/20050624-wadham-ball/">pictures</a> of summer events from the last few weeks of Trinity Term at Wadham. These include the Finalists/Graduates Garden Party with the Warden, and the Burlesque Ball.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-59371917556806664542005-05-30T23:36:00.000-07:002009-04-27T23:37:27.887-07:00Camping in Lakes DistrictChristian and I went camping in the Lakes District for an extended weekend. The pictures are posted <a href="http://www.stokely.org/20050529-lakesdistrict">here</a> and <a href="http://www.stokely.org/20050529-lakesdistrict-cgb">here</a>.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-51411496889856370732005-05-20T23:35:00.000-07:002009-04-27T23:36:20.567-07:00Back from OttawaI have returned from BSDCan in Ottawa and my friend of 16 years, Christian, is visiting for 2 weeks. The pictures from Canada are <a href="http://www.stokely.org/20050515-ottawa/">here</a>.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-41159132527964672392005-05-03T23:34:00.000-07:002009-04-27T23:35:20.843-07:00Pictures from MoscowI have returned from Moscow and posted the pictures <a href="http://www.stokely.org/20050501-moscow/">here</a>.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-24994927680132158852005-04-23T23:33:00.000-07:002009-04-28T00:51:19.608-07:00Back to MoscowI'm flying to Moscow on Tuesday to give FreeBSD talks at <a href="http://www.outsourcing-russia.com/events/?18">Open Source Forum Russia</a> and at <a href="http://www.msu.ru/en">Moscow State University</a>. I will return on 2 May.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-8200548351188870642005-04-16T23:32:00.000-07:002009-04-27T23:33:30.284-07:00Back from Hiking Trip to SnowdoniaI've just returned from a great week in Snowdonia with other Wadham College MCR members on the annual reading/hiking retreat. Pictures are <a href="http://www.stokely.org/uk.html">here</a>.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-9213624406812765882005-04-06T23:31:00.000-07:002009-04-27T23:32:13.538-07:00Cambridge Exchange DinnerI just returned after a nice exchange dinner at Christ College Cambridge. We spent the night and had a nice dinner and visit with their MCR. The next morning I stayed later to have lunch with some friends. Pictures are posted <a href="http://www.stokely.org/20050406-cambridge">here</a>.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-36036887260625375042005-03-22T23:30:00.000-08:002009-04-28T00:52:02.500-07:00End of CourseworkI submitted my final projects and thus the Hilary Term has ended for me, hurray! No more coursework, only a dissertation to write.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-51173662106144200132005-03-11T23:30:00.000-08:002009-04-27T23:30:37.752-07:00MCR Guest DinnerNik and his girlfriend Lisa came up to Oxford for another MCR Guest Dinner. A few pictures have been posted <a href="http://www.stokely.org/200503-oxford/">here</a>.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-56943964558949519512005-02-03T23:28:00.000-08:002009-04-28T00:52:30.205-07:00Seigo visits OxfordSeigo Tanimura arrived in Oxford today and we walked around town together. A few pictures (mostly from Christ Church) are <a href="http://www.stokely.org/20050203-oxford-seigo/">here</a>. This is the third continent we've met on after originally meeting at a FreeBSD meeting in Tokyo.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-45689854749379235942005-01-21T23:27:00.000-08:002009-04-27T23:27:51.709-07:00Burns NightWe had haggis, whisky, poetry, and bagpipes in hall to celebrate Burns Night in Wadham. The pictures are <a href="http://www.stokely.org/20050121-wadham-burns">here</a>.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-46141876261931308552005-01-19T23:26:00.000-08:002009-04-27T23:27:01.170-07:00Moved to Summertown (Finished)We've moved from Iffley to Summertown, which is a little closer to the Oxford city centre and a much better place to live for various reasons.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-90804091957532581822005-01-17T23:25:00.000-08:002009-04-27T23:26:16.888-07:00Hilary TermHilary term has begun in Oxford so I won't have time to post again until early Spring. My<br /><a href="http://users.ox.ac.uk/~wadh2425">academic page</a> contains some information about lectures and activities this term.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0tag:blogger.com,1999:blog-1814353732274100925.post-50634785322579554882005-01-16T23:24:00.000-08:002009-04-27T23:25:08.089-07:00Moving to SummertownDeDe and I are back from our 10 day trip to Morocco. The pictures have been posted <a href="http://www.stokely.org/europe.html">here</a>. We had a great time. Term starts tomorrow and we are in the middle of moving from Iffley to Summertown in Oxford.Murrayhttp://www.blogger.com/profile/05615584529128840992noreply@blogger.com0