<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom">
  <title>Pat Shaughnessy</title>
  <id>http://patshaughnessy.net</id>
  <updated>2008-09-03T00:00:00Z</updated>
  <author>
    <name />
  </author>
  <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/patshaughnessy" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="patshaughnessy" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
    <title>Ruby’s Missing Data Structure</title>
    <link href="http://patshaughnessy.net/2013/4/11/rubys-missing-data-structure" rel="alternate" />
    <id>http://patshaughnessy.net/2013/4/11/rubys-missing-data-structure</id>
    <published>2013-04-11T00:00:00Z</published>
    <updated>2013-04-11T00:00:00Z</updated>
    <category>ruby</category>
    <author>
      <name />
    </author>
    <summary type="html">&lt;div style="float: left; padding: 7px 30px 10px 0px"&gt;
  &lt;img src="http://patshaughnessy.net/assets/2013/4/11/linked-list2.png"&gt;
&lt;/div&gt;


&lt;p&gt;Have you ever noticed Ruby doesn’t include support for linked lists? Most
computer science textbooks are filled with algorithms, examples and exercises
based on linked lists: inserting or removing elements, sorting lists, reversing
lists, etc. Strangely, however, there is no linked list object in Ruby&amp;hellip;&lt;/p&gt;
</summary>
    <content type="html">&lt;div style="float: left; padding: 7px 30px 10px 0px"&gt;
  &lt;img src="http://patshaughnessy.net/assets/2013/4/11/linked-list2.png"&gt;
&lt;/div&gt;


&lt;p&gt;Have you ever noticed Ruby doesn’t include support for linked lists? Most
computer science textbooks are filled with algorithms, examples and exercises
based on linked lists: inserting or removing elements, sorting lists, reversing
lists, etc. Strangely, however, there is no linked list object in Ruby.&lt;/p&gt;

&lt;p&gt;Recently after studying Haskell and Lisp for a couple of months, I returned to
Ruby and tried to use some of the functional programming ideas I had learned
about. But how do I create a list in Ruby? How do I add or remove an element
from a list in Ruby? Ruby contains fast, internal C implementations of the
Array and Hash classes, and in the standard library you can find Ruby code
implementing the Set, Matrix, and Vector classes among many other things. But
no linked lists – why?&lt;/p&gt;

&lt;p&gt;The answer is simple: Matz included features you would normally associate with
linked lists in the Ruby Array class. For example, you can use &lt;span
  class="code"&gt;Array#push&lt;/span&gt; and &lt;span class="code"&gt;Array#unshift&lt;/span&gt; to
add an element to an array, or &lt;span class="code"&gt;Array#pop&lt;/span&gt; and &lt;span
  class="code"&gt;Array#shift&lt;/span&gt; to remove an element from an array.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://rubysource.com/rubys-missing-data-structure/"&gt;Today on RubySource.com&lt;/a&gt;
I wrote about a few common operations you would normally use a linked list for,
and then took a look at how you would implement them using an array. I also
looked into how Ruby’s array object works internally.&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>Ruby 2.0 Works Hard So You Can Be Lazy</title>
    <link href="http://patshaughnessy.net/2013/4/3/ruby-2-0-works-hard-so-you-can-be-lazy" rel="alternate" />
    <id>http://patshaughnessy.net/2013/4/3/ruby-2-0-works-hard-so-you-can-be-lazy</id>
    <published>2013-04-03T00:00:00Z</published>
    <updated>2013-04-03T00:00:00Z</updated>
    <category>ruby</category>
    <author>
      <name />
    </author>
    <summary type="html">&lt;div style="float: left; padding: 8px 30px 10px 0px;
line-height:16px"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/work1.jpg"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;Lazy enumeration isn’t magic;&lt;br/&gt;it’s just a matter of hard work&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;Ruby 2.0’s new lazy enumerator feature seems like magic. In case you haven’t&lt;/p&gt;
</summary>
    <content type="html">&lt;div style="float: left; padding: 8px 30px 10px 0px;
line-height:16px"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/work1.jpg"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;Lazy enumeration isn’t magic;&lt;br/&gt;it’s just a matter of hard work&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;Ruby 2.0’s new lazy enumerator feature seems like magic. In case you haven’t
tried it yet, it allows you to iterate over an infinite series of values and
take just the values you want. It brings the functional programming concept of
lazy evaluation to Ruby &amp;ndash; at least for enumerations.&lt;/p&gt;

&lt;p&gt;For example, in Ruby 1.9 and earlier you would run into an endless loop if you
tried to iterate over an infinite range:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/code1.png"/&gt;&lt;/p&gt;

&lt;p&gt;Here the call to &lt;span class="code"&gt;collect&lt;/span&gt; starts an endless loop; the
call to &lt;span class="code"&gt;first&lt;/span&gt; never happens. However, if you upgrade
to Ruby 2.0 and use the new &lt;span class="code"&gt;Enumerable#lazy&lt;/span&gt; method,
you can avoid the endless loop and get just the values you need:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/code2.png"/&gt;&lt;/p&gt;

&lt;p&gt;But how does lazy evaluation actually work? How does Ruby know I only want ten
values, in this example? All I have to do is make the simple call to the &lt;span class="code"&gt;lazy&lt;/span&gt;
method and it just works.&lt;/p&gt;

&lt;p&gt;It seems like magic, but actually it’s just a matter of hard work. A lot
happens inside of Ruby when you call &lt;span class="code"&gt;lazy&lt;/span&gt;. To give
you just the values you need, Ruby automatically creates and uses many
different types of internal Ruby objects. Like heavy equipment at a work site,
these objects work together to process the input values from my infinite range
in just the right way. What are these objects? What do they do? How do they
work together? Let’s find out!&lt;/p&gt;

&lt;h2&gt;The Enumerable module: many different ways of calling “each”&lt;/h2&gt;

&lt;p&gt;When I call &lt;span class="code"&gt;collect&lt;/span&gt; on the range above I’m using
Ruby’s &lt;span class="code"&gt;Enumerable&lt;/span&gt; module.  As you probably know, this
module contains a series of methods, such as &lt;span class="code"&gt;select&lt;/span&gt;,
&lt;span class="code"&gt;detect&lt;/span&gt;, &lt;span class="code"&gt;any?&lt;/span&gt; and many more,
that process lists of values in different ways. Internally, all of these
methods work by calling &lt;span class="code"&gt;each&lt;/span&gt; on the target object or
receiver:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/collect1.png"/&gt;&lt;/p&gt;

&lt;div style="float: right; padding: 17px 0px 10px 30px;
line-height:16px"&gt;
  &lt;img src="http://patshaughnessy.net/assets/2013/4/3/work2.jpg"&gt;
&lt;/div&gt;


&lt;p&gt;You can think of the &lt;span class="code"&gt;Enumerable&lt;/span&gt; methods as a series
of different types of machines that operate on data in different ways, all via
the &lt;span class="code"&gt;each&lt;/span&gt; method:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/select-any.png"/&gt;&lt;/p&gt;

&lt;h2&gt;Enumerable is eager&lt;/h2&gt;

&lt;p&gt;Many of the &lt;span class="code"&gt;Enumerable&lt;/span&gt; methods, including &lt;span
  class="code"&gt;collect&lt;/span&gt;, return an array of values.  Since the &lt;span
  class="code"&gt;Array&lt;/span&gt; class also includes the &lt;span
  class="code"&gt;Enumerable&lt;/span&gt; module and responds to &lt;span
  class="code"&gt;each&lt;/span&gt;, you can chain different &lt;span
  class="code"&gt;Enumerable&lt;/span&gt; methods together easily:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/collect-first.png"/&gt;&lt;/p&gt;

&lt;p&gt;In my code example above, the &lt;span class="code"&gt;Enumerable#first&lt;/span&gt; method
calls &lt;span class="code"&gt;each&lt;/span&gt; on the result of &lt;span
  class="code"&gt;Enumerable#collect&lt;/span&gt;, an array which was generated in turn
by another call to &lt;span class="code"&gt;each&lt;/span&gt; on the input range.&lt;/p&gt;

&lt;p&gt;One important detail to notice here is that both &lt;span
  class="code"&gt;Enumerable#collect&lt;/span&gt; and &lt;span
  class="code"&gt;Enumerable#first&lt;/span&gt; are eager: this means that they process
all of the values returned by &lt;span class="code"&gt;each&lt;/span&gt; before returning
the new array value. So in my example, first &lt;span class="code"&gt;collect&lt;/span&gt;
processes all the values from the range and saves the results into the first
array. Then in a second step &lt;span class="code"&gt;first&lt;/span&gt; processes all the
values from the first array, placing the results into the second array:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/two-steps.png"/&gt;&lt;/p&gt;

&lt;p&gt;This is what leads to the endless loop for an infinite range; since &lt;span
  class="code"&gt;Range#each&lt;/span&gt; will never stop returning values, &lt;span
  class="code"&gt;Enumerable#collect&lt;/span&gt; will never finish, and &lt;span
  class="code"&gt;Enumerable#first&lt;/span&gt; will never get a chance to stop the
iteration.&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/endless-loop.png"/&gt;&lt;/p&gt;

&lt;div style="float: left; padding: 47px 30px 10px 0px;
line-height:16px"&gt;
  &lt;img src="http://patshaughnessy.net/assets/2013/4/3/work3.jpg"&gt;
&lt;/div&gt;


&lt;h2&gt;The Enumerator object: deferred enumeration&lt;/h2&gt;

&lt;p&gt;One interesting trick you can use with the &lt;span class="code"&gt;Enumerable&lt;/span&gt;
module’s methods is to call them without providing a block. For example,
suppose I call &lt;span class="code"&gt;collect&lt;/span&gt; on my range, but I don’t
provide a block:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/code3.png"/&gt;&lt;/p&gt;

&lt;p&gt;Here Ruby has prepared an object you can use later to actually enumerate over
the range, called an “Enumerator.” As you can see from the inspect string, Ruby
has saved a reference to the receiver (&lt;span class="code"&gt;1..10&lt;/span&gt;) along with the name of the
enumerable method I want to use (&lt;span
  class="code"&gt;collect&lt;/span&gt;) inside the enumerator object.&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/enumerator-collect.png"/&gt;&lt;/p&gt;

&lt;p&gt;Later when I want to actually iterate through the range and collect the values
in an array, I can just call &lt;span class="code"&gt;each&lt;/span&gt; on the enumerator:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/code4.png"/&gt;&lt;/p&gt;

&lt;p&gt;There are a few other ways of using enumerators, such as calling &lt;span class="code"&gt;next&lt;/span&gt;
repeatedly, which I don’t have time to discuss today.&lt;/p&gt;

&lt;h2&gt;Enumerator::Generator &amp;ndash; generating new values for enumeration&lt;/h2&gt;

&lt;p&gt;In my previous examples I used a &lt;span class="code"&gt;Range&lt;/span&gt; object to produce a series of values.
However, the &lt;span class="code"&gt;Enumerator&lt;/span&gt; class provides another more
flexible way of generating a series of values using a block. Here’s an example:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/enumerator-new.png"/&gt;&lt;/p&gt;

&lt;p&gt;Let’s take a look at what sort of enumerator this is:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/inspect-enum.png"/&gt;&lt;/p&gt;

&lt;p&gt;As you can see, Ruby has created a new enumerator object that contains a
reference to an internal object called &lt;span
  class="code"&gt;Enumerator::Generator&lt;/span&gt;, and has setup to call the &lt;span
  class="code"&gt;each&lt;/span&gt; method on that generator. Internally, the generator
object converts the block I provided above into a &lt;span
  class="code"&gt;Proc&lt;/span&gt; object and saves it away:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/enum-generator.png"/&gt;&lt;/p&gt;

&lt;p&gt;Now when I use the &lt;span class="code"&gt;Enumerator&lt;/span&gt; object, Ruby will call
the &lt;span class="code"&gt;Proc&lt;/span&gt; saved inside the generator to get the values
for the enumeration:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/code5.png"/&gt;&lt;/p&gt;

&lt;p&gt;In other words, the &lt;span class="code"&gt;Enumerator::Generator&lt;/span&gt; object is a
source of data for an enumeration &amp;ndash; it “generates” the values and passes them
along.&lt;/p&gt;

&lt;h2&gt;Enumerator::Yielder &amp;ndash; allowing one block to yield to another&lt;/h2&gt;

&lt;p&gt;If you take a close look at the code above, there’s something strange about it.
I first created the &lt;span class="code"&gt;Enumerator&lt;/span&gt; object using a block:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/enumerator-new.png"/&gt;&lt;/p&gt;

&lt;p&gt;…which yields values to a second block I provide later when I call each:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/code5.png"/&gt;&lt;/p&gt;

&lt;p&gt;In other words, the enumerator somehow allows you to yield values directly from
one block to another:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/two-blocks.png"/&gt;&lt;/p&gt;

&lt;p&gt;But of course this isn’t how Ruby works. Blocks can’t pass values directly to
each other like this. The trick to making this work is another internal object
called the &lt;span class="code"&gt;Enumerator::Yielder&lt;/span&gt; object, passed into
the block with the &lt;span class="code"&gt;y&lt;/span&gt; block parameter:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/enumerator-new.png"/&gt;&lt;/p&gt;

&lt;p&gt;The &lt;span class="code"&gt;y&lt;/span&gt; parameter is very easy to miss here. But if you
re-read the block’s code, you’ll notice I’m not actually yielding values at
all, I’m simply calling the &lt;span class="code"&gt;yield&lt;/span&gt; method on the &lt;span
  class="code"&gt;y&lt;/span&gt; object, which is an instance of the built in &lt;span
  class="code"&gt;Enumerator::Yielder&lt;/span&gt; class. You can see and use this
class for yourself in IRB as follows:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/irb1.png"/&gt;&lt;/p&gt;

&lt;p&gt;The yielder catches values I want the enumerator to generate, using the &lt;span
  class="code"&gt;yield&lt;/span&gt; method, and then later actually yields them to the
target block. As a Ruby developer, aside from calling &lt;span
  class="code"&gt;yield&lt;/span&gt; I don’t normally ever need to interact with the
generator or the yielder; they are used internally by the enumerator. When I
call &lt;span class="code"&gt;each&lt;/span&gt; on the enumerator, it uses these two
objects to generate and yield the values I want:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/enumerator-yields.png"/&gt;&lt;/p&gt;

&lt;h2&gt;Enumerators generate data; Enumerable methods consume it&lt;/h2&gt;

&lt;p&gt;Stepping back for a moment, the pattern we’ve seen so far with enumerations in Ruby is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Enumerator objects produce data.&lt;/li&gt;
&lt;li&gt;Enumerable methods consume data.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/each-and-yield.png"/&gt;&lt;/p&gt;

&lt;p&gt;From right to left, the enumerable method calls &lt;span class="code"&gt;each&lt;/span&gt;
to request data; later from left to right the enumerator object provides the
data by yielding it to a block.&lt;/p&gt;

&lt;div style="float: right; padding: 47px 0px 10px 30px;
line-height:16px"&gt;
  &lt;img src="http://patshaughnessy.net/assets/2013/4/3/work4.jpg"&gt;
&lt;/div&gt;


&lt;h2&gt;Enumerator::Lazy &amp;ndash; putting it all together&lt;/h2&gt;

&lt;p&gt;Ruby 2.0 implements lazy evaluation using an object called &lt;span
  class="code"&gt;Enumerator::Lazy&lt;/span&gt;.  What makes this special is that it
plays both roles! It is an enumerator, and also contains a series of &lt;span
  class="code"&gt;Enumerable&lt;/span&gt; methods. It calls &lt;span
  class="code"&gt;each&lt;/span&gt; to obtain data from an enumeration source, and it
yields data to the rest of an enumeration:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/left-and-right.png"/&gt;&lt;/p&gt;

&lt;p&gt;Since &lt;span class="code"&gt;Enumerator::Lazy&lt;/span&gt; plays both roles, you can
chain them up together to produce a single enumeration. This is what happens in
my infinite range example:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/code2.png"/&gt;&lt;/p&gt;

&lt;p&gt;The call to &lt;span class="code"&gt;lazy&lt;/span&gt; produces one &lt;span
  class="code"&gt;Enumerator::Lazy&lt;/span&gt; object. Then when I call &lt;span
  class="code"&gt;collect&lt;/span&gt; on this first object, the &lt;span
  class="code"&gt;Enumerator::Lazy#collect&lt;/span&gt; method returns a second one:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/lazy-chain.png"/&gt;&lt;/p&gt;

&lt;p&gt;You can see here that the second &lt;span class="code"&gt;Enumerator::Lazy,&lt;/span&gt; created by the call to
&lt;span class="code"&gt;Enumerator::Lazy#collect,&lt;/span&gt; also calls my block, the &lt;span class="code"&gt;x*x&lt;/span&gt; code.&lt;/p&gt;

&lt;p&gt;How does all of this work? How does &lt;span class="code"&gt;Enumerator::Lazy&lt;/span&gt;
do all of this? To serve both as a data producer and consumer, &lt;span
  class="code"&gt;Enumerator::Lazy&lt;/span&gt; uses generator and yielder objects in a
special way. The generator first calls &lt;span class="code"&gt;each&lt;/span&gt; to obtain
data, and then it passes each value it obtains immediately into a special
block:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/lazy-details.png"/&gt;&lt;/p&gt;

&lt;p&gt;Let’s take a closer look at the block from the diagram &amp;ndash; this block implements
the &lt;span class="code"&gt;Enumerator::Lazy#collect&lt;/span&gt; method. (The other lazy
enumeration methods use slightly different blocks.) Ruby implements it
internally using C code, but this is the equivalent Ruby code:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/lazy-map.png"/&gt;&lt;/p&gt;

&lt;p&gt;Reading the code, we can see the block takes a yielder and a value. Then it
yields the value to another block &amp;ndash; this is actually the block I provide to
&lt;span class="code"&gt;Enumerator::Lazy#collect&lt;/span&gt; or &lt;span
  class="code"&gt;x*x&lt;/span&gt; in my example. Then the &lt;span
  class="code"&gt;Enumerator::Lazy#collect&lt;/span&gt; block calls the yielder, passing
the result of my block onto the rest of the enumeration.&lt;/p&gt;

&lt;p&gt;This is the key to lazy evaluation in Ruby. Each value from the data source is
yielded to my block, and then the result is immediately passed along down the
enumeration chain. This enumeration is not eager &amp;ndash; the &lt;span class="code"&gt;Enumerator::Lazy#collect&lt;/span&gt;
method does not collect the values into an array. Instead, each value is passed
one at a time along the chain of &lt;span class="code"&gt;Enumerator::Lazy&lt;/span&gt; objects, via repeated yields.
If I had chained together a series of calls to &lt;span class="code"&gt;collect&lt;/span&gt; or other
&lt;span class="code"&gt;Enumerator::Lazy&lt;/span&gt; methods, each value would be passed
along the chain from one of my blocks to the next, one at a time:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/lazy-chain2.png"/&gt;&lt;/p&gt;

&lt;h2&gt;Lazy evaluation: executing code backwards&lt;/h2&gt;

&lt;p&gt;Why is this chain lazy evaluation? Why does this allow Ruby to avoid an endless loop
and provide me with just the values I need? The answer is that the code at the
end of the enumeration chain, in my example the &lt;span
  class="code"&gt;first(10)&lt;/span&gt; method call, controls how long the enumeration
runs:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/code2.png"/&gt;&lt;/p&gt;

&lt;p&gt;At the end of the enumeration chain the values are yielded to the Enumerable#first
method:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/4/3/lazy-chain-end.png"/&gt;&lt;/p&gt;

&lt;p&gt;After the &lt;span class="code"&gt;Enumerable#first&lt;/span&gt; method receives enough
values, 10 in my example, it stops the iteration by raising an exception.&lt;/p&gt;

&lt;p&gt;In other words, the code at the right side of my enumeration chain, the code at
the end, actually controls the execution flow. The &lt;span
  class="code"&gt;Enumerable#first&lt;/span&gt; both starts the iteration by calling
&lt;span class="code"&gt;each&lt;/span&gt; on the lazy enumerators, and ends the iteration
by raising an exception when it has enough values.&lt;/p&gt;

&lt;p&gt;At the end of the day, this is the key idea behind lazy evaluation: the
function or method at the end of a calculation chain starts the execution
process, and the program’s flow works backwards through the chain of function
calls until it obtains just the data inputs it needs. Ruby achieves this using
a chain of &lt;span class="code"&gt;Enumerator::Lazy&lt;/span&gt; objects, as we’ve seen
above. However, functional languages such as Haskell implement this in a
deeper, more fundamental way, that encompasses all execution and not just
enumeration.&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>An Interview With Laurent Sansonetti</title>
    <link href="http://patshaughnessy.net/2013/2/12/an-interview-with-laurent-sansonetti" rel="alternate" />
    <id>http://patshaughnessy.net/2013/2/12/an-interview-with-laurent-sansonetti</id>
    <published>2013-02-12T00:00:00Z</published>
    <updated>2013-02-12T00:00:00Z</updated>
    <category>ruby</category>
    <author>
      <name />
    </author>
    <summary type="html">&lt;div style="float: left; padding: 7px 30px 10px 0px"&gt;
&lt;table cellpadding="0" cellspacing="0" border="0"&gt;
  &lt;tr&gt;&lt;td&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/12/laurent3.png"&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;Laurent Sansonetti created &lt;a href="http://www.rubymotion.com/"&gt;RubyMotion&lt;/a&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;It was only last April when Laurent Sansonetti captured the imagination of the&lt;/p&gt;
</summary>
    <content type="html">&lt;div style="float: left; padding: 7px 30px 10px 0px"&gt;
&lt;table cellpadding="0" cellspacing="0" border="0"&gt;
  &lt;tr&gt;&lt;td&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/12/laurent3.png"&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;Laurent Sansonetti created &lt;a href="http://www.rubymotion.com/"&gt;RubyMotion&lt;/a&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;It was only last April when Laurent Sansonetti captured the imagination of the
entire Ruby community with &lt;a href="http://www.rubymotion.com/"&gt;RubyMotion&lt;/a&gt;. For the first time Ruby
developers were able to write apps directly for the iOS mobile
platform using the language we all know and love. Now there is no
longer a need to learn the archaic, verbose Objective-C language, or
worse yet to learn to navigate the confusing myriad of windows and dialog boxes in
the XCode IDE. We can just bring the magic of Ruby to iOS, using our favorite
tools: VIM, Sublime or Emacs with our usual command line based workflow.&lt;/p&gt;

&lt;p&gt;I was thrilled earlier this month when Laurent agreed to do an interview with
me for RubySource. I was able to learn more about RubyMotion directly from its
inventor. Since I had so many questions, I’ve divided the interview into two
halves. &lt;a href="http://rubysource.com/getting-to-know-rubymotion-with-laurent-sansonetti/"&gt;In the first
half&lt;/a&gt;,
I ask Laurent some basic questions: What is RubyMotion, exactly? What does it
do? How should we use it? Is writing Ruby for iOS any different from writing a
“normal” Ruby app?&lt;/p&gt;

&lt;p&gt;&lt;a href="http://rubysource.com/laurent-sansonetti-on-rubymotion-internals"&gt;The second half of the
interview&lt;/a&gt; is
a bit more technical: How does RubyMotion work, exactly? How does Laurent use
the LLVM framework to compile Ruby code into a static executable? How does it
differ from the way MacRuby and Rubinius work? I love learning about Ruby
internals, and RubyMotion is certainly a unique and fascinating implementation
of Ruby.&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>Ruby MRI Source Code Idioms #3: Embedded Objects</title>
    <link href="http://patshaughnessy.net/2013/2/8/ruby-mri-source-code-idioms-3-embedded-objects" rel="alternate" />
    <id>http://patshaughnessy.net/2013/2/8/ruby-mri-source-code-idioms-3-embedded-objects</id>
    <published>2013-02-08T00:00:00Z</published>
    <updated>2013-02-08T00:00:00Z</updated>
    <category>ruby</category>
    <author>
      <name />
    </author>
    <summary type="html">&lt;p&gt;&lt;a href="http://patshaughnessy.net/2012/1/4/never-create-ruby-strings-longer-than-23-characters"&gt;Last year I wrote a
post&lt;/a&gt;
about how the core team optimized Ruby to process shorter strings faster than
longer strings. I found that Ruby strings containing 23 or fewer
characters are much faster. Why am I bringing this up again now? Well, it
turns out this isn’t a single optimization that the core team has added for&lt;/p&gt;
</summary>
    <content type="html">&lt;p&gt;&lt;a href="http://patshaughnessy.net/2012/1/4/never-create-ruby-strings-longer-than-23-characters"&gt;Last year I wrote a
post&lt;/a&gt;
about how the core team optimized Ruby to process shorter strings faster than
longer strings. I found that Ruby strings containing 23 or fewer
characters are much faster. Why am I bringing this up again now? Well, it
turns out this isn’t a single optimization that the core team has added for
short strings. Instead, they’ve used the same technique in many other places as
well. For example, if you create an array with only one, two or three elements,
it’s much faster than if you create an array with four or more elements:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/array-chart.png"/&gt;&lt;/p&gt;

&lt;p&gt;Or if you create a Struct object, it’s much faster when there are three or
fewer attributes:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/struct-setup.png"/&gt;
&lt;img src="http://patshaughnessy.net/assets/2013/2/8/struct-chart.png"/&gt;&lt;/p&gt;

&lt;p&gt;The same pattern also appears if you create large integer values using the
Bignum class:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/bignum-chart.png"/&gt;&lt;/p&gt;

&lt;p&gt;Even your own Ruby objects are faster if they contain three or fewer
instance variables:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/ivars.png"/&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/object-chart.png"/&gt;&lt;/p&gt;

&lt;p&gt;Finally, here’s the data showing the same optimization for Ruby strings that I
wrote about last year &amp;ndash; you can see strings containing 23 or fewer characters
are faster:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/string-chart.png"/&gt;&lt;/p&gt;

&lt;p&gt;So should you stop and refactor all of your code to use small arrays, short
strings, and objects with fewer than four instance variables? Of course not!
That’s obviously ridiculous.&lt;/p&gt;

&lt;p&gt;Also, I’ve exaggerated the performance difference by running these lines of
code in a tight loop, executing the same line of code over and over again. In a
more realistic Ruby program the speedup produced by using shorter strings or
small arrays would be mixed in with many other types of operations and code.
The speed of most Ruby applications has more to do with database connections,
network latency and other factors. And, of course, if you’re developing a Rails
web site &amp;ndash; or more generally using lots of different gems &amp;ndash; then your own Ruby
code is probably a small fraction of the Ruby code used across your entire
application. Bizarre refactoring to use fewer instance variables wouldn’t help
you much anyway.&lt;/p&gt;

&lt;p&gt;Instead, the reason I’m bringing these optimizations to your attention is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&amp;hellip;to give you a small taste of all the hard work the Ruby core team has done
to speed up your code. The core team has put in countless hours of work to
squeeze every bit of performance out of Ruby they could to make your code
run faster. To make these optimizations work they had to add
many lines of complex, additional C code inside of Ruby.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&amp;hellip;to make it easier for you to follow the C source code inside of Ruby. If
you’re interested in learning how your language actually works internally then
you’ll need to understand the coding patterns behind these optimizations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&amp;hellip;because it’s fun to see how these things actually work!&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;A verbose optimization&lt;/h2&gt;

&lt;p&gt;There are many places within Ruby’s C source code where small objects &amp;ndash; the
objects corresponding to the shorter bars in the charts above &amp;ndash; are handled
differently than larger objects. This is such a common pattern that I consider
it an &lt;em&gt;MRI idiom&lt;/em&gt;. In order to understand how many of the built in functions in
the &lt;span class="code"&gt;String&lt;/span&gt;, &lt;span class="code"&gt;Array&lt;/span&gt;, &lt;span
  class="code"&gt;Struct&lt;/span&gt;, or &lt;span class="code"&gt;Bignum&lt;/span&gt; classes work
you need to understand the coding pattern behind this optimization. And, as we
saw above, Ruby also uses this pattern when handling instance variables in your
own classes.&lt;/p&gt;

&lt;p&gt;I call these smaller, faster objects “Embedded Objects,” based on the name of
certain C constants used inside of Ruby. For example, here’s the C code that
Ruby uses to create a new array of a certain size or “capacity:”&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/ary_new.png"/&gt;&lt;/p&gt;

&lt;p&gt;As you can see arrays longer than &lt;span class="code"&gt;RARRAY_EMBED_LEN_MAX&lt;/span&gt; are handled
differently than shorter arrays. What’s the value of &lt;span class="code"&gt;RARRAY_EMBED_LEN_MAX&lt;/span&gt;? It
turns out it is 3. This explains the behavior in the chart above.&lt;/p&gt;

&lt;p&gt;Here’s another example &amp;ndash; whenever you increase the size of a string, for
example by calling &lt;span class="code"&gt;String#&amp;lt;&amp;lt;&lt;/span&gt; or &lt;span
  class="code"&gt;String#insert&lt;/span&gt;, Ruby uses this code:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/rb_str_modify_expand.png"/&gt;&lt;/p&gt;

&lt;p&gt;Here again, we can see Ruby handles longer strings differently than shorter
strings, using the value &lt;span class="code"&gt;RSTRING_EMBED_LEN_MAX.&lt;/span&gt; What
is this set to? Well, from the performance chart above we know it must be 23.&lt;/p&gt;

&lt;p&gt;Finally, here’s the code Ruby uses to create new &lt;span
  class="code"&gt;Struct&lt;/span&gt; objects:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/struct_alloc.png"/&gt;&lt;/p&gt;

&lt;p&gt;Once again you can see structs with fewer than &lt;span
  class="code"&gt;RSTRUCT_EMBED_LEN_MAX&lt;/span&gt; members are handled differently
than structs with more attributes. What’s the value of &lt;span
  class="code"&gt;RSTRUCT_EMBED_LEN_MAX?&lt;/span&gt; It must be 3, based on the chart
above.&lt;/p&gt;

&lt;p&gt;These are just three simple examples; if you go and look you’ll find that these
“EMBED” constants appear in many places inside Ruby’s implementation of these 4
built-in classes, along with the code that handles instance variables in your
objects. Each time one of these constants appears, there will also be code the
Ruby core team had to write to handle embedded objects differently &amp;ndash; to make
your code run a few microseconds faster!&lt;/p&gt;

&lt;p&gt;To summarize, here are the 5 C constants Ruby uses as a threshold for embedded
objects, and their values:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/defines.png"/&gt;&lt;/p&gt;

&lt;p&gt;You can find these values in the include/ruby/ruby.h file. As you can see,
each of these corresponds to one of the performance pattern you see in the
charts above. For the Bignum class, Ruby uses the &lt;span
  class="code"&gt;RBIGNUM_EMBED_LEN_MAX&lt;/span&gt; value to keep track of how many
&lt;span class="code"&gt;BDIGIT&lt;/span&gt; structures will fit into a single &lt;span
  class="code"&gt;RBignum&lt;/span&gt; structure.  Ruby uses these &lt;span
  class="code"&gt;BDIGIT&lt;/span&gt; structures to hold large integer values, and
allocates more of them as necessary to represent very large integers.&lt;/p&gt;

&lt;h2&gt;The C “union” keyword&lt;/h2&gt;

&lt;p&gt;Above I showed a few places where these “EMBED” constants appear in Ruby’s
source code, but the most important places the constant appears is in the C
structure definitions for these object types. For example, &lt;a href="http://patshaughnessy.net/2013/1/23/ruby-mri-source-code-idioms-1-accessing-data-via-macros"&gt;as I explained two
weeks
ago&lt;/a&gt;,
Ruby represents every array object using the &lt;span class="code"&gt;RArray&lt;/span&gt;
structure:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/union.png"/&gt;&lt;/p&gt;

&lt;p&gt;Here I’ve shown the &lt;span class="code"&gt;RArray&lt;/span&gt; struct separated into two
pieces: the top rectangle shows how larger arrays with 4 or more elements save
their data, and the lower rectangle shows how shorter array with three or fewer
elements work.  The key to this is the &lt;span class="code"&gt;union&lt;/span&gt; keyword,
which is a trick you can use in the C language to indicate the same memory
segment can be used in more than one way:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/rarray-memory.png"/&gt;&lt;/p&gt;

&lt;p&gt;By using the &lt;span class="code"&gt;union&lt;/span&gt; keyword, the C compiler allows you
to access either the values on the top via the &lt;span class="code"&gt;heap&lt;/span&gt;
structure, the first member of the union, or the values on the bottom, inside
the &lt;span class="code"&gt;ary&lt;/span&gt; array, the second member of the union.&lt;/p&gt;

&lt;h2&gt;Accessing embedded objects via macros&lt;/h2&gt;

&lt;p&gt;&lt;a href="http://patshaughnessy.net/2013/1/23/ruby-mri-source-code-idioms-1-accessing-data-via-macros"&gt;As I also wrote about two weeks
ago&lt;/a&gt;,
Ruby uses a series of C macros to access the data inside an array, string or
most other built in object types. The Ruby core team, fortunately, also uses
these macros to hide some of the complexity around embedded objects.&lt;/p&gt;

&lt;p&gt;To see what I mean, here’s the definition of the &lt;span
  class="code"&gt;RARRAY_PTR&lt;/span&gt; macro  &amp;ndash; Ruby uses this to get a pointer to an
array’s elements:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/2/8/rarray-ptr.png"/&gt;&lt;/p&gt;

&lt;p&gt;Every time Ruby needs access to the contents of an array, it runs the code
found inside this macro: First, it uses a second macro, &lt;span
  class="code"&gt;RBASIC&lt;/span&gt;, to get access to some internal flags stored
inside the inner &lt;span class="code"&gt;RBasic&lt;/span&gt; structure. One of these flags
is called &lt;span class="code"&gt;RARRAY_EMBED_FLAG&lt;/span&gt;. If &lt;span
  class="code"&gt;RARRAY_EMBED_FLAG&lt;/span&gt; is set, then Ruby knows this array is
an embedded object, and therefore looks for the array’s elements in &lt;span
  class="code"&gt;as.ary&lt;/span&gt; &amp;ndash; or the array located right inside the &lt;span
  class="code"&gt;RArray&lt;/span&gt; struct. If &lt;span
  class="code"&gt;RARRAY_EMBED_FLAG&lt;/span&gt; is not set, then Ruby looks for the
array’s elements in the usual way by following the &lt;span class="code"&gt;ptr&lt;/span&gt; pointer to another
memory block in the heap.&lt;/p&gt;

&lt;h2&gt;Learn the idiom once and use it many times&lt;/h2&gt;

&lt;p&gt;As I said above, by learning just a few coding patterns, you can quickly start
to understand large parts of Ruby’s internal source code. Since the embedded
object pattern is used by five different types of objects, it makes a lot of
sense to spend some time learning how it works. By learning a few more MRI
idioms, you’ll start to think like a member of the Ruby core team! Stay tuned,
next time we’ll look at another common coding pattern used by Matz and his
colleagues&amp;hellip;&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>Ruby MRI Source Code Idioms #2: C That Resembles Ruby</title>
    <link href="http://patshaughnessy.net/2013/1/31/ruby-mri-source-code-idioms-2-c-that-resembles-ruby" rel="alternate" />
    <id>http://patshaughnessy.net/2013/1/31/ruby-mri-source-code-idioms-2-c-that-resembles-ruby</id>
    <published>2013-01-31T00:00:00Z</published>
    <updated>2013-01-31T00:00:00Z</updated>
    <category>ruby</category>
    <author>
      <name />
    </author>
    <summary type="html">&lt;div style="float: right; margin: 8px 5px 20px 25px; line-height:16px;"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center" style="background-color: rgb(248, 248, 255);padding: 5px;"&gt;&lt;img
    src="http://patshaughnessy.net/assets/2013/1/31/c-ruby.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;Reading Ruby’s C source code can be&lt;br/&gt;as easy as reading your own Ruby code&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;

</summary>
    <content type="html">&lt;div style="float: right; margin: 8px 5px 20px 25px; line-height:16px;"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center" style="background-color: rgb(248, 248, 255);padding: 5px;"&gt;&lt;img
    src="http://patshaughnessy.net/assets/2013/1/31/c-ruby.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;Reading Ruby’s C source code can be&lt;br/&gt;as easy as reading your own Ruby code&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;&lt;a href="http://patshaughnessy.net/2013/1/23/ruby-mri-source-code-idioms-1-accessing-data-via-macros"&gt;Last
week&lt;/a&gt;
I discussed how Ruby’s C source code uses macros to access data values. I
explained that this “MRI Idiom” can make Ruby’s source a bit confusing for C
programmers to read, but at the same time can make it easier to follow for Ruby
developers who aren’t experienced with C. Today I want to continue this series
and talk about another MRI idiom: how Ruby’s C source code frequently resembles
Ruby code.&lt;/p&gt;

&lt;p&gt;Sounds hard to believe, doesn’t it? At first glance MRI’s C source code looks
nothing like Ruby. For example, take a look at the implementation of
&lt;span class="code"&gt;Array#collect&lt;/span&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/rb_ary_collect1.png"/&gt;&lt;/p&gt;

&lt;p&gt;This is typical C code: verbose, confusing and hard to understand. Ruby is
supposed to be elegant and concise! However, I honestly believe this C code
does resemble Ruby, and in fact implements &lt;span
  class="code"&gt;Array#collect&lt;/span&gt; just the way you would if you were to
implement this in Ruby.&lt;/p&gt;

&lt;p&gt;Of course, we don’t need to imagine how we would implement this in Ruby &amp;ndash; there
already is a Ruby version of &lt;span class="code"&gt;Array#collect&lt;/span&gt; (actually
&lt;span class="code"&gt;Enumerable#collect&lt;/span&gt;) in &lt;a href="http://rubini.us"&gt;Rubinius&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/rubinius1.png"/&gt;&lt;/p&gt;

&lt;p&gt;Now let’s take a second look at MRI’s C implementation &amp;ndash; if you have a vivid
imagination you can see how the MRI C code corresponds to the Ruby used by
Rubinius:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/rb_ary_collect2.png"/&gt;&lt;/p&gt;

&lt;p&gt;My point is that by learning a few of MRI’s idioms and coding patterns, you can
begin to read Ruby’s C source code just as easily as you can read Rubinius’s
Ruby implementation.&lt;/p&gt;

&lt;p&gt;Why would you want to do this? Why not refer to the Rubinius source code base
whenever you have a question about how something works? This is actually a
great idea; Rubinius is a particularly beautiful implementation of Ruby, and
reading its code can give you a lot of insight into what’s going on inside of
Ruby. However, most of us don’t run Rubinius in production. If you really need
to know how something works at a detailed level&amp;hellip; why it is slow, why it is
fast, how or when does it allocate memory, etc., then there is no alternative
but to read the C code you are actually running in production.&lt;/p&gt;

&lt;p&gt;Today I’m going to examine how the C constructs in &lt;span
  class="code"&gt;rb_ary_collect&lt;/span&gt; work in detail, and show how we can
replace them &amp;ndash; in our minds at least &amp;ndash; with the corresponding Ruby code. By
investing a bit of time to learn just a few of MRI’s coding patterns you’ll be
able to understand not only &lt;span class="code"&gt;rb_ary_collect,&lt;/span&gt; but many
of the built in methods from classes such as String, Array, or File that you
use everyday in your code.  Understanding MRI’s idioms will allow you to follow
much of Ruby’s internal source code without being an expert C developer.&lt;/p&gt;

&lt;h2&gt;First, a review of what Array#collect does&lt;/h2&gt;

&lt;p&gt;Let’s start by reviewing how the &lt;span class="code"&gt;collect&lt;/span&gt; method works in Ruby. Here’s the
example from the Ruby documentation:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/ruby-example1.png"/&gt;&lt;/p&gt;

&lt;p&gt;Most of the time we use &lt;span class="code"&gt;collect&lt;/span&gt; to iterate over an array (or some other
Enumerable) and call a block for each element. Later &lt;span class="code"&gt;collect&lt;/span&gt; pushes the
return values from each call to the block into a single new array, and returns
it.&lt;/p&gt;

&lt;p&gt;However, if you call &lt;span class="code"&gt;collect&lt;/span&gt; without a block it returns an enumerator object
you can use later:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/ruby-example2.png"/&gt;&lt;/p&gt;

&lt;p&gt;Now let’s see how Ruby implements &lt;span class="code"&gt;Array#collect&lt;/span&gt; internally. I’ll do this by
replacing bits of the confusing C code with Ruby, step by step. As you’ll see,
this isn’t that hard to do!&lt;/p&gt;

&lt;h2&gt;rb_ary_new2 = Array.new&lt;/h2&gt;

&lt;p&gt;This oddly named C function is actually quite simple: it just creates a new
Array with a length of zero, and the given “capacity.” As I explained last
week, internally Ruby saves a “capacity” value inside of each array, in the
&lt;span class="code"&gt;RArray&lt;/span&gt; structure, which keeps track of the size of
the memory actually allocated for the array. The confusing part of this
function is the name: &lt;span class="code"&gt;rb_ary_new2&lt;/span&gt; doesn’t create two
arrays, or call the “new2” method on the Array class. It’s simply equivalent to
calling &lt;span class="code"&gt;Array.new&lt;/span&gt; in Ruby. MRI uses the number “2” on
the function name to distinguish it from other functions that also create
arrays in slightly different ways, for example without the internal capacity
setting. Let’s take a look what &lt;span class="code"&gt;rb_ary_new2&lt;/span&gt; does:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/rb_ary_new2.png"/&gt;&lt;/p&gt;

&lt;p&gt;You can see it just calls &lt;span class="code"&gt;ary_new&lt;/span&gt; with the given
capacity value, and by passing in &lt;span class="code"&gt;rb_cArray&lt;/span&gt; indicates
we want to create a new instance of the Array class (sometimes Ruby uses the
&lt;span class="code"&gt;RArray&lt;/span&gt; struct for instances of other classes):&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/ary_new.png"/&gt;&lt;/p&gt;

&lt;p&gt;I won’t explain this in detail, but you can see at the top Ruby checks the
capacity parameter is valid, and gets a new &lt;span class="code"&gt;RArray&lt;/span&gt;
struct using the &lt;span class="code"&gt;ary_alloc&lt;/span&gt; function. Finally it sets
up this new &lt;span class="code"&gt;RArray&lt;/span&gt; if necessary. I’ll explain the
details around &lt;span class="code"&gt;RARRAY_EMBED_LEN_MAX&lt;/span&gt; in my next post.&lt;/p&gt;

&lt;p&gt;Now let’s return to &lt;span class="code"&gt;rb_ary_collect&lt;/span&gt; and substitute the
Ruby &lt;span class="code"&gt;Array.new&lt;/span&gt; call into our C code &amp;ndash; just as a
thought experiment, of course!&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/substitute1.png"/&gt;&lt;/p&gt;

&lt;h2&gt;for-loop = Array#each&lt;/h2&gt;

&lt;p&gt;Next, let’s look at how Ruby internally iterates over the array’s elements:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/for-loop.png"/&gt;&lt;/p&gt;

&lt;p&gt;Of course, in Ruby I would never use a for-loop like this; instead I would
call &lt;span class="code"&gt;Array.each&lt;/span&gt; and pass each element to a block. But remember in the C
language there is no concept of blocks or enumerators. Instead, you have to
code “closer to the metal” and explain to the C compiler exactly how it should
iterate through the array. Here’s how this works: first the code above creates
a loop and assigns the values 0, 1, 2… to the variable &lt;span
  class="code"&gt;i&lt;/span&gt;. Next, Ruby accesses each value of the array using this
syntax:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/c-array.png"/&gt;&lt;/p&gt;

&lt;p&gt;As I explained last week, the &lt;span class="code"&gt;RARRAY_PTR&lt;/span&gt; returns a
pointer to the array’s actual data, and &lt;span class="code"&gt;[i]&lt;/span&gt; uses C’s
array syntax to obtain the proper element of the array.&lt;/p&gt;

&lt;p&gt;Now in our thought experiment if we substitute the for-loop with a call to
&lt;span class="code"&gt;Array#each&lt;/span&gt;, passing a block parameter we get:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/substitute2.png"/&gt;&lt;/p&gt;

&lt;p&gt;Now this C code is starting to make more sense!&lt;/p&gt;

&lt;h2&gt;rb_yield = yield&lt;/h2&gt;

&lt;p&gt;Now let’s take a look at what happens inside the loop. As you can see, Ruby
takes each element of the array and passes it to a C function called &lt;span
  class="code"&gt;rb_yield&lt;/span&gt;. As you might guess, this is Ruby’s internal
implementation of the Ruby &lt;span class="code"&gt;yield&lt;/span&gt; keyword. I don’t
have time or space today to explain how &lt;span class="code"&gt;rb_yield&lt;/span&gt;
works in detail here &amp;ndash; it calls into the internal guts of the YARV virtual
machine that runs your Ruby program. For a good explanation of how YARV works
and of what Ruby does internally when you call a block, check out chapters 2
and 5 from &lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby Under a
Microscope&lt;/a&gt;, my eBook on
Ruby internals.&lt;/p&gt;

&lt;p&gt;Let’s continue the thought experiment and substitute &lt;span
  class="code"&gt;rb_yield&lt;/span&gt; with a simple Ruby &lt;span
  class="code"&gt;yield&lt;/span&gt; keyword:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/substitute3.png"/&gt;&lt;/p&gt;

&lt;h2&gt;rb_ary_push = Array#&amp;lt;&amp;lt;&lt;/h2&gt;

&lt;p&gt;Next, let’s take a look at the &lt;span class="code"&gt;rb_ary_push&lt;/span&gt; function call. As you might guess,
this simply calls &lt;span class="code"&gt;Array#&amp;lt;&amp;lt;&lt;/span&gt;. It adds a new value to the end of the
array. Let’s take a quick look at the implementation of this, much farther
above in the same array.c MRI source code file:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/rb_ary_push.png"/&gt;&lt;/p&gt;

&lt;p&gt;I won’t explain this code carefully, but in a nutshell Ruby uses another C
function called &lt;span class="code"&gt;rb_ary_push_1&lt;/span&gt; as a optimization when
you push a single new element. The related &lt;span class="code"&gt;Array#push&lt;/span&gt;
method can possibly take more than one parameter, so it’s handled slightly
differently inside of MRI.&lt;/p&gt;

&lt;p&gt;An interesting detail here how Ruby doubles the internal capacity of the array
when there’s no room for another value, based on the “capacity” value. This is
an optimization to avoid calling &lt;span class="code"&gt;malloc&lt;/span&gt; to allocate
memory over and over again as you push elements onto the array. Allocating (or
reallocating) memory can often be an expensive operation.&lt;/p&gt;

&lt;p&gt;Following the same pattern, I’ll substitute a call to &lt;span
  class="code"&gt;Array#&amp;lt;&amp;lt;&lt;/span&gt; into my original C function &amp;ndash; now the C
code is looking more and more like the Rubinius implementation:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/substitute4.png"/&gt;&lt;/p&gt;

&lt;h2&gt;RETURN_ENUMERATOR = Kernel.to_enum&lt;/h2&gt;

&lt;p&gt;As you can see, there’s just one last bit of confusing C code left from the
original version of &lt;span class="code"&gt;rb_ary_collect&lt;/span&gt; &amp;ndash; the &lt;span
  class="code"&gt;RETURN_ENUMERATOR&lt;/span&gt; macro. Let’s take a look at how this
macro is written, from include/ruby/intern.h:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/return_enumerator.png"/&gt;&lt;/p&gt;

&lt;p&gt;Ah yes… typical C verbosity! We have a multiline macro with backslashes, and a
needless do&amp;hellip;while loop inserted around the actual macro to provide a safe
scope for substitution.   What in the world does this mean?&lt;/p&gt;

&lt;p&gt;Don’t panic! This code isn’t that hard to understand if you take a moment to
read it carefully; it essentially means: if a block was not given then call the
&lt;span class="code"&gt;rb_enumeratorize&lt;/span&gt; function, passing in the name of the
current C function as a parameter. Then return the result as the return value
for &lt;span class="code"&gt;rb_ary_collect&lt;/span&gt;.&lt;/p&gt;

&lt;p&gt;Ruby uses this &lt;span class="code"&gt;RETURN_ENUMERATOR&lt;/span&gt; macro quite often
while implementing methods related to enumeration, such as &lt;span
  class="code"&gt;collect&lt;/span&gt; or &lt;span class="code"&gt;each&lt;/span&gt;. You can find
the &lt;span class="code"&gt;rb_enumeratorize&lt;/span&gt; function in the enumerator.c MRI
source code file, but I won’t explain it here. There’s some complex code that
eventually does the same thing a call to &lt;span
  class="code"&gt;Kernel.to_enum&lt;/span&gt; does &amp;ndash; which is to return a new enumerator
object that is initialized with the values in the current array.&lt;/p&gt;

&lt;p&gt;Replacing this macro with the equivalent call to &lt;span
  class="code"&gt;Kernel.to_enum&lt;/span&gt;, I get:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/substitute5.png"/&gt;&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Let’s take a step back and review what I’ve done in this odd thought
experiment. Reading the original implementation of &lt;span
  class="code"&gt;rb_ary_collect&lt;/span&gt; I recognized some idiomatic C patterns
that I was familiar with. This allowed me &amp;ndash; in my own head at least &amp;ndash; to read
the C source code the same way I would read a Ruby function. Notice how similar
the code above is to Rubinius’s implementation:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/31/rubinius2.png"/&gt;&lt;/p&gt;

&lt;p&gt;My point today is that by learning a few of the C coding patterns the Ruby core
team uses, you can start to read Ruby’s source code just as easily as you can
read your own Ruby code. This is especially true for Ruby’s implementation of
built-in methods like &lt;span class="code"&gt;Array#collect&lt;/span&gt;.&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>Ruby MRI Source Code Idioms #1: Accessing Data Via Macros</title>
    <link href="http://patshaughnessy.net/2013/1/23/ruby-mri-source-code-idioms-1-accessing-data-via-macros" rel="alternate" />
    <id>http://patshaughnessy.net/2013/1/23/ruby-mri-source-code-idioms-1-accessing-data-via-macros</id>
    <published>2013-01-23T00:00:00Z</published>
    <updated>2013-01-23T00:00:00Z</updated>
    <category>ruby</category>
    <author>
      <name />
    </author>
    <summary type="html">&lt;div style="float: left; margin: 8px 25px 5px 0px; line-height:16px;"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center" style="background-color: rgb(248, 248, 255);padding: 5px;"&gt;&lt;img
    src="http://patshaughnessy.net/assets/2013/1/23/definition.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;From: &lt;a href="http://en.wiktionary.org/wiki/idiom"&gt;wiktionary.org&lt;/a&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;

</summary>
    <content type="html">&lt;div style="float: left; margin: 8px 25px 5px 0px; line-height:16px;"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center" style="background-color: rgb(248, 248, 255);padding: 5px;"&gt;&lt;img
    src="http://patshaughnessy.net/assets/2013/1/23/definition.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;From: &lt;a href="http://en.wiktionary.org/wiki/idiom"&gt;wiktionary.org&lt;/a&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;Don’t be afraid of reading Ruby’s C source code. If you’re a Ruby developer, it
can be a lot of fun to see how things work “under the hood,” and studying Ruby
internals can give you a deeper understanding of what Ruby really is and how to
use it.  A good way to get started looking at Ruby’s source code, to get a “lay
of the land,” would be to watch Peter Cooper and I walk through some code in a
&lt;a href="http://www.rubyinside.com/ruby-mri-code-walk-tour-6020.html"&gt;screencast we recorded last
month&lt;/a&gt;. However,
you might be reluctant to read Ruby’s source code on your own since it’s
written in C, a verbose, confusing low-level language that most of us don’t
have time to learn.&lt;/p&gt;

&lt;p&gt;But is Ruby really written in C? I find Ruby’s C code to be very &lt;em&gt;idiomatic&lt;/em&gt;; at
times it almost resembles another dialect or language. To see what I mean, take
a look at this snippet from MRI’s array.c file, which implements the
&lt;span class="code"&gt;Array#compact!&lt;/span&gt; method:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/code1.png"/&gt;&lt;/p&gt;

&lt;p&gt;Most of the code in this function is composed of C macros; these appear in
capital letters. Macros are text formulas that C developers can use to make
their code more concise and easier to read. The Ruby core team very often uses
macros to access data, which is what most of the code in the function above is
doing. This is an example of what I call an “MRI idiom.”&lt;/p&gt;

&lt;p&gt;Today I’m going to take a close look at this method, &lt;span
  class="code"&gt;Array#compact!&lt;/span&gt;, and explain how it works at a C
programming level. I’ll do this by explaining what these different macros do.
Beyond understanding this one method, learning this MRI idiom of accessing data
via macros will help you understand many, many different functions in the Ruby
C source code. In a series of upcoming blog posts, I’ll look at some different
MRI idioms as well.&lt;/p&gt;

&lt;h2&gt;Array#compact!&lt;/h2&gt;

&lt;p&gt;Before we get to the C code, let’s review what the &lt;span
  class="code"&gt;compact!&lt;/span&gt; method does in Ruby. Here’s the example used in
the Ruby docs, the C comment that appears just above this code in array.c:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/code-comment1.png"/&gt;&lt;/p&gt;

&lt;p&gt;The &lt;span class="code"&gt;rb_ary_compact_bang&lt;/span&gt; C function I showed above
actually implements this behavior. Whenever you use the &lt;span
  class="code"&gt;compact!&lt;/span&gt; method in your code, Ruby internally calls this
function and passes in the target array. Somehow it has to identify and remove
the &lt;span class="code"&gt;nil&lt;/span&gt; values. Also, it has to update the target
array, or the receiver of the &lt;span class="code"&gt;compact!&lt;/span&gt; message &amp;ndash; the
normal &lt;span class="code"&gt;compact&lt;/span&gt; method would return a new array
instead and leave the original unchanged. Finally, it should return nil if
there were no changes made to the array.&lt;/p&gt;

&lt;h2&gt;Array data in MRI&lt;/h2&gt;

&lt;p&gt;To understand how MRI accesses data via macros, let’s first look at how it
stores data, at least for array objects. Ruby stores all arrays and their
contents using the &lt;span class="code"&gt;RArray&lt;/span&gt; C struct, like this:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/rarray1.png"/&gt;&lt;/p&gt;

&lt;p&gt;I won’t cover all the details here today; in fact, you can see some other MRI
idioms at work here, such as the &lt;span
  class="code"&gt;ary[RARRY_EMBED_LEN_MAX]&lt;/span&gt; struct member which is used for
space optimization, or the &lt;span class="code"&gt;shared&lt;/span&gt; value, which is
used for copy-on-write optimization. I wrote about these how these idioms work
in the String class last year; see: &lt;a href="http://patshaughnessy.net/2012/1/4/never-create-ruby-strings-longer-than-23-characters"&gt;Never create Ruby strings longer than 23
characters&lt;/a&gt;,
and &lt;a href="http://patshaughnessy.net/2012/1/18/seeing-double-how-ruby-shares-string-values"&gt;Seeing double: how Ruby shares string
values&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The details to learn for today are that: Ruby stores the Array data values in a
C memory array, tracked by the &lt;span class="code"&gt;VALUE *ptr&lt;/span&gt; pointer.  Ruby (usually) tracks the
length of the array in &lt;span class="code"&gt;len&lt;/span&gt;, and Ruby keeps a capacity
value in &lt;span class="code"&gt;capa&lt;/span&gt;. The capacity records the size of the
allocated C memory array (as a count of VALUEs, not as bytes). Ruby frequently
allocates more memory for an array than what is actually needed.&lt;/p&gt;

&lt;p&gt;By taking some time to first study the &lt;span class="code"&gt;RArray&lt;/span&gt; C structure, you can quite easily
understand large parts of the Ruby C source code in the array.c file… and
understand how Ruby arrays actually work!&lt;/p&gt;

&lt;h2&gt;Accessing array data via macros&lt;/h2&gt;

&lt;p&gt;However, as I explained above if you read array.c you’ll immediately notice
that Ruby doesn’t use the &lt;span class="code"&gt;RArray&lt;/span&gt; struct directly. Instead, it accesses values
such as &lt;span class="code"&gt;ptr&lt;/span&gt;, &lt;span class="code"&gt;len&lt;/span&gt; and &lt;span
  class="code"&gt;capa&lt;/span&gt; using macros. If you’re not a C programmer, macros
are formulas that the C “pre-processor” evaluates and substitutes into the C
code just before the C compiler runs.&lt;/p&gt;

&lt;p&gt;So let’s just take a look at these macros and see what they do &amp;ndash; should be simple,
right?&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/gibberish.png"/&gt;&lt;/p&gt;

&lt;p&gt;Oops! C programming isn’t that simple! One of the most challenging parts of
reading and understanding Ruby’s code is figuring out what these macros mean
and do. But this is essential, since they are used so frequently across the
code base. Most of the complexity in these particular formulas has to do with
Ruby’s embedded data idiom, which I’ll cover in one of my upcoming blog
posts.&lt;/p&gt;

&lt;p&gt;But don’t despair, normally these macros just boil down to something very
simple:&lt;/p&gt;

&lt;ol&gt;

&lt;li&gt;&lt;span class="code"&gt;RARRAY_PTR(ary)&lt;/span&gt; - this returns a pointer to the array’s actual data,
normally the same as the &lt;span class="code"&gt;ptr&lt;/span&gt; value. In &lt;span
  class="code"&gt;rb_ary_compact_bang&lt;/span&gt;, Ruby initializes the &lt;span
  class="code"&gt;p&lt;/span&gt; and &lt;span class="code"&gt;t&lt;/span&gt; pointers using &lt;span
  class="code"&gt;RARRAY_PTR&lt;/span&gt;:

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/rarray2.png"/&gt;&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;span class="code"&gt;RARRAY_LEN(ary)&lt;/span&gt; - this returns the length of the array, normally just the
&lt;span class="code"&gt;len&lt;/span&gt; value. &lt;span class="code"&gt;rb_ary_compact_bang&lt;/span&gt; initializes the &lt;span class="code"&gt;end&lt;/span&gt; pointer using
&lt;span class="code"&gt;RARRAY_LEN,&lt;/span&gt; by adding the length to &lt;span class="code"&gt;p&lt;/span&gt;:

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/rarray3.png"/&gt;&lt;/p&gt;

&lt;p&gt;In this diagram, I assume the length of the array is 3, and the
capacity of the array is 5.&lt;/p&gt;&lt;br/&gt;&lt;/li&gt;

&lt;li&gt;&lt;span class="code"&gt;ARY_CAPA(ary)&lt;/span&gt; - the returns the capacity of the array, or the amount of memory Ruby
actually allocated for the array’s elements. Ruby allocates more memory than
necessary to avoid repeated memory allocations when an array changes size. This
normally just returns &lt;span class="code"&gt;capa&lt;/span&gt; (except when Ruby is using certain optimizations):

&lt;p/&gt;

&lt;img src="http://patshaughnessy.net/assets/2013/1/23/rarray4.png"/&gt;&lt;/li&gt;

&lt;li&gt;&lt;span class="code"&gt;ARY_SET_LEN(ary, n)&lt;/span&gt; - this updates the array length, which is normally the &lt;span class="code"&gt;len&lt;/span&gt;
value:

&lt;p/&gt;

&lt;img src="http://patshaughnessy.net/assets/2013/1/23/rarray5.png"/&gt;&lt;/li&gt;

&lt;/ol&gt;


&lt;h2&gt;Putting it all together&lt;/h2&gt;

&lt;p&gt;Now that we understand how MRI accesses data values via macros, it’s not hard
to follow most of the code in &lt;span class="code"&gt;rb_ary_compact_bang&lt;/span&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/rb_ary_compact_bang.png"/&gt;&lt;/p&gt;

&lt;p&gt;Let’s walk through it, starting with this loop:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/compact-loop.png"/&gt;&lt;/p&gt;

&lt;p&gt;This C pointer arithmetic loop actually does the compact operation &amp;ndash; if
you’re not familiar with C, here’s a 5 second lesson on how pointers work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;If &lt;span class="code"&gt;p&lt;/span&gt; is a pointer, then &lt;span
class="code"&gt;*p&lt;/span&gt; means to return the value that &lt;span
class="code"&gt;p&lt;/span&gt; points to, and:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;span class="code"&gt;*p++&lt;/span&gt; means return the value &lt;span
class="code"&gt;p&lt;/span&gt; points to, but also increment &lt;span
class="code"&gt;p&lt;/span&gt; by one after obtaining this value, so it points to the
next value.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Let’s walk through this loop visually, to get a sense of how it works. I’ll use
the example from the Ruby docs:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/ruby-example.png"/&gt;&lt;/p&gt;

&lt;p&gt;First, Ruby checks whether the first value in the array is nil, using another
macro: &lt;span class="code"&gt;NIL_P(*t)&lt;/span&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/compacting1.png"/&gt;&lt;/p&gt;

&lt;p&gt;Since it is not nil, Ruby copies the “a” onto itself:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/else.png"/&gt;&lt;/p&gt;

&lt;p&gt;This has no effect, but both &lt;span class="code"&gt;p&lt;/span&gt; and &lt;span class="code"&gt;t&lt;/span&gt; move forward to the next element:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/compacting2.png"/&gt;&lt;/p&gt;

&lt;p&gt;Now &lt;span class="code"&gt;NIL_P(*t)&lt;/span&gt; is true, so Ruby just increments &lt;span class="code"&gt;t&lt;/span&gt; and not &lt;span class="code"&gt;p&lt;/span&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/if.png"/&gt;&lt;/p&gt;

&lt;p&gt;Now &lt;span class="code"&gt;t&lt;/span&gt; points to the “b”, while &lt;span class="code"&gt;p&lt;/span&gt; remains the same:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/compacting3.png"/&gt;&lt;/p&gt;

&lt;p&gt;This time, &lt;span class="code"&gt;NIL_P(*t)&lt;/span&gt; is false, so Ruby copies the value “b” back, and
increments both pointers:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/else.png"/&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/compacting4.png"/&gt;&lt;/p&gt;

&lt;p&gt;Continuing through the loop again, &lt;span class="code"&gt;NIL_P(*t)&lt;/span&gt; will be true this time:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/if.png"/&gt;&lt;/p&gt;

&lt;p&gt;And Ruby will again only increment &lt;span class="code"&gt;t&lt;/span&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/compacting5.png"/&gt;&lt;/p&gt;

&lt;p&gt;Iterating again, &lt;span class="code"&gt;t&lt;/span&gt; points to the “c”, and so Ruby will copy it back:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/else.png"/&gt;&lt;/p&gt;

&lt;p&gt;And again now both pointers will be incremented:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/compacting6.png"/&gt;&lt;/p&gt;

&lt;p&gt;Finally, Ruby increments &lt;span class="code"&gt;t&lt;/span&gt; past the last nil value,
and exits the loop when &lt;span class="code"&gt;t == end&lt;/span&gt;. This leaves us with
the compacted array, which ends at the current location of &lt;span
  class="code"&gt;p&lt;/span&gt;.&lt;/p&gt;

&lt;h2&gt;Wrapping up&lt;/h2&gt;

&lt;p&gt;Now the compacting operation is done, and Ruby just needs to wrap things up and
leave the array’s internal values in a self consistent state.&lt;/p&gt;

&lt;p&gt;First, Ruby calculates the new length, using the &lt;span class="code"&gt;p&lt;/span&gt;
pointer and &lt;span class="code"&gt;RARRAY_PTR&lt;/span&gt; which returns the start of the
array again:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/calc-length.png"/&gt;&lt;/p&gt;

&lt;p&gt;You can see if the new length is the same as the original length was Ruby will
return &lt;span class="code"&gt;nil&lt;/span&gt; and exit immediately. Otherwise, Ruby uses
&lt;span class="code"&gt;ARY_SET_LEN&lt;/span&gt; to save the new length back in the &lt;span
  class="code"&gt;RArray&lt;/span&gt; struct. In the example above, the new length would
be 3.&lt;/p&gt;

&lt;p&gt;The last bit of confusing code in &lt;span class="code"&gt;rb_ary_compact_bang&lt;/span&gt;
updates the array’s capacity, using the &lt;span class="code"&gt;ARY_CAPA&lt;/span&gt;
macro:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/reset-capacity.png"/&gt;&lt;/p&gt;

&lt;p&gt;This code is still very confusing, but at least we know now that &lt;span
  class="code"&gt;ARY_CAPA(ary)&lt;/span&gt; returns the current capacity of the array.
Remember the capacity is the actual size of the memory allocated to hold the
array data, measured as an element count. Here Ruby calls the &lt;span
  class="code"&gt;ary_resize_capa&lt;/span&gt; method if the new size of the smaller,
compacted  array is less than half of the current capacity, which will free up
some memory. The condition about &lt;span class="code"&gt;ARY_DEFAULT_SIZE&lt;/span&gt;
enforces a minimum capacity &amp;ndash; this constant is set to 16 at the top of array.c:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2013/1/23/default-size.png"/&gt;&lt;/p&gt;

&lt;p&gt;Note: this doesn’t mean that all new, empty arrays allocate enough memory to have a
capacity of at least 16; things aren’t so simple. I’ll explain how new arrays
look in my next post.&lt;/p&gt;

&lt;h2&gt;Loose ends&lt;/h2&gt;

&lt;p&gt;I glossed over a few details here. First of all, as I said above sometimes
&lt;span class="code"&gt;RARRAY_PTR&lt;/span&gt; and &lt;span class="code"&gt;RARRY_LEN&lt;/span&gt;
sometimes work differently. I’ll cover this in my next blog post, on Ruby’s
“embedded data” idiom. Second, I didn’t explain the call to &lt;span
  class="code"&gt;rb_ary_modify&lt;/span&gt;, which is used for Ruby’s copy-on-write
optimization, another MRI idiom. While these are optimizations Ruby uses
internally to speed up your programs, I consider them to be idioms also since
they have a broad, widespread impact on the way MRI’s C code was
written.&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>Ruby, Smalltalk and Class Variables</title>
    <link href="http://patshaughnessy.net/2012/12/17/ruby-smalltalk-and-class-variables" rel="alternate" />
    <id>http://patshaughnessy.net/2012/12/17/ruby-smalltalk-and-class-variables</id>
    <published>2012-12-17T00:00:00Z</published>
    <updated>2012-12-17T00:00:00Z</updated>
    <category>ruby</category>
    <author>
      <name />
    </author>
    <summary type="html">&lt;div style="float: left; padding: 17px 30px 10px 0px;
line-height:16px"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;img src="http://patshaughnessy.net/assets/2012/12/17/bluebook-and-ruby.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;Many of the ideas behind Ruby’s object model&lt;br/&gt;were developed for Smalltalk in the 1970s.&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;A couple weeks ago [an article by Ernie&lt;/p&gt;
</summary>
    <content type="html">&lt;div style="float: left; padding: 17px 30px 10px 0px;
line-height:16px"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;img src="http://patshaughnessy.net/assets/2012/12/17/bluebook-and-ruby.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;Many of the ideas behind Ruby’s object model&lt;br/&gt;were developed for Smalltalk in the 1970s.&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;A couple weeks ago &lt;a href="http://erniemiller.org/2012/11/29/ruby-tidbit-include-vs-extend-with-module-class-variables/"&gt;an article by Ernie
Miller&lt;/a&gt;
got me interested in how class variables work in Ruby. After doing a bit of
research, I found that class variables have been a perennial source of
confusion. In fact, John Nunemaker wrote an article called &lt;a href="http://railstips.org/blog/archives/2006/11/18/class-and-instance-variables-in-ruby/"&gt;Class and Instance
Variables
In Ruby&lt;/a&gt;
way back in 2006 that still applies today. The fundamental problem with class
variables in Ruby is that they are shared among a class and all of its
subclasses &amp;ndash; as John explained six years ago, this can lead to confusion and
unexpected behavior.&lt;/p&gt;

&lt;p&gt;But for me the interesting question here is: “Why?” Why does Ruby share a single
value across all of the subclasses? Why have a distinction between “class
variables” and “class instance variables?” Where do these ideas come from? It
turns out the answer is simple: class variables in Ruby work the same way class
variables work in a much older language called
&lt;a href="http://en.wikipedia.org/wiki/Smalltalk"&gt;Smalltalk&lt;/a&gt;. Smalltalk was invented in
the early 1970s by the renown computer scientist &lt;a href="http://en.wikipedia.org/wiki/Alan_Kay"&gt;Alan
Kay&lt;/a&gt; and a group of his colleagues
working at the &lt;a href="http://www.parc.com"&gt;Xerox PARC&lt;/a&gt; laboratory. With
Smalltalk, Alan Kay didn’t just invent a programming language; he also
conceived of the entire concept of object oriented programming (OOP) and
implemented it for the first time.  While not in very widespread use now,
Smalltalk has influenced many other object oriented programming languages that
are used widely today &amp;ndash; most importantly Objective C and Ruby.&lt;/p&gt;

&lt;p&gt;Today I’m going to look at how class variables work in Smalltalk, and compare and
contrast that against how they work in Ruby. As you’ll see, I found that class
variables aren’t the only idea Ruby took from Smalltalk. Much of Ruby’s object
model design was taken from Smalltalk as well.&lt;/p&gt;

&lt;h2&gt;Class variables in Ruby&lt;/h2&gt;

&lt;p&gt;First, let’s quickly review what a class variable is, and how they work in Ruby. Using
&lt;a href="http://railstips.org/blog/archives/2006/11/18/class-and-instance-variables-in-ruby/"&gt;John Nunemaker’s example from
2006&lt;/a&gt;,
here’s a simple Ruby class, &lt;span class="code"&gt;Polygon&lt;/span&gt;, that contains a single class variable,
&lt;span class="code"&gt;@@sides&lt;/span&gt;:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="r"&gt;class&lt;/span&gt; &lt;span class="cl"&gt;Polygon&lt;/span&gt;
  &lt;span class="cv"&gt;@@sides&lt;/span&gt; = &lt;span class="i"&gt;10&lt;/span&gt;
  &lt;span class="r"&gt;def&lt;/span&gt; &lt;span class="pc"&gt;self&lt;/span&gt;.&lt;span class="fu"&gt;sides&lt;/span&gt;
    &lt;span class="cv"&gt;@@sides&lt;/span&gt;
  &lt;span class="r"&gt;end&lt;/span&gt;
&lt;span class="r"&gt;end&lt;/span&gt;

puts &lt;span class="co"&gt;Polygon&lt;/span&gt;.sides
=&amp;gt; 10
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;This is simple enough: &lt;span class="code"&gt;@@sides&lt;/span&gt; is a variable that any class or instance
method of &lt;span class="code"&gt;Polygon&lt;/span&gt; can access. Here the &lt;span class="code"&gt;sides&lt;/span&gt; class method returns it. At a
conceptual level, internally Ruby associates the &lt;span class="code"&gt;@@sides&lt;/span&gt; variable with the
same memory structure used to represent the &lt;span class="code"&gt;Polygon&lt;/span&gt; class:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2012/12/17/polygon.png"/&gt;&lt;/p&gt;

&lt;p&gt;The confusion comes in when you define a subclass; again here is another one of
John Nunemaker’s examples:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="r"&gt;class&lt;/span&gt; &lt;span class="cl"&gt;Triangle&lt;/span&gt; &amp;lt; &lt;span class="co"&gt;Polygon&lt;/span&gt;
  &lt;span class="cv"&gt;@@sides&lt;/span&gt; = &lt;span class="i"&gt;3&lt;/span&gt;
&lt;span class="r"&gt;end&lt;/span&gt;

puts &lt;span class="co"&gt;Triangle&lt;/span&gt;.sides
=&amp;gt; 3
puts &lt;span class="co"&gt;Polygon&lt;/span&gt;.sides
=&amp;gt; 3
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Notice both class variables, &lt;span class="code"&gt;Triangle.sides&lt;/span&gt; and &lt;span class="code"&gt;Polygon.sides&lt;/span&gt;, were changed to
3. In fact, internally Ruby creates a single variable that both classes share:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2012/12/17/polygon-and-triangle.png"/&gt;&lt;/p&gt;

&lt;p&gt;I may write in more detail about the details of Ruby’s internal implementation
of class variables in an upcoming blog post, but for today I’ll just use these
very simple diagrams. Instead, now let’s switch gears and learn more about
Smalltalk….&lt;/p&gt;

&lt;h2&gt;What is Smalltalk?&lt;/h2&gt;

&lt;p&gt;As I said above, Alan Kay invented Smalltalk along with object oriented
programming while working at Xerox PARC in the early 1970s. This is the same
laboratory that also invented the personal computer, the graphical user
interface, and the Ethernet among many other things. Object oriented
programming actually seems to be one of their less important inventions!&lt;/p&gt;

&lt;p&gt;In Smalltalk, Kay introduced terminology and ideas that we all take for granted
today. Every value in Smalltalk, including language constructs such as code
blocks, is an object. A Smalltalk program consists of these objects and the
way they interact; to call a particular Smalltalk function, you “send a
message” to the object that implements that function. In Smalltalk, functions
are known as “methods.” An object implements a series of methods. All of this
should sound very familiar, of course.&lt;/p&gt;

&lt;div style="float: left; padding: 17px 30px 10px 0px;
line-height:16px"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;img src="http://patshaughnessy.net/assets/2012/12/17/children.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;

In the 1970s, Alan Kay envisioned a laptop/tablet he called
the&lt;br/&gt;
“Dynabook” would run Smalltalk. He and his team actually built
a&lt;br/&gt;
computer called the “Interim Dynabook” and used it to teach&lt;br/&gt;
programming to middle school children.

    &lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;From the very beginning, Kay’s conception of OOP included the idea of an
object’s “class.” An object’s class described a series of behaviors (methods)
each instance of that class exhibited.  Smalltalk also implemented the concept
of polymorphism, which allows the developer to define “subclasses” that share
the behaviors of their “superclass.” All of these terms we use often today were
coined by Kay and his colleagues 40 years ago.&lt;/p&gt;

&lt;p&gt;Smalltalk, however, is more than just a programming language; it’s an entire
graphical development environment. I think of Smalltalk as a precursor to
Visual Studio or XCode, invented before Microsoft or Apple even existed, in a
world where computers were found only in academic or government settings! One
other impressive goal Alan Kay and the Smalltalk team had from the beginning
was to use their visual environment as a teaching tool for school children.
It’s a truly amazing story.&lt;/p&gt;

&lt;p&gt;To learn more about the history and origin of Smalltalk, I would highly
recommend reading &lt;i&gt;The Early History Of Smalltalk&lt;/i&gt;
(&lt;a href="http://www.smalltalk.org/smalltalk/TheEarlyHistoryOfSmalltalk_Abstract.html"&gt;html&lt;/a&gt;
or &lt;a href="http://www.smalltalk.org/downloads/papers/SmalltalkHistoryHOPL.pdf"&gt;original
pdf&lt;/a&gt; or
&lt;a href="http://samizdat.cc/shelf/documents/2004/08.02-historyOfSmalltalk/historyOfSmalltalk.pdf"&gt;easier to read pdf, but missing some
diagrams&lt;/a&gt;),
a retrospective account Kay wrote later in the 1990s. It’s a fascinating
narrative of how Kay and his colleagues borrowed ideas from even earlier, but
with the combination of hard work, creativity and pure talent managed to take a
large step forward and revolutionize the computer science world of their day,
and ours.&lt;/p&gt;

&lt;p&gt;Alan Kay created the first working version of Smalltalk in 1972 &amp;ndash; in his own
words, here is how it happened:&lt;/p&gt;

&lt;blockquote style="line-height:16px"&gt;
I had expected that the new Smalltalk would be an iconic language and would
take at least two years to invent, but fate intervened. One day, in a typical
PARC hallway bullsession, Ted Kaehler, Dan Ingalls, and I were standing around
talking about programming languages. The subject of power came up and the two
of them wondered how large a language one would have to make to get great
power. With as much panache as I could muster, I asserted that you could define
the "most powerful language in the world" in "a page of code." They said, "Put
up or shut up." Ted went back to CMU but Dan was still around egging me on. For
the next two weeks I got to PARC every morning at four o'clock and worked on
the problem until eight, when Dan, joined by Henry Fuchs, John Shoch, and Steve
Purcell showed up to kibbitz the morning's work.  I had originally made the
boast because McCarthy's self-describing LISP interpreter was written in
itself. It was about "a page", and as far as power goes, LISP was the whole
nine-yards for functional languages. I was quite sure I could do the same for
object-oriented languages….
&lt;/blockquote&gt;


&lt;p&gt;Here Kay referred to &lt;a href="http://en.wikipedia.org/wiki/John_McCarthy_(computer_scientist)"&gt;John
McCarthy&lt;/a&gt;, who
invented LISP about 10 years earlier.  It took Kay only eight early mornings of
work to finish the first version of Smalltalk:&lt;/p&gt;

&lt;blockquote style="line-height:16px"&gt;
The first few versions had flaws that were soundly criticized by the group. But
by morning 8 or so, a version appeared that seemed to work….
&lt;/blockquote&gt;


&lt;p&gt;I wish I could be as creative, dedicated and productive as Alan Kay and his
Xerox PARC colleagues were 40 years ago!&lt;/p&gt;

&lt;h2&gt;Class variables in Smalltalk&lt;/h2&gt;

&lt;p&gt;To find out how class variables actually work in Smalltalk, I
installed &lt;a href="http://smalltalk.gnu.org"&gt;GNU
Smalltalk&lt;/a&gt;, a command line based
version of the language which is easy to download and run on a
Linux box. Initially I found Smalltalk to be very strange and
unfamiliar; it’s syntax seems a bit odd and weird at first
glance. For example, you need to remember to end each command
with a period, and also to define a method you only need to
specify a list of arguments… without a method name! I suppose the
first argument is the method name or vice-versa. But after a
couple of days I became accustomed to the idiosyncratic syntax,
and the language began to make more sense to me.&lt;/p&gt;

&lt;p&gt;Here is the same &lt;span class="code"&gt;Polygon&lt;/span&gt; class again &amp;ndash; now I have Smalltalk on the left, and
Ruby on the right:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;div style="float: left; width: 350px;"&gt;Object &lt;span class="r"&gt;subclass:&lt;/span&gt; Polygon [
  Sides := &lt;span class="cl"&gt;10&lt;/span&gt;.
]

Polygon &lt;span class="r"&gt;class&lt;/span&gt; extend [
  sides [ ^Sides ]
]

Polygon sides printNl.
=&amp;gt; 10&lt;/div&gt;&lt;div style="float: left;"&gt;&lt;span class="r"&gt;class&lt;/span&gt; &lt;span class="cl"&gt;Polygon&lt;/span&gt;
  &lt;span class="cv"&gt;@@sides&lt;/span&gt; = &lt;span class="i"&gt;10&lt;/span&gt;
  &lt;span class="r"&gt;def&lt;/span&gt; &lt;span class="pc"&gt;self&lt;/span&gt;.&lt;span class="fu"&gt;sides&lt;/span&gt;
    &lt;span class="cv"&gt;@@sides&lt;/span&gt;
  &lt;span class="r"&gt;end&lt;/span&gt;
&lt;span class="r"&gt;end&lt;/span&gt;


puts &lt;span class="co"&gt;Polygon&lt;/span&gt;.sides
=&amp;gt; 10&lt;/div&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Here’s a quick explanation of what the Smalltalk code does:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;span class="code"&gt;Object subclass: Polygon&lt;/span&gt; &amp;ndash; this means send the &lt;span class="code"&gt;subclass&lt;/span&gt;
message to the &lt;span class="code"&gt;Object&lt;/span&gt; class and pass in the name &lt;span class="code"&gt;Polygon&lt;/span&gt;. It
creates a new class, which is a subclass of the &lt;span class="code"&gt;Object&lt;/span&gt; class.
This is analogous to &lt;span class="code"&gt;class Polygon &amp;lt; Object&lt;/span&gt; in Ruby. Of
course, in Ruby specifying &lt;span class="code"&gt;Object&lt;/span&gt; as the superclass is
unnecessary.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;span class="code"&gt;Sides := 10.&lt;/span&gt; &amp;ndash; this declares a class variable &lt;span class="code"&gt;Sides&lt;/span&gt;, and
assigns it a value. Ruby instead uses the &lt;span class="code"&gt;@@sides&lt;/span&gt; syntax.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;span class="code"&gt;Polygon class extend&lt;/span&gt; &amp;ndash; this “extends” the &lt;span class="code"&gt;Polygon&lt;/span&gt; class;
i.e., it opens up the &lt;span class="code"&gt;Polygon&lt;/span&gt; class and allows me to add a class
method.  In Ruby I use &lt;span class="code"&gt;class Polygon; def self.sides&lt;/span&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;span class="code"&gt;printNl&lt;/span&gt; method prints a value to the console; it works
the same way as &lt;span class="code"&gt;puts&lt;/span&gt; in Ruby, except &lt;span class="code"&gt;printNl&lt;/span&gt; is a method of
the &lt;span class="code"&gt;Sides&lt;/span&gt; object. Imagine calling &lt;span class="code"&gt;@@sides.puts&lt;/span&gt; in Ruby!&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Aside from the superficial syntax differences, if you take a step back and
think about this, it’s striking how similar Smalltalk and Ruby really are! Not
only do both languages share the same class variable concept, but I wrote the
&lt;span class="code"&gt;Polygon&lt;/span&gt; class, declared a class variable and printed it out exactly the same
way in both languages. In fact, you can think of Ruby as a newer version of
Smalltalk with a simpler, easier to use syntax!&lt;/p&gt;

&lt;p&gt;As I said at the top, Smalltalk shares class variables among subclasses the
same way Ruby does. Here’s how I would declare the Triangle subclass in
Smalltalk and Ruby:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;div style="float: left; width: 350px;"&gt;Polygon &lt;span class="r"&gt;subclass:&lt;/span&gt; Triangle [
]
Triangle &lt;span class="r"&gt;class&lt;/span&gt; extend [
  set_sides: num [ Sides := num ]
]

Polygon sides printNl.
=&amp;gt; 10 &lt;/div&gt;&lt;div style="float: left;"&gt;&lt;span class="r"&gt;class&lt;/span&gt; &lt;span class="cl"&gt;Triangle&lt;/span&gt; &amp;lt; &lt;span class="co"&gt;Polygon&lt;/span&gt;
  &lt;span class="r"&gt;def&lt;/span&gt; &lt;span class="pc"&gt;self&lt;/span&gt;.&lt;span class="fu"&gt;sides=&lt;/span&gt;(num)
    &lt;span class="cv"&gt;@@sides&lt;/span&gt; = num
  &lt;span class="r"&gt;end&lt;/span&gt;
&lt;span class="r"&gt;end&lt;/span&gt;

puts &lt;span class="co"&gt;Triangle&lt;/span&gt;.sides
=&amp;gt; 10&lt;/div&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Here I declare the &lt;span class="code"&gt;Triangle&lt;/span&gt; subclass and a method to set the class variable’s
value. Now let’s try changing the value of the class variable from the
subclass:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;div style="float: left; width: 350px;"&gt;Triangle set_sides: &lt;span class="cl"&gt;3&lt;/span&gt;.
Triangle sides printNl.
=&amp;gt; 3&lt;/div&gt;&lt;div style="float: left;"&gt;&lt;span class="co"&gt;Triangle&lt;/span&gt;.sides = &lt;span class="i"&gt;3&lt;/span&gt;
puts &lt;span class="co"&gt;Triangle&lt;/span&gt;.sides
=&amp;gt; 3&lt;/div&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;No surprise; by calling the &lt;span class="code"&gt;set_slides&lt;/span&gt; class method (&lt;span class="code"&gt;sides=&lt;/span&gt; in Ruby) I can
update the value. But notice since both &lt;span class="code"&gt;Polygon&lt;/span&gt; and &lt;span class="code"&gt;Triangle&lt;/span&gt; share the same
class variable, it’s changed for &lt;span class="code"&gt;Polygon&lt;/span&gt; also:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;div style="float: left; width: 350px;"&gt;Polygon sides printNl.
=&gt; 3&lt;/div&gt;&lt;div style="float: left;"&gt;puts &lt;span class="co"&gt;Polygon&lt;/span&gt;.sides
=&amp;gt; 3&lt;/div&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Again, we’ve seen Ruby and Smalltalk behave in exactly the same way.&lt;/p&gt;

&lt;p&gt;One way the two languages differ is that Smalltalk does allow you to create a
separate class variable for each subclass, if you want. By repeating the class
variable definition and the accessor class method in both classes they become
separate variables, at least in GNU Smalltalk which I was using:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;Object &lt;span class="r"&gt;subclass:&lt;/span&gt; Polygon [
  Sides := &lt;span class="cl"&gt;10&lt;/span&gt;.
]

Polygon &lt;span class="r"&gt;class&lt;/span&gt; extend [
  sides [ ^Sides ]
]

Polygon &lt;span class="r"&gt;subclass:&lt;/span&gt; Triangle [
  Sides := &lt;span class="cl"&gt;3&lt;/span&gt;.
]

Triangle &lt;span class="r"&gt;class&lt;/span&gt; extend [
  sides [ ^Sides ]
]

Polygon sides printNl.
&gt;= 10

Triangle sides printNl.
&gt;= 3
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;This isn’t true in Ruby; as we saw above &lt;span class="code"&gt;@@sides&lt;/span&gt; will always refer to the
same value.&lt;/p&gt;

&lt;h2&gt;Class instance variables&lt;/h2&gt;

&lt;p&gt;In Ruby if you want to keep a separate value for each class, then you need to
use a class instance variable instead of a class variable. What does this mean?
Let’s take a look at another one of John Nunemaker’s examples:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;span class="r"&gt;class&lt;/span&gt; &lt;span class="cl"&gt;Polygon&lt;/span&gt;
  &lt;span class="r"&gt;def&lt;/span&gt; &lt;span class="pc"&gt;self&lt;/span&gt;.&lt;span class="fu"&gt;sides&lt;/span&gt;
    &lt;span class="iv"&gt;@sides&lt;/span&gt;
  &lt;span class="r"&gt;end&lt;/span&gt;
  &lt;span class="iv"&gt;@sides&lt;/span&gt; = &lt;span class="i"&gt;8&lt;/span&gt;
&lt;span class="r"&gt;end&lt;/span&gt;

puts &lt;span class="co"&gt;Polygon&lt;/span&gt;.sides
=&amp;gt; 8
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Now since I used the &lt;span class="code"&gt;@sides&lt;/span&gt; notation instead of &lt;span class="code"&gt;@@sides&lt;/span&gt;, Ruby created an
instance variable instead of a class variable:&lt;/p&gt;

&lt;p&gt;&lt;img
src="http://patshaughnessy.net/assets/2012/12/17/polygon-instance.png"/&gt;&lt;/p&gt;

&lt;p&gt;Conceptually there’s no difference, until I create the &lt;span class="code"&gt;Triangle&lt;/span&gt; subclass again:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;
&lt;span class="r"&gt;class&lt;/span&gt; &lt;span class="cl"&gt;Triangle&lt;/span&gt; &amp;lt; &lt;span class="co"&gt;Polygon&lt;/span&gt;
  &lt;span class="iv"&gt;@sides&lt;/span&gt; = &lt;span class="i"&gt;3&lt;/span&gt;
&lt;span class="r"&gt;end&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;Now each class has its own copy of the value &lt;span class="code"&gt;@sides&lt;/span&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img
src="http://patshaughnessy.net/assets/2012/12/17/polygon-and-triangle-instance.png"/&gt;&lt;/p&gt;

&lt;p&gt;Now let’s try the same thing in Smalltalk. In Smalltalk to declare an instance
variable you call the &lt;span class="code"&gt;instanceVariableNames&lt;/span&gt; method on a class:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;div style="float: left; width: 350px;"&gt;Object &lt;span class="r"&gt;subclass:&lt;/span&gt; Polygon [
]

Polygon &lt;span class="r"&gt;instanceVariableNames:&lt;/span&gt; &lt;span class="pc"&gt;'Sides '&lt;/span&gt;!

Polygon extend [
  sides [ ^Sides ]
]&lt;/div&gt;&lt;div style="float: left;"&gt;&lt;span class="r"&gt;class&lt;/span&gt; &lt;span class="cl"&gt;Polygon&lt;/span&gt;
  &lt;span class="r"&gt;def&lt;/span&gt; &lt;span class="fu"&gt;sides&lt;/span&gt;
    &lt;span class="iv"&gt;@sides&lt;/span&gt;
  &lt;span class="r"&gt;end&lt;/span&gt;
&lt;span class="r"&gt;end&lt;/span&gt;&lt;/div&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Here I’ve created a new class &lt;span class="code"&gt;Polygon&lt;/span&gt;, a subclass of &lt;span class="code"&gt;Object&lt;/span&gt;. Then I send
the &lt;span class="code"&gt;instanceVariableNames&lt;/span&gt; message to this new class, telling Smalltalk to
create a new instance variable called &lt;span class="code"&gt;Sides&lt;/span&gt;. Finally, I reopen the &lt;span class="code"&gt;Polygon&lt;/span&gt;
class and add the &lt;span class="code"&gt;sides&lt;/span&gt; method to it. Again I show the corresponding Ruby
code on the right.&lt;/p&gt;

&lt;p&gt;However, here &lt;span class="code"&gt;Sides&lt;/span&gt; and &lt;span class="code"&gt;@sides&lt;/span&gt; are instance variables of &lt;span class="code"&gt;Polygon&lt;/span&gt; objects,
and not of the &lt;span class="code"&gt;Polygon&lt;/span&gt; class. To create a class instance variable in Smalltalk,
you instead have to send the &lt;span class="code"&gt;class&lt;/span&gt; message to &lt;span class="code"&gt;Polygon&lt;/span&gt; first before calling
&lt;span class="code"&gt;instanceVariableNames&lt;/span&gt; or &lt;span class="code"&gt;extend&lt;/span&gt;, like this:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;&lt;div style="float: left; width: 350px;"&gt;Object &lt;span class="r"&gt;subclass:&lt;/span&gt; Polygon [
]

Polygon class &lt;span class="r"&gt;instanceVariableNames:&lt;/span&gt; &lt;span class="pc"&gt;'Sides '&lt;/span&gt;!

Polygon class extend [
  sides [ ^Sides ]
]&lt;/div&gt;&lt;div style="float: left;"&gt;&lt;span class="r"&gt;class&lt;/span&gt; &lt;span class="cl"&gt;Polygon&lt;/span&gt;
  &lt;span class="r"&gt;def&lt;/span&gt; &lt;span class="pc"&gt;self&lt;/span&gt;.&lt;span class="fu"&gt;sides&lt;/span&gt;
    &lt;span class="iv"&gt;@sides&lt;/span&gt;
  &lt;span class="r"&gt;end&lt;/span&gt;
&lt;span class="r"&gt;end&lt;/span&gt;&lt;/div&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Again, notice that the Smalltalk and Ruby code snippets are really just two
different ways of expressing the same commands. In Smalltalk you say &lt;span class="code"&gt;Polygon
class extend [ sides&amp;hellip;&lt;/span&gt; while in Ruby you say &lt;span class="code"&gt;class Polygon; def self.sides&lt;/span&gt;.
To me Ruby seems to be a more succinct version of Smalltalk.&lt;/p&gt;

&lt;h2&gt;Metaclasses in Smalltalk and Ruby&lt;/h2&gt;

&lt;div style="float: right; padding: 17px 0px 10px 30px;
line-height:16px"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;img
    src="http://patshaughnessy.net/assets/2012/12/17/metaphysics.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;
This diagram, taken from Alan Kay’s fascinating article &lt;a
href="http://www.smalltalk.org/downloads/papers/SmalltalkHistoryHOPL.pdf"&gt;The
Early&lt;br/&gt;
History Of Smalltalk&lt;/a&gt;, resembles the class hierarchy Ruby
would&lt;br/&gt;
use 20 years later! 
    &lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;Let’s take another look at the line of code I used above to create an instance
variable in Smalltalk:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;Polygon &lt;span class="r"&gt;instanceVariableNames:&lt;/span&gt; &lt;span class="pc"&gt;'Sides '&lt;/span&gt;!
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Translating from Smalltalk into English, this means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Take the &lt;span class="code"&gt;Polygon&lt;/span&gt; class,&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;send it a message called &lt;span class="code"&gt;instanceVariableNames&lt;/span&gt;,&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;and pass the string &lt;span class="code"&gt;Sides&lt;/span&gt; as a parameter.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Again, this is how you define instance variables in Smalltalk. That is, now
when I create instances of the &lt;span class="code"&gt;Polygon&lt;/span&gt; class, they will each have a &lt;span class="code"&gt;Sides&lt;/span&gt;
instance variable. Saying the same thing in a different way, to give all
polygon instances an instance variable, I call a method on the &lt;span class="code"&gt;Polygon&lt;/span&gt; class.&lt;/p&gt;

&lt;p&gt;As I explained above, to create a class instance variable in Smalltalk you have
to use the &lt;span class="code"&gt;class&lt;/span&gt; keyword, like this:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;Polygon class &lt;span class="r"&gt;instanceVariableNames:&lt;/span&gt; &lt;span class="pc"&gt;'Sides '&lt;/span&gt;!
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;This code literally means: call the &lt;span class="code"&gt;instanceVariableNames&lt;/span&gt; method on the
&lt;span class="code"&gt;Polygon&lt;/span&gt; class’s class. Following the same pattern, now all instances of the
&lt;span class="code"&gt;Polygon&lt;/span&gt; class will contain a class instance variable. But what is the “class of
the &lt;span class="code"&gt;Polygon&lt;/span&gt; class” in Smalltalk? What does this mean? Spending just a moment at
the GNU Smalltalk REPL we can find out:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;$ gst
GNU Smalltalk ready

st&gt; Polygon printNl.
=&gt; Polygon

st&gt; Polygon class printNl.
=&gt; Polygon class
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;I first display the &lt;span class="code"&gt;Polygon&lt;/span&gt; class object, and I get “Polygon”. Displaying the
class of the &lt;span class="code"&gt;Polygon&lt;/span&gt; class, I get “Polygon class.” But what type of object is
this? Let’s call &lt;span class="code"&gt;class&lt;/span&gt; on it:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;st&gt; Polygon class class printNl.
=&gt; Metaclass
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Ah… so the class of a class is a &lt;span class="code"&gt;Metaclass&lt;/span&gt;. Above, when I called
&lt;span class="code"&gt;instanceVariableNames&lt;/span&gt; to create a class instance variable, I was actually
using the &lt;span class="code"&gt;Polygon&lt;/span&gt; metaclass, an instance of the &lt;span class="code"&gt;Metaclass&lt;/span&gt; class.&lt;/p&gt;

&lt;p&gt;Here’s a diagram showing how these classes are all related in Smalltalk:&lt;/p&gt;

&lt;p&gt;&lt;img
src="http://patshaughnessy.net/assets/2012/12/17/metaclasses-smalltalk.png"/&gt;&lt;/p&gt;

&lt;p&gt;By now, it should be no surprise if I tell you internally Ruby uses the same
model. Here’s how classes work inside of Ruby:&lt;/p&gt;

&lt;p&gt;&lt;img
src="http://patshaughnessy.net/assets/2012/12/17/metaclasses-ruby.png"/&gt;&lt;/p&gt;

&lt;p&gt;In Ruby whenever you create a class, Ruby internally creates a corresponding
new class called the “metaclass.” Unlike Smalltalk, Ruby doesn’t use this for
class instance variables, but only to keep track of class methods. Also, Ruby
doesn’t have a &lt;span class="code"&gt;Metaclass&lt;/span&gt; class, but instead all metaclasses are simply
instances of the &lt;span class="code"&gt;Class&lt;/span&gt; class.&lt;/p&gt;

&lt;p&gt;In Ruby the metaclass is a hidden, mysterious concept. Ruby silently creates it
without telling you and doesn’t expose the metaclass in the language directly.
In Smalltalk, however, the metaclasses are not hidden at all and instead play a
large role in the language. Creating a class instance variable, as I did above,
is just one example of using a metaclass in Smalltalk. Another good example is
the way you add class methods by calling &lt;span class="code"&gt;extend&lt;/span&gt;.&lt;/p&gt;

&lt;p&gt;When you ask for a class’s class in Ruby, you simply get &lt;span class="code"&gt;Class&lt;/span&gt;. Ruby doesn’t
tell you about the metaclass:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;
$ irb
&amp;gt; class Polygon; end
&amp;gt; Polygon.class
Class
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;To see a Ruby metaclass, you have to use a trick instead:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;
$ irb
&amp;gt; class Polygon
&amp;gt;   def self.metaclass
&amp;gt;     class &amp;lt;&amp;lt; self
&amp;gt;       self
&amp;gt;     end
&amp;gt;   end
&amp;gt; end
=&amp;gt; nil
&amp;gt; Polygon.metaclass
=&amp;gt; #&amp;lt;Class:Polygon&amp;gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;“#&amp;lt;Class:Polygon&gt;” is the metaclass of &lt;span class="code"&gt;Polygon&lt;/span&gt;. This syntax means
the metaclass is “an instance of &lt;span class="code"&gt;Class&lt;/span&gt; for the &lt;span class="code"&gt;Polygon&lt;/span&gt; class,” or
the metaclass for &lt;span class="code"&gt;Polygon&lt;/span&gt;.&lt;/p&gt;

&lt;p&gt;&lt;/p&gt;


&lt;blockquote&gt;
* Quoted text and images from: The Early History Of Smalltalk, by Alan Kay, © 1993 Association for Computing Machinery
&lt;/blockquote&gt;

</content>
  </entry>
  <entry>
    <title>An Interview With Jim Weirich</title>
    <link href="http://patshaughnessy.net/2012/12/13/an-interview-with-jim-weirich" rel="alternate" />
    <id>http://patshaughnessy.net/2012/12/13/an-interview-with-jim-weirich</id>
    <published>2012-12-13T00:00:00Z</published>
    <updated>2012-12-13T00:00:00Z</updated>
    <category>ruby</category>
    <author>
      <name />
    </author>
    <summary type="html">&lt;p&gt;This appeared &lt;a href="http://rubysource.com/an-interview-with-jim-weirich/"&gt;on
RubySource.com&lt;/a&gt; last
week; just getting around to posting a link to the interview here also&amp;hellip;.&lt;/p&gt;

&lt;p&gt;&lt;/p&gt;




&lt;div style="float: left; padding: 7px 30px 10px 0px"&gt;
&lt;table cellpadding="0" cellspacing="0" border="0"&gt;
  &lt;tr&gt;&lt;td&gt;&lt;img src="http://patshaughnessy.net/assets/2012/12/13/jim.jpg"&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td align="center"&gt;&lt;small&gt;&lt;i&gt;Jim Weirich is the Chief Scientist at &lt;a href="http://neo.com"&gt;Neo&lt;/a&gt;&lt;/i&gt;&lt;/small&gt;&lt;/td&gt;&lt;/tr&gt;

</summary>
    <content type="html">&lt;p&gt;This appeared &lt;a href="http://rubysource.com/an-interview-with-jim-weirich/"&gt;on
RubySource.com&lt;/a&gt; last
week; just getting around to posting a link to the interview here also&amp;hellip;.&lt;/p&gt;

&lt;p&gt;&lt;/p&gt;




&lt;div style="float: left; padding: 7px 30px 10px 0px"&gt;
&lt;table cellpadding="0" cellspacing="0" border="0"&gt;
  &lt;tr&gt;&lt;td&gt;&lt;img src="http://patshaughnessy.net/assets/2012/12/13/jim.jpg"&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td align="center"&gt;&lt;small&gt;&lt;i&gt;Jim Weirich is the Chief Scientist at &lt;a href="http://neo.com"&gt;Neo&lt;/a&gt;&lt;/i&gt;&lt;/small&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;I’ve been familiar with Jim Weirich’s name for a while; among other things he
wrote the “rake” tool which most of us use on a daily basis. Then I was lucky
enough to see Jim do &lt;a href="http://www.confreaks.com/videos/988-goruco2012-power-rake"&gt;a presentation at GoRuCo this
June&lt;/a&gt;, when he
explained some of the more advanced features of rake. I was immediately struck
by how Jim was able to explain a very complex topic in a natural,
straightforward way. Later this Fall at RubyConf I saw Jim gave &lt;a href="http://www.confreaks.com/videos/1287-rubyconf2012-y-not-adventures-in-functional-programming"&gt;an amazing
keynote
address&lt;/a&gt;
that derived the Y-Combinator from first principles, explaining some of the
basic ideas behind Lambda Calculus along the way. This time he not only clearly
explained an even more difficult topic, but was able to make what would
normally be a dry, mathematical subject very entertaining.&lt;/p&gt;

&lt;p&gt;This month I was thrilled had the chance to interview Jim for RubySource; it
was a great opportunity for me to learn more about him and how he approaches
public speaking. We also had a chance to talk about how he got started as a
computer programmer, why he learned Ruby, functional programming, Ruby’s
threading model and also his new
&lt;a href="https://github.com/jimweirich/rspec-given"&gt;RSpec-Given&lt;/a&gt; framework. I’ve typed
in the interesting parts of our conversation and &lt;a href="http://rubysource.com/an-interview-with-jim-weirich"&gt;posted them on RubySource.com&lt;/a&gt;.&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>A High Level Code Walk Through Ruby MRI</title>
    <link href="http://patshaughnessy.net/2012/12/3/a-high-level-code-walk-through-ruby-mri" rel="alternate" />
    <id>http://patshaughnessy.net/2012/12/3/a-high-level-code-walk-through-ruby-mri</id>
    <published>2012-12-03T00:00:00Z</published>
    <updated>2012-12-03T00:00:00Z</updated>
    <category>ruby</category>
    <author>
      <name />
    </author>
    <summary type="html">&lt;div style="float: left; padding: 17px 30px 10px 0px"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;img src="http://patshaughnessy.net/assets/2012/12/3/screencast.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;&lt;a href="http://www.rubyinside.com/ruby-mri-code-walk-tour-6020.html"&gt;Watch it now on rubyinside.com!&lt;/a&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;Have you ever thought about taking at look at Ruby’s internal C source code,&lt;/p&gt;
</summary>
    <content type="html">&lt;div style="float: left; padding: 17px 30px 10px 0px"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;img src="http://patshaughnessy.net/assets/2012/12/3/screencast.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;&lt;a href="http://www.rubyinside.com/ruby-mri-code-walk-tour-6020.html"&gt;Watch it now on rubyinside.com!&lt;/a&gt;&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;Have you ever thought about taking at look at Ruby’s internal C source code,
but weren’t quite sure where to get started? Well, this weekend &lt;a href="http://peterc.org"&gt;Peter Cooper&lt;/a&gt;
and I recorded a screencast that is just for you. We went on a tour through the
MRI source code tree, and recorded it in a 33 minute screencast called: &lt;a href="http://www.rubyinside.com/ruby-mri-code-walk-tour-6020.html"&gt;A High
Level Code Walk Through Ruby
MRI&lt;/a&gt;. We’ll give
you a “lay of the land,” a sense of what is where inside the MRI C source code
tree and suggest places to get started reading the C source code yourself.&lt;/p&gt;

&lt;p&gt;Special thanks to Peter for taking time out of his busy schedule to record and
edit the video, and for being such a big fan of my book, &lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby Under a
Microscope&lt;/a&gt;! I’m sure you
all know Peter as the editor of &lt;a href="http://www.rubyinside.com"&gt;Ruby Inside&lt;/a&gt; and
&lt;a href="http://rubyweekly.com"&gt;Ruby Weekly&lt;/a&gt;, but he’s also recorded a number of great
screencasts. Just this June he produced the fantastic &lt;a href="https://cooperpress.com/19walkthrough"&gt;Ruby 1.9
Walkthrough&lt;/a&gt;, and earlier in February he
did the fun &lt;a href="http://rubyreloaded.com/trickshots/"&gt;Ruby Trick Shots&lt;/a&gt; piece. In
fact, Peter’s new Kickstarter project, &lt;a href="http://www.kickstarter.com/projects/1225193080/the-ruby-20-walkthrough"&gt;The Ruby 2.0
Walkthrough&lt;/a&gt;,
was backed just last week &amp;ndash; keep an eye out for that in the coming months!
This is the first screencast I’ve ever done; having the chance to do it with
Peter was a bit like stepping onto a tennis court for the first time and
getting a private lesson from John McEnroe &amp;ndash; or maybe Andy Murray would be a
better analogy.&lt;/p&gt;

&lt;p&gt;Let us know if you like this; if enough people are interested we may do some
more screencasts on Ruby internals, diving into more detail that we had time
for this week. Enjoy!&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>My eBook build process and some PDF, EPUB and MOBI tips</title>
    <link href="http://patshaughnessy.net/2012/11/27/my-ebook-build-process-and-some-pdf-epub-and-mobi-tips" rel="alternate" />
    <id>http://patshaughnessy.net/2012/11/27/my-ebook-build-process-and-some-pdf-epub-and-mobi-tips</id>
    <published>2012-11-27T00:00:00Z</published>
    <updated>2012-11-27T00:00:00Z</updated>
    <category>ruby</category>
    <author>
      <name />
    </author>
    <summary type="html">&lt;div style="float: left; padding: 17px 30px 10px 0px"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;img src="http://patshaughnessy.net/assets/2012/11/27/cover-ana.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;&lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby Under a Microscope&lt;/a&gt; is an illustrated guide to&lt;br/&gt; Ruby internals. No C programming required!&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;

</summary>
    <content type="html">&lt;div style="float: left; padding: 17px 30px 10px 0px"&gt;
  &lt;table cellpadding="0" cellspacing="0" border="0"&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;img src="http://patshaughnessy.net/assets/2012/11/27/cover-ana.png"&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td align="center"&gt;&lt;i&gt;&lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby Under a Microscope&lt;/a&gt; is an illustrated guide to&lt;br/&gt; Ruby internals. No C programming required!&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/table&gt;
&lt;/div&gt;


&lt;p&gt;In case you missed it, last month I self-published an eBook about Ruby
internals called &lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby Under a
Microscope&lt;/a&gt;. To date I’ve
sold over 600 copies &amp;ndash; thanks everyone! I’ve never written anything so
ambitious before, and I’m grateful for your support. I hope it’s been a fun
read, and that you come away with a better appreciation for the amazing work
Matz and the rest of the core team have done to create such a beautiful
language. It was certainly a lot of fun to write!&lt;/p&gt;

&lt;p&gt;In the spirit of “sharing is caring,” and to follow in the footsteps of some
Ruby self-publishers like Jesse Storimer (see &lt;a href="http://jstorimer.com/2012/04/20/4-months-of-ebook-sales.html"&gt;4 Months of ebook
Sales&lt;/a&gt; and
&lt;a href="http://jstorimer.com/2012/10/15/getting-others-to-sell-my-ebook.html"&gt;Lessons Learned Getting Other People to Sell My
Ebook&lt;/a&gt;)
and Avdi Grimm (see &lt;a href="http://devblog.avdi.org/2012/01/12/my-authoring-tools/"&gt;My authoring
tools&lt;/a&gt;) who have been
generous with their knowledge, I’d like to try to pass along whatever
information I can about how to publish an eBook. This post contains a high
level description of my build process for &lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby Under a Microscope&lt;/a&gt;, and also a
few tips for dealing with the PDF, EPUB and MOBI file formats that I learned
along the way. If anyone would like actual code or more technical detail about
my build process, or if there’s any other way I can help you self-publish
something, please let me know!&lt;/p&gt;

&lt;h2&gt;My authoring tools&lt;/h2&gt;

&lt;p&gt;To write &lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby Under a
Microscope&lt;/a&gt; I used Apple
Pages on a Mac laptop. While I love VIM for writing code, I find it easier and
more natural to use a traditional word processor to write English text. Also,
Apple’s spelling autocorrect feature works nicely; I prefer to use the
“Automatically use spell checker suggestions” option in Preferences &amp;ndash;&amp;gt;
Auto-Correction.&lt;/p&gt;

&lt;p&gt;As you might know, &lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby Under a
Microscope&lt;/a&gt; contains a large
number of diagrams &amp;ndash; a picture is worth 1000 words, and often I find visual
aids are the only way to communicate the complex ideas, algorithms and data
structures Ruby uses internally. To draw these I used the
&lt;a href="http://www.omnigroup.com/products/omnigraffle/"&gt;Omnigraffle&lt;/a&gt; product from The
Omni Group. This is an amazing tool that allows you to produce professional
looking diagrams very quickly. It also integrates very nicely with Apple Pages;
I can copy/paste a diagram from Omnigraffle right into Pages and immediately
see something that will closely resemble my final document. As I’ll explain
below, I also ended up purchasing a license for Omnigraffle Pro, in order to be
able to export my diagrams as SVG vector image files, a feature the standard
version does not contain.&lt;/p&gt;

&lt;h2&gt;The Bookshop gem&lt;/h2&gt;

&lt;p&gt;Once I had the bulk of my writing done, I copied my text into a series of
HTML/ERB files and used a gem called
&lt;a href="https://github.com/blueheadpublishing/bookshop/"&gt;Bookshop&lt;/a&gt; to manage the
process of creating the PDF, EPUB and MOBI target files from them. Bookshop, in
turn, uses a tool called &lt;a href="http://www.princexml.com"&gt;PrinceXML&lt;/a&gt; to create the
PDF file:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2012/11/27/pdf-build.png"/&gt;&lt;/p&gt;

&lt;p&gt;As you can see, Bookshop is essentially a Ruby static site generator like
&lt;a href="http://jekyllrb.com"&gt;Jekyll&lt;/a&gt; or &lt;a href="http://middlemanapp.com"&gt;Middleman&lt;/a&gt;: you use
it to run a series of ERB transformations that produce static HTML and CSS.
Then Bookshop launches PrinceXML to convert the HTML to PDF. PrinceXML is not
cheap ($495 US), but has been worth every penny to me. It does a remarkable job
rendering the PDF, allowing you to use really any CSS styling/design code you
you would like to. With PrinceXML, creating a beautiful PDF file is as simple
as &amp;ndash; or as hard as &amp;ndash; creating a beautiful web site.&lt;/p&gt;

&lt;p&gt;PrinceXML also supports a number of print-only related CSS directives you may
not be familiar from web development, such as the “page-break-before” attribute
or the @page directive. For example, I used this CSS code to indicate I wanted
page numbers in the lower left corner for left side pages:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;
&lt;span class="di"&gt;@page&lt;/span&gt; &lt;span class="ps"&gt;:left&lt;/span&gt; {
  &lt;span class="di"&gt;@bottom-left&lt;/span&gt; {
    &lt;span class="ke"&gt;font&lt;/span&gt;: &lt;span class="fl"&gt;8pt&lt;/span&gt; &lt;span class="s"&gt;&lt;span class="dl"&gt;&amp;quot;&lt;/span&gt;&lt;span class="k"&gt;Helvetica Neue&lt;/span&gt;&lt;span class="dl"&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt;, &lt;span class="vl"&gt;serif&lt;/span&gt;;
    &lt;span class="ke"&gt;content&lt;/span&gt;: &lt;span class="er"&gt;c&lt;/span&gt;&lt;span class="er"&gt;o&lt;/span&gt;&lt;span class="er"&gt;u&lt;/span&gt;&lt;span class="er"&gt;n&lt;/span&gt;&lt;span class="er"&gt;t&lt;/span&gt;&lt;span class="er"&gt;e&lt;/span&gt;&lt;span class="er"&gt;r&lt;/span&gt;(&lt;span class="vl"&gt;page&lt;/span&gt;);
    &lt;span class="ke"&gt;padding-top&lt;/span&gt;: &lt;span class="fl"&gt;2em&lt;/span&gt;;
    &lt;span class="ke"&gt;vertical-align&lt;/span&gt;: &lt;span class="vl"&gt;top&lt;/span&gt;;
    &lt;span class="ke"&gt;color&lt;/span&gt;: &lt;span class="cr"&gt;#880000&lt;/span&gt;;
  }
}
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;One of the nicest features of Bookshop was that out of the box it provided a
series of CSS templates to get me started, containing a series of example CSS
styles I could use for page numbers, the table of contents and more.&lt;/p&gt;

&lt;p&gt;Next, here’s how Bookshop creates the EPUB file:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2012/11/27/epub-build.png"/&gt;&lt;/p&gt;

&lt;p&gt;You can see here that the EPUB is essentially a ZIP file containing the static
HTML/CSS source code, along with a few mysterious XML files (content.opf,
toc.ncx, etc.) that contain the book’s table of contents, a list of your HTML
and other source codes files with their mime types, and other meta data.
Bookshop also automatically runs the
&lt;a href="http://code.google.com/p/epubcheck/"&gt;EpubCheck&lt;/a&gt; utility on the finished EPUB
file to help you be sure you got everything correct in those XML files.&lt;/p&gt;

&lt;p&gt;If you’re really interested in learning the details of the EPUB format, this
fun, practical video by Paul Salvette might be worth watching:
&lt;a href="http://www.youtube.com/watch?v=RSc7XZzMkq0"&gt;http://www.youtube.com/watch?v=RSc7XZzMkq0&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally, Bookshop launches Amazon’s &lt;a href="http://www.amazon.com/gp/feature.html?ie=UTF8&amp;amp;docId=1000765211"&gt;KindleGen
utility&lt;/a&gt; to
generate the MOBI file from the EPUB file:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2012/11/27/mobi-build.png"/&gt;&lt;/p&gt;

&lt;h2&gt;Customizing Bookshop&lt;/h2&gt;

&lt;p&gt;To make my life easier, I ended up customizing Bookshop to support code syntax
highlighting (using &lt;a href="http://coderay.rubychan.de"&gt;Coderay&lt;/a&gt;) and to allow Markdown as the source file format
(using &lt;a href="https://github.com/vmg/redcarpet"&gt;Redcarpet&lt;/a&gt;). I also customized it to
be able to manage the large number of diagram images files I created in special
ways. If anyone is interested in these details, let me know. In a nutshell, I
used SVG to represent the diagrams for the PDF, but PNG for the diagrams in the
EPUB/MOBI files; using two different file formats for the same images required
some code changes to Bookshop.&lt;/p&gt;

&lt;p&gt;I’ve also heard good things about the
&lt;a href="https://github.com/fnando/kitabu"&gt;Kitabu&lt;/a&gt; gem, which seems to function in
largely the same way as Bookshop. I haven’t tried it, but if I write another
eBook I’ll probably look at using Kitabu instead, since it already contains
support for code syntax highlighting and Markdown. It’s also encouraging that
Jesse Storimer, a prolific eBook author, is a maintainer on that project.&lt;/p&gt;

&lt;h2&gt;PDF tip: use vector image file formats&lt;/h2&gt;

&lt;p&gt;PDF is the the Mercedes of eBook file formats. I’m guessing most people who
read technical eBooks use a PDF file on a laptop: it just works, it looks the
same on every platform, and we all have the confidence that the rendered
document will appear the same as the original document did. After spending
months crafting a PDF file, I’m amazed at what Adobe’s software can do &amp;ndash; why
read a book using any other file format?&lt;/p&gt;

&lt;p&gt;One important but easily overlooked property of the PDF file format is that it
is vector based. That means that, like SVG and related image file formats, all
of the text and other rendering information in a PDF document is saved as a
series of vector commands: “go here, draw this, move there, draw that, etc.”
But since the Adobe’s PDF file format is more or less a black box, this doesn’t
normally matter: who cares how the PDF file format works internally?&lt;/p&gt;

&lt;p&gt;I certainly didn’t, until I tried to render some of the many diagrams in &lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby
Under a Microscope&lt;/a&gt; in a PDF
file. For my first attempt, I rendered my diagrams as PNG images and then
included them in my HTML using &amp;lt;img&amp;gt; tags, as I normally do for blog posts.
PrinceXML then produced a PDF file that looked like this:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2012/11/27/blurry-text.png"/&gt;&lt;/p&gt;

&lt;p&gt;While this isn’t horrible, you can see the “YARV internal stack,”
“rb_control_frame_t” and other text in my diagram at the bottom is blurry and
somewhat disappointing to look at. The text above, which originally came from
the HTML source code of my book, looks fine. It almost looks as if you were
viewing the diagram through a thick piece of glass, while viewing the rest of
the document directly with the naked eye.&lt;/p&gt;

&lt;p&gt;The problem here is that the end user PDF viewer software, for me Apple’s
Preview app, has to resize the original PNG image for the diagram, scaling the
text as necessary along the way. This produces the blurriness. It’s also
possible the scaling is performed by PrinceXML when the PDF file is generated.&lt;/p&gt;

&lt;p&gt;It took me a few days of research, trial and error to stumble upon the
solution: since PDF is internally a vector based format, why not render my
diagrams using a vector based image file format? After paying some extra money
to upgrade to Omnigraffle Pro, I was able to use the “File &amp;ndash;&amp;gt; Export” command:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2012/11/27/file-export.png"/&gt;&lt;/p&gt;

&lt;p&gt;… and then select the “SVG vector drawing” file type (the standard version of
Omnigraffle doesn’t support this):&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2012/11/27/file-type.png"/&gt;&lt;/p&gt;

&lt;p&gt;Now the real magic happens: when I include an SVG image in my HTML using a
standard &amp;lt;img&amp;gt; tag, PrinceXML generates a PDF file that includes the vector
commands to render the diagram! In fact, I also noticed that PrinceXML includes
whatever font I used in the Omnigraffle diagram right inside the PDF file:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;
prince: loading image: builds/pdf/assets/images/ch01/process.svg
prince: used font: Menlo, Bold
prince: used font: Menlo, Regular
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;Here’s what same diagram looks like in the final version of &lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby Under a
Microscope&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://patshaughnessy.net/assets/2012/11/27/correct-text.png"/&gt;&lt;/p&gt;

&lt;p&gt;The reason this looks better is that the end user PDF viewer app, Apple Preview,
Adobe Reader, iBooks or whatever software your reader has, draws the diagram
text at the proper size at the moment the reader opens the page, using the font
that PrinceXML embedded into the PDF file.&lt;/p&gt;

&lt;h2&gt;EPUB tip: package many small HTML files, not one large one&lt;/h2&gt;

&lt;p&gt;While writing &lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby Under a
Microscope&lt;/a&gt; I also became
familiar with the EPUB file format. It’s a useful file format because it’s
become an open standard shared by a large number of eBook reader devices. You
can read EPUBs on a high end device like an iPad, on a variety of Android
tablets, and also on cheaper eBook readers.&lt;/p&gt;

&lt;p&gt;An EPUB file is really just a ZIP file that contains a HTML/CSS web site, along
with a few confusing, bizarre XML files. In fact, in hindsight I’ve realized
that an EPUB file closely resembles a J2EE web application &amp;ndash; but without the
Java! You might remember J2EE from the early 2000s and 1990s &amp;ndash; it was how we
created web sites before Rails came along in 2004 and made all our lives
easier.&lt;/p&gt;

&lt;p&gt;Unzipping my EPUB file, here’s what you’ll see:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;
$ unzip ruby-under-a-microscope.epub 
Archive:  ruby-under-a-microscope.epub
 extracting: mimetype                
  inflating: META-INF/container.xml  
  inflating: OEBPS/assets/css/page-template.xpgt  
  inflating: OEBPS/assets/css/stylesheet.epub.css  
  inflating: OEBPS/assets/images/canvas-ana.png  
  inflating: OEBPS/assets/images/ch01/ast-compile1.png  
  inflating: OEBPS/assets/images/ch01/ast-compile2.png  
  inflating: OEBPS/assets/images/ch01/ast-compile3.png  
  inflating: OEBPS/assets/images/ch01/ast1.png  

… etc …

  inflating: OEBPS/ch04-part6.html   
  inflating: OEBPS/ch05-part1.html   
  inflating: OEBPS/ch05-part2.html   
  inflating: OEBPS/ch05-part3.html   
  inflating: OEBPS/ch05-part4.html   
  inflating: OEBPS/ch05-part5.html   
  inflating: OEBPS/ch05-part6.html   
  inflating: OEBPS/conclusion.html   
  inflating: OEBPS/content.opf       
  inflating: OEBPS/cover.html        
  inflating: OEBPS/preface.html      
  inflating: OEBPS/title.html        
  inflating: OEBPS/toc.html          
  inflating: OEBPS/toc.ncx          
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;It really does look like a committee of Java programmers designed the EPUB
spec! We have something called “META-INF/container.xml,” a directory called
“OEBPS,” and there are lots of confusing XML files with names like content.opf
and toc.ncx. These all need to contain the proper information in precisely the
correct format. Fortunately, as I mentioned above, the Bookshop gem runs the
&lt;a href="http://code.google.com/p/epubcheck/"&gt;EpubCheck&lt;/a&gt; tool to validate that all the
information in these files is correct.&lt;/p&gt;

&lt;p&gt;On of the drawbacks of the Bookshop gem was that I had to write some Ruby code
myself to generate the table of contents information stored in the “toc.ncx”
file, and also the list of HTML and image files contained in the “content.opf”
file. In hindsight the Kitabu gem might have taken care of this detail for me.
Ideally the TOC info should be autogenerated from the &amp;lt;h1&amp;gt; or &amp;lt;h2&amp;gt;
tags I use in my text.&lt;/p&gt;

&lt;p&gt;One other important detail here: you’ll notice my HTML content is split up into
a series of different HTML files: “ch04-part6,” “ch05-part1,” etc. Originally
Bookshop created a single, large HTML file containing the entire text of the
book and included that in the EPUB file. However, then &lt;a href="https://twitter.com/avdi/status/266323794812080128"&gt;I heard from Avdi
Grimm&lt;/a&gt; and other readers
than it was slow on some EPUB reader devices. The problem was that the reader
device would hang while opening up the large HTML file, trying to render 100s
of pages all at once.&lt;/p&gt;

&lt;p&gt;With some help from &lt;a href="https://twitter.com/mcouk"&gt;Mike Cook&lt;/a&gt; I was able to fix
this by splitting my text up into smaller HTML files. I did this by further
customizing Bookshop, but I suspect Kitabu or other tools out there can do this
automatically for you.&lt;/p&gt;

&lt;h2&gt;MOBI tip: use relative font sizes and Amazon CSS media types&lt;/h2&gt;

&lt;p&gt;Finally, to produce a book you can read on all of the different Kindle devices,
the most popular eBook reader in the U.S., you need to create a MOBI file.
Ironically, the MOBI format wasn’t even invented by Amazon, but now Amazon is
the only reason this format continues to be relevant. As I explained above,
Bookshop does this by running the &lt;a href="http://www.amazon.com/gp/feature.html?ie=UTF8&amp;amp;docId=1000765211"&gt;KindleGen
utility&lt;/a&gt;, from
Amazon. I’ve heard rumors Amazon might directly support EPUB in the future,
which would make all of our lives easier.&lt;/p&gt;

&lt;p&gt;Be sure to download and use the &lt;a href="http://www.amazon.com/gp/feature.html?ie=UTF8&amp;amp;docId=1000765261"&gt;Kindle
Previewer&lt;/a&gt; app.
This allows you to see how your MOBI file will render on seven or more
different varieties of the Kindle. eBooks appear somewhat differently on each
version of the Kindle and without this app there’s no way to know what will
happen.&lt;/p&gt;

&lt;p&gt;Using the Kindle Previewer with &lt;a href="http://patshaughnessy.net/ruby-under-a-microscope"&gt;Ruby Under a
Microscope&lt;/a&gt; I ran into
trouble rendering fonts on some versions of the Kindle. Newer versions of
Kindle, like the Kindle Fire, worked well, while older Kindles had trouble when
I used certain fonts. More trial and error revealed that I could use only
relative font sizes in my CSS code: &lt;span class="code"&gt;font-size: 50%&lt;/span&gt;
instead of &lt;span class="code"&gt;font-size: 10pt&lt;/span&gt;, for example. I was also
able to distinguish between newer and older Kindles using this special CSS
media type directive invented by Amazon:&lt;/p&gt;

&lt;div class="CodeRay"&gt;
  &lt;div class="code"&gt;&lt;pre&gt;
&lt;span class="di"&gt;@media&lt;/span&gt;  &lt;span class="ty"&gt;amzn-mobi&lt;/span&gt;
{

&lt;span class="cl"&gt;.section-caption&lt;/span&gt; {
  &lt;span class="ke"&gt;text-align&lt;/span&gt;: &lt;span class="vl"&gt;left&lt;/span&gt;;
  &lt;span class="ke"&gt;font-style&lt;/span&gt;: &lt;span class="vl"&gt;italic&lt;/span&gt;;
  &lt;span class="ke"&gt;font-size&lt;/span&gt;: &lt;span class="fl"&gt;50%&lt;/span&gt;;
}

}

&lt;span class="di"&gt;@media&lt;/span&gt;  &lt;span class="ty"&gt;amzn-kf8&lt;/span&gt;
{

&lt;span class="cl"&gt;.section-caption&lt;/span&gt; {
  &lt;span class="ke"&gt;text-align&lt;/span&gt;: &lt;span class="vl"&gt;center&lt;/span&gt;;
  &lt;span class="ke"&gt;font-style&lt;/span&gt;: &lt;span class="vl"&gt;italic&lt;/span&gt;;
  &lt;span class="ke"&gt;font-size&lt;/span&gt;: &lt;span class="fl"&gt;75%&lt;/span&gt;;
}

}
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;h2&gt;Why go to all this trouble?&lt;/h2&gt;

&lt;p&gt;If you made it this far, then you must really be interested in self-publishing.
I have to admit that the technical work involved with producing an eBook myself
was far more tedious and difficult than I had ever thought it would be. I can
imagine that partnering with a publishing company would make this all a lot
easier &amp;ndash; probably you would be able to work within an established, proven
technical publishing pipeline. But one of the joys of self-publishing is being
able to have full control over the final product. There’s something thrilling
and fun about controlling every last detail yourself. I hope this post saves
you a few hours of frustration when it comes time to produce your eBook, or
that it provides you a bit of encouragement to take the first step!&lt;/p&gt;
</content>
  </entry>
</feed>
