<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
<channel>
  <title><![CDATA[ Max Chernyak]]></title>
  <link>https://max.engineer</link>
  <atom:link href="https://max.engineer/feed.rss" rel="self" type="application/rss+xml"/>
  <description><![CDATA[ The feed of updates to Max Chernyak ]]></description>
  <item>
    <title><![CDATA[ Don’t Build a General Purpose API (4 Years Later) ]]></title>
    <link>https://max.engineer/server-informed-ui-p2</link>
    <guid>https://max.engineer/server-informed-ui-p2</guid>
    <pubDate>Thu, 11 Dec 2025 00:00:00 -0500</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>In 2021 I wrote <a href="https://max.engineer/server-informed-ui">an article</a> encouraging people not to build general purpose APIs for their own front-ends. (You should probably read it before reading this one.) It <a href="https://hn.algolia.com/?dateRange=all&amp;page=0&amp;prefix=true&amp;query=story%3A37197257%2Cstory%3A28511570&amp;sort=byDate&amp;type=story">got featured</a> on Hacker News twice, albeit with a worse reception (and more heated discussion) the second time around. My guess is, more front-enders showed up. <span class="small-caps">😛</span></p>
<p>Having observed this approach for 6 years, I’ve only grown more confident in its success. Part of that confidence comes from seeing it play out. Our team vastly simplified maintenance, reduced bugs, and boosted performance by making the jump. (The jump was made possible by <a href="https://max.engineer/long-term-refactors">Long Term Refactors</a> btw.) Another reason for my confidence is the many comments I’ve received over the years confirming that it works for other teams too. Finally, I’ve received a number of challenges and condemnations, although they were either entirely theoretical, or based on a misunderstanding of the article. Since I consider it somewhat my fault for not being clear enough, I want to address these misunderstandings. So here they are, organized into topics, each one addressed in turn.</p>
<h2 id="you-reinvented-html">1. You reinvented <span class="small-caps">HTML</span>!</h2>
<p><a href="https://news.ycombinator.com/item?id=37197591">Multiple</a> <a href="https://news.ycombinator.com/item?id=37200603">readers</a> <a href="https://max.engineer/server-informed-ui#fast-comments-jt=vm2XeOGTDO2ti">have</a> informed me that I can apparently serve <span class="small-caps">HTML</span> directly from the server, instead of doing all that <span class="small-caps">JSON</span> payload nonsense. Having started building websites in 2003, I get it. However, the reality is that today we work with front-end teams, they work with React, and React comes with certain practices. 15 years of them in fact, in addition to the baggage of pre-built components. That’s worth some respect. If a <span class="small-caps">JSON</span> payload is preferred, and I have no trouble delivering it, why would I mind? Personally, I much prefer the good old Ruby on Rails-esque stack with server-side <span class="small-caps">HTML</span> rendering, but we work in teams, and should probably play to each other’s strengths.</p>
<p>That said, the bigger issue with the<span class="push-double"></span> <span class="pull-double">“</span>you rediscovered <span class="small-caps">HTML</span><span class="push-double"></span><span class="pull-double">”</span> crowd is that they’ve misunderstood the advice. <span class="small-caps">HTML</span> involves content, structure, and style (esp.&nbsp;in the case of Tailwind) — which is basically everything you can serve, aside from assets. In our <span class="small-caps">JSON</span> world we only serve content, a faint hint of structure, and no style at all. That’s way higher level than <span class="small-caps">HTML</span>. It’s important to understand that I’m not asking the back-end to serve <code>JSON.dump(html)</code> to the front-end, only the gaps the page requires to be filled. The rest of the page’s static content should just be hardcoded on the front-end.</p>
<p>I do, however, need to correct one mistake. My old advice went:<span class="push-double"></span> <span class="pull-double">“</span>content and structure come from the back-end”, but I didn’t realize that folks would interpret it as literally as<span class="push-double"></span> <span class="pull-double">“</span>serialize the entire <span class="small-caps">HTML</span> into <span class="small-caps">JSON</span><span class="push-double"></span><span class="pull-double">”</span>. That’s not at all what I meant. You only need the bare minimum <span class="small-caps">JSON</span> structure to help the front-end engineer understand which values should go into which parts of the page. For example, if your <span class="small-caps">HTML</span> is something like this:</p>
<pre><code class="hljs html"><span class="hljs-tag">&lt;<span class="hljs-name">div</span>&gt;</span>
  <span class="hljs-tag">&lt;<span class="hljs-name">article</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">h1</span>&gt;</span>Title<span class="hljs-tag">&lt;/<span class="hljs-name">h1</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>Body<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
  <span class="hljs-tag">&lt;/<span class="hljs-name">article</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span></code></pre>
<p>You don’t need this: <code>{ "div": { "article": { "h1": "Title", "p": "Body" } } }</code>. You simply supply the keyword arguments to this otherwise hardcoded <code>ArticlePage</code><span class="push-single"></span><span class="pull-single">’</span>s constructor:</p>
<pre><code class="hljs json">{
  <span class="hljs-attr">"title"</span>: <span class="hljs-string">"Title"</span>,
  <span class="hljs-attr">"body"</span>: <span class="hljs-string">"Body"</span>
}</code></pre>
<p>If there are multiple articles, you make an array of these. As simple as that. Cater to the needs of the page.</p>
<h2 id="pages-will-load-slower-without-async">2. Pages will load slower without async!</h2>
<p><a href="https://news.ycombinator.com/item?id=28520608">There exists</a> a concern that with my approach you can’t load parts of pages asynchronously, and that this would result in bad performance. If you are a front-end engineer who has only ever worked with generic <span class="small-caps">API</span> endpoints, I get why you have this impression. You needed 10 endpoints to render a single page, you parallelized the requests, you saw that they can be flaky and slow at random, you prioritized some over others. Right? I’m sorry, but that pain was self-inflicted. If you asked a back-end engineer to give you all of that data in one bundle, the server would most likely produce it in under 30 milliseconds in a single streamlined 200ms roundtrip rather than juggling 10x200ms roundtrips competing with one another.</p>
<p>The whole idea of asynchronously loading pages is a truism of the front-end world that only feels correct in theory. In practice, it’s like hiring 10 trucks to deliver 10 <span class="small-caps">USB</span> drives in parallel, realizing how slow it is to manage 10 trucks, and concluding<span class="push-double"></span> <span class="pull-double">“</span>maybe I need some more trucks”. Async loading only makes sense when you are actually dealing with slow, heavy, or streaming data sources. If it’s your own back-end, attached to your own database, spitting out kilobytes of content, you are making things hundreds if not thousands of times slower by running parallel requests. On top of that, you’re making it harder for the back-end engineering team to bundle, optimize, and cache data, because it must be sent piecemeal.</p>
<p>Then there’s the reverse concern as exemplified by <a href="https://news.ycombinator.com/item?id=37198059">this whole thread</a>, and <a href="https://news.ycombinator.com/item?id=37203579">this comment</a>. It argues that you should be able to submit individual form fields to the server, rather than entire forms. Is that something you should do? The real answer here is: sure, if you want to. Nothing about my approach prevents you from submitting forms in any way you like. But just like in the situation above, performance isn’t really a good reason to split data (in most cases).</p>
<h2 id="this-makes-no-sense-in-a-single-page-application">3. This makes no sense in a Single-Page Application!</h2>
<p>Some <a href="https://news.ycombinator.com/item?id=37198059">folks</a> have <a href="https://max.engineer/server-informed-ui#fast-comments-jt=L1JAcxvov_MJ">struggled</a> to see how serving <em>pages</em> can work in the context of a <em>single-page</em> application. The answer is: just rename<span class="push-double"></span> <span class="pull-double">“</span>page” to<span class="push-double"></span> <span class="pull-double">“</span>screen” in your mind and you’re good to go. When transitioning between them, ship the next screen’s worth of data to the front-end in one bundle. If you need to fetch a specific screen’s sections individually, feel free to provide special endpoints for those. Although, in most real-world cases, reloading the whole screen’s worth of data only to swap out a single section would still be insanely fast and simple. Only watch out that you don’t provide raw data from the database. For example, if the front-end is showing a table of <code>items</code>, give it a <code>table_items</code> endpoint for this specific screen, where all the data is already pre-arranged for display in this
specific table. Don’t make the front-end do the extra legwork of accessing multiple endpoints to wire this table together.</p>
<h2 id="why-not-use-graphql-or-an-aggregation-layer">4. Why not use GraphQL? Or an aggregation layer?</h2>
<p>A <a href="https://news.ycombinator.com/item?id=28520051">couple</a> of <a href="https://news.ycombinator.com/item?id=28529811">readers</a> have wondered: why not use an aggregation layer over a generic <span class="small-caps">API</span>, or switch to GraphQL.</p>
<p>Not gonna lie, an aggregation layer sounds absurd to me. It might be a symptom of front-end-centric thinking. We first build a generic <span class="small-caps">API</span>, thereby creating the problem. Then we build another layer that hides the problem without solving it. This leaves us with twice as much back-end code, double/triple the back-end complexity, and all the same performance issues, while requiring more time and effort. Why?</p>
<p>As far as GraphQL goes, the reason you might want to avoid it is that it comes with an enormous complexity and development style trade-off. GraphQL is a web of infinite possibilities that back-end developers must be able to manage and secure. Supporting such an infinite maze starts to make a lot of sense when you consider how many types of clients Facebook has. Their clients span from computers and phones to TVs, fridges, and washing machines. For them, producing an insanely flexible layer that allows thousands of unique clients to fetch unique sets of data is probably worth it. How many unique types of clients do you have? Let me guess: one website and one mobile app (for most of you). Maybe not even. Should you be supporting an infinitely flexible query language for this?</p>
<p>Remember the problem you’re trying to solve: providing the data your front-end needs to display your software. Just go ahead and provide what’s needed. Problem solved.</p>
<h2 id="you-took-away-flexibility-from-the-front-end">5. You took away flexibility from the front-end!</h2>
<p>Perhaps I didn’t communicate this clearly enough in the original article, but <a href="https://news.ycombinator.com/item?id=28520038">people</a> <a href="https://news.ycombinator.com/item?id=37201700">are</a> <a href="https://max.engineer/server-informed-ui#fast-comments-jt=reG9_68AsQSt">still</a> <a href="https://max.engineer/server-informed-ui#fast-comments-jt=suGe_FUFk2ve">telling</a> me that having a generic <span class="small-caps">API</span> enables front-end flexibility. They talk about losing the ability of the front-end to build new features and do redesigns without communicating with the back-end.</p>
<p>There is no world where the front-end can build new features or redesigns without involving the back-end. Yes, they can use old endpoints in new, unexpected ways. But then, the back-end will have to catch up to the mess of unpredictable patterns of server bombardment that the front-end introduced, lose all track of what endpoints got used for what purpose, and try to post-hoc optimize inefficiencies, creating a half-baked version of the solution I proposed in the original article.</p>
<p>The cost of this is that the back-end will indefinitely have to support all unexpected usage patterns, and be afraid to change or remove anything, because it’s very hard to trace exactly how and where the front-end relies on specific endpoints. You will need to deal with <span class="small-caps">API</span> versioning, and a forever growing (i.e.&nbsp;never-shrinking) codebase. To make matters worse, there isn’t even that much flexibility gained, because for redesigns and new features, the front-end is still at the mercy of what the back-end provides, and will still probably require new endpoints built for it. Meanwhile, old endpoints will never be removed<span class="push-double"></span> <span class="pull-double">“</span>just in case”.</p>
<p>On the other hand, you could vastly simplify your redesigns when the back-end serves the exact data needed for each page. You can redesign each page separately, and never wonder if something is being used in unexpected ways.</p>
<p>Bottom line is, there is no real flexibility to be gained from a naive <span class="small-caps">CRUD</span> <span class="small-caps">API</span> based on database records. The cost of making the back-end actually flexible for diverse use cases is much heavier than many teams realize. It only sounds nice in theory.</p>
<h2 id="what-do-i-put-into-the-payload">6. What do I put into the payload?</h2>
<p><a href="https://max.engineer/server-informed-ui#fast-comments-jt=L1JAcxvov_MJ">One</a> of the <a href="https://news.ycombinator.com/item?id=37201700">most</a> important <a href="https://news.ycombinator.com/item?id=28520355">questions</a> I keep <a href="https://news.ycombinator.com/item?id=37200262">getting</a> is exactly what data should be provided to the front-end, and how it should be structured and named. As mentioned before, some people erroneously think that I was suggesting to send essentially the entire <span class="small-caps">HTML</span> auto-translated into <span class="small-caps">JSON</span>. Others thought I was suggesting to create some sort of JSON-based UI-construction protocol. And yet others were just not sure what to do with state and static content that exists entirely on the front-end. Should it come from the back-end too?</p>
<p>The answer to all three is: you’re overthinking it. The front-end developer should hardcode a page as much as possible, and only leave gaps for what makes sense to come from the back-end. The data to fill those gaps should be arranged in a neat little <span class="small-caps">JSON</span> payload. End of story. Hardcode all static content on the front-end, keep all front-end state on the front-end, ship things that the back-end controls from the back-end.</p>
<p>Furthermore, do not try to make your <span class="small-caps">JSON</span> formalized and consistent across different pages. Each page should have its own custom code that takes <span class="small-caps">JSON</span> values and puts them in the right places in the right way for this one page. You may have certain very similar components on multiple pages that you might want to normalize arguments for, but don’t try to normalize overall page structure. It will only make things difficult. Just do what one specific page requires. Your <span class="small-caps">JSON</span> is nothing more than arguments into the constructor for this one page. A different page should have a different <span class="small-caps">JSON</span> structure. Any similarities between overall page structures should be accidental.</p>
<h2 id="crud-makes-back-end-easier-to-maintain">7. <span class="small-caps">CRUD</span> makes back-end easier to maintain!</h2>
<p><a href="https://news.ycombinator.com/item?id=37209686">Some</a> <a href="https://max.engineer/server-informed-ui#fast-comments-jt=suGe_FUFk2ve">comments</a> insist that <span class="small-caps">CRUD</span> is easier to build, document and test than custom pages. This is a misunderstanding of the term. <span class="small-caps">CRUD</span> isn’t supposed to let the front-end CREATE/READ/UPDATE/DELETE your database records, it’s meant for doing that with <em>resources</em>. Figuring out what those resources are is the key to architecting your web app correctly. They may occasionally map directly to database records, but more often they won’t.</p>
<p>A page is a resource with a <span class="small-caps">READ</span> endpoint. The CREATE/UPDATE endpoints are rarely useful for pages, but they are useful for more granular resources, like various items and relationships. Sometimes they map to <span class="small-caps">DB</span> records, but more often you need a resource at a higher level of abstraction.<span class="push-double"></span> <span class="pull-double">“</span>Form objects” can play that role. They can represent an end-user entity that<span class="push-double"></span> <span class="pull-double">“</span>changes together”, served as a single form on the front-end. Your controller would then take the data from this object, and transactionally commit it to however many records in the <span class="small-caps">DB</span> are backing it.</p>
<p>All that is to say, the idea that<span class="push-double"></span> <span class="pull-double">“</span><span class="small-caps">CRUD</span> is easier to build, document, and test” is really saying<span class="push-double"></span> <span class="pull-double">“</span>exposing my database records directly to the front-end is easier to build, document, and test”. By doing this, you skip the whole app, and expose low-level storage directly to your front-end. No wonder it’s easier. This shifts your entire app to the front-end, where composing data comes at a huge cost of network failure modes. In a way, I guess you did make development<span class="push-double"></span> <span class="pull-double">“</span>easier”, because the front-end often gets a pass for omitting tests and docs.</p>
<h2 id="what-is-general-purpose-api">8. What is<span class="push-double"></span> <span class="pull-double">“</span>General Purpose API”?</h2>
<p><a href="https://max.engineer/server-informed-ui#fast-comments-jt=cizb0UAv2SsZ">Someone</a> asked me for a definition.</p>
<p>When I say<span class="push-double"></span> <span class="pull-double">“</span>don’t build a general purpose <span class="small-caps">API</span><span class="push-double"></span><span class="pull-double">”</span>, I mean an <span class="small-caps">API</span> for a wide variety of public use cases, available for use by your customers. I’m instead advocating for a <span class="small-caps">BFF</span> (back-end for front-end) <span class="small-caps">API</span>, where you only cater to your front-end team’s requirements, and don’t make this <span class="small-caps">API</span> generally available. If you actually do need a general <span class="small-caps">API</span>, build it separately from your <span class="small-caps">BFF</span> <span class="small-caps">API</span> to avoid clashing requirements, and unnecessary release management + documentation overhead.</p>
<h2 id="how-is-this-applicable-in-the-ai-era">9. How is this applicable in the <span class="small-caps">AI</span> era?</h2>
<p>This is a question I’m asking myself about all of my programming-related writing. Perhaps you can feed my articles into an <span class="small-caps">AI</span>, so it can follow my advice for you? Who am I kidding, they’ve already been ingested by LLMs and diluted in the ocean of other blog content. I don’t know. Be positive and have fun, I guess, everything is going to be okay.</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Failover to Human Intelligence ]]></title>
    <link>https://max.engineer/failover-to-hi</link>
    <guid>https://max.engineer/failover-to-hi</guid>
    <pubDate>Mon, 11 Aug 2025 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>There’s no denying that <span class="small-caps">AI</span> is getting very capable, but one thing keeps bothering me: what happens if something goes wrong?</p>
<p>Right now, self-driving cars still require human monitoring and intervention (outside of specially-designated areas). Isn’t this also true of a sufficiently complex system where you might need to intervene quickly in case <span class="small-caps">AI</span> fails to resolve an issue? Worth considering, right?</p>
<p>You might say — so what? AI-written code is arguably better (or will eventually be better), often with more comments and docs, humans would understand it faster anyway. And that may be true, but with human-written code you can usually find a human who wrote it and ask them questions. If <span class="small-caps">AI</span> gets mixed up in too much context, and can neither fix nor successfully explain what’s going on, there might be nobody else familiar with the codebase. Are we saying that we should try to maintain some level of familiarity just in case?</p>
<p>You might say — well, we already have AIs capable of storing large permanent context, so it will know the codebase better than any human would. At some point <span class="small-caps">AI</span> will just become strictly better in every way. And that may be true, but I will keep asking the same question:</p>
<p>Can we forego human intervention? What if <span class="small-caps">AI</span> servers are down? Can we ever completely rely on a technology caring for itself?</p>
<p>If the answer is a<span class="push-double"></span> <span class="pull-double">“</span>no”, even a tiny<span class="push-double"></span> <span class="pull-double">“</span>no”, then doesn’t it kind of negate the entire<span class="push-double"></span> <span class="pull-double">“</span>full <span class="small-caps">AI</span> takeover” narrative? As we start to unpack this chain of thought back down from<span class="push-double"></span> <span class="pull-double">“</span><span class="small-caps">AI</span> perfection” to<span class="push-double"></span> <span class="pull-double">“</span>but human intervention might still be needed in rare cases”, aren’t we inevitably back at the question: What should we do to help humans intervene?</p>
<p>Once you start answering this question, you might find yourself back at square one:</p>
<ul>
<li>Humans will need to be reading and reviewing code at the very least.</li>
<li>Humans will need to maintain good understanding of the codebase.</li>
<li>The best way to learn is by doing. (I.e. writing the code.)</li>
<li>If humans are expected to jump in to fix something, they should probably have the final say in its implementation.</li>
</ul>
<p>It’s like one of those little<span class="push-double"></span> <span class="pull-double">“</span>snags” that you hit, which seems insignificant at first, but as you drill down on it, it may have way bigger implications than you realized.</p>
<p>You might say — well, most projects out there are not that critical. That’s true, but what if it grows into a bigger, more critical one later?</p>
<p>Even the tiniest possibility of human intervention leads me to think that it’s always going to be better for software developers to work together with <span class="small-caps">AI</span>, and never simply be replaced by it. Otherwise, failover is going to fail in a situation where it’s needed most.</p>
<p>Where am I wrong?</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Getting Answers from a Big PDF with RubyLLM ]]></title>
    <link>https://max.engineer/giant-pdf-llm</link>
    <guid>https://max.engineer/giant-pdf-llm</guid>
    <pubDate>Fri, 16 May 2025 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>Some <span class="small-caps">API</span> vendors give you an <span class="small-caps">API</span> doc in a giant custom-edited <span class="small-caps">PDF</span> file. In my case it’s &gt;1200 pages, with a<span class="push-double"></span> <span class="pull-double">“</span>helpful” table of contents that itself spans about 20 pages.</p>
<p>Well, I dislike reading giant <span class="small-caps">PDF</span> docs, love writing Ruby, and there’s an awesome <a href="https://github.com/crmne/ruby_llm">RubyLLM gem</a>, and Gemini supports <span class="small-caps">PDF</span> parsing, so maybe I can just throw together a quick <span class="small-caps">CLI</span> tool that can answer questions for me? Alas, Gemini is limited to 1000 pages. Either way it would probably be too wasteful to send the entire doc every time. RubyLLM supports <strong>tools</strong>, so I decided to try that out.</p>
<h2 id="reading-pdf-text-locally">Reading <span class="small-caps">PDF</span> Text Locally</h2>
<p>My doc is mostly text, there isn’t any pics in there I care about, so this part is easy. A quick search later, there’s a gem called <a href="https://rubygems.org/gems/pdf-reader">pdf-reader</a>. Perfect for a tool.</p>
<p><code>bin/ask_api_doc</code></p>
<pre><code class="hljs ruby"><span class="hljs-meta">#!/usr/bin/env ruby</span>

<span class="hljs-keyword">require</span> <span class="hljs-string">'ruby_llm'</span>
<span class="hljs-keyword">require</span> <span class="hljs-string">'pdf-reader'</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">PdfPageReader</span> &lt; RubyLLM::Tool</span>
  DOC = PDF::Reader.new(<span class="hljs-string">'docs/big-doc.pdf'</span>)

  description <span class="hljs-string">'Read the text of any set of pages from the doc.'</span>
  param <span class="hljs-symbol">:page_numbers</span>,
    <span class="hljs-symbol">desc:</span> <span class="hljs-string">'Comma-separated page numbers (first page: 1). (e.g. "12, 14, 15")'</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">execute</span><span class="hljs-params">(<span class="hljs-symbol">page_numbers:</span>)</span></span>
    puts <span class="hljs-string">"\n-- Reading pages: <span class="hljs-subst">#{page_numbers}</span>\n\n"</span>
    page_numbers = page_numbers.split(<span class="hljs-string">','</span>).map { _1.strip.to_i }
    pages = page_numbers.map { [_1, DOC.pages[_1.to_i - <span class="hljs-number">1</span>]] }
    {
      <span class="hljs-symbol">pages:</span> pages.map { <span class="hljs-params">|num, p|</span>
        <span class="hljs-comment"># There are lines drawn with dots in my doc.</span>
        <span class="hljs-comment"># So I squeeze them to save tokens.</span>
        { <span class="hljs-symbol">page:</span> num, <span class="hljs-symbol">text:</span> p&amp;.text&amp;.squeeze(<span class="hljs-string">'.'</span>) }
      }
    }
  <span class="hljs-keyword">rescue</span> =&gt; e
    { <span class="hljs-symbol">error:</span> e.message }
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span></code></pre>
<p>Now my <span class="small-caps">LLM</span> can use the tool to extract text from any page.</p>
<h2 id="and-were-basically-done">And We’re Basically Done</h2>
<p>Unlike <a href="https://imgur.com/super-easy-owl-drawing-tutorial-rCr9A"><span class="push-double"></span><span class="pull-double">“</span>draw the rest of the owl”</a>, the rest of the code is actually pretty straightforward (goes after the above):</p>
<pre><code class="hljs ruby"><span class="hljs-comment"># Grab key from my 1Password.</span>
GEMINI_API_KEY=<span class="hljs-string">`op read "op://Private/Google Gemini API Personal/credential"`</span>

RubyLLM.configure <span class="hljs-keyword">do</span> <span class="hljs-params">|config|</span>
  config.gemini_api_key = GEMINI_API_KEY
<span class="hljs-keyword">end</span>

chat =
  RubyLLM
    .chat(<span class="hljs-symbol">model:</span> <span class="hljs-string">'gemini-2.5-pro-preview-03-25'</span>) <span class="hljs-comment"># Pick a model.</span>
    .with_tool(PdfPageReader.new) <span class="hljs-comment"># Add the tool.</span>
    .with_instructions(&lt;&lt;~TEXT) <span class="hljs-comment"># Add general instructions.</span>
      Use provided tool to find requested info <span class="hljs-keyword">in</span> the multi-page doc. Ask <span class="hljs-keyword">for</span>
      multiple pages at a time to avoid roundtrips.

      Respond only with results of your findings. Don<span class="hljs-string">'t do ascii tables, I prefer
      text and bullet points.

      To find info, use table of contents. Make sure you scan the full table of
      contents before you give up. Don'</span>t go to irrelevant parts of the doc <span class="hljs-keyword">unless</span>
      absolutely needed.

      Total number of <span class="hljs-symbol">pages:</span> <span class="hljs-number">1249</span>
      Table of contents is on <span class="hljs-symbol">pages:</span> <span class="hljs-number">31</span>-<span class="hljs-number">49</span>
    TEXT

response = chat.ask(ARGV.join(<span class="hljs-string">' '</span>)) { <span class="hljs-params">|chunk|</span>
  print chunk.content
}

<span class="hljs-comment"># Some stats at the end</span>
puts <span class="hljs-string">"\n\n-----------\n"</span>
puts <span class="hljs-string">"Input tokens: <span class="hljs-subst">#{response.input_tokens}</span>"</span>
puts <span class="hljs-string">"Output tokens: <span class="hljs-subst">#{response.output_tokens}</span>"</span>
puts <span class="hljs-string">"Total tokens: <span class="hljs-subst">#{response.input_tokens.to_i + response.output_tokens.to_i}</span>"</span></code></pre>
<p>That’s it.</p>
<p>Now I can ask a question and sit back, watching the llm scan table of contents, read relevant pages, and spit out a catered response. Pretty nice!</p>
<p>(Below is just sample output, not what’s really in my doc.)</p>
<pre><code class="hljs shell">❯ bin/ask_api_doc "what are all available statuses?"

-- Reading pages: 31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49

-- Reading pages: 1123

The available statuses are:
- `ACTIVE`: The default status for a new object.
- `INACTIVE`: The object is inactive and cannot be used.
- `PENDING`: The object is pending approval or activation.
- `ARCHIVED`: The object has been archived and is no longer active.
- `DELETED`: The object has been deleted and cannot be recovered.
- `SUSPENDED`: The object has been suspended and cannot be used.
- `EXPIRED`: The object has expired and is no longer valid.

-----------
Input tokens: 95288
Output tokens: 643
Total tokens: 95931</code></pre>
<p>I bet there are more involved<span class="push-double"></span> <span class="pull-double">“</span>talk to your docs” solutions out there, but this was quick and easy, and I can tweak it as needed. Speaking of which, let me know if you have any ideas for improving this.</p>
<hr>
<p><strong>Update (2025-05-26):</strong> Since I wrote this, I slightly extended it with a search tool based on <code>pdfgrep</code>:</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">PdfPageSearch</span> &lt; RubyLLM::Tool</span>
  DOC_PATH = <span class="hljs-string">'docs/big-doc.pdf'</span>
  description <span class="hljs-string">'Get page numbers by a PCRE regular expression.'</span>
  param <span class="hljs-symbol">:regex</span>, <span class="hljs-symbol">desc:</span> <span class="hljs-string">'PCRE Regular expression to search by, case insensitive.'</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">execute</span><span class="hljs-params">(<span class="hljs-symbol">regex:</span>)</span></span>
    command = <span class="hljs-string">"pdfgrep --color never -inP <span class="hljs-subst">#{regex.shellescape}</span> <span class="hljs-subst">#{DOC_PATH}</span>"</span>
    puts <span class="hljs-string">"\n-- Running: <span class="hljs-subst">#{command}</span>\n\n"</span>
    output = <span class="hljs-string">`<span class="hljs-subst">#{command}</span>`</span>
    pages = output.split(<span class="hljs-string">"\n"</span>).map { _1.split(<span class="hljs-string">':'</span>).first.to_i }.uniq
    puts <span class="hljs-string">"\n-- Found results on: <span class="hljs-subst">#{pages.size}</span> page(s)\n\n"</span>
    { <span class="hljs-symbol">pages:</span> pages }
  <span class="hljs-keyword">rescue</span> =&gt; e
    { <span class="hljs-symbol">error:</span> e.message }
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span></code></pre>
<p>and added it to RubyLLM like this:</p>
<pre><code class="hljs ruby">chat =
  RubyLLM
    .chat(<span class="hljs-symbol">model:</span> <span class="hljs-string">'gemini-2.5-pro-preview-03-25'</span>)
    .with_tool(PdfPageReader.new)
    .with_tool(PdfPageSearch.new) <span class="hljs-comment"># &lt;------- HERE</span>
    .with_instructions(&lt;&lt;~TEXT)
      ...
    TEXT</code></pre>
<p>Also switched from Google Gemini to OpenAI o3, and together these changes considerably improved the search performance.</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Long Term Refactors ]]></title>
    <link>https://max.engineer/long-term-refactors</link>
    <guid>https://max.engineer/long-term-refactors</guid>
    <pubDate>Mon, 20 Nov 2023 00:00:00 -0500</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>Big (or<span class="push-double"></span> <span class="pull-double">“</span>Long Term”) refactors are hard to pull off in a busy company. To succeed, we must:</p>
<ul>
<li>Convince business that it’s worth the delay.</li>
<li>Decide what features will have to wait.</li>
<li>Produce regular status updates and ETAs.</li>
<li>Justify the refactor as we go. Is it the right approach?</li>
<li>Keep ourselves from burning out.</li>
<li>Allow time for the team to digest and review the huge diff.</li>
<li>Fix a bombardment of <span class="small-caps">QA</span> issues.</li>
</ul>
<p>And we better do this all quickly, because god forbid original and refactored code coexist!</p>
<p>Is this really the only way? Feature freeze, a rush, a buggy rollout, and likely burnout?</p>
<h2 id="the-other-way">The Other Way</h2>
<p>I have a theory that long refactors get a bad rap because most of them take far longer than we expect. The length leads to stress, an awkward codebase, a confused team, and often no end in sight. Instead, what if we <em>prepared</em> an intentional long term refactor? A few years ago, I began trying this method, and it has led to some surprisingly successful results:</p>
<ul>
<li>We didn’t need to negotiate business timelines.</li>
<li>We didn’t need to compete against business priorities.</li>
<li>The team quickly understood and even took ownership of the refactor over time.</li>
<li>There was no increase in stress and risk of burnout.</li>
<li>PRs were easy to review, no huge diffs.</li>
<li>The refactor was consistently and collaboratively re-evaluated by the entire team.</li>
<li>We never wasted time refactoring code that didn’t need it.</li>
<li>Our feature development remained unblocked.</li>
<li>The team expanded their architectural knowledge.</li>
<li>The new engineers had a great source of first tasks.</li>
<li>We rolled out the refactor gradually, making it easier to <span class="small-caps">QA</span>, and reducing bugs.</li>
</ul>
<p>Long-term refactors involve the whole team from the beginning, which is one of their most powerful aspects. So far, I’ve participated in ~10 big refactors using this method across 2 companies with at least 3 different teams, and I’ve yet to see it go wrong. Here was our approach.</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>To start, you should have the following:</p>
<ol type="1">
<li>An experienced software engineer with a vision for the refactor.</li>
<li>A team of software engineers at various levels of expertise.</li>
<li>An internal knowledge base. (Any of Github Wiki, Notion, Confluence, Markdown files, etc)</li>
<li>Less than ~5-10 long term refactors already in progress, depending on their scope.</li>
</ol>
<h2 id="process">Process</h2>
<p>Almost every big refactor I’ve encountered follows a semi-consistent pattern. What makes a refactor big is the sheer number of times you must apply the pattern. In an ideal world, this labor is divided. Unfortunately, the refactor often requires case-by-case decision making. My proposed process is centered around explaining the refactoring idea to your colleagues, so that they can also make decisions.</p>
<p><span class="small-caps">NOTE</span>: The process is for the<span class="push-double"></span> <span class="pull-double">“</span>experienced engineer” from prerequisite #1.</p>
<ol type="1">
<li><strong>Identify code that should be refactored.</strong></li>
<li><strong>Identify the refactoring pattern.</strong><br>
Explore the codebase to identify a common pattern of required changes. A rough idea is fine for now. It’s okay to ignore special cases and focus on the commonalities.</li>
<li><strong>Implement an example of the refactor.</strong><br>
Find the smallest representative sample that you can apply your rough pattern to, and refactor it. This is where you want to be extremely diligent. Experiment and thoroughly refine your pattern. Don’t skimp on best practices. Follow <a href="https://max.engineer/reasons-to-leave-comment">4 reasons to leave a code comment</a>. Convey <a href="https://max.engineer/maintainable-code">how, what, and why</a>. Make it your best work, because it’s going to become the primary reference for the rest of your team. Submit a merge request.</li>
<li><strong>Prepare the codebase for the refactor.</strong><br>
Now that you’ve tried the refactor yourself, made decisions, and fought friction, you have an idea of what your colleagues are going to be dealing with. Use your experience to pave a smoother path for them. Go through the codebase and do minor preparations: reshuffle code, fill gaps, rename things for clarity, resolve ambiguities, create new dir structures. Keep your changes minor. Don’t refactor everything yourself. It’s critical that the bulk of the work is shared. Submit a merge request.</li>
<li><strong>Name your refactor.</strong><br>
Give your refactor a convenient name for use in discussions and docs. Make sure the name is concise, clear, and descriptive. For example,<span class="push-double"></span> <span class="pull-double">“</span>Remove dependency on [package X]”.</li>
<li><strong>Write up refactoring instructions.</strong><br>
Create a document in your internal knowledge base and title it with the name from step 5. Some tips:
<ul>
<li>State exactly what to do, and how to do it. Be brief and specific.
<ul>
<li><em>Do <span class="small-caps">NOT</span> tell stories</em>:<span class="push-double"></span> <span class="pull-double">“</span>Over the years we’ve realized that the method we’ve been using …”</li>
<li><em>Do list specific steps</em>:<span class="push-double"></span> <span class="pull-double">“</span>Find a class that has function X. Create new class named Y. Move function X into class Y.” The steps can’t all be plain of course, but challenge yourself to see how brief and specific you can make them.</li>
</ul></li>
<li>Link your example merge request from step 3. People should see the code before and after.</li>
<li>Finally, feel free to add some context at the end. Here, you’re welcome to provide the background, tell stories, link to relevant resources and discussions. That said, do hide this part under an expandable element, like <a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Element/details">&lt;details&gt;</a>. We want the doc focused on the pattern itself, with an option to expand the context as needed.</li>
</ul></li>
<li><strong>Add this refactor to the list of long term refactors.</strong><br>
Make sure that there is a page in your knowledge base that lists all long term refactors. The document from step 6 should be added to that list.</li>
<li><strong>Introduce this refactor to your team.</strong><br>
Use either a written announcement or a meeting. Explain how to pick a chunk that needs refactoring, walk them through your example. Don’t forget to link the instructions you wrote in step 6. It’s very important that your engineering team is aware of every long term refactor currently in progress. That’s why you want to stick to just a few at a time, and properly introduce each one. Adding a new long term refactor should be a big deal.</li>
<li><strong>Assign refactoring tasks.</strong><br>
Your refactor is now ready to be done gradually over time, but I advise against creating all the tasks up front in the task tracker. That would destroy one of the main benefits — not wasting time on unimportant or soon-to-be-deleted code. Instead, create tasks as they naturally come up in planning.<span class="push-double"></span> <span class="pull-double">“</span>Hey, since you’re going to be changing that, maybe remove dependency on package X while you’re in there?”. Moreover, I advise keeping the whole umbrella-refactor away from the task tracker, or at least from the areas where business can see it. A successful long term refactor should be tracked by engineers, not the company management. As long as it’s written in the knowledge base, and is always present on engineers’ minds, you should be good. It shouldn’t matter to the business whether the refactor is completed, and how long it takes.</li>
<li><strong>Stay aware of long term refactors.</strong><br>
Get every new engineer joining the team to read the list from step 7. Make sure you have this step in your onboarding process. This is also a great source of first tasks for them, to help them understand both the existing code, and the new direction. It’s easy to refer back to the list anytime (remember, the list must remain short), but engineers also tend to remind each other of these refactors when planning.</li>
<li><strong>Complete the refactor.</strong><br>
A long term refactor doesn’t need to be 100% completed. Instead, one day you will find that your doc is redundant, because the codebase already speaks for itself. If all the major parts are refactored, and there is no more confusion about your direction, feel free to mark it done. This creates space for the next one.</li>
</ol>
<p>Having followed this process carefully, I’ve seen something awesome happen. The team got into the habit of self-assigning refactors as needed. When they had questions, they’d initiate discussions and meetings. This got everyone on the same page around decisions that might’ve been controversial if made alone. With each completed refactor task, we’d all gain new examples to draw from in upcoming tasks.</p>
<p>Compare that to working on your own for weeks or months, and blindsiding your team with a huge diff.</p>
<h2 id="drawbacks">Drawbacks</h2>
<p>Here are some that I can think of.</p>
<ul>
<li>Albeit rare, some big refactors don’t have a common pattern. It’s possible that you’re actually dealing with multiple refactors that shouldn’t be under the same umbrella. Try to split them instead.</li>
<li>You need patience to get through these refactors. They can span a year, two years, who knows. During that time, the old and the new code will coexist, and might cause some confusion if the list from step 7 is not on everyone’s mind. I personally haven’t encountered this drawback in practice, because the process constantly keeps everyone on the same page. Due to organization and communication, nobody is confused about where we’re coming from, and where we are headed.</li>
<li>Certain parts of the code may never get refactored. There’s probably a good reason why. It could be that this part is easy to maintain as is, and doesn’t need to change. Or perhaps this code is on its way out. Think of it as a win — you saved time and didn’t introduce bugs unnecessarily.</li>
<li>If you like doing everything alone, this ain’t it. This approach is designed to get everyone on the same page. You will have to agree on solutions and articulate your reasoning. If you don’t like doing that, you won’t like long term refactors.</li>
</ul>
<p>Try it, let me know how it goes!</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ 4 Reasons to Leave a Code Comment ]]></title>
    <link>https://max.engineer/reasons-to-leave-comment</link>
    <guid>https://max.engineer/reasons-to-leave-comment</guid>
    <pubDate>Thu, 20 Jul 2023 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>I originally wrote this list as a part of <a href="https://max.engineer/maintainable-code">Writing Maintainable Code is a Communication Skill</a>, then made a <a href="https://twitter.com/hakunin/status/1613955102401765376">tweet</a>. Since then, I had to link this list multiple times. This post makes it easier to link.</p>
<h2 id="reasons">Reasons</h2>
<ol type="1">
<li><strong>An odd business requirement (share the origin story)</strong></li>
<li><strong>It took research (summarize with links)</strong></li>
<li><strong>Multiple options were considered (justify decision)</strong></li>
<li><strong>Question in a code review (answer in a comment)</strong></li>
</ol>
<p>Important caveat for number 4: if your code can be restructured in a way that answers the question without a comment, do that instead.</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Adventures in Ruby-esque type enforcement ]]></title>
    <link>https://max.engineer/portrayal-guards-poc</link>
    <guid>https://max.engineer/portrayal-guards-poc</guid>
    <pubDate>Sat, 13 May 2023 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>In Ruby you can kinda pretend that you have type enforcement at runtime, because Ruby is very flexible. This could be a useful-enough thing to do to organize and formalize the<span class="push-double"></span> <span class="pull-double">“</span>guarding” of your data. As a disclaimer, I’m not actually a huge fan of this practice, because I think that if you’re going to enforce types at runtime, you may as well achieve the same result via learning how to write good constructors and immutable objects. I believe the focus should be on controlling the flow of data from source to destination, not declaring types to guard against every generic use case. Nevertheless, for many existing codebases out there, runtime-level types might be the right way to improve maintainability, so I decided to experiment with my own approach.</p>
<p>Before I start, there are already libraries out there that let you declare types to be checked at runtime. They offer a bunch of fancy-named classes and methods that let you construct your own types. I disagree with their approach, because it introduces a lot of cognitive overhead. They expect me to learn an extensive vocabulary only to describe simple boolean expressions. Why not just let me write those boolean expressions in the first place? This is the whole premise of my experiment: it seems easier to write a plain Ruby value check than to figure out how to build it with fancy type libraries.</p>
<p>A while ago, I wrote a little library called <a href="https://github.com/maxim/portrayal">portrayal</a>, which is a simple Struct-like object builder. It lets you declare keywords, which are just <code>attr_accessor</code>s and a default <code>initialize</code>, plus some extra convenience. Using this lib as the basis, I wrote a proof of concept extension called <code>Portrayal::Guards</code>. In this article I show you how it works.</p>
<h2 id="leaning-into-boolean-expressions">Leaning into boolean expressions</h2>
<p>Let’s say we have a <code>class Person</code>, who has <code>age</code> and <code>favorite_beer</code>.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Person</span></span>
  extend Portrayal

  keyword <span class="hljs-symbol">:age</span>
  keyword <span class="hljs-symbol">:favorite_beer</span>, <span class="hljs-symbol">default:</span> <span class="hljs-literal">nil</span>
  public <span class="hljs-symbol">:age=</span>, <span class="hljs-symbol">:favorite_beer=</span>
<span class="hljs-keyword">end</span></code></pre>
<p>Note: Normally setters are protected, but I’m making them public above to illustrate how guards work.</p>
<p>Imagine that our data type requirements are as follows:</p>
<ul>
<li>Age must be an integer between 0 and 130</li>
<li>Favorite beer must be nil or any string</li>
<li>If favorite beer is not nil, then age must be &gt;=21</li>
</ul>
<p>Here is one simple way to do this with <code>Portrayal::Guards</code>.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Person</span></span>
  extend Portrayal

  keyword <span class="hljs-symbol">:age</span>
  keyword <span class="hljs-symbol">:favorite_beer</span>, <span class="hljs-symbol">default:</span> <span class="hljs-literal">nil</span>
  public <span class="hljs-symbol">:age=</span>, <span class="hljs-symbol">:favorite_beer=</span>

  guard(<span class="hljs-string">'age must be human and beer is only for &gt;=21yo'</span>) {
    age.is_a?(Integer) &amp;&amp; (<span class="hljs-number">0</span>..<span class="hljs-number">130</span>).cover?(age) &amp;&amp;
      (favorite_beer.<span class="hljs-literal">nil</span>? <span class="hljs-params">||</span> (favorite_beer.is_a?(String) &amp;&amp; age &gt;= <span class="hljs-number">21</span>))
  }
<span class="hljs-keyword">end</span></code></pre>
<p>This guard can be declared <em>anywhere in the class body</em>. It has a single boolean expression in it. If it returns anything truthy, the guard passes. If it returns <code>false</code> or <code>nil</code>, the guard fails. The string argument serves as the error message in case it fails. With this single guard we actually solved the whole problem.</p>
<p>Check out how this guard protects our object:</p>
<pre><code class="hljs irb"><span class="hljs-comment"># Trying to init a person with invalid age</span>
&gt; Person.new(<span class="hljs-symbol">age:</span> <span class="hljs-number">200</span>)
<span class="hljs-symbol">ArgumentError:</span> age must be human <span class="hljs-keyword">and</span> beer is only <span class="hljs-keyword">for</span> &gt;=21yo

<span class="hljs-comment"># Making a valid person</span>
&gt; person = Person.new(<span class="hljs-symbol">age:</span> <span class="hljs-number">5</span>)
=&gt; #&lt;Person @age=5, @favorite_beer=nil&gt;

<span class="hljs-comment"># Trying to assign a beer to &lt;21yo with a setter method</span>
&gt; person.favorite_beer = <span class="hljs-string">'corona'</span>
<span class="hljs-symbol">ArgumentError:</span> age must be human <span class="hljs-keyword">and</span> beer is only <span class="hljs-keyword">for</span> &gt;=21yo

<span class="hljs-comment"># Method `update` lets you apply multiple changes at once, in this case invalid</span>
&gt; person.update(<span class="hljs-symbol">age:</span> <span class="hljs-number">200</span>, <span class="hljs-symbol">favorite_beer:</span> <span class="hljs-number">9</span>)
=&gt; {<span class="hljs-symbol">:base=&gt;</span>[<span class="hljs-string">"age must be human and beer is only for &gt;=21yo"</span>]}

<span class="hljs-comment"># Valid `update`</span>
&gt; person.update(<span class="hljs-symbol">age:</span> <span class="hljs-number">30</span>, <span class="hljs-symbol">favorite_beer:</span> <span class="hljs-string">'corona'</span>)
=&gt; nil

&gt; person
=&gt; #&lt;Person @age=30, @favorite_beer="corona"&gt;</code></pre>
<p>Three things to notice here:</p>
<ol type="1">
<li>This guard is guarding both <code>initialize</code> (<code>.new</code>), and writer methods.</li>
<li>We have a special method <code>update</code>, which lets you update multiple values at the same time. This helps resolve situations when you can’t assign attributes one at a time, because guards cross-check them.</li>
<li>Notice that the error we got from <code>update</code> is under a key <code>:base</code>. Keep it in mind for now, I will explain this later.</li>
</ol>
<p>This was easy, it’s just a plain boolean expression that now completely guards our attributes. However, the expression is a little bit unwieldy, and the error message is not super useful for telling us what exactly is wrong. That’s okay. We can rewrite the guard into 3 separate guards.</p>
<pre><code class="hljs ruby">guard(<span class="hljs-string">'age must be an integer in human range'</span>) {
  age.is_a?(Integer) &amp;&amp; (<span class="hljs-number">0</span>..<span class="hljs-number">130</span>).cover?(age)
}

guard(<span class="hljs-string">'favorite_beer must be string or nil'</span>) {
  favorite_beer.<span class="hljs-literal">nil</span>? <span class="hljs-params">||</span> favorite_beer.is_a?(String)
}

guard(<span class="hljs-string">'favorite_beer is only allowed for age &gt;=21'</span>) {
  favorite_beer.<span class="hljs-literal">nil</span>? <span class="hljs-params">||</span> age &gt;= <span class="hljs-number">21</span>
}</code></pre>
<p>Much neater. Let’s try running the same code:</p>
<pre><code class="hljs irb">&gt; Person.new(<span class="hljs-symbol">age:</span> <span class="hljs-number">200</span>)
<span class="hljs-symbol">ArgumentError:</span> age must be an integer <span class="hljs-keyword">in</span> human range

&gt; person = Person.new(<span class="hljs-symbol">age:</span> <span class="hljs-number">5</span>)
=&gt; #&lt;Person @age=5, @favorite_beer=nil&gt;

&gt; person.favorite_beer = <span class="hljs-string">'corona'</span>
<span class="hljs-symbol">ArgumentError:</span> favorite_beer is only allowed <span class="hljs-keyword">for</span> age &gt;=<span class="hljs-number">21</span>

&gt; person.update(<span class="hljs-symbol">age:</span> <span class="hljs-number">200</span>, <span class="hljs-symbol">favorite_beer:</span> <span class="hljs-number">9</span>)
=&gt; {<span class="hljs-symbol">:base=&gt;</span>[<span class="hljs-string">"age must be an integer in human range"</span>, <span class="hljs-string">"favorite_beer must be string or nil"</span>]}

&gt; person.update(<span class="hljs-symbol">age:</span> <span class="hljs-number">30</span>, <span class="hljs-symbol">favorite_beer:</span> <span class="hljs-string">'corona'</span>)
=&gt; nil

&gt; person
=&gt; #&lt;Person @age=30, @favorite_beer="corona"&gt;</code></pre>
<p>Nice, error messages are now more specific.</p>
<p>Just to recap, with <code>guard</code> and plain Ruby we can accomplish… everything.</p>
<h2 id="but-what-about-reuse">But what about reuse?</h2>
<p>Ah. Reuse is already here by default. We can have a module like this.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">ReusableTypes</span></span>
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">int</span><span class="hljs-params">(name)</span></span>
    guard(<span class="hljs-string">"<span class="hljs-subst">#{name}</span> must be an integer"</span>) { send(name).is_a?(Integer) }
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">age</span><span class="hljs-params">(name)</span></span>
    int(name)
    guard(<span class="hljs-string">"<span class="hljs-subst">#{name}</span> must be within 0-130"</span>) { (<span class="hljs-number">0</span>..<span class="hljs-number">130</span>).cover?(send(name)) }
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">nullable_string</span><span class="hljs-params">(name)</span></span>
    guard(<span class="hljs-string">"<span class="hljs-subst">#{name}</span> must be nil or a string"</span>) {
      value = send(name)
      value.<span class="hljs-literal">nil</span>? <span class="hljs-params">||</span> value.is_a?(String)
    }
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span></code></pre>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Person</span></span>
  extend Portrayal
  extend ReusableTypes

  keyword <span class="hljs-symbol">:age</span>
  keyword <span class="hljs-symbol">:favorite_beer</span>, <span class="hljs-symbol">default:</span> <span class="hljs-literal">nil</span>
  public <span class="hljs-symbol">:age=</span>, <span class="hljs-symbol">:favorite_beer=</span>

  <span class="hljs-comment"># Calling the guards!</span>
  age <span class="hljs-symbol">:age</span>
  nullable_string <span class="hljs-symbol">:favorite_beer</span>

  guard(<span class="hljs-string">'favorite_beer is only allowed for age &gt;=21'</span>) {
    favorite_beer.<span class="hljs-literal">nil</span>? <span class="hljs-params">||</span> age &gt;= <span class="hljs-number">21</span>
  }
<span class="hljs-keyword">end</span></code></pre>
<p>We put guards in module methods and call them. Nothing really changed, but we suddenly have reusable types.</p>
<p>In Ruby it’s a common tradition to return the name of what’s being declared. Portrayal’s <code>keyword</code> follows this tradition, returning the name of the keyword. If you’d like, you can put our type methods in front of <code>keyword</code>, and it works the same.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Person</span></span>
  extend Portrayal
  extend ReusableTypes

  <span class="hljs-comment"># Calling the guards inline with keywords!</span>
  age keyword <span class="hljs-symbol">:age</span>
  nullable_string keyword <span class="hljs-symbol">:favorite_beer</span>, <span class="hljs-symbol">default:</span> <span class="hljs-literal">nil</span>
  public <span class="hljs-symbol">:age=</span>, <span class="hljs-symbol">:favorite_beer=</span>

  guard(<span class="hljs-string">'favorite_beer is only allowed for age &gt;=21'</span>) {
    favorite_beer.<span class="hljs-literal">nil</span>? <span class="hljs-params">||</span> age &gt;= <span class="hljs-number">21</span>
  }
<span class="hljs-keyword">end</span></code></pre>
<p>If you don’t like the above style, you could do something else. For example, you could return <code>name</code> from methods in our module, and wrap the keyword names in them. Let’s also capitalize method names while at it:</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">ReusableTypes</span></span>
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">Int</span><span class="hljs-params">(name)</span></span>
    guard(<span class="hljs-string">"<span class="hljs-subst">#{name}</span> must be an integer"</span>) { send(name).is_a?(Integer) }
    name
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">Age</span><span class="hljs-params">(name)</span></span>
    Int(name)
    guard(<span class="hljs-string">"<span class="hljs-subst">#{name}</span> must be within 0-130"</span>) { (<span class="hljs-number">0</span>..<span class="hljs-number">130</span>).cover?(send(name)) }
    name
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">NullableString</span><span class="hljs-params">(name)</span></span>
    guard(<span class="hljs-string">"<span class="hljs-subst">#{name}</span> must be nil or a string"</span>) {
      value = send(name)
      value.<span class="hljs-literal">nil</span>? <span class="hljs-params">||</span> value.is_a?(String)
    }
    name
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span></code></pre>
<p>Which makes this possible:</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Person</span></span>
  extend Portrayal
  extend ReusableTypes

  keyword Age(<span class="hljs-symbol">:age</span>)
  keyword NullableString(<span class="hljs-symbol">:favorite_beer</span>), <span class="hljs-symbol">default:</span> <span class="hljs-literal">nil</span>
  public <span class="hljs-symbol">:age=</span>, <span class="hljs-symbol">:favorite_beer=</span>

  guard(<span class="hljs-string">'favorite_beer is only allowed for age &gt;=21'</span>) {
    favorite_beer.<span class="hljs-literal">nil</span>? <span class="hljs-params">||</span> age &gt;= <span class="hljs-number">21</span>
  }
<span class="hljs-keyword">end</span></code></pre>
<p>When I said earlier that guards can be declared <em>anywhere in the class body</em>, I really meant it. This still works. I’m sure there are more ways you can come up with for using these guards. These are just a couple off the top of my head.</p>
<p>Looking at the above, you can probably already imagine how you’d be able to easily implement a type of any complexity, and <code>Portrayal::Guards</code> will make sure to guard your initializers and writers for you.</p>
<h2 id="but-what-about-composition">But what about composition?</h2>
<p>Right, we actually might need some extra features to make composition nice. After toying around with some ideas, I decided to include the following additional features into the proof of concept.</p>
<h3 id="guard-chaining">Guard chaining</h3>
<p>One way to compose guards could be to make sure that our reusable methods return the passed-in <code>name</code>, like we already did above. If every declaration returns the name that it received, then we could chain guards like this:</p>
<pre><code class="hljs ruby"><span class="hljs-comment"># Type methods</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">Odd</span><span class="hljs-params">(name)</span></span>
  guard(<span class="hljs-string">"<span class="hljs-subst">#{name}</span> must be odd"</span>) { value = send(name); value.respond_to?(<span class="hljs-symbol">:odd?</span>) &amp;&amp; value.odd? }
  name
<span class="hljs-keyword">end</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">Int</span><span class="hljs-params">(name)</span></span>
  guard(<span class="hljs-string">"<span class="hljs-subst">#{name}</span> must be an integer"</span>) { send(name).is_a?(Integer) }
  name
<span class="hljs-keyword">end</span>

<span class="hljs-comment"># Chaining example:</span>
Odd Int keyword <span class="hljs-symbol">:odd_number</span></code></pre>
<p>This could be especially nice for something like <code>Nullable</code>, where we don’t want to create <code>NullableString</code>, <code>NullableInt</code>, etc for every possible type. So maybe if we had</p>
<pre><code class="hljs ruby"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">Nullable</span><span class="hljs-params">(name)</span></span>
  guard(<span class="hljs-string">"<span class="hljs-subst">#{name}</span> can be nil"</span>) { send(name).<span class="hljs-literal">nil</span>? }
  name
<span class="hljs-keyword">end</span></code></pre>
<p>Then maybe we could write <code>Nullable Int keyword :number</code>?</p>
<p>Unfortunately, we cannot. It won’t work, because <code>Nullable</code> will fail anything that isn’t a nil, and <code>Int</code> will fail anything that isn’t an integer. They don’t mesh, because we don’t have full <code>&amp;&amp;</code>/<code>||</code> capabilities across guards. The good news is that perhaps we don’t actually need them.</p>
<p>I’ve thought about a few ways to enable this sort of composition, and came up with what I find to be a simple solution: a <code>pass!</code> guard.</p>
<h3 id="special-pass-guard">Special <code>pass!</code> guard</h3>
<p>A <code>pass!</code> is just like a regular <code>guard</code>, you can have as many as you want (but you probably never need more than one), and they always run first. If a <code>pass!</code> returns anything truthy, then we’re done, the object is valid, no further guards are called. With this new capability we can make <code>Nullable</code> like this:</p>
<pre><code class="hljs ruby"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">Nullable</span><span class="hljs-params">(name)</span></span>
  pass!(<span class="hljs-string">"<span class="hljs-subst">#{name}</span> can be nil"</span>) { send(name).<span class="hljs-literal">nil</span>? }
  name
<span class="hljs-keyword">end</span></code></pre>
<p>And this kind of composition works now:</p>
<pre><code class="hljs ruby">Nullable Int keyword <span class="hljs-symbol">:number</span>, <span class="hljs-symbol">default:</span> <span class="hljs-literal">nil</span>
Nullable String keyword <span class="hljs-symbol">:text</span>, <span class="hljs-symbol">default:</span> <span class="hljs-literal">nil</span></code></pre>
<p>Yay!</p>
<p>Because a <code>pass!</code> always runs first, the order doesn’t matter. If a <code>pass!</code> sees <code>nil</code>, other guards won’t run. If it sees non-nil, then we proceed into int/string guards.</p>
<p>Unfortunately, there’s still a problem here. All the guards are mixed together, so the <code>Nullable</code> check for <code>number</code> will stop all guards from executing, even the <code>String</code> guard for <code>text</code>. That’s because we add guards into the class, but we aren’t grouping them with each other.</p>
<p>To solve this, I added guard grouping. But don’t worry, it’s basically nothing.</p>
<h3 id="guard-grouping">Guard grouping</h3>
<p>Remember that <code>:base</code> key in the error hash you saw earlier? Here’s a reminder:</p>
<pre><code class="hljs irb">{<span class="hljs-symbol">:base=&gt;</span>[<span class="hljs-string">"age must be human and beer is only for &gt;=21yo"</span>]}</code></pre>
<p>The <code>:base</code> is actually a default topic for guards. And it’s super simple to group guards into other topics. Just add one more argument to the guard:</p>
<pre><code class="hljs ruby">guard(<span class="hljs-symbol">:topic_name</span>, <span class="hljs-string">'error message'</span>) { boolean expression }</code></pre>
<p>The new first argument <code>:topic_name</code> (it could be anything really) is the topic. So all guards are actually per topic. A fail or <code>pass!</code> in one topic won’t stop guards in another topic. This is just a more generic way to let you make guards<span class="push-double"></span> <span class="pull-double">“</span>per attribute”. And of course it’s just what the doctor ordered for <code>ReusableTypes</code> module. We can now do this:</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">ReusableTypes</span></span>
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">int</span><span class="hljs-params">(name)</span></span>
    guard(name, <span class="hljs-string">"<span class="hljs-subst">#{name}</span> must be an integer"</span>) { send(name).is_a?(Integer) }
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">age</span><span class="hljs-params">(name)</span></span>
    int(name)
    guard(name, <span class="hljs-string">"<span class="hljs-subst">#{name}</span> must be between 0 and 130"</span>) { (<span class="hljs-number">0</span>..<span class="hljs-number">130</span>).cover?(send(name)) }
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">string</span><span class="hljs-params">(name)</span></span>
    guard(name, <span class="hljs-string">"<span class="hljs-subst">#{name}</span> must be string"</span>) { send(name).is_a?(String) }
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">nullable</span><span class="hljs-params">(name)</span></span>
    pass!(name, <span class="hljs-string">"<span class="hljs-subst">#{name}</span> can be nil"</span>) { send(name).<span class="hljs-literal">nil</span>? }
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span></code></pre>
<p>By the way, notice how we’re no longer returning <code>name</code> from each method. That’s because each guard already returns its topic, so we don’t have to do that anymore. Another small win.</p>
<p>With these in place we can now declare our <code>Person</code> this way:</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Person</span></span>
  extend Portrayal
  extend ReusableTypes

  age keyword <span class="hljs-symbol">:age</span>
  nullable string keyword <span class="hljs-symbol">:favorite_beer</span>, <span class="hljs-symbol">default:</span> <span class="hljs-literal">nil</span>

  guard(<span class="hljs-string">'favorite_beer is only allowed for age &gt;=21'</span>) {
    favorite_beer.<span class="hljs-literal">nil</span>? <span class="hljs-params">||</span> age &gt;= <span class="hljs-number">21</span>
  }  
<span class="hljs-keyword">end</span></code></pre>
<p>Or this way if you made methods capitalized:</p>
<pre><code class="hljs ruby">Age keyword <span class="hljs-symbol">:age</span>
Nullable String keyword <span class="hljs-symbol">:favorite_beer</span>, <span class="hljs-symbol">default:</span> <span class="hljs-literal">nil</span></code></pre>
<p>Or this way, if you like to keep keyword on the left:</p>
<pre><code class="hljs ruby">keyword Age(<span class="hljs-symbol">:age</span>)
keyword Nullable(String <span class="hljs-symbol">:favorite_beer</span>), <span class="hljs-symbol">default:</span> <span class="hljs-literal">nil</span></code></pre>
<p>Or this way, if you don’t want to interfere with keywords:</p>
<pre><code class="hljs ruby">Age <span class="hljs-symbol">:age</span>
Nullable String <span class="hljs-symbol">:favorite_beer</span>

keyword <span class="hljs-symbol">:age</span>
keyword <span class="hljs-symbol">:favorite_beer</span>, <span class="hljs-symbol">default:</span> <span class="hljs-literal">nil</span></code></pre>
<p>Or go back to plain guard declarations. Whatever you fancy.</p>
<p>Keep in mind, we only learned 2 methods so far: <code>guard</code> and <code>pass!</code> (well, maybe also <code>update</code> if you’re pedantic). The rest is just plain Ruby.</p>
<h3 id="listing-guards">Listing guards</h3>
<p>Just for fun, I wanted to be able to list guards declared on a class. It’s possible with <code>Person.portrayal.list_guards</code>, which returns the following:</p>
<pre><code class="hljs irb">&gt; Person.portrayal.list_guards
=&gt; {<span class="hljs-symbol">:age=&gt;</span>[<span class="hljs-string">"age must be an integer"</span>, <span class="hljs-string">"age must be between 0 and 130"</span>],
 <span class="hljs-symbol">:favorite_beer=&gt;</span>[<span class="hljs-string">"favorite_beer can be nil"</span>, <span class="hljs-string">"favorite_beer must be string"</span>],
 <span class="hljs-symbol">:base=&gt;</span>[<span class="hljs-string">"favorite_beer is only allowed for age &gt;=21"</span>]}</code></pre>
<h2 id="where-is-this-lib">Where is this lib?</h2>
<p>At the time of this writing the implementation is <a href="https://gist.github.com/maxim/12e086f23f7ae9d230a2895bbb519483">just a gist</a>. I’m curious what people think about this before I make it into a proper gem. Let me know your thoughts. Too crazy? Or not crazy enough?&nbsp;:)</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Rails — narrative vs model centric approach ]]></title>
    <link>https://max.engineer/rails-narratives-vs-models</link>
    <guid>https://max.engineer/rails-narratives-vs-models</guid>
    <pubDate>Tue, 22 Nov 2022 00:00:00 -0500</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>I’ve explored <span class="small-caps">DHH</span><span class="push-single"></span><span class="pull-single">’</span>s way of writing Rails applications. His introduction of <a href="https://api.rubyonrails.org/classes/ActiveSupport/CurrentAttributes.html">CurrentAttributes</a> and <a href="https://api.rubyonrails.org/classes/ActiveRecord/Suppressor.html">suppressors of callbacks</a> a few years ago made me want to revisit his <a href="https://www.youtube.com/watch?v=H5i1gdwe1Ls">Youtube Screencasts on Basecamp 3</a> and try to really appreciate this approach with an open mind.</p>
<p>I soon understood our fundamental difference in thinking. <span class="small-caps">DHH</span> approaches application code as an interconnected web of rich models. Each model is chock-full of concerns (modules with generalized functionality). Each model’s methods produce ripple effects far and wide across associated models, via numerous callbacks spanning many modules. This approach is somewhat graph-like. You have a graph of rich nodes (by<span class="push-double"></span> <span class="pull-double">“</span>rich” I mean that they provide the highest level of business functionality), and as you activate one, it activates other nodes at various distances in all directions. These ripple<span class="push-double"></span> <span class="pull-double">“</span>activations” could be anything, from additional database interactions, to 3rd party <span class="small-caps">API</span> calls, emails and text messages sent out, logging, and lots of other stuff.</p>
<p>This finally made me realize that with such a dynamic way of viewing application behavior, where all business needs are embedded at the very core, it makes sense why one might want to suppress entire classes and categories of callbacks. Those ripple effects are nearly untraceable, and require blanket suppressors from the top. Instead of directing logic, we’re constraining logic that would otherwise spread in all kinds of surprising ways.</p>
<p>This also explains why it’s so difficult to avoid globals in this world. You don’t want to impede your vast ripple effects with such minutia as passing the same data across associations over and over. It’s a waste of time.</p>
<p>I want to say that this is a highly unusual approach, but to be frank, any consistent thought-out approach is unusual in our industry. Thoughtful codebases are unusual. So yes, it’s unusual to have a well-established approach consistently applied to the entire codebase.</p>
<p>I know many people disagree with <span class="small-caps">DHH</span>, but I have not heard them provide a good alternative way of architecting the entire application. I’ve seen rebuttals to individual features of Rails, as well as OOP-obsessed approaches that (to be frank) were even more difficult to follow, but still no holistic explanation on how Rails apps should be written for <a href="https://max.engineer/maintainable-code">maximum maintainability</a>.</p>
<p>My biggest problem with <span class="small-caps">DHH</span><span class="push-single"></span><span class="pull-single">’</span>s approach is that in his videos it was really hard for me to follow the story that his code is telling. Since all ripple effects are unapologetically triggered via chains of callbacks, it was difficult to follow so many paths into so many directions (or shall we say, indirections), ending up with so many outcomes. I find that this doesn’t work well with how I think.</p>
<p>My alternative way of building apps is narrative centric. Instead of creating a web of nodes with ripple effects, I want to write short stories, each living in an entry point (whether that’s a controller action, bg job, rake task, or test). Each story has a beginning (the initiating request/call) and an end (the response/output), with side effects in between. I love seeing a complete story where every major plot point is clearly visible at the controller level. If plot points are often repeated in the same order, we can bundle them under a laconically-named function. I love that I can look at every entry point, see exactly what it does, and decide how best to optimize it. I can decide to put various calls into a transaction, group multiple calls into one for a more efficient <span class="small-caps">SQL</span> interaction. I can kick something off into background processing, or introduce an exponential backoff retry for any of the steps involved. I find that with the entry point narrative-centric approach I have
the most clarity and flexibility to achieve these optimizations. On the contrary, codebases where<span class="push-double"></span> <span class="pull-double">“</span>stories” fan out widely through callback chains, don’t lend themselves well to these tweaks. You are never sure what might trigger a particular code path, and whether a particular optimization can be applied for all possible triggers.</p>
<p>To write beautiful short stories, I prefer to focus my attention on building myself the<span class="push-double"></span> <span class="pull-double">“</span>domain language”. I don’t mean the literal <span class="small-caps">DSL</span>. Rather, a bunch of libraries needed for my entry points to call into, such that entry routines appear clear and concise, yet still tell the complete story. Both Ruby and Rails allow for very expressive code of this kind, as long as your business logic is neatly packed away beneath proper interfaces. This is where I prefer to spend the most time. Write clean domain-driven interfaces, then write beautiful short stories with them.</p>
<p>I wrote this post to explain my <a href="https://notes.max.engineer/should-i-use-rails-currentattributes-callbacks-and-suppressors">decision</a> not to use certain features of Rails.</p>
<p>If you’re curious about this approach, check out the <a href="https://github.com/maxim/narrative">narrative app template for Ruby on Rails</a>.</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Good Engineering is not Premature Optimization ]]></title>
    <link>https://max.engineer/premature-optimization</link>
    <guid>https://max.engineer/premature-optimization</guid>
    <pubDate>Mon, 10 Oct 2022 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>The term<span class="push-double"></span> <span class="pull-double">“</span>premature optimization” is often misused. It’s supposed to be about trading simplicity for unnecessary performance gains. Instead, it’s used as a blanket dismissal of anything unfamiliar. That’s both inaccurate, and hostile to good engineering. Throughout my career, I’ve heard every one of these situations referred to as<span class="push-double"></span> <span class="pull-double">“</span>premature optimization”. None of them are.</p>
<p>It’s not premature optimization when:</p>
<ul>
<li>They solved a problem elegantly, in a way that you didn’t think of</li>
<li>They solved a problem elegantly by deviating from the beaten path</li>
<li>They designed a clear and fitting code pattern that you didn’t come up with</li>
<li>They used a fitting data structure that you weren’t aware of</li>
<li>They used a fitting algorithm that you weren’t aware of</li>
<li>They achieved extra performance without sacrificing clarity, with an approach that you didn’t think of</li>
<li>They configured a piece of infrastructure for the required use case</li>
<li>They let an appropriate existing system handle the work that it’s good at handling</li>
</ul>
<p>…and they did that within the allotted time.</p>
<p>Respectively, it’s not premature optimization when:</p>
<ul>
<li>You propose a clean and elegant solution that they didn’t think of</li>
<li>You challenge an existing practice with a simpler/cleaner alternative</li>
<li>You introduce a new code pattern going forward, which improves codebase clarity and consistency</li>
<li>You recommend a minor code change to use a more fitting data structure that they weren’t aware of</li>
<li>You note that there’s a more fitting algorithm that they may not have seen</li>
<li>You explain how a small change can make the code more performant without sacrificing its clarity</li>
<li>You suggest an infrastructure config appropriate for the required use case</li>
<li>You push for letting an appropriate existing system to handle the work that it’s good at</li>
</ul>
<p>…and it takes an acceptable amount of time to accomplish.</p>
<p>Lumping good engineering with premature optimization is a sure way to discourage engineers, and push the codebase quality towards the lowest common denominator. Don’t do that.</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Ruby Enumerator.new(size) ]]></title>
    <link>https://max.engineer/ruby-enumerator-size</link>
    <guid>https://max.engineer/ruby-enumerator-size</guid>
    <pubDate>Sun, 07 Aug 2022 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>Every ruby enumerator supports <code>count</code>. It’s a method that will iterate over every item and return their total count.</p>
<pre><code class="hljs ruby">irb&gt; enum = Enumerator.new { <span class="hljs-params">|yielder|</span>
  (<span class="hljs-number">1</span>..<span class="hljs-number">100</span>).each <span class="hljs-keyword">do</span> <span class="hljs-params">|i|</span>
    puts <span class="hljs-string">"counting item: <span class="hljs-subst">#{i}</span>"</span>
    yielder &lt;&lt; i
  <span class="hljs-keyword">end</span>
}

irb&gt; enum.count
counting <span class="hljs-symbol">item:</span> <span class="hljs-number">1</span>
counting <span class="hljs-symbol">item:</span> <span class="hljs-number">2</span>
…
counting <span class="hljs-symbol">item:</span> <span class="hljs-number">100</span>
=&gt; <span class="hljs-number">100</span></code></pre>
<p>However, Enumerable also has <code>size</code>. Except, by default it’s just <code>nil</code>.</p>
<pre><code class="hljs ruby">irb&gt; enum.size
=&gt; nil</code></pre>
<p>A little-known feature in ruby is that you can pass a parameter to <code>Enumerator.new</code> to give it a shortcut<span class="push-double"></span> <span class="pull-double">“</span>answer” to the size question.</p>
<pre><code class="hljs ruby">irb&gt; enum = Enumerator.new(<span class="hljs-number">100</span>) { <span class="hljs-params">|yielder|</span>
  (<span class="hljs-number">1</span>..<span class="hljs-number">100</span>).each <span class="hljs-keyword">do</span> <span class="hljs-params">|i|</span>
    puts <span class="hljs-string">"counting item: <span class="hljs-subst">#{i}</span>"</span>
    yielder &lt;&lt; i
  <span class="hljs-keyword">end</span>
}

irb&gt; enum.size
=&gt; <span class="hljs-number">100</span></code></pre>
<p>No more iterating to get the count. However, there’s an even more little-known feature. You can pass a lambda to determine the size lazily, and still faster than iterating. Let’s say that you’re enumerating over products in some kind of ecommerce <span class="small-caps">API</span>.</p>
<pre><code class="hljs ruby">irb&gt; api = EcommerceApi.new(<span class="hljs-string">'connection config'</span>)
irb&gt; enum = Enumerator.new { <span class="hljs-params">|yielder|</span>
  api.products.each.with_index <span class="hljs-keyword">do</span> <span class="hljs-params">|product, index|</span>
    puts <span class="hljs-string">"fetching product: <span class="hljs-subst">#{index}</span>"</span>
    yielder &lt;&lt; product
  <span class="hljs-keyword">end</span>
}
irb&gt; enum.count
fetching product <span class="hljs-number">0</span>
fetching product <span class="hljs-number">1</span>
…
fetching product <span class="hljs-number">235</span>
=&gt; <span class="hljs-number">236</span></code></pre>
<p>Let’s say our <span class="small-caps">API</span> has a more efficent way of obtaining the count: <code>total_count</code> endpoint.</p>
<pre><code class="hljs ruby">irb&gt; api = EcommerceApi.new(<span class="hljs-string">'connection config'</span>)
irb&gt; enum = Enumerator.new(api.products.total_count) { <span class="hljs-params">|yielder|</span>
  api.products.each.with_index <span class="hljs-keyword">do</span> <span class="hljs-params">|product, index|</span>
    puts <span class="hljs-string">"fetching product: <span class="hljs-subst">#{index}</span>"</span>
    yielder &lt;&lt; product
  <span class="hljs-keyword">end</span>
}
irb&gt; enum.size
=&gt; <span class="hljs-number">236</span></code></pre>
<p>We no longer have to iterate over products to get the total count, but notice a new problem: we now always run <code>total_count</code>, even if the user of our <code>enum</code> never calls <code>size</code>. Seems like a waste. Moreover, if the products are added to the <span class="small-caps">API</span>, our size will not change. The lambda would allow us to run the <span class="small-caps">API</span> call only when requested, and always get fresh count.</p>
<pre><code class="hljs ruby">irb&gt; api = EcommerceApi.new(<span class="hljs-string">'connection config'</span>)
irb&gt; enum = Enumerator.new(-&gt; { api.products.total_count }) { <span class="hljs-params">|yielder|</span>
  api.products.each.with_index <span class="hljs-keyword">do</span> <span class="hljs-params">|product, index|</span>
    puts <span class="hljs-string">"fetching product: <span class="hljs-subst">#{index}</span>"</span>
    yielder &lt;&lt; product
  <span class="hljs-keyword">end</span>
}
irb&gt; enum.size <span class="hljs-comment"># Calls -&gt; { api.products.total_count } lambda.</span>
=&gt; <span class="hljs-number">236</span></code></pre>
<p>This feature also exists when using <code>enum_for</code>/<code>to_enum</code> to create the enumerator. You have to return it from the block passed into <code>enum_for</code>. The block arguments are any additional arguments passed to <code>enum_for</code>.</p>
<pre><code class="hljs ruby">irb&gt; <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">each_number</span><span class="hljs-params">(max = <span class="hljs-number">100</span>)</span></span>
  <span class="hljs-keyword">return</span> enum_for(__method__, max) { <span class="hljs-params">|max|</span> max } <span class="hljs-keyword">unless</span> block_given?
  (<span class="hljs-number">1</span>..max).each { <span class="hljs-params">|n|</span> <span class="hljs-keyword">yield</span> n }
<span class="hljs-keyword">end</span>
irb&gt; each_number(<span class="hljs-number">200</span>).size
=&gt; <span class="hljs-number">200</span></code></pre>
<p>P.S. I often forget how this Ruby feature works, and searching never brings up quick examples, so hopefully this article will help when in need of a quick reminder.</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Writing Maintainable Code is a Communication Skill ]]></title>
    <link>https://max.engineer/maintainable-code</link>
    <guid>https://max.engineer/maintainable-code</guid>
    <pubDate>Wed, 24 Nov 2021 00:00:00 -0500</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>Writing maintainable code is easy. Just keep methods and argument lists short, names and comments long, and follow a styleguide. Boom! Done. Unfortunately, as one famous journalist once wrote:</p>
<blockquote>
<p><span class="pull-double">“</span>For every complex problem there is an answer that is clear, simple, and wrong.”<br>
— H. L. Mencken</p>
</blockquote>
<p>It’s not style and shape that makes code hard to maintain. It’s the lack of clarity on <em>how</em> the code works, <em>what</em> it represents, and/or <em>why</em> it was written (this way). I’ll refer to these questions as<span class="push-double"></span> <span class="pull-double">“</span>how?”,<span class="push-double"></span> <span class="pull-double">“</span>what?”, and<span class="push-double"></span> <span class="pull-double">“</span>why?” for short. The questions are straightforward, but there’s nothing straightforward about answering them. You may feel that short method bodies help with understanding the <em><span class="push-double"></span><span class="pull-double">“</span>how?”</em>&nbsp;, but sometimes they make the program hard to follow. You may think that long, descriptive names always answer the <em><span class="push-double"></span><span class="pull-double">“</span>what?”</em>, but often they add too much noise. You may feel that<span class="push-double"></span> <span class="pull-double">“</span>wall-of-text” comments address any <em><span class="push-double"></span><span class="pull-double">“</span>why?”</em> concerns, but now your readers TL;DR them. Every situation is different. It’s up to you, the programmer, to find an eloquent and considerate way to address the <em>how?</em>, <em>what?</em>, and <em>why?</em> in each particular case.</p>
<p><strong>Maintainable code is code that <em>eloquently</em> and <em>considerately</em> communicates to its reader how, what, and why it implements.</strong></p>
<h2 id="how">How?</h2>
<blockquote>
<p><span class="pull-double">“</span>I have only made this letter longer because I have not had the time to make it shorter.”<br>
— Blaise Pascal</p>
</blockquote>
<p><span class="pull-double">“</span>How?” refers to the degree of expressiveness with which a routine or an algorithm is written.</p>
<p>The good news is that it’s hard to fail at answering<span class="push-double"></span> <span class="pull-double">“</span>how?”. You’d have to write utter gibberish. The bad news is that it’s equally hard to succeed. You must break up complex algorithms into clear steps. You must seek out good metaphors that help people make sense of your abstract code. In other words, you must write code that continuously guides fellow engineers. That level of clear communication is rare, but so are great codebases. How often have you seen an algorithm expressed with such grace that it appears boringly obvious?</p>
<p>Another component in a successful answer of<span class="push-double"></span> <span class="pull-double">“</span>how?” is the programming language itself. A flexible language allows you to write incredibly expressive codebases. However, your level of writing skill is all that stands between a magnum opus and a major oops. Make a few wrong moves, and your codebase is a total mess. This is why some engineering teams opt to commit to a strict programming language with guardrails. A codebase written in such a language won’t win you any poetry awards, but neither will it leave you with a magical ball of mud. Well, you might still end up with a ball of mud (engineers do have a boundless capacity to shoot themselves in the foot), but at least it won’t be magical.</p>
<p>As you can probably tell, there are practical business trade offs with both types of language. An expressive language better serves a small, experienced team, or a team with strong senior guidance. A strict language can support a larger and a less senior team. In the short term, both teams could accomplish the same amount of work. However, in the long term, a larger team will likely produce more code. That’s more code to support and maintain, which is certainly not ideal.</p>
<p><strong>Failing at<span class="push-double"></span> <span class="pull-double">“</span>how?”</strong></p>
<p>When code fails at answering<span class="push-double"></span> <span class="pull-double">“</span>how?”, it is often verbose, convoluted, or is again, simply utter gibberish. Much like the example below:</p>
<pre><code class="hljs ruby"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">r</span><span class="hljs-params">(s1, s2, s3)</span></span>
  [s3.bytes, [<span class="hljs-number">32</span>], s1.bytes, [<span class="hljs-number">10</span>]*<span class="hljs-number">2</span>, s2.bytes].map { <span class="hljs-params">|ba|</span>
    ba.flat_map(&amp;<span class="hljs-symbol">:chr</span>).inject { <span class="hljs-params">|v, a|</span> <span class="hljs-string">"<span class="hljs-subst">#{a}</span><span class="hljs-subst">#{v}</span>"</span> }.reverse
  }.inject(&amp;<span class="hljs-symbol">:+</span>)
<span class="hljs-keyword">end</span></code></pre>
<p>In the above, we didn’t take the time to find a more considerate representation of the desired behavior.</p>
<p><strong>Succeeding at<span class="push-double"></span> <span class="pull-double">“</span>how?”</strong></p>
<p>In the example below, we’ve put in the effort to make the code easy to follow. You can see that strings are being concatenated.</p>
<pre><code class="hljs ruby"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">r</span><span class="hljs-params">(s1, s2, s3)</span></span>
  s3 + <span class="hljs-string">" "</span> + s1 + <span class="hljs-string">"\n\n"</span> + s2
<span class="hljs-keyword">end</span></code></pre>
<p>Now, while the above code clearly answers<span class="push-double"></span> <span class="pull-double">“</span>how?”, we still don’t know <em>what</em> business function it accomplishes.</p>
<h2 id="what">What?</h2>
<blockquote>
<p><span class="pull-double">“</span>Isolating complexity in a place where it will never be seen is almost as good as eliminating the complexity entirely.”<br>
— John Ousterhout</p>
</blockquote>
<p>If you have succeeded at <em>what?</em> it means that a new maintainer understands the goal of every piece of your code. In order to ensure that those goals are clear, you must figure out 1) what to abstract and encapsulate and 2) what to name it.</p>
<p>On one hand, the<span class="push-double"></span> <span class="pull-double">“</span>what?” can be used to cover up the problems of<span class="push-double"></span> <span class="pull-double">“</span>how?”. You can write awful code as long as your function is well isolated, well tested, and well named. Once the goal of your function is clear, nobody will ever need to look inside of it. Congrats, you saved some time now, and someone could simply swap out the whole thing later. Sounds like a win-win, but there is a catch. You better make damn sure you get the abstraction right, because the stakes are high. If you get it wrong, then someone will have to dig into your messy function and tease it apart. They will not enjoy that. The moral of the story is, if you have any doubts about your choice of abstraction, then definitely put some extra time towards a clean implementation.</p>
<p>On the other hand, it’s possible to take<span class="push-double"></span> <span class="pull-double">“</span>what?” too far. For example, you might feel the need to blindly fixate on consistency in naming, or include the greater context in every name, length be damned. Maintainable code is not about communicating consistently or exhaustively. It’s about communicating the right amount of information at the right time (i.e.&nbsp;eloquently). Long names work well in high-level interfaces that mimic business terminology. However, they can be distracting in low level code, where it’s easy to lose sight of data transformations in a forest of names.</p>
<p>Much like a novelist’s prose, it takes years to develop good taste for eloquent and considerate naming. My advice is to get comfortable reading other people’s code. Put yourself in the shoes of your audience.</p>
<p><strong>Failing at<span class="push-double"></span> <span class="pull-double">“</span>what?”</strong></p>
<p>Short names typically send a signal that they are contextually self-explanatory. Long names signal that we’re breaking away from current context, and we should pay special attention. Moreover, long names are harder to tell apart. Use them sparingly.</p>
<p>In the below example, the names are too long and redundant given their context.</p>
<pre><code class="hljs ruby"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">render_email_with_a_greeting</span><span class="hljs-params">(email_recipient_name_string_for_rendering_email, email_body_string_for_rendering_email, email_greeting_string_for_rendering_email)</span></span>
  email_greeting_string_for_rendering_email + <span class="hljs-string">" "</span> + email_recipient_name_string_for_rendering_email + <span class="hljs-string">"\n\n"</span> + email_body_string_for_rendernig_email
<span class="hljs-keyword">end</span></code></pre>
<p><strong>Succeeding at<span class="push-double"></span> <span class="pull-double">“</span>what?”</strong></p>
<pre><code class="hljs ruby"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">render_email</span><span class="hljs-params">(recipient_name, body, <span class="hljs-symbol">greeting:</span> <span class="hljs-string">'Hello,'</span>)</span></span>
  greeting + <span class="hljs-string">" "</span> + recipient_name + <span class="hljs-string">"\n\n"</span> + body
<span class="hljs-keyword">end</span></code></pre>
<p>Here we understand what’s being done, and begin to form some valid<span class="push-double"></span> <span class="pull-double">“</span>why?” questions.</p>
<h2 id="why">Why?</h2>
<blockquote>
<p><span class="pull-double">“</span>Give light and people will find the way.”<br>
— Ella Baker</p>
</blockquote>
<p>Some schools of thought consider all code comments to be failures of code expression. I tend to agree with this for<span class="push-double"></span> <span class="pull-double">“</span>how?” and<span class="push-double"></span> <span class="pull-double">“</span>what?”, but not<span class="push-double"></span> <span class="pull-double">“</span>why?”. Trying to cram all business context into names of variables and functions is bound to make the code more confusing. The code already has more than enough to deal with answering the <em><span class="push-double"></span><span class="pull-double">“</span>how?”</em> and <em><span class="push-double"></span><span class="pull-double">“</span>what?”</em>. Let’s give it a break by answering<span class="push-double"></span> <span class="pull-double">“</span>why?” in the comments.</p>
<p>That said, code is the most dependable source of truth, and unfortunately comments are a distant second. <a href="https://www.google.com/search?rls=en&amp;q=code+comments+lie&amp;ie=UTF-8&amp;oe=UTF-8">They tend to lie</a>. This means that we should not overuse them. To excel at<span class="push-double"></span> <span class="pull-double">“</span>why?”, it’s important to learn to:</p>
<ol type="1">
<li><p>Pinpoint which decisions actually need context.<br>
Usually, decisions that need to be explained derive from one of four circumstances: 1) there is a non-obvious business reason for your decision 2) you did a significant amount of research to arrive at a decision 3) you were on the fence about your chosen solution or 4) you were asked a question in a code review. In each of these situations, it is probably a good idea to leave a clarifying comment.</p></li>
<li><p>Identify the level of detail needed for your audience.<br>
The people who will read your code comments are likely to be experienced programmers who are familiar with your company’s internal terminology and processes. Lean on your shared knowledge to communicate efficiently.</p></li>
</ol>
<p><strong>Failing at<span class="push-double"></span> <span class="pull-double">“</span>why?”</strong></p>
<pre><code class="hljs ruby"><span class="hljs-comment"># Keyword argument `greeting` has a default value.</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">render_email_to_send</span><span class="hljs-params">(recipient_name, body, <span class="hljs-symbol">greeting:</span> <span class="hljs-string">'Hello,'</span>)</span></span>
  <span class="hljs-comment"># Emails can be plain and html, and while most email</span>
  <span class="hljs-comment"># clients support html, it's a good practice to add plain</span>
  <span class="hljs-comment"># text versions as a fallback.</span>
  greeting + <span class="hljs-string">" "</span> + recipient_name + <span class="hljs-string">"\n\n"</span> + body
<span class="hljs-keyword">end</span></code></pre>
<p>Here we see multiple pitfalls: addressing an unlikely audience, going into arbitrary levels of detail, adding redundant information, and failing to answer likely questions. The outer comment is redundant. The inner comment neither seems relevant to the code, nor does it consider the audience it’s most likely addressing.</p>
<p><strong>Succeeding at<span class="push-double"></span> <span class="pull-double">“</span>why?”</strong></p>
<pre><code class="hljs ruby"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">render_email</span><span class="hljs-params">(recipient_name, body, <span class="hljs-symbol">greeting:</span> <span class="hljs-string">'Hello,'</span>)</span></span>
  greeting + <span class="hljs-string">" "</span> + recipient_name + <span class="hljs-string">"\n\n"</span> + body
<span class="hljs-keyword">end</span></code></pre>
<p>When looking at the above code we can assume familiarity with the basics and identify a couple of potential questions:</p>
<ul>
<li>Why do we need to support a custom greeting?</li>
<li>Since we use <code>\n</code>, is this function only used for plain text emails?</li>
<li>(For rubyists out there) why are we concatenating with <code>+</code> instead of <code>"#{interpolation}"</code>?</li>
</ul>
<p>Here’s one way to address them.</p>
<pre><code class="hljs ruby"><span class="hljs-comment"># We allow custom greetings because marketing wants to be able to</span>
<span class="hljs-comment"># personalize them by time of day, e.g. "Good Afternoon, Person".</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">render_plain_text_email</span><span class="hljs-params">(recipient_name, body, <span class="hljs-symbol">greeting:</span> <span class="hljs-string">'Hello,'</span>)</span></span>
  <span class="hljs-comment"># We avoid interpolation because we want nil values to error out.</span>
  <span class="hljs-comment"># Helps prevent missing content in sent emails.</span>
  greeting + <span class="hljs-string">" "</span> + recipient_name + <span class="hljs-string">"\n\n"</span> + body
<span class="hljs-keyword">end</span></code></pre>
<p>There are now comments explaining why we allow custom greetings and avoid interpolation. We also clarified our use of <code>\n</code> by adding <code>_plain_text_</code> into the method name.</p>
<p>Alternatively, we could consider eliminating the top comment by renaming <code>greeting</code> to <code>personalized_greeting</code> as follows:</p>
<pre><code class="hljs ruby"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">render_plain_text_email</span><span class="hljs-params">(recipient_name, body, <span class="hljs-symbol">personalized_greeting:</span> <span class="hljs-string">'Hello,'</span>)</span></span>
  <span class="hljs-comment"># We avoid interpolation because we want nil values to error out.</span>
  <span class="hljs-comment"># Helps prevent missing content in sent emails.</span>
  personalized_greeting + <span class="hljs-string">" "</span> + recipient_name + <span class="hljs-string">"\n\n"</span> + body
<span class="hljs-keyword">end</span></code></pre>
<h2 id="useful-framing">Useful Framing</h2>
<blockquote>
<p><span class="pull-double">“</span>If I had an hour to solve a problem and my life depended on the solution, I would spend the first 55 minutes determining the proper question to ask for once I know the proper question, I could solve the problem in less than five minutes.”<br>
— Albert Einstein</p>
</blockquote>
<p>When we work with fellow engineers and stakeholders, we engage in three of the most difficult kinds of communication: 1) giving feedback (in code reviews) 2) negotiating (in estimations) and 3) conveying abstract concepts (in code). These conversations can be anxiety inducing and we have them multiple times a day! The<span class="push-double"></span> <span class="pull-double">“</span>how?”,<span class="push-double"></span> <span class="pull-double">“</span>what?” and<span class="push-double"></span> <span class="pull-double">“</span>why?” framework can help us organize our thoughts.</p>
<ol type="1">
<li>When conducting code reviews, you could be more specific in pointing out a problem:
<ul>
<li><span class="pull-double">“</span>I see what you’re doing but have trouble understanding <em>how</em> it works under the hood.”</li>
<li><span class="pull-double">“</span>I see how this works and why we need this, but extracting a method would make it easier to understand <em>what</em> this piece is doing.”</li>
<li><span class="pull-double">“</span>I see what is being accomplished, and how it’s done, but I am unclear <em>why</em> we made this particular choice.”</li>
</ul></li>
<li>When negotiating refactoring deadlines, you now have language that can help stakeholders understand exactly what you’re trying to achieve:
<ul>
<li><span class="pull-double">“</span>It’s hard to understand <em>how</em> this code works under the hood. We need to do a refactor before we can confidently change it.”</li>
<li><span class="pull-double">“</span>This code needs to be broken up so we can more easily follow <em>what</em> it’s doing.”</li>
<li><span class="pull-double">“</span>A lot of our reasons <em>why</em> were never written down, so we’d like to try and add some context to the codebase before we forget.”</li>
</ul></li>
<li>And finally, it provides a checklist to reflect on your own work before you share it with the team:
<ul>
<li><span class="pull-double">“</span>Will a reader easily understand <em>how</em> my code works?”</li>
<li><span class="pull-double">“</span>Do my names clearly convey <em>what</em> my code accomplishes?”</li>
<li><span class="pull-double">“</span>Have I given the proper amount of context to convey <em>why</em> I wrote the code this way?”</li>
</ul></li>
</ol>
<p>It would be interesting to adopt a code quality standard along the lines of:<span class="push-double"></span> <span class="pull-double">“</span>all new code must successfully convey how, what, and why, to at least 2 of your colleagues.” If you were to conduct such an experiment, I would love to know how it goes.</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Mindful Code Reviews ]]></title>
    <link>https://max.engineer/mindful-code-reviews</link>
    <guid>https://max.engineer/mindful-code-reviews</guid>
    <pubDate>Mon, 04 Oct 2021 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<h2 id="code-reviews-are-first-class-citizens">Code Reviews are First-Class Citizens</h2>
<p>Code reviews are an integral part of our daily work as engineers. They help us reduce bugs, share knowledge, collaborate asynchronously, build rapport, feel recognized, and most importantly, keep software maintainable. Diligent code reviews can save the team from insidious architectural mistakes that may hinder all future development. So why do we often treat them as second-rate citizens, a distraction in the way of shipping? Why is it wrong for a good day of work to consist entirely of leaving <span class="small-caps">PR</span> feedback? Why do we try to sneak reviews past stakeholders, and outright skip them in the face of movable deadlines? There are definitely reasons for it, but whatever they are, it might help to take a deeper look at your engineering culture. I believe that in a growing (and especially geographically distributed) company, engineering success is predicated on embracing code reviews as first-class citizens, with full stakeholder buy-in.</p>
<h2 id="writing-under-pressure">Writing Under Pressure <span class="small-caps">📣👂</span></h2>
<p>To members of the computer generation it’s no surprise that it is easy to accidentally come off dry, dismissive, judgmental, or worse in text. This is exacerbated by constantly putting out fires and missing deadlines while trying to leave feedback. On the receiving end, people most likely take pride in their work, and are attuned to listen carefully. Reading <span class="small-caps">PR</span> comments then becomes sort of like putting your ear directly onto an active megaphone. This is why I believe that rushed code reviews are harmful to engineering culture. All this pressure makes being kind and considerate a more difficult challenge than it has to be. If your company is a burning tornado, fixing that could be a crucial step towards mindful… everything, let alone code reviews.</p>
<h2 id="practices">Practices</h2>
<p>Over the years in the industry I’ve compiled a list of my favorite code review practices. Here they are in no particular order.</p>
<h3 id="advocate-for-the-reviewee">1. Advocate for the Reviewee</h3>
<p><em>Approach each comment from the position of respect for author’s work and decisions</em></p>
<p>Even when some of the author’s decisions appear to be<span class="push-double"></span> <span class="pull-double">“</span>clearly suboptimal”, or straight up mistakes, assume the best intentions on their part. Spend some time advocating in your mind for the code you’re reading, challenging your own assumptions. If you understand where the author is coming from, acknowledge it before providing counterarguments.</p>
<h3 id="objectivity-subjectivity">2. Objectivity &gt; Subjectivity</h3>
<p><em>Seek out objectivity in all arguments</em></p>
<p>A comment asking for a change should make an objective case for it. When making a case, dig past personal preferences all the way down to objective underpinnings of your argument. A tiny nugget of strict objectivity is miles more effective than a 500-word opinion piece.</p>
<p>There’s one good kind of subjective comment: the code confuses you. Since the most important measure of maintainability is that code is clear for people on your team,<span class="push-double"></span> <span class="pull-double">“</span>confusing” comment gets a special pass.</p>
<h3 id="conversation-silence">3. Conversation &gt; Silence</h3>
<p><em>Subjectivity is welcome as long as it’s a discussion</em></p>
<p>Sometimes we can’t help but voice a subjective opinion. In doing so, we must acknowledge that we have been unsuccessful in finding an objective argument, and are asking the author to indulge us for a moment. This is ok, as long as we present these opinions as topics for discussion, and not as something we insist on being implemented. Use the discussion as a tool to figure out the objective underpinnings behind your opinion. The team should support this exploration, and try to learn from it. Of course, be willing to accept that the discussion will not always result in your opinion making its way into the code.</p>
<h3 id="assume-competence">4. Assume Competence</h3>
<p><em>Use question form when suggesting something seemingly obvious</em></p>
<p>When you are suggesting something that appears obvious to you, it’s possible that you’re missing a problem that the author may have already discovered. If you don’t invite an explanation, the author may feel compelled to make the requested change, and work around the problem in some other way.</p>
<p>Instead, switch your statement into a genuine question. Let’s say you think code should be moved to another function.</p>
<ul>
<li>The original thought:<span class="push-double"></span> <span class="pull-double">“</span>This code should be moved to function X due to [reasons].”</li>
<li>The fake question form (don’t do this):<span class="push-double"></span> <span class="pull-double">“</span>Could you move this code to function X due to [reasons]?”</li>
<li>The genuine question form:<span class="push-double"></span> <span class="pull-double">“</span>Have you considered moving this code to function X to avoid [reasons]?”</li>
</ul>
<p>Notice how we are still being concise, and are still providing our solution. Except, now the author gets to choose. They can either explain why they did what they did (and maybe you’ll end up agreeing), or they can follow the request without wasting another round.</p>
<p>P.S. In my experience, this is the most powerful<span class="push-double"></span> <span class="pull-double">“</span>hack” in this whole list. It’s incredibly easy to switch to a question form, not obscure any valuable info, and yet completely remove any sense of bitter judgement from your comment.</p>
<h3 id="care-about-details">5. Care About Details</h3>
<p><em>It is not a waste of time to discuss a detail in depth</em></p>
<p>Details can matter because technical debt tends to be a<span class="push-double"></span> <span class="pull-double">“</span>death by a thousand cuts”. Besides, a discussion over a small detail can often be useful for other things, like establishing a rapport with someone. A self-conscious fear of wasting time could end up wasting more time than actually staying on topic.</p>
<h3 id="specific-examples-generalizations">6. Specific Examples &gt; Generalizations</h3>
<p><em>Try to propose a concrete solution</em></p>
<p>If possible, use pseudo-code or real code (untested is ok) to illustrate your points. If writing the code is not feasible, take time to make your comment easy to follow. This is especially important when collaborating across time zones.</p>
<h3 id="working-code-no-code">7. Working Code &gt; No Code</h3>
<p><em>Always respect working code</em></p>
<p>If a fellow engineer submits a <span class="small-caps">PR</span> with a working and tested implementation, but you find that it could use a better architectural approach, this is a great problem to have. Now we can focus on refactoring this <span class="small-caps">PR</span> without worrying about implementation details, since they are already working and tested. This actually frees us to collaborate on reshaping the code’s architecture while maintaining the same logic.</p>
<h3 id="advocate-for-the-reviewer">8. Advocate for the Reviewer</h3>
<p><em>A code review itself is an original work</em></p>
<p>When you are on the receiving end of a code review, treat the review itself as its author’s work. Even though they’re reviewing <em>your</em> code, their review is <em>their</em> original work. Instead of only focusing on the changes you’ve been asked to make, express some appreciation for comments that you found useful, or their effort to understand your code.</p>
<h3 id="use-complete-thoughts">9. Use Complete Thoughts</h3>
<p><em>Fight the instinct to leave a quick one-liner</em></p>
<p>It’s okay to use one-liners in a considerate way (i.e.&nbsp;as per point 4). However, if your one-liner is a short and dry change instruction, you are sending some bad signals, like:</p>
<ul>
<li>I don’t care whether you agree or disagree</li>
<li>I don’t see you as my peer</li>
<li>I don’t take code reviews (or reviewing <em>your</em> code) seriously</li>
<li>My time (writing) is worth more than your time (unpacking what I mean)</li>
<li>Your mistake was obvious to me</li>
</ul>
<p>Practices in this article will help you avoid sending these signals.</p>
<h3 id="practice">10. Practice</h3>
<p><em>These aren’t rules to be followed perfectly from day one</em></p>
<p>These practices aren’t meant to be a checklist. As long as you follow these practices in spirit, it’s ok to make your own judgment calls based on specific situations. The more you practice, the easier it gets.</p>
<h3 id="have-fun">11. Have fun!</h3>
<p><em>Enjoy geeking out on technical discussions with your colleagues</em></p>
<p>Code reviews are places where we get to unapologetically talk deep programming, so let’s take advantage of it, and have fun!</p>
<hr>
<p>Special thanks to the awesome <a href="https://twitter.com/pszals">Philip Szalwinski</a> for suggestions and contributions.</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Don’t Build A General Purpose API To Power Your Own Front End ]]></title>
    <link>https://max.engineer/server-informed-ui</link>
    <guid>https://max.engineer/server-informed-ui</guid>
    <pubDate>Mon, 13 Sep 2021 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p><strong>Update 2025-12-11</strong>: There is now a <a href="https://max.engineer/server-informed-ui-p2">follow up article (4 years later)</a>.</p>
<p><strong>TL;DR</strong> <span class="small-caps">YAGNI</span>, unless you’re working in a big company with federated front-ends or GraphQL.</p>
<p>It’s popular in web dev nowadays to build a backend that serves <span class="small-caps">JSON</span>, and a frontend that renders the app. This is fine. I’m not the biggest fan, but it’s really okay. Except it’s not okay if you think that your backend needs to be designed like a generic public <span class="small-caps">API</span>. This will not save you time.</p>
<h2 id="why-not">Why not?</h2>
<p>When you design a general purpose <span class="small-caps">API</span>, you have to figure out a bunch of annoying stuff.</p>
<ol type="1">
<li>How to predict and enable all possible workflows</li>
<li>How to avoid N+1 requests for awkward workflows</li>
<li>How to test functionality, performance, and security of every possible request</li>
<li>How to change the <span class="small-caps">API</span> without breaking the existing workflows</li>
<li>How to prioritize <span class="small-caps">API</span> changes between internal and community requirements</li>
<li>How to document everything so that all parties can get stuff done</li>
</ol>
<p>And on the front-end side, there’s a bunch more:</p>
<ol type="1">
<li>How to collect all the data needed to render a page</li>
<li>How to optimize requests to multiple endpoints</li>
<li>How to avoid using <span class="small-caps">API</span> data fields in unintended ways</li>
<li>How to weigh the benefit of new features against the cost of new <span class="small-caps">API</span> requests</li>
</ol>
<p>Do these really have to be your problems if you’re just making a backend for your frontend? Do you have to imagine every possible workflow, avoid N+1 request issues, test every request configuration, or deny yourself features when you know exactly what each page needs to look like? You can probably see where I’m going with this.</p>
<h2 id="so-what-do-you-suggest">So what do you suggest?</h2>
<p>I suggest you stop treating your frontend as some generic <span class="small-caps">API</span> client, and start treating it as a half of your app.</p>
<p>Imagine if you could just send it the whole<span class="push-double"></span> <span class="pull-double">“</span>page” worth of <span class="small-caps">JSON</span>. Make an endpoint for <code>/page/a</code> and render the whole <span class="small-caps">JSON</span> for <code>/page/a</code> there. Do this for every page. Don’t force your front-end developers to send a bunch of individual requests to render a complex page. Stop annoying them with contrived limitations. Align yourselves. <span class="small-caps">🧘‍♂️</span></p>
<p>And in that <span class="small-caps">JSON</span>, actually render the page. Don’t render abstract models and collections. Render concrete boxes, sections, paragraphs, lists. Render the visual page structure.</p>
<pre><code class="hljs json">{
  "section1": {
    "topBoxTitle": "Foo",
    "leftBoxTitle": "Bar",
    "linkToClose": "https://…"
  },
  "section2": {
    …
  }
}</code></pre>
<p>This is similar but not quite the same as Server Driven <span class="small-caps">UI</span><a href="#footnote-1AYG" class="footnote-ref" id="ref-1AYG" role="doc-noteref"><sup>1</sup></a>. Perhaps we could call it Server Informed <span class="small-caps">UI</span>.</p>
<h2 id="how-is-that-better-exactly">How is that better exactly?</h2>
<p>Have you seen that list of annoying decisions up there? For one, they are gone now. <span class="small-caps">💨</span></p>
<p>For two, you are now free to decide<span class="push-double"></span> <span class="pull-double">“</span>I want page a” and then implement<span class="push-double"></span> <span class="pull-double">“</span>page a” in the backend, and in the frontend. Super straightforward. ✅</p>
<p>No more<span class="push-double"></span> <span class="pull-double">“</span>what <span class="small-caps">API</span> workflows do we need to introduce to sort of make this page possible almost? <span class="small-caps">🤔</span><span class="push-double"></span><span class="pull-double">”</span>. You can keep<span class="push-double"></span> <span class="pull-double">“</span>page a” dumb to only do what it needs to do. You test the crap out of<span class="push-double"></span> <span class="pull-double">“</span>page a” for bugs, security, performance. You can even fetch everything for<span class="push-double"></span> <span class="pull-double">“</span>page a” in a single big <span class="small-caps">SQL</span> query. You can cache the entire <span class="small-caps">JSON</span> payload of<span class="push-double"></span> <span class="pull-double">“</span>page a”.</p>
<p>Frontend knows exactly what each field in<span class="push-double"></span> <span class="pull-double">“</span>page a” payload is for. There are no discrepancies in field meanings. They represent exactly what frontend needs.</p>
<p>When a stakeholder tells you to change<span class="push-double"></span> <span class="pull-double">“</span>page a” you will be able to literally go ahead and change<span class="push-double"></span> <span class="pull-double">“</span>page a”, instead of spending meetings figuring out how your backend <span class="small-caps">API</span> could accommodate the change in<span class="push-double"></span> <span class="pull-double">“</span>page a”. It’s not a choreographed conglomeration of <span class="small-caps">API</span> requests. It’s just<span class="push-double"></span> <span class="pull-double">“</span>page a”. You have freed yourself from self-imposed limitations of your <span class="small-caps">API</span>.</p>
<p>Your business logic has now moved from being haphazardly split between frontend and backend into just backend. Your frontend can finally focus on presentation and <span class="small-caps">UI</span>. Your backend can finally focus on implementing exactly what’s needed. Kinda the goal, no?</p>
<h2 id="have-you-actually-tried-this">Have you actually tried this?</h2>
<p>Yes, I have tried this on a couple of production projects so far. One of them was personal, the other was a consistent multi-year refactoring effort in an existing company. The whole team was bought in, and it worked out well. The only problem we’ve encountered was that the front-end team has gotten increasingly bored. Nearly all business logic was taken away from them. At the same time, no<span class="push-double"></span> <span class="pull-double">“</span>excitement” was added to the back-end team. It’s just gotten kinda boring all around. Somehow we all ended up talking more about the business than the code.</p>
<p>Feel free to stop reading here if you’re convinced. Next part is just responding to various rebuttals I keep hearing.</p>
<h3 id="but-i-want-my-front-end-team-to-have-freedom-or-i-want-my-front-end-to-be-decoupled">But I want my front-end team to have freedom! (Or, I want my front-end to be decoupled!)</h3>
<p>Let’s be honest, your frontend doesn’t really have freedom. When they send you 7 requests to render a single page, that’s not freedom. It’s jumping hoops to meet basic requirements. As soon as requirements change, you probably going to need to change the backend anyway to accommodate it. The freedom is all accidental and mostly in the wrong places.</p>
<p>If you really want to give your front end team freedom, install them a GraphQL wrapper directly on top of Postgres and quit. <span class="small-caps">😛</span></p>
<h3 id="but-we-actually-want-a-general-purpose-api-anyway-so-this-is-2-birds-with-1-stone-no">But we actually want a general purpose <span class="small-caps">API</span> anyway, so this is 2 birds with 1 stone, no?</h3>
<p>No, you would not actually want to make this <span class="small-caps">API</span> public. You think you would, but when time comes, you’d be like<span class="push-double"></span> <span class="pull-double">“</span>crap, maybe I shouldn’t”. These 2 APIs have very different reasons to change. Public <span class="small-caps">API</span> needs to enable the workflows of your clients. Private backend needs to enable the next whim of your product manager. Stop jamming sticks into your own bicycle wheels.</p>
<h3 id="but-how-will-i-reuse-the-logic-when-building-json-for-pages-i-reused-so-much-logic-in-my-crud-controllers">But how will I reuse the logic when building <span class="small-caps">JSON</span> for pages? I reused so much logic in my <span class="small-caps">CRUD</span> controllers!</h3>
<p>If your programming language lets you reuse logic (it does), then you can reuse logic. Use mixins, composition, inheritance, whatever you got to work with. If you make yourself some good abstractions, then you will have an amazing time putting together pages from your <span class="small-caps">LEGO</span> blocks.</p>
<h3 id="but-we-can-reuse-this-api-for-the-mobile-app-too">But we can reuse this <span class="small-caps">API</span> for the mobile app too!</h3>
<p>Your mobile app has a different set of pages with different info, structures, and reasons to change. You’ll save more time and sanity making another backend specifically for it. But hey, you can reuse a lot of your logic (see the previous paragraph).</p>
<h3 id="but-what-if-a-page-needs-a-partial-xhr-update-am-i-supposed-to-always-return-an-entire-page">But what if a page needs a partial <span class="small-caps">XHR</span> update? Am I supposed to always return an entire page?</h3>
<p>No, it’s okay to make an endpoint that returns just something specific. You have my permission. Make endpoints for snippets of data for specific page sections or whatever. It’s okay. Render your React components from initial payload, then update them from <span class="small-caps">XHR</span> calls to these endpoints. But only introduce these endpoints when you need them on certain pages. These are exceptions, not the default.</p>
<h3 id="but-my-frontend-is-a-spa-so-it-almost-always-needs-data-snippets-not-entire-pages">But my frontend is a <span class="small-caps">SPA</span>, so it almost always needs data snippets, not entire pages</h3>
<p>Those data snippets could still be provided as partial page structures, not generic resources. As long as your backend only serves the exact needs of your frontend, you’re good. <span class="small-caps">😇</span></p>
<h3 id="but-im-building-a-site-builder-so-my-frontend-is-dogfooding-the-site-builder-api">But I’m building a site builder, so my frontend is dogfooding the site builder <span class="small-caps">API</span></h3>
<p><span class="small-caps">🗡</span> I dub thee a legitimate use case haver, congratulations!</p>
<h3 id="do-you-have-data-to-support-your-claims">Do you have data to support your claims?</h3>
<p>I wish. It’s pretty hard to measure these kinds of things in our industry. Who’s gonna maintain 2 architectures for the same software for 3 years, and compare productivity between them? All I got is a mixed bag of personal experiences. Feels inductively justifiable. <span class="small-caps">🤷‍♂️</span></p>
<p><strong>Update 2025-12-11</strong>: There is now a <a href="https://max.engineer/server-informed-ui-p2">follow up article (4 years later)</a>.</p>
<section id="footnotes" class="footnotes footnotes-end-of-document" role="doc-endnotes">
<hr>
<ol>
<li id="footnote-1AYG"><p>There has already been some experimentation with this approach. A Server Driven <span class="small-caps">UI</span> is when the <span class="small-caps">API</span> tells the client which components to display and with which content. That said, most <a href="https://github.com/Lona/Lona"><span class="small-caps">SDUI</span></a> <a href="https://www.infoq.com/news/2021/07/airbnb-server-driven-ui/">implementations</a> take this idea all the way. They treat <span class="small-caps">API</span> payloads as a kind of declarative <span class="small-caps">UI</span> language. The front-end then acts as an interpreter, and dynamically renders the declared components. I don’t think this level of generalization is necessary for most apps, but it’s a fun approach to explore.<a href="#ref-1AYG" class="footnote-back" role="doc-backlink"><span class="small-caps">↩︎</span></a></p></li>
</ol>
</section>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ 3 Reasons Not To Implicitly Memoize ]]></title>
    <link>https://max.engineer/3-reasons-not-to-memoize</link>
    <guid>https://max.engineer/3-reasons-not-to-memoize</guid>
    <pubDate>Sat, 21 Mar 2020 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>The other day I was listening to this <a href="https://www.bikeshed.fm/237">Bikeshed podcast episode</a>, where the hosts were discussing when is it a good idea to memoize values using <code>||=</code> ruby idiom. Since this is a common question even among seasoned developers, I decided to write up my take on it. The short answer is: <em>never</em>.</p>
<h2 id="problem">Problem</h2>
<p>Let’s take a look at this example. We query the database to find the user by id, then use their email to make an <span class="small-caps">API</span> call to download a profile and grab the name. While this example is indeed contrived, it’s fairly common to see variations on this theme in the wild.</p>
<pre><code class="hljs ruby"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">name</span></span>
  <span class="hljs-variable">@name</span> <span class="hljs-params">||</span>= <span class="hljs-variable">@api</span>.fetch_profile(User.find(<span class="hljs-variable">@id</span>).email).name
<span class="hljs-keyword">end</span></code></pre>
<p>Now, just to get it out of the way, there are various problems with this code. However, in this post, let’s just view it from the angle of memoization. So, what are the 3 reasons not to memoize like this?</p>
<h3 id="reason-1-caller-is-misled-about-the-real-impact-of-making-this-call.">Reason 1: Caller is misled about the real impact of making this call.</h3>
<p>Typically, doing this sort of memoization goes hand-in-hand with naming your method with a noun. Since the method is named so inconspicuously (<code>name</code>), we’re signalling that a caller doesn’t have to worry what happens under the hood. We perpetuate the practice of calling this method mindlessly, with no regard for the fragile sequence of interdependent network operations that it takes to fulfill the request. I get it, we want to encapsulate the plumbing, but couldn’t we do it without misleading the caller?</p>
<h3 id="reason-2-caller-has-no-say-in-cache-invalidation.">Reason 2: Caller has no say in cache invalidation.</h3>
<p>This memoization style assumes that caller will never want another fresh value. For web apps, it probably comes out of another assumption that we’re always living within a web request, and we never want to fetch any data twice. Unfortunately, each such memoization slowly eats away at our understanding of how data flows through our application, making it much harder to debug problems, or implement anything else on top of the same codebase.</p>
<h3 id="reason-3-caller-has-no-way-of-stopping-redundant-work.">Reason 3: Caller has no way of stopping redundant work.</h3>
<p>In our example, if a caller already has a <code>user</code> available, the method will fetch it again anyway. In a well architected system we should be able to inject that dependency, especially if it took something as error-prone as network or database roundtrips to obtain it.</p>
<h2 id="solution">Solution</h2>
<p>How would we avoid all 3 of the above problems? It’s not that difficult, but with a caveat that you didn’t already overcommit to bigger architectural mistakes. Still, it’s never too late to stop making things worse. So without further ado, here’s the code free of all of the above problems.</p>
<pre><code class="hljs ruby">def retrieve_name <span class="hljs-symbol">email:</span> User.find(<span class="hljs-variable">@id</span>).email, <span class="hljs-symbol">api:</span> <span class="hljs-variable">@api</span>
  api.fetch_profile(email).name
<span class="hljs-keyword">end</span></code></pre>
<p>You might’ve just done a double-take: wait, how is this the solution? We just removed caching and added some useless arguments. Bear with me, let’s talk through this real quick.</p>
<p>Note that arguments are optional, so the method can still be called without passing anything. Let’s go back and see if we’ve addressed the problems with the original code.</p>
<h3 id="is-caller-still-misled-about-the-real-impact-of-calling-this">1. Is caller still misled about the real impact of calling this?</h3>
<p>No.&nbsp;The fact that this method name is now a verb <code>retrieve_name</code> makes it clear that when you call it, it will do things. That’s all it takes to send the correct signal.</p>
<h3 id="can-the-caller-control-cache-invalidation">2. Can the caller control cache invalidation?</h3>
<p>Yes.</p>
<pre><code class="hljs ruby">name = retrieve_name

<span class="hljs-comment"># Name is now cached, feel free to reuse it.</span>
do_something_with(name)
do_something_else_with(name)

<span class="hljs-comment"># Get a fresh name whenever you want.</span>
fresh_name = retrieve_name</code></pre>
<h3 id="can-the-caller-stop-redundant-work-from-happening">3. Can the caller stop redundant work from happening?</h3>
<p>Totally.</p>
<pre><code class="hljs ruby">my_user = User.find(<span class="hljs-number">123</span>)
name = retrieve_name(<span class="hljs-symbol">email:</span> my_user.email) <span class="hljs-comment"># Saves a database call.</span></code></pre>
<p>In case it’s not obvious, we couldn’t accept arguments the same way in the original version, because we’re only caching one value, and even if we then passed a different user, we would still get back the first cached value.</p>
<p>Ultimately, with very little effort, we just gained 3 significant advantages in maintainability, reusability, and performance of our code.</p>
<h2 id="faq"><span class="small-caps">FAQ</span></h2>
<h3 id="what-if-i-need-to-call-this-method-from-different-places-so-i-dont-have-a-variable-to-reuse">What if I need to call this method from different places, so I don’t have a variable to reuse?</h3>
<p>I feel your pain. Unfortunately, if you must depend on this caching technique because you cannot assign a variable once, and pass it around, I have some bad news for you. Your abstractions need rethinking. There should be a top level routine in your code that tells the story of a particular transaction. Values that are reused need to be floated up into that context and passed into whatever needs them. In a vanilla Rails world the place like this would be your controller actions. If doing this makes your actions too long, you’re missing intermediary objects that give you a clean abstraction to write your routine. That said, this is a pretty big topic best left for future blog posts.</p>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Don’t use docker to run your app in development ]]></title>
    <link>https://max.engineer/docker-in-dev</link>
    <guid>https://max.engineer/docker-in-dev</guid>
    <pubDate>Sat, 04 Aug 2018 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>Using docker in development can be very convenient, but running your actual app (you know, the one you’re coding) in docker introduces various headaches.</p>
<ol type="1">
<li>Mounted volumes are slow and error-prone</li>
<li>You need hacks and shortcuts to run any console/debug commands in containers</li>
<li>Live updates on code changes are unreliable in docker</li>
<li>Runtime is slower in docker</li>
<li>Dependency updates are slower in docker</li>
<li>Networking is more complicated with docker</li>
</ol>
<p>What genuinely surprises me, is that often teams don’t consider the obvious: just run your app directly on your machine. I find it to be the sweet spot of dev setup. Let’s see what it would look like.</p>
<h2 id="use-docker-compose-for-databases-and-external-services">1. Use docker-compose for databases and external services</h2>
<p>Create a <code>docker-compose.yml</code> file in your app’s root and only declare your databases in it. For example, this file gives you</p>
<ul>
<li>Postgres on <code>localhost:5432</code></li>
<li>Redis on <code>localhost:6379</code></li>
<li>Fake <span class="small-caps">S3</span> on <code>localhost:9000</code></li>
</ul>
<pre><code class="hljs yaml"><span class="hljs-attr">version:</span> <span class="hljs-string">'3'</span>

<span class="hljs-attr">services:</span>
  <span class="hljs-attr">postgres:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">postgres:10.3-alpine</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"5432:5432"</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">postgres-data:/var/lib/postgresql/data</span>
  <span class="hljs-attr">redis:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">redis:3.2.11-alpine</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"6379:6379"</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">redis-data:/data</span>
  <span class="hljs-attr">minio:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">minio/minio</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">minio-data:/data</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">"9000:9000"</span>
    <span class="hljs-attr">entrypoint:</span> <span class="hljs-string">sh</span>
    <span class="hljs-attr">command:</span> <span class="hljs-string">-c</span> <span class="hljs-string">"mkdir -p /data/dev /data/test &amp;&amp; /usr/bin/minio server /data"</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">MINIO_ACCESS_KEY:</span> <span class="hljs-string">access_key</span>
      <span class="hljs-attr">MINIO_SECRET_KEY:</span> <span class="hljs-string">secret_key</span>

<span class="hljs-attr">volumes:</span>
  <span class="hljs-attr">postgres-data:</span>
  <span class="hljs-attr">redis-data:</span>
  <span class="hljs-attr">minio-data:</span></code></pre>
<h2 id="use-asdf-for-language-runtimes">2. Use <a href="https://github.com/asdf-vm/asdf">asdf</a> for language runtimes</h2>
<p>Create a <code>.tool-versions</code> file in app’s root. Here’s an example for elixir and node setup.</p>
<pre><code>elixir 1.6.4-otp-20
erlang 20.2.4
nodejs 10.8.0</code></pre>
<p><a href="https://github.com/asdf-vm/asdf">Asdf</a> is like rvm, nvm, and other version managers combined. It has <a href="https://github.com/asdf-vm/asdf-plugins#plugin-list">an extensive list</a> of things it can manage.</p>
<h2 id="setup-everything">3. Setup everything</h2>
<p>Now you can bootstrap the application by running</p>
<pre><code class="hljs shell">asdf install
docker-compose up</code></pre>
<p>and in another terminal you run the app itself:</p>
<pre><code class="hljs shell">mix phx.server</code></pre>
<p>That’s it. Now you have the benefit of quick and simple dev setup without giving up all the convenience of interacting with your app directly, without containers in the middle.</p>
<h2 id="bonus-how-to-make-local-rails-work-with-dockerized-postgres">Bonus: How to make local Rails work with dockerized Postgres?</h2>
<p>The cool part is that your <code>database.yml</code> can be committed to the repo, it will always look the same:</p>
<pre><code class="hljs yaml"><span class="hljs-attr">default:</span> <span class="hljs-meta">&amp;default</span>
  <span class="hljs-attr">adapter:</span> <span class="hljs-string">postgresql</span>
  <span class="hljs-attr">username:</span> <span class="hljs-string">postgres</span>
  <span class="hljs-attr">host:</span> <span class="hljs-string">localhost</span>

<span class="hljs-attr">development:</span>
  <span class="hljs-string">&lt;&lt;:</span> <span class="hljs-meta">*default</span>
  <span class="hljs-attr">database:</span> <span class="hljs-string">myapp_dev</span>

<span class="hljs-attr">test:</span>
  <span class="hljs-string">&lt;&lt;:</span> <span class="hljs-meta">*default</span>
  <span class="hljs-attr">database:</span> <span class="hljs-string">myapp_test</span>

<span class="hljs-attr">production:</span>
  <span class="hljs-string">&lt;&lt;:</span> <span class="hljs-meta">*default</span>
  <span class="hljs-attr">database:</span> <span class="hljs-string">myapp</span></code></pre>
<p>However, there’s a minor issue when using this setup in Rails. You might get an error when trying to install the pg gem or run a <code>rake db:structure:dump</code> command. Both of these actions rely on postgres being installed locally. To work around it simply add postgres to your <code>.tool-versions</code> — asdf supports it. You will not be actually running this postgres, only using its cli as a client, and satisfying pg’s dependencies.</p>
<!--meta
  toc: true
  date: "2018-08-04"
  toc: true
  tldr: Use docker-compose and asdf for local dev
-->  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Elasticsearch gems and modules, clearly explained ]]></title>
    <link>https://max.engineer/elasticsearch-gems</link>
    <guid>https://max.engineer/elasticsearch-gems</guid>
    <pubDate>Tue, 15 Sep 2015 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<h2 id="non-rails-specific">Non Rails-specific</h2>
<h3 id="gem-elasticsearch-transport">Gem elasticsearch-transport</h3>
<p>Provides a bare-bones <span class="small-caps">HTTP</span> client that doesn’t have any Elasticsearch-specific api methods, but knows how to discover and connect to multiple servers, rotate connections, and log things.</p>
<ul>
<li>readme: <a href="https://github.com/elastic/elasticsearch-ruby/tree/master/elasticsearch-transport">elastic/elasticsearch-ruby/elasticsearch-transport</a></li>
</ul>
<h3 id="gem-elasticsearch-api">Gem elasticsearch-api</h3>
<p>Provides a module that adds elasticsearch-specific methods such as <code>search</code>, <code>cluster</code>, <code>index</code> to a generic <span class="small-caps">HTTP</span> client. Can be included in any class that implements method <code>perform_request</code> which returns an object responding to <code>status</code>, <code>body</code>, <code>headers</code>.</p>
<ul>
<li>readme: <a href="https://github.com/elastic/elasticsearch-ruby/tree/master/elasticsearch-api">elastic/elasticsearch-ruby/elasticsearch-api</a></li>
</ul>
<h3 id="gem-elasticsearch">Gem elasticsearch</h3>
<p>Depends on:</p>
<ul>
<li><a href="#gem-elasticsearch-transport">elasticsearch-transport</a></li>
<li><a href="#gem-elasticsearch-api">elasticsearch-api</a></li>
</ul>
<p>All it does is it takes an <span class="small-caps">HTTP</span> client from elasticsearch-transport and includes the <code>Elasticsearch::API</code> module into it from elasticsearch-api, providing a more convenient client as a result.</p>
<ul>
<li>readme: <a href="https://github.com/elastic/elasticsearch-ruby/tree/master/elasticsearch">elastic/elasticsearch-ruby/elasticsearch</a></li>
</ul>
<h3 id="gem-elasticsearch-dsl">Gem elasticsearch-dsl</h3>
<p>Provides a <a href="https://github.com/karmi/retire">tire</a>-like syntax for defining queries. The resulting query object is useless on its own, but it supports <code>to_hash</code>, and therefore can easily be fed into any <span class="small-caps">HTTP</span> client by encoding Hash as <span class="small-caps">JSON</span>. If you feed this object to the client from elasticsearch-transport, it will be automatically dumped as <span class="small-caps">JSON</span> using the default MultiJson serializer.</p>
<ul>
<li>readme: <a href="https://github.com/elastic/elasticsearch-ruby/tree/master/elasticsearch-dsl">elastic/elasticsearch-ruby/elasticsearch-dsl</a></li>
</ul>
<h3 id="gem-elasticsearch-watcher">Gem elasticsearch-watcher</h3>
<p>Depends on:</p>
<ul>
<li><a href="#gem-elasticsearch-api">elasticsearch-api</a></li>
</ul>
<p>Extends <code>Elasticsearch::API</code> with an extra method <code>watcher</code>, which in turn provides methods specific for the <a href="https://www.elastic.co/guide/en/watcher/current/introduction.html">Watcher</a> plugin, such as <code>put_watch</code>, <code>get_watch</code>, and others.</p>
<ul>
<li>readme: <a href="https://github.com/elastic/elasticsearch-ruby/tree/master/elasticsearch-watcher">elastic/elasticsearch-ruby/elasticsearch-watcher</a></li>
</ul>
<h3 id="gem-elasticsearch-extensions">Gem elasticsearch-extensions</h3>
<p>Depends on:</p>
<ul>
<li><a href="#gem-elasticsearch">elasticsearch</a></li>
</ul>
<p>Adds contributor-friendly features like terminal colorizers and formatters for Elasticsearch responses, cluster start/stop for testing, and profiling features for testing.</p>
<ul>
<li>readme: <a href="https://github.com/elastic/elasticsearch-ruby/tree/master/elasticsearch-extensions">elastic/elasticsearch-ruby/elasticsearch-extensions</a></li>
</ul>
<h2 id="rails-specific">Rails-specific</h2>
<h3 id="gem-elasticsearch-model">Gem elasticsearch-model</h3>
<p>Depends on:</p>
<ul>
<li><a href="#gem-elasticsearch">elasticsearch</a></li>
</ul>
<p>This gem contains various modules to be included into models. It does nothing without explicit includes.</p>
<ul>
<li>readme: <a href="https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-model">elastic/elasticsearch-rails/elasticsearch-model</a></li>
</ul>
<h4 id="module-elasticsearchmodelproxy">Module Elasticsearch::Model::Proxy</h4>
<p>This module is useless on its own. It adds <code>__elasticsearch__</code> method to a model at class and instance levels, which is supposed to isolate all elasticsearch functionality underneath it. However, the proxy object is actually empty, it has no methods, and it expects that all other modules will be manually included into it.</p>
<ul>
<li>module source/docs: <a href="https://github.com/elastic/elasticsearch-rails/blob/master/elasticsearch-model/lib/elasticsearch/model/proxy.rb">elasticsearch-rails/elasticsearch-model/lib/elasticsearch/model/proxy.rb</a></li>
</ul>
<h4 id="module-elasticsearchmodelclient">Module Elasticsearch::Model::Client</h4>
<p>Adds accessor <code>client</code> to the model at both class and instance level. Default client comes from <a href="#gem-elasticsearch-transport">elasticsearch-transport</a>.</p>
<ul>
<li>module source/docs: <a href="https://github.com/elastic/elasticsearch-rails/blob/master/elasticsearch-model/lib/elasticsearch/model/client.rb">elasticsearch-rails/elasticsearch-model/lib/elasticsearch/model/client.rb</a></li>
</ul>
<h4 id="module-elasticsearchmodelnaming">Module Elasticsearch::Model::Naming</h4>
<p>Adds accessors <code>index_name</code> and <code>document_type</code> to the model at both class and instance level. Defaults are inferred from the model name.</p>
<ul>
<li>module source/docs: <a href="https://github.com/elastic/elasticsearch-rails/blob/master/elasticsearch-model/lib/elasticsearch/model/naming.rb">elasticsearch-rails/elasticsearch-model/lib/elasticsearch/model/naming.rb</a></li>
</ul>
<h4 id="module-elasticsearchmodelindexing">Module Elasticsearch::Model::Indexing</h4>
<p>Adds class methods <code>settings</code>, <code>mapping</code>, <code>create_index!</code>, <code>index_exists?</code>, <code>delete_index!</code>, and <code>refresh_index!</code> which can be used to define and manage field mappings. Index methods would implicitly use the previously defined mapping, as well as implicitly inferred index/document_type names. The module also adds instance methods <code>index_document</code>, <code>update_document</code>, and <code>delete_document</code> that depend on model having <code>as_indexed_json</code>, and <code>id</code> to work.</p>
<ul>
<li>module source/docs: <a href="https://github.com/elastic/elasticsearch-rails/blob/master/elasticsearch-model/lib/elasticsearch/model/indexing.rb">elasticsearch-rails/elasticsearch-model/lib/elasticsearch/model/indexing.rb</a></li>
</ul>
<h4 id="module-elasticsearchmodelsearching">Module Elasticsearch::Model::Searching</h4>
<p>Adds class level <code>search</code> method that accepts a to_hash-compatible object, delegates to the <code>search</code> method on the client. By default the client is coming from <a href="#gem-elasticsearch-transport">elasticsearch-transport</a>.</p>
<ul>
<li>module source/docs: <a href="https://github.com/elastic/elasticsearch-rails/blob/master/elasticsearch-model/lib/elasticsearch/model/searching.rb">elasticsearch-rails/elasticsearch-model/lib/elasticsearch/model/searching.rb</a></li>
</ul>
<h4 id="module-elasticsearchmodelserializing">Module Elasticsearch::Model::Serializing</h4>
<p>Adds an instance method <code>as_indexed_json</code> to a model, which by default delegates to <code>as_json</code> with option <code>root: false</code>.</p>
<p>module source/docs: <a href="https://github.com/elastic/elasticsearch-rails/blob/master/elasticsearch-model/lib/elasticsearch/model/serializing.rb">elasticsearch-rails/elasticsearch-model/lib/elasticsearch/model/serializing.rb</a></p>
<h4 id="module-elasticsearchmodelimporting">Module Elasticsearch::Model::Importing</h4>
<p>Provides class-level method <code>import</code> allowing batches of records to be efficiently imported into Elasticsearch. This module automatically adapts for ActiveRecord and Mongoid.</p>
<p>module source/docs: <a href="https://github.com/elastic/elasticsearch-rails/blob/master/elasticsearch-model/lib/elasticsearch/model/importing.rb">elasticsearch-rails/elasticsearch-model/lib/elasticsearch/model/importing.rb</a></p>
<h4 id="module-elasticsearchmodel">Module Elasticsearch::Model</h4>
<p>When included, this module does 3 things.</p>
<ol type="1">
<li>Includes <a href="#module-elasticsearchmodelproxy">Elasticsearch::Model::Proxy</a> into the model.<br>
</li>
<li>Includes the following modules into the <code>__elasticsearch__</code> proxy object.</li>
</ol>
<ul>
<li><a href="#module-elasticsearchmodelclient">Elasticsearch::Model::Client</a></li>
<li><a href="#module-elasticsearchmodelnaming">Elasticsearch::Model::Naming</a></li>
<li><a href="#module-elasticsearchmodelindexing">Elasticsearch::Model::Indexing</a></li>
<li><a href="#module-elasticsearchmodelsearching">Elasticsearch::Model::Searching</a></li>
<li><a href="#module-elasticsearchmodelserializing">Elasticsearch::Model::Serializing</a></li>
<li><a href="#module-elasticsearchmodelimporting">Elasticsearch::Model::Importing</a></li>
</ul>
<ol start="3" type="1">
<li>Delegates some important methods from model class/instance to the <code>__elasticsearch__</code> proxy object, namely <code>search</code>, <code>mapping</code>, <code>settings</code>, <code>index_name</code>, <code>document_type</code>, <code>import</code>.</li>
</ol>
<p>module source/docs: <a href="https://github.com/elastic/elasticsearch-rails/blob/master/elasticsearch-model/lib/elasticsearch/model.rb">elasticsearch-rails/elasticsearch-model/lib/elasticsearch/model.rb</a></p>
<h4 id="module-elasticsearchmodelcallbacks">Module Elasticsearch::Model::Callbacks</h4>
<p>Adds callbacks that sync model and Elasticsearch representation on create/update/delete. The callbacks are blocking, if the syncing must be asynchronous it’s suggested to implement your own callbacks, and not use this module. This module automatically adapts for ActiveRecord and Mongoid.</p>
<p>module source/docs: <a href="https://github.com/elastic/elasticsearch-rails/blob/master/elasticsearch-model/lib/elasticsearch/model/callbacks.rb">elasticsearch-rails/elasticsearch-model/lib/elasticsearch/model/callbacks.rb</a></p>
<h3 id="gem-elasticsearch-persistence">Gem elasticsearch-persistence</h3>
<p>Depends on:</p>
<ul>
<li><a href="#gem-elasticsearch">elasticsearch</a></li>
<li><a href="#gem-elasticsearch-model">elasticsearch-model</a></li>
</ul>
<p>Provides a way to build models backed by Elasticsearch database, similar to ActiveRecord models being backed by <span class="small-caps">SQL</span> database. Additionally, provides a way of using Repository pattern to the same effect.</p>
<ul>
<li>readme: <a href="https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-persistence">elastic/elasticsearch-rails/elasticsearch-persistence</a></li>
</ul>
<h3 id="gem-elasticsearch-rails">Gem elasticsearch-rails</h3>
<p>Provides rake tasks for importing data from Rails models into Elasticsearch, as well as instrumentation for displaying search requests and their stats in logs. Includes special support for <a href="https://github.com/roidrage/lograge">Lograge</a>. Both rake tasks and instrumentation features must be manually required to function (no railtie support). Comes with multiple Rails application templates which allow the user to generate example applications locally, starting from a very simple integration, to a full-blown Elasticsearc-powered application, to demonstrate the gem capabilities and common usage patterns.</p>
<ul>
<li>readme: <a href="https://github.com/elastic/elasticsearch-rails/tree/master/elasticsearch-rails">elastic/elasticsearch-rails/elasticsearch-rails</a></li>
</ul>
<!-- meta
  date: "2015-09-15"
  toc: true
  tldr: Learn what each Elasticsearch gem and module does
-->  ]]></description>
  </item>
  <item>
    <title><![CDATA[ 6 practices for super smooth Ansible experience ]]></title>
    <link>https://max.engineer/six-ansible-practices</link>
    <guid>https://max.engineer/six-ansible-practices</guid>
    <pubDate>Wed, 18 Jun 2014 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>I started porting my setup from <a href="https://www.chef.io">Chef</a> to <a href="https://www.ansible.com">Ansible</a> a few weeks ago. Having had plenty of experience with Chef gave me a pretty good idea of what I wanted to achieve. One of the main advantages I see in Ansible is the ability to drive your server setup via ssh from your own machine. If you don’t have 100s of servers (<strong>update:</strong> actually more like tens of thousands, see the <a href="https://max.engineer/six-ansible-practices#comment-1444175554">comment by mpdehaan</a>), this agentless<span class="push-double"></span> <span class="pull-double">“</span>push” approach is very powerful. You get to simplify things tremendously in ways like</p>
<ul>
<li>deterministic order of operations across hosts</li>
<li>centralized configuration (no immediate need for the likes of <a href="https://github.com/coreos/etcd">etcd</a>/<a href="https://www.consul.io">consul</a>)</li>
<li>agent forwarding</li>
<li>better control over host resources (no unnecessary periodic runs)</li>
</ul>
<p>In essence, you have an entity that can see and orchestrate all the pieces in the system rather than having each piece trying to maintain itself by catching up to its surroundings.</p>
<p>Given the above points, this article is about running Ansible from your local machine. It assumes that the target hosts are only accessible via ssh, and helps setup Vagrant in the same way, as if it was a <span class="small-caps">VPS</span>.</p>
<p>Nevertheless, during my venture into Ansible I immediately ran into some sticking points, which I knew had to have elegant solutions, yet they were hard to search for online, or easy to miss in the docs. Naturally, they ate away my time and now I’d like to help you save yours.</p>
<h2 id="build-a-convenient-local-playground">1. Build a convenient local playground</h2>
<p>Just a few servers that can talk to each other is all you want. Your multiple production machines, their interactions, their firewalls and dns config should all just be reproduced on a smaller scale. Is that really so hard? If your hosting provider is kind of like <a href="https://www.digitalocean.com/?refcode=c59a2fd651b1">Digital Ocean</a> it’s especially useful to get it all thoroughly mimicked, since you don’t get any security groups or virtual private clouds there, so all your ipconfig and dns stuff has to be configured by hand.</p>
<p>Well, turns out it’s easy, after some screwing around.</p>
<h3 id="path-to-failure">Path to failure</h3>
<p>You start hooking up ansible provisioner in Vagrant. Don’t. It’s not even a good approximation of how you will run Ansible in production.</p>
<h3 id="path-to-success">Path to success</h3>
<p>There are 4 quick steps to having a very convenient setup.</p>
<ol type="1">
<li>Make it easy to sync your hosts file with your VMs</li>
<li>Automate adding your pub key to VMs</li>
<li>Configure your ssh client</li>
<li>Write your Vagrantfile</li>
</ol>
<h4 id="make-it-easy-to-sync-your-hosts-file-with-your-vms">1. Make it easy to sync your hosts file with your VMs</h4>
<p>This assumes you have vagrant installed. A very convenient vagrant plugin can automatically add and remove hosts every time you add or destroy VMs. Install as follows.</p>
<pre><code class="hljs shell"><span class="hljs-meta">$</span><span class="bash"> vagrant plugin install vagrant-hostsupdater</span></code></pre>
<p>Now every time you boot or destroy a <span class="small-caps">VM</span> your <code>/etc/hosts</code> will have the hostname added/removed automatically. You will notice it asking you for your sudo password every time it tries to do that.</p>
<h4 id="automate-adding-your-pub-key-to-vms">2. Automate adding your pub key to VMs</h4>
<p>I wrote a <a href="https://gist.github.com/maxim/dafc3b6da5754419babb">small ruby script</a> for Vagrant which lets you conveniently put your pub key into <span class="small-caps">VM</span> akin to how <a href="https://www.digitalocean.com/?refcode=c59a2fd651b1">Digital Ocean</a> would bootstrap your machine with your key. Assuming you’ve made a root dir for your Ansible project (I called mine <code>stack</code>), do this while in it.</p>
<pre><code class="hljs shell"><span class="hljs-meta">$</span><span class="bash"> mkdir vagrant</span>
<span class="hljs-meta">$</span><span class="bash"> <span class="hljs-built_in">cd</span> vagrant</span>
<span class="hljs-meta">$</span><span class="bash"> curl -O https://gist.githubusercontent.com/maxim/dafc3b6da5754419babb/raw/7789793ed7e799dc22e6222c30c6130f34a055e7/key_authorization.rb</span>
<span class="hljs-meta">$</span><span class="bash"> <span class="hljs-built_in">cd</span> ..</span></code></pre>
<p>Now you have a <code>vagrant/key_authorization.rb</code> file in there, I’ll show you how to use it in just a bit.</p>
<h4 id="configure-your-ssh-client">3. Configure your ssh client</h4>
<p><strong><span class="small-caps">SECURITY</span> <span class="small-caps">NOTICE</span>:</strong> Absolutely do not do this for your production servers. This is only safe on a private vagrant network with your own VMs.</p>
<p>We will setup our machines on certain <span class="small-caps">IP</span> range, and I’d like them to be accessible just like <a href="https://www.digitalocean.com/?refcode=c59a2fd651b1">Digital Ocean</a> machines, directly as root. So this <code>~/.ssh/config</code> makes it much more convenient.</p>
<pre><code># For vagrant virtual machines
Host 192.168.33.* *.myapp.dev
  StrictHostKeyChecking no
  UserKnownHostsFile=/dev/null
  User root
  LogLevel ERROR</code></pre>
<p>With this one config you murdered a whole bunch of birds. Specifically,</p>
<ul>
<li><span class="small-caps">SSH</span> won’t complain about non-matching keys for your ever-changing vagrant VMs</li>
<li><span class="small-caps">SSH</span> won’t try to remember and manage those keys via known_hosts</li>
<li>You won’t have to specify <code>root@…</code> every time</li>
<li><span class="small-caps">SSH</span> will shut up about how you’re making it do such awful things</li>
</ul>
<p>Just make sure you replace <code>myapp</code> with whatever local hostname you’d like for your app, and ip address with your desired vagrant ip range.</p>
<h4 id="write-your-vagrantfile">4. Write your Vagrantfile</h4>
<p>Now that you have everything else in place, let’s add the <code>Vagrantfile</code> into your ansible dir.</p>
<pre><code class="hljs ruby">require_relative <span class="hljs-string">'./vagrant/key_authorization'</span>

Vagrant.configure(<span class="hljs-string">'2'</span>) <span class="hljs-keyword">do</span> <span class="hljs-params">|config|</span>
  config.vm.box = <span class="hljs-string">'ubuntu/trusty64'</span>
  authorize_key_for_root config, <span class="hljs-string">'~/.ssh/id_dsa.pub'</span>, <span class="hljs-string">'~/.ssh/id_rsa.pub'</span>

  {
    <span class="hljs-string">'db1'</span>    =&gt; <span class="hljs-string">'192.168.33.10'</span>,
    <span class="hljs-string">'app1'</span>   =&gt; <span class="hljs-string">'192.168.33.11'</span>,
    <span class="hljs-string">'redis1'</span> =&gt; <span class="hljs-string">'192.168.33.12'</span>,
  }.each <span class="hljs-keyword">do</span> <span class="hljs-params">|short_name, ip|</span>
    config.vm.define short_name <span class="hljs-keyword">do</span> <span class="hljs-params">|host|</span>
      host.vm.network <span class="hljs-string">'private_network'</span>, <span class="hljs-symbol">ip:</span> ip
      host.vm.hostname = <span class="hljs-string">"<span class="hljs-subst">#{short_name}</span>.myapp.dev"</span>
    <span class="hljs-keyword">end</span>
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span></code></pre>
<p>This makes it super easy to add more machines into the ruby hash, specify their exact ips, and bring your whole stack up and down with <code>vagrant up</code> and <code>vagrant suspend</code>.</p>
<p>Also notice the require line on top, and the <code>authorize_key_for_root</code> command. This is a reference to my script you downloaded earlier. With this in place the first key it finds among the ones listed will go into the <span class="small-caps">VM</span> as one of root user’s <code>authorized_keys</code>. This way you can ssh as root without a password.</p>
<p>Also thanks to our ssh config, you now get to run the following, and it’ll just work.</p>
<pre><code class="hljs shell"><span class="hljs-meta">$</span><span class="bash"> vagrant up db1</span>
<span class="hljs-meta">$</span><span class="bash"> ssh db1.myapp.dev</span>
root@db1:~#</code></pre>
<p>This might make you wonder, why not simply let Ansible setup a non-root user for you, and do everything via sudo? Based on my conversations with friendly neighborhood sysadmins, passwordless sudo gives you no more security than bootstrapping via root does. All it does is add an extra useless step to every operation. As far as using Ansible as Vagrant provisioner: as I mentioned in the intro, my goal is a very production-like environment. Vagrant shouldn’t play any role in it except leave me with a few blank machines similar to the ones my <span class="small-caps">VPS</span> provider would build for me. In essence, I want my starting point on Vagrant to be almost exactly like if I used an actual live <span class="small-caps">VPS</span>, and I like to keep it simple by making a good use of the default config. In my case it means a machine with a hostname, <span class="small-caps">IP</span>, and a root user with my key authorized. That’s exactly what we’re doing here.</p>
<h2 id="teach-ansible-to-talk-to-github-on-your-behalf">2. Teach Ansible to talk to Github on your behalf</h2>
<p>In an effort to keep things simple, I avoid having to create extra ssh keys on my servers and add them to Github. Instead there is a way to let servers access Github on your behalf without creating any extra identities. Ansible would take the identity of the user who initiated the playbook run, and forward it to the host, which in its turn will use it to talk to Github.</p>
<p>This mechanism is called agent forwarding. You might not want this if you have a complex deploy pipeline, where a deploy server acts autonomously and has its own identity, but Ansible makes it so easy to orchestrate various processes, that I decided not to build one for my setup.</p>
<p>So there is a setting for this. Create a file right here in the root dir called <code>ansible.cfg</code> with the following contents, and it will be automatically picked up when you run Ansible.</p>
<pre><code class="hljs ini"><span class="hljs-section">[ssh_connection]</span>
<span class="hljs-attr">ssh_args</span> = -o ForwardAgent=<span class="hljs-literal">yes</span></code></pre>
<p>That’s it. No need to add new keys to github.</p>
<h2 id="add-github-to-known_hosts-properly-and-securely">3. Add Github to known_hosts properly and securely</h2>
<p>For those who are not sure what this is: a server like github can give you a key which your ssh client will use to ensure that you have a secure ssh connection. That key is easily obtained by using the following command.</p>
<pre><code class="hljs shell">ssh-keyscan -t rsa github.com</code></pre>
<h3 id="path-to-failure-1">Path to failure</h3>
<p>People out there suggest that you should run that command on your remote hosts in your Ansible playbooks to set the key dynamically. Don’t. That defeats the purpose of having the key. A <a href="https://en.wikipedia.org/wiki%20/Man-in-the-%20middle_attack">man-in-the-middle</a> attack could compromise the result you get, leaving you in the exact situation this measure was meant to prevent.</p>
<h3 id="path-to-success-1">Path to success</h3>
<p>Use Ansible feature called <code>lookup</code>. Here’s an example Ansible task that will set the key in a secure way.</p>
<pre><code class="hljs yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">ensure</span> <span class="hljs-string">github.com</span> <span class="hljs-string">is</span> <span class="hljs-string">a</span> <span class="hljs-string">known</span> <span class="hljs-string">host</span>
  <span class="hljs-attr">lineinfile:</span>
    <span class="hljs-attr">dest:</span> <span class="hljs-string">/root/.ssh/known_hosts</span>
    <span class="hljs-attr">create:</span> <span class="hljs-literal">yes</span>
    <span class="hljs-attr">state:</span> <span class="hljs-string">present</span>
    <span class="hljs-attr">line:</span> <span class="hljs-string">"<span class="hljs-template-variable"></span>"</span>
    <span class="hljs-attr">regexp:</span> <span class="hljs-string">"^github\\.com"</span></code></pre>
<p>{:.notice} <strong>Careful:</strong> If you do this while having a large number of target servers, you’re gonna have a bad time. This might cause some serious bombardment of your control machine. In that case use <code>accept_hostkeys=yes</code> in your git task. I only have about 10-20 machines, so this isn’t a problem for me. (from the <a href="https://max.engineer/six-ansible-practices#comment-1444175554">comment by mpdehaan</a>)</p>
<p>You might wonder how is that different than the fail path above? First of all, this doesn’t run on a remote host, it runs on your control machine. Second of all, it only sets this key once per host. If github decides to change it you would have to write another play to update it, or modify this one. This is good because we don’t want a <span class="small-caps">MITM</span> attack to trigger a change of the real key.</p>
<p>Another advice out there is to actually hardcode this key in a variable. That’s also a good way to do it, but I don’t like having ugly strings pollute my var files.</p>
<h2 id="keep-your-secret-vars-separate">4. Keep your secret vars separate</h2>
<p>I’m personally not a fan of shit-work involved in placing variables in many different files. It’s more convenient to see the whole picture in one place. However, I do believe secret variables should be either git-ignored or encrypted, and for that you need to put them into their own file.</p>
<p>In my setup I use <code>group_vars/all</code> to keep all non-secret things. So now that <strong>this file is taken</strong>, how can you share secrets among all your hosts?</p>
<h3 id="path-to-failure-2">Path to failure</h3>
<p>I spent a long time trying to figure this one out. I was recommended things like using lookups to fetch each individual variable from their own files elsewhere on my machine. I was also recommended to place these variables into vars file for each individual host, repeatedly. Both are fail. When I discovered the way, I admit I was kind of kicking myself.</p>
<h3 id="path-to-success-2">Path to success</h3>
<p>I simply didn’t know one little fact. Your <code>group_vars/all</code> can be a directory. All files in there can contain variables for all hosts. So I created 2 files in there, <code>config</code> and <code>secrets</code>. I also added <code>group_vars/all/secrets</code> to <code>.gitignore</code> and solved all my issues. Another approach would be to encrypt that file with <a href="https://docs.ansible.com/ansible/latest/playbooks_vault.html">ansible-vault</a> and let it stay in your repo. I didn’t need that.</p>
<h2 id="avoid-perpetually-changed-and-skipping-tasks">5. Avoid perpetually<span class="push-double"></span> <span class="pull-double">“</span>changed” and<span class="push-double"></span> <span class="pull-double">“</span>skipping” tasks</h2>
<p>As a slightly obsessive-compulsive person, I didn’t like the fact that some tasks kept showing me<span class="push-double"></span> <span class="pull-double">“</span>changed” or<span class="push-double"></span> <span class="pull-double">“</span>skipping” status. Besides the fact that it feels wrong, various notification tools might end up bothering you about things changing while they actually aren’t. One such offender was the way to create a postgres extensions in your database.</p>
<h3 id="path-to-failure-3">Path to failure</h3>
<p>This is the way a typical postgres create extension task looks.</p>
<pre><code class="hljs yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">ensure</span> <span class="hljs-string">postgresql</span> <span class="hljs-string">hstore</span> <span class="hljs-string">extension</span> <span class="hljs-string">is</span> <span class="hljs-string">created</span>
  <span class="hljs-attr">sudo:</span> <span class="hljs-literal">yes</span>
  <span class="hljs-attr">sudo_user:</span> <span class="hljs-string">postgres</span>
  <span class="hljs-attr">shell:</span> <span class="hljs-string">"psql my_database -c 'CREATE EXTENSION IF NOT EXISTS hstore;'"</span></code></pre>
<p>Every time you run it, it will be detected as<span class="push-double"></span> <span class="pull-double">“</span>changed” even though nothing actually changes.</p>
<h3 id="path-to-success-3">Path to success</h3>
<p>Instead we can leverage Ansible’s <code>register</code>, <code>changed_when</code> and <code>failed_when</code> to make this task report <code>ok</code>, as it should. Take a look at this version.</p>
<pre><code class="hljs yaml"><span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">ensure</span> <span class="hljs-string">postgresql</span> <span class="hljs-string">hstore</span> <span class="hljs-string">extension</span> <span class="hljs-string">is</span> <span class="hljs-string">created</span>
  <span class="hljs-attr">sudo:</span> <span class="hljs-literal">yes</span>
  <span class="hljs-attr">sudo_user:</span> <span class="hljs-string">postgres</span>
  <span class="hljs-attr">shell:</span> <span class="hljs-string">"psql my_database -c 'CREATE EXTENSION hstore;'"</span>
  <span class="hljs-attr">register:</span> <span class="hljs-string">psql_result</span>
  <span class="hljs-attr">failed_when:</span> <span class="hljs-string">&gt;
    psql_result.rc != 0 and ("already exists" not in psql_result.stderr)
</span>  <span class="hljs-attr">changed_when:</span> <span class="hljs-string">"psql_result.rc == 0"</span></code></pre>
<p>This clever trick takes advantage of psql exit codes and stderr output. Notice also that we removed <code>IF NOT EXISTS</code> part from the <span class="small-caps">SQL</span> to make sure we get an error if extension is already there. This is done on purpose, because we only consider the task failed if the exit code is not zero and the error is something other than<span class="push-double"></span> <span class="pull-double">“</span>already exists”. If the error is actually<span class="push-double"></span> <span class="pull-double">“</span>already exists”, then it’s not really a failure, it’s exactly what we want. The <code>changed_when</code> piece indicates that if there is no error and we exited successfully, then it means psql actually added the extension, and therefore changed. All neat now.</p>
<p>It’s worth noting that while it hurts an obsessive person like me, sometimes it’s hard to achieve an <code>ok</code> report on some tasks. For example, if you use the <code>shell</code> module with <code>creates</code> option, it might generate <code>skipping</code> instead of <code>ok</code>, and you should let it go. Instead focus on getting rid of <code>changed</code> reports, and leave <code>skipping</code> alone.</p>
<h2 id="separate-your-setup-and-deploy-playbooks">6. Separate your setup and deploy playbooks</h2>
<p>Every time you use a package module in Ansible (like apt or npm) you have a choice between <code>state=present</code> and <code>state=latest</code>. The former will simply ensure that a desired package is installed, while the latter will, in addition to that, go ahead and update it if it’s not of the latest available version. When you are building your stack, my advice is to always prefer <code>present</code>. This also means that when using <span class="small-caps">VCS</span> modules like <code>git</code> set <code>update: no</code>. This is important because you need to be able to converge your server configuration without actually deploying and changing your software. A software update, whether it’s your app’s deploy, or a dependency version bump, has nothing to do with your server configuration, and could really break your production. Your updates have to be strict, purposeful, and well thought out, which is why I suggest to write separate playbooks for them. In those playbooks it would be acceptable to use
<code>state=latest</code>, since you’d only run them when you’re ready to deal with the consequences. Chances are you would need to choreograph some data and configuration to get all the updated pieces working anyway, so having a different<span class="push-double"></span> <span class="pull-double">“</span>convergence vector” for it is a much simpler approach.</p>
<p>Well, time to grab some coffee and dive back into building an awesome stack.</p>
<!--meta
  toc: true
  date: "2014-06-18"
  tldr: Vagrant, ssh, secrets, setup/deploy and more
-->  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Linux permissions cheatsheet ]]></title>
    <link>https://max.engineer/permissions</link>
    <guid>https://max.engineer/permissions</guid>
    <pubDate>Sun, 15 Jun 2014 00:00:00 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<h2 id="chmod-abcd">chmod [a]bcd</h2>
<table>
<thead>
<tr>
<th>bit</th>
<th>scope</th>
<th>description</th>
</tr>
</thead>
<tbody>
<tr>
<td>a</td>
<td></td>
<td>sticky:1, setgid:2, setuid:4 (optional, default: 0)</td>
</tr>
<tr>
<td>b</td>
<td>owner</td>
<td>x:1/w:2/r:4 - xw:3/xr:5/wr:6/xwr:7</td>
</tr>
<tr>
<td>c</td>
<td>group</td>
<td>x:1/w:2/r:4 - xw:3/xr:5/wr:6/xwr:7</td>
</tr>
<tr>
<td>d</td>
<td>everyone</td>
<td>x:1/w:2/r:4 - xw:3/xr:5/wr:6/xwr:7</td>
</tr>
</tbody>
</table>
<ul>
<li><em>Note: only file/dir owner can chmod it</em></li>
<li><em>Note: scripts need both <code>x</code> and <code>r</code> permissions to execute</em> <em>(that’s because scripts are <strong>read</strong> into interpreter)</em><br>
<em>(only <code>r</code> is enough if ran via <code>ruby script.rb</code>, <code>sh script.sh</code>)</em></li>
</ul>
<h2 id="files">files</h2>
<table>
<thead>
<tr>
<th>bit setting</th>
<th>meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td>sticky on files</td>
<td>no effect</td>
</tr>
<tr>
<td>setgid on execable binaries</td>
<td>no matter who executes, process runs as file’s group</td>
</tr>
<tr>
<td>setuid on execable binaries</td>
<td>no matter who executes, process runs as file’s owner</td>
</tr>
<tr>
<td>setuid/setgid on scripts</td>
<td>ignored due to security issues</td>
</tr>
<tr>
<td>setuid/setgid on non-execables</td>
<td>no effect<a href="#footnote-15PN" class="footnote-ref" id="ref-15PN" role="doc-noteref"><sup>1</sup></a></td>
</tr>
</tbody>
</table>
<p><strong>Warning:</strong> <em>setuid</em> is dangerous</p>
<h2 id="directories">directories</h2>
<table>
<thead>
<tr>
<th>bit setting</th>
<th>meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td>x on dirs</td>
<td><code>cd</code>, <code>stat</code> (e.g.&nbsp;<code>ls -l</code>), inode lookup (access files)</td>
</tr>
<tr>
<td>w on dirs</td>
<td>add/delete/rename files (requires <code>x</code> for inode lookup)</td>
</tr>
<tr>
<td>r on dirs</td>
<td><code>ls</code></td>
</tr>
</tbody>
</table>
<ul>
<li><em>Note: having <code>xw</code> on a dir is enough to delete any file in it</em> <em>(unless it has sticky bit)</em></li>
</ul>
<h3 id="sticky-on-dirs">sticky on dirs</h3>
<ul>
<li>only used when writable by group/everyone</li>
<li>files in dir can only be edited/deleted by their owner (think <code>/tmp</code>)</li>
<li>symlinks only work if target is within this dir</li>
</ul>
<h3 id="setgid-on-dirs">setgid on dirs</h3>
<ul>
<li>all files/subdirs created by anyone in this dir inherit its group</li>
<li>all subdirs inherit this bit when created</li>
</ul>
<h3 id="setuid-on-dirs">setuid on dirs</h3>
<ul>
<li>no effect</li>
</ul>
<h2 id="sources">sources</h2>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Chmod">https://en.wikipedia.org/wiki/Chmod</a></li>
<li><a href="https://wpollock.com/AUnix1/FilePermissions.htm">https://wpollock.com/AUnix1/FilePermissions.htm</a></li>
<li><a href="https://major.io/2007/02/13/chmod-and-the-mysterious-first-octet/">https://major.io/2007/02/13/chmod-and-the-mysterious-first-octet/</a></li>
</ul>
<section id="footnotes" class="footnotes footnotes-end-of-document" role="doc-endnotes">
<hr>
<ol>
<li id="footnote-15PN"><p>There is an exception. See<span class="push-double"></span> <span class="pull-double">“</span><span class="small-caps">SUID</span> and <span class="small-caps">SGID</span> on non-executable files” on <a href="https://wpollock.com/AUnix1/FilePermissions.htm">this page</a>.<a href="#ref-15PN" class="footnote-back" role="doc-backlink"><span class="small-caps">↩︎</span></a></p></li>
</ol>
</section>  ]]></description>
  </item>
  <item>
    <title><![CDATA[ CMS Trap ]]></title>
    <link>https://max.engineer/cms-trap</link>
    <guid>https://max.engineer/cms-trap</guid>
    <pubDate>Tue, 26 Nov 2013 00:00:00 -0500</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>Many years ago there was a Rails app. It started with things. These things were actually blueprints for other things. The other things needed many associated parts, and parts of parts. How many? The blueprints knew. The blueprints absolutely had to have an admin interface, but changing the blueprints would cause a chain reaction on things and parts. Every modification to the things and their blueprints permeated throughout the coupled network of various models. The admin <span class="small-caps">UI</span> complexity quickly skyrocketed as parts continued to branch out into more entities. It got to the point where blueprints had to have serializable, persistable snippets of logic. At that point every feature has become subject to a very difficult implementation, and thus the app degraded into the state of utter unmaintainability. It felt as if there was a <em>content management system</em> standing in the way of getting things done, imposing itself as the middle man between the feature and its implementation. It
was like the system actually forced all the business logic to be reframed in terms of this higher level of abstraction.</p>
<p>The worst part? This was a minimal viable product for a newly-born startup.</p>
<h2 id="accidentally-cms">Accidentally <span class="small-caps">CMS</span></h2>
<div class="frame">
<img src="https://cdn.blot.im/blog_ae6687516f4a44fb8370df95cf526289/_image_cache/490c213f-c111-4f32-ac5b-7dc5b87a8c8c.png" alt="“Complexity” by nerovivo" width="832" height="219"><span class="caption">“Complexity” by nerovivo</span>
</div>
<p>The programmer’s nature encourages us to indulge ourselves in solving puzzles and modelling abstract concepts. It’s the passion that makes us lose sight of the danger looming ahead, the trap we’re edging towards thanks to our subjective assumptions and vague speculation, the trap of building a overdesigned and overcomplicated system for its own sake. A <span class="small-caps">CMS</span> trap. We suffer various consequences, ranging from burnouts, and loss of enthusiasm, to missed deadlines, and failed businesses, yet we never seem to speak of this mistake directly. Somewhere by a water cooler, an experienced colleague casually points out that you might be overcomplicating things. Somewhere in an <span class="small-caps">IRC</span> chat you get ridiculed for asking questions about a complex object model for a project that will most likely never see the light of day. Yet nobody can clearly explain exactly what is the underlying thought process. These casual remarks is all the education we get on the subject, and people end up learning this the
hard way. That’s why I’d like to shine some light on this phenomenon. To start, here is my best shot at defining the <span class="small-caps">CMS</span> trap the way I see it.</p>
<p><strong><em>A <span class="small-caps">CMS</span> Trap is a state of a web-application in which the development of content management systems is obstructing the development of the content.</em></strong></p>
<p>If you are building a startup like me, you should know that this trap is especially dangerous in the early stage. Only a small percentage of companies get to play the long game, and by that time their problems have shifted onto a entirely different plane of existence. While these companies may also be subject to falling into the <span class="small-caps">CMS</span> trap, they would probably be able to afford it, if not even pursue this direction intentionally. Here, I’d like to focus on the much more abundant variety: the small companies. The problem would become apparent as soon as you’ve opened your doors to an influx of customers, who’d begin using your project, and providing you with real analytics and feedback. At this point your project would no longer be driven by your gut feeling, rather you’d have real data suggesting how to proceed, dictating which features to implement next. This would be the time when all of your initial architectural assumptions are being tested, and reality is beginning to set in.
Reality has no tact, it doesn’t spare you any painful truths when dawning upon your hopeful application design. You’d wish you could refactor, but it’d be too late, as you’d be forced to keep up with new features instead, and implementing them would only be getting harder in this downwards spiral of dwindling productivity.</p>
<h2 id="ill-thank-me-later">I’ll thank me later</h2>
<blockquote>
<p><span class="pull-double">“</span>Most of our assumptions have outlived their uselessness.”<br>
— Marshall McLuhan</p>
</blockquote>
<p>Simply put, we love designing systems. As soon as we form some understanding of a problem, we rush to our <code>/(?:whiteboards|moleskines|mindmaps|editors)/</code> and start passionately defining entities and their interactions. It feels good, it’s what we do best. We tackle some of the most fundamental decisions about the project. Then, having carefully outlined our assumptions, we commit to them. We like to think that we sow wisdom and flexibility with our early decisions, and <em>we will thank ourselves later</em>. With all of those useful points of extension and well-represented entities, what could possibly go wrong? The reality is, that most likely these early assumptions will restrict our future, not expand it. The day comes when we meet our old friend, the innocent<span class="push-double"></span> <span class="pull-double">“</span>past self” staring back at us from the editor, smiling proudly. This well-meaning person spent hours, days, and weeks diverting our efforts into the abyss of <em>speculative architecture</em>, while having
barely any idea about the real problems we’ll be facing. We are now stuck with all of that<span class="push-double"></span> <span class="pull-double">“</span>helpful” code. It’s as if you decided to cook some salad, but instead of having separate ingredients laid out in front of you, all you have is another fully cooked salad given to you by a stranger, which you are now forced to dig through in hopes of fetching some of the pieces you need.</p>
<p><strong>In programming, your past self is nothing but a stranger with boundary issues.</strong></p>
<p>In the same spirit, imagine you come back to your computer only to find your app reorganized by some clueless stranger in ways that have little to do with reality. This isn’t very different to how we find ourselves looking at a system we’ve over-modelled in the past. Wouldn’t you wish that you didn’t have to deal with any of this garbage, and could instead simply greenfield your way ahead as dictated by your business needs?</p>
<p>To bring this back to my personal story, I eventually realized that with every new business feature I spent more time figuring out how to fit it into the existing framework I imposed on myself than actually designing the feature. As you may have guessed, I was thanking myself profusely for being so considerate.</p>
<h2 id="c-rud">C &lt; <span class="small-caps">RUD</span></h2>
<blockquote>
<p><span class="pull-double">“</span>It’s harder to read code than to write it.”<br>
— Joel Spolsky</p>
</blockquote>
<p>Talking about architecture is a lot like talking about code itself. It’s not exactly that we can never be mindful of the future, it’s just that the odds are not in our favor. Code is easy to write and hard to change or remove. Every line we light-heartedly throw into the mix will eventually be taunting us with the timeless question:<span class="push-double"></span> <span class="pull-double">“</span>guess what’s going to break if you touch me? <img src="https://cdn.blot.im/blog_ae6687516f4a44fb8370df95cf526289/_image_cache/13b6f170-b144-4530-b653-dceb19701d77.png" alt="trollface" width="20" height="16"><span class="pull-double">”</span>. Architectural decisions, just like code, are easy to make and very hard to unmake. While in code this problem is alleviated with testing, in architecture we don’t have testing. The only measure of quality we have is really the measure of pain we feel when working on a new feature, and by that time it’s often too late. Bad architecture can suffocate your business even while your code is sporting 100% test coverage.</p>
<h2 id="alarm-triggers">Alarm triggers</h2>
<div class="frame">
<img src="https://cdn.blot.im/blog_ae6687516f4a44fb8370df95cf526289/_image_cache/a8971e2b-792c-4ead-8c13-7c47b5b50e40.png" alt="“Fire Alarm” by Fey Ilyas" width="832" height="184"><span class="caption">“Fire Alarm” by Fey Ilyas</span>
</div>
<p>As with most traps, there is no specific way of knowing when you are walking into one. The best you can hope for is to have some sort of<span class="push-double"></span> <span class="pull-double">“</span>tells” that warn you of an upcoming danger. Below I list some of these tells from my own experience. Seeing these things in your early stage project should at least make you suspicious.</p>
<h3 id="the-early-onset-conservatism">The early onset conservatism</h3>
<blockquote>
<p><span class="pull-double">“</span>A state without the means of some change is without the means of its conservation.”<br>
— Edmund Burke</p>
</blockquote>
<p>Say you are faced with a new feature, and you find it to be a real <a href="https://programmers.stackexchange.com/questions/34775/correct-definition-%20of-the-term-yak-shaving">yak shave</a>. You realize that it will take a huge refactor, and you are arguing for ways to <em>just avoid it</em>. There is a fine line between negotiating feature requirements for reasons of efficiency, and negotiating them because you are stuck in the accidental buildup of legacy architecture. Could it be that your early speculative design decisions are starting to get in the way of today’s real business needs? Have you perhaps built too much too soon, and is it only a matter of time before the trap snaps, leaving your project effectively paralyzed? Don’t get me wrong, more often than not being conservative is a healthy defense against unnecessary complexity, it’s a standard practice of an experienced developer. The problem is when there is too much defense too early in the project’s life. It should
definitely cause some suspicion.</p>
<h3 id="the-drupal-syndrome">The Drupal syndrome</h3>
<blockquote>
<p><span class="pull-double">“</span>Hm, our pricing rules are different for various products, so we’ll need to find a way for an admin to define these rules in the admin panel. Maybe we should store code in the database and eval it?”</p>
</blockquote>
<p>This is a classic sign of walking into the <span class="small-caps">CMS</span> trap. You are trying to come up with ways to let admin program some logic, which should then be saved to the database. If it wasn’t for the admin interface, it could’ve been done with only a few lines of code. However, now we are talking about creating price models associated with rules, and all the complexity emerging from this. Any further extensions to pricing capabilities, which could’ve been implemented with a line or two of code, would now have to take the shape of database migrations, forms, validations, and everything else down this rabbit hole. Do you really need admin <span class="small-caps">UI</span> for pricing rules at this point?</p>
<h3 id="the-seed-is-weak">The seed is weak</h3>
<p>How do you implement 10 categories to place your products into? Typical answer involves creating a <code>Category</code> model and then writing a script that will seed the 10 prescribed categories, which will be assigned to products. Then you’d make sure every developer runs this seed file. Also, don’t forget about running it in production of course. On every deploy. And on every pull. And when setting up a new machine. And when running tests. And naturally, if you change something in the seed file.</p>
<p>If your early-stage application relies on a lot of seed data, you are on a slippery slope. Things that can be assumed constant mustn’t need to be modelled as database-backed entities at this point yet, but I’ll get back to this later.</p>
<h3 id="every-road-leads-to-mordor">Every road leads to Mordor</h3>
<blockquote>
<p><span class="pull-double">“</span>One does not simply implement business logic”<br>
— Boromir</p>
</blockquote>
<p>This is somewhat similar to the early onset conservatism, yet there is a difference. Ever found yourself intimidated by a trivial task? Ask yourself this: would this task be intimidating if it was to be implemented in isolation, without the rest of the app surrounding it? If the answer is yes, look at your feet, because you might be caught in the trap. Implementing a feature in a well architected system shouldn’t be any more difficult than implementing it in isolation.</p>
<h3 id="the-phantom-pain">The phantom pain</h3>
<p>Sometimes a <span class="small-caps">CMS</span> trap can be recognized by the presence of phantom pains that stem from hidden implications of an emerging <span class="small-caps">CMS</span>. For example, in reality you would never need to delete your categories, but because you built them as admin-editable database-backed records, suddenly you are thinking about the non-existent scenario of having them deleted. Your architecture took the liberty of making you contemplate a scenario that isn’t real. You end up dealing with fake pains, the phantom pains.</p>
<h2 id="prevention">Prevention</h2>
<div class="frame">
<img src="https://cdn.blot.im/blog_ae6687516f4a44fb8370df95cf526289/_image_cache/d5d71781-3552-4648-b403-55a79b387bd4.png" alt="“Drawing A Line In The Sand” by Henry Burrows" width="832" height="219"><span class="caption">“Drawing A Line In The Sand” by Henry Burrows</span>
</div>
<p>All of the above symptoms have something in common. They are all a product of early assumptions that lead to a complex system. At this point it’s useful to answer two questions:<span class="push-double"></span> <span class="pull-double">“</span>what’s a complex system?” and<span class="push-double"></span> <span class="pull-double">“</span>how do you program without making assumptions?”.</p>
<p>Well, for the purposes of this essay let’s say that a complex system is a system of networked nodes which consists of more nodes and connections than you can generally track in your head. Obviously, to get a low-complexity system you need to reduce the number of nodes and connections. As for the latter question, that’s what brings me to the main point. In order to program without making early assumptions, you must <strong>avoid doing things at runtime</strong>.</p>
<p>Let me elaborate. Having been scarred by over-modelling, I found that there is a principle that should become fundamental in all decision making. Let’s call it: <em>keep it static, stupid</em>, which seems appropriate because it’s really nothing more than a slightly more architecturally-aware riff on keeping things simple.</p>
<p><strong>Making things static is the architectural equivalent of avoiding premature optimization.</strong></p>
<p>The beauty of this principle is that it’s applicable on every abstraction level, regardless of whether you are talking about views, database, or code. The idea itself is simple: if in doubt, do it statically. It’s easier to understand what this means by looking at some concrete examples on various levels of a typical Rails app.</p>
<h3 id="can-it-be-solved-with-a-class">Can it be solved with a class?</h3>
<p>Earlier in the post I mentioned pricing rules. This is a common problem where each product might abide by a different pricing algorithm. Price could depend on quantity, current user (think loyalty programs), order history, coupons, and various other things. To avoid the <span class="small-caps">CMS</span> trap I urge you not to allow constructing these kinds algorithms at runtime at an early stage. Write a pricing scheme class. <em>Use <a href="https://en.wikipedia.org/wiki/Strategy_pattern">strategy pattern</a></em>. Make the pricing algorithm swappable at the code level. Define pricing rules via your programming language, this way the complexities of this logic can be mapped directly to code, and not warrant a whole layer of abstraction.</p>
<p>Programming languages already come with many wonderful tools, such as conditions and loops. Why reimplement them at a higher level of abstraction? These tools are more than enough to allow you to build complex pricing logic by writing code, directly. Once you have multiple pricing algorithms written as pluggable objects, feel free to let admin choose one, perhaps even<span class="push-double"></span> <span class="pull-double">“</span>fill in the blanks” by plugging in factors and key values into your algorithm, but evolve this functionality gradually, as needed. Build out your admin <span class="small-caps">UI</span> with time, injecting more and more runtime flexibility into your strategy objects. Remember, you can always make static/hardcoded things dynamic, but not so much the other way. Everything that you make adjustable at runtime introduces complexity into every decision you make from that point on, and increases chances of bugs you cannot foresee, even in seemingly unrelated parts of your app.</p>
<h3 id="can-it-be-solved-with-a-static-page">Can it be solved with a static page?</h3>
<p>Say you are listing things on a page for customer to see. These things may very well be products, photos, or files, whatever else it is that you are doing. Now, you have probably decided that there would be a title, a description, a picture, perhaps author or brand on each of those elements. You’ve split up your entities into these data fields and you decided to build database-backed models. This is where I’d suggest to stop and consider whether you have any good reason for why it can’t be a static page. A static templated view means that in order to change things, you have to edit the view and deploy, yes, but it also means that you don’t have controllers, models, migrations, forms, admin <span class="small-caps">UI</span>, or anything else. In fact, you might kind of still have an admin <span class="small-caps">UI</span> if you’re using Github. It’s not as real time as it could’ve been, but decent nonetheless. People can edit views on Github directly without much issue.</p>
<p>This becomes more of an issue if the things you list are categorized and otherwise laid out based on certain rules. In the dynamic approach, this would immediately force you to create a network of associated models just to render this sort of a page. Consider how little you know at this point about your future needs, and how constrained you will be having speculated your way towards that future. Consider also how quick and easy it would be to just sit down and hardcode this page. Just like the case with strategy pattern, you can always inject dynamic content into this page going forward, when the real needs arise. If you build out a dynamic system right away, you will likely end up constrained by it. Err on the side of static.</p>
<h3 id="can-it-be-solved-with-a-constant">Can it be solved with a constant?</h3>
<p>Getting back to the seed data issue, this example is fairly simple. You are creating categories. These categories are predefined. Instead of adding models, tables, and seed data, why not simply make a constant with an array? Code allows you to carry static data without involvement of a database. Use that, and wait until you really need the editing of categories at runtime. When that time comes, you could always extract data from the constant into the seed file, without any issues. Moreover, even that isn’t necessary. If you have some categories that never change, and some that should be manipulated in admin panel, you shouldn’t even seed the former ones. You could leave them in the constant, always read them from there, and this way avoid seed data altogether. It’s actually a little secret of mine. I don’t like seed data. It’s been years now, and our app works right out of the box on any new developer’s machine. If you pull our code into your dev machine, the app will just run. This
is why I say: when in doubt, hardcode.</p>
<h3 id="can-it-be-solved-with-a-string-in-the-database">Can it be solved with a string in the database?</h3>
<p>Say, at this point the app is working just fine, and you have your necessary database-backed models. You need to display a free-form text that may differ from one entity to another (e.g.&nbsp;different per product), yet it might contain certain values interpolated from elsewhere. As per the principle, you should not think about modelling this text via classes. First, ask whether you could simply get by with letting admin type the text as a string. But wait, you’d say. If this text has values plugged in from elsewhere, why should admin be typing them by hand? Would she have to look them up every time? Seems wrong. Well, relax a bit and consider canned snippets. That’s right, perhaps you can simply setup a free form text field while providing some pre-written text for admin, which has appropriate values already plugged in. When you think you need structured data for storing something highly flexible, consider instead using a plain string with canned snippets.</p>
<h2 id="wrapping-up">Wrapping up</h2>
<blockquote>
<p><span class="pull-double">“</span>For every complex problem there is an answer that is clear, simple, and wrong.”<br>
— H. L. Mencken</p>
</blockquote>
<p>While the above text is a good general principle, it cannot exactly apply to problems that are clearly asking for CMS-like solutions. When you are tasked with building a highly-flexible <span class="small-caps">CMS</span>, that’s what you do, naturally. When you are asked to build an app with something like <a href="https://drupal.org">Drupal</a>, you are in a whole different realm, where the <span class="small-caps">CMS</span> trap is pretty much your perpetual state of being. However, even in those special cases questions will arise whether to make something more or less dynamic, and I encourage every developer to always lean towards static. You will be doing a service not only to your future self, but also to the next developer, who would much rather slice up a piece of static html and inject some dynamic content than attempt to understand a steaming pile of speculative architecture with many moving parts.</p>
<p>It’s also important to note that I’m not advocating entirely against architecting up front. It’s good to a healthy extent, yet there is a line we draw in the sand on a case by case basis. I encourage you to think carefully about where to draw that line every time you implement something.</p>
<p>Speaking of my story, it ended with a year-long stagnation and a very reluctant revamp of the entire app. In the end, the aforementioned<span class="push-double"></span> <span class="pull-double">“</span>blueprints” have been downgraded to hardcoded classes, and over time they have become very declarative, thanks to a naturally-evolving internal <span class="small-caps">DSL</span>. Seeing these files today and imagining how I’d proceed implementing runtime admin <span class="small-caps">UI</span> for all the moving parts is nightmarish. Even though it ate away a year, I’m still glad that we bit the bullet and refactored. It was painful, but now this mistake is far behind us.</p>
<p>Unless you know exactly what you’re doing (which is unlikely), stay static. Try to put extra effort into determining which parts of your business can be left hardcoded. If in doubt, hardcode. While doing that, make sure you follow best practices: never put the same conditions in two places, never repeat constant data, use composition, dependency injection, inheritance, whatever you need to make sure you abide single responsibility principle, and maintain singular authority.</p>
<p>Most importantly, don’t get yourself tangled in too much speculation, let the story unfold naturally.</p>
<!--meta
  date: "2013-11-26"
  toc: true
  hn: 6809990
  tldr: Avoid architectural premature optimization
-->  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Tips on Rails 3 load paths ]]></title>
    <link>https://max.engineer/rails3-load-paths</link>
    <guid>https://max.engineer/rails3-load-paths</guid>
    <pubDate>Mon, 09 Sep 2013 20:49:14 -0400</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<h2 id="if-you-add-a-dir-directly-under-app">If you add a dir directly under app/</h2>
<p>Do nothing. All files in this dir are eager loaded in production and lazy loaded in development by default.</p>
<h2 id="if-you-add-a-dir-under-appsomething">If you add a dir under app/something/</h2>
<p>(e.g.&nbsp;<code>app/models/concerns/</code>, <code>app/models/products/</code>)</p>
<p>Ask: do I want to namespace modules and classes inside my new dir? For example in app/models/products/ you would need to wrap your class in <code>module Products</code>.</p>
<ul>
<li><p>If the answer is yes, do nothing. It will just work.</p></li>
<li><p>If the answer is no, append the exact path in your application.rb.</p>
<pre><code class="hljs ruby">config.autoload_paths += <span class="hljs-string">%W( <span class="hljs-subst">#{config.root}</span>/app/models/products )</span></code></pre></li>
</ul>
<p>In either case, everything will be eager loaded in production.</p>
<h2 id="if-you-add-code-in-your-lib-directory">If you add code in your lib/ directory</h2>
<h3 id="option-1">Option 1</h3>
<p>If you put something in the lib/ dir, what you are saying is:<span class="push-double"></span> <span class="pull-double">“</span>I wrote this library, and I want to depend on it where I decide.” This means that if you use your library in a rake task, but not in a rails app, you just <code>require</code> it in your rake task. If you need this library to always be loaded for your rails app, you <code>require</code> it in an initializer. If you need this library for some of your models or controllers, you <code>require_dependency</code> (see below why) it in those files, and since everything under your <code>app/</code> dir is already auto- and eager- loaded as needed, your library will only be<span class="push-double"></span> <span class="pull-double">“</span>pulled-in” if something that requires it from <code>app/</code> or rake, or your custom script, actually gets loaded.</p>
<h3 id="option-2-bad">Option 2 (bad)</h3>
<p>Another option is to add your whole lib dir into autoload_paths.</p>
<pre><code class="hljs ruby">config.autoload_paths += <span class="hljs-string">%W( <span class="hljs-subst">#{config.root}</span>/lib )</span></code></pre>
<p>This means you shouldn’t explicitly require your lib anywhere. As soon as you hit the namespace of your dir in other classes, rails will require it. The problem with this is that in Rails 3 if you just add something to your autoload paths it won’t get eager loaded in production. You would need to add it to <code>eager_load_paths</code> instead, which causes a different problem (see below). And in ruby 1.9 autoload is not threadsafe. You probably want eager loading in production. Requiring your lib explicitly, like in option 1, is akin to eager loading it, which is threadsafe.</p>
<h3 id="option-3-meh">Option 3 (meh)</h3>
<p>All the different things under your lib dir should be placed into their own directories, and those directories should be individually added to <code>eager_load_paths</code>.</p>
<pre><code class="hljs ruby">config.eager_load_paths += <span class="hljs-string">%W(
  <span class="hljs-subst">#{config.root}</span>/lib/my_lib1
  <span class="hljs-subst">#{config.root}</span>/lib/my_lib2
)</span></code></pre>
<p>This means that you can’t just throw files into your lib dir. If you have <code>my_lib1.rb</code>, you must put it under <code>my_lib1/my_lib1.rb</code> and <code>my_lib1</code> should be added to eager load paths. This means that if you have more files in <code>my_lib1</code>, you should create a dir <code>my_lib1/my_lib1/extra.rb</code>. This is a bit annoying.</p>
<h3 id="so-why-not-just-add-lib-into-eager_load_paths">So why not just add lib/ into eager_load_paths?</h3>
<p>If you add lib/ into <code>eager_load_paths</code>, everything will work great. Your files will be autoloaded in development, and eager-loaded in production. Except the problem is that <code>eager_load_paths</code> use globbing like <code>lib/**/*.rb</code>, meaning that everything in your lib dir will try to get loaded. Your tasks, your generators, everything. This is not what you want.</p>
<h3 id="organizing-lib">Organizing lib</h3>
<p>Regardless of which option you pick (option 1, hint hint), in your lib/ dir you should structure your code as if you structure a gem. If you need more than 1 file, you could for example add a same-named directory where everything is properly namespaced, and let your 1 file relatively require files in that directory.</p>
<h3 id="why-use-require_dependency-auto-reloading">Why use require_dependency (auto-reloading)</h3>
<p>If you use <code>require_dependency</code>, you are enabling auto-reload of your files in development across requests. <code>require</code> alone won’t do it. I suggested to use it in your rails app, but not in initializers or rake tasks because rake tasks only run once, and changing initializers always requires restart.</p>
<p>However, it won’t work without one additional piece of configuration. In application.rb you should add this:</p>
<pre><code class="hljs ruby">config.watchable_dirs[<span class="hljs-string">'lib'</span>] = [<span class="hljs-symbol">:rb</span>]</code></pre>
<p>P.S. I originally posted this article in a <a href="https://gist.github.com/maxim/6503591">gist</a>.</p>
<!--meta
  date: "2013-09-09T20:49:14-07:00"
  tldr: How to use autoload_paths, eager_load_paths, lib
  toc: true
-->  ]]></description>
  </item>
  <item>
    <title><![CDATA[ Multiple Table Inheritance With ActiveRecord ]]></title>
    <link>https://max.engineer/mti</link>
    <guid>https://max.engineer/mti</guid>
    <pubDate>Thu, 21 Jan 2010 04:01:40 -0500</pubDate>
    <dc:creator><![CDATA[ Max Chernyak ]]></dc:creator>
    <description><![CDATA[  

<p>Imagine writing an online shop with different types of products. Normally all products would have common attributes such as <code>title</code> and <code>price</code>. Some attributes will likely differ. <code>Tee</code> may have <code>size</code> such as S, M, or L, while a <code>Pen</code> could have an <code>ink_color</code>. It’s easy to see that <code>Tee</code> is a <code>Product</code>, and so is <code>Pen</code>. We are looking at an <em>is_a</em> relationship. When I program this type of relationship I usually use inheritance.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Product</span> &lt; ActiveRecord::Base</span>
<span class="hljs-keyword">end</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Tee</span> &lt; Product</span>
<span class="hljs-keyword">end</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Pen</span> &lt; Product</span>
<span class="hljs-keyword">end</span></code></pre>
<p>This inheritance looks reasonable, but now we have to come up with relational database structure. We need to find a way to store tee’s own attributes, pen’s own attributes, as well as their common (product’s) attributes without duplication. Some databases (PostgreSQL) provide support for table inheritance, but it’s a specialized feature which ties you down to the given db.</p>
<h2 id="single-table-inheritance">Single table inheritance</h2>
<p>ActiveRecord provides only one way to handle a <em>is_a</em> relationship which is <a href="https://en.wikipedia.org/wiki/Single_Table_Inheritance">Single Table Inheritance</a>. You’d have to create a table looking somewhat like the following.</p>
<table>
<thead>
<tr>
<th>id</th>
<th>type</th>
<th>price</th>
<th>title</th>
<th>size</th>
<th>ink_color</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>Tee</td>
<td>1000</td>
<td>tie-dye t-shirt</td>
<td>M</td>
<td></td>
</tr>
<tr>
<td>2</td>
<td>Pen</td>
<td>500</td>
<td>ball pen</td>
<td></td>
<td>blue</td>
</tr>
</tbody>
</table>
<p>The problem here is that all attributes are stored in the same table. It’s likely that soon the number of attributes will grow unmanageable, and most of them will always stay <code>NULL</code> since they’ll be specific to only one <code>type</code>.</p>
<h2 id="polymorphic-has_one-association">Polymorphic has_one association</h2>
<p>A <code>has_one</code> association allows us to split out tees, pens, and products into three different tables. In fact — as you’re about to see — this is the only way to get what we want. The problem is that it creates a <code>has_a</code> relationship, and we want <code>is_a</code>. Since there isn’t much choice, we can make it <em>look</em> like we have an is_a relationship, which I’m about to show.</p>
<h2 id="multiple-table-inheritance-simulated">Multiple table inheritance (simulated)</h2>
I was speaking with <a href="https://twitter.com/fowlduck">the awesome @fowlduck</a> over at
<p>#railsbridge <span class="small-caps">IRC</span> channel about ways to achieve something like <span class="small-caps">MTI</span> with Active Record. He pointed me to a pastie where he implemented an MTI-like behavior and called it a<span class="push-double"></span> <span class="pull-double">“</span>hydra” pattern, which I subsequently cleaned up a bit.</p>
<p>So we want to have 3 tables in the database.</p>
<ul>
<li>product_properties</li>
<li>tees</li>
<li>pens</li>
</ul>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ProductProperties</span> &lt; ActiveRecord::Base</span>
  belongs_to <span class="hljs-symbol">:sellable</span>, <span class="hljs-symbol">:polymorphic</span> =&gt; <span class="hljs-literal">true</span>, <span class="hljs-symbol">:dependent</span> =&gt; <span class="hljs-symbol">:destroy</span>
<span class="hljs-keyword">end</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Tee</span> &lt; ActiveRecord::Base</span>
  has_one <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:as</span> =&gt; <span class="hljs-symbol">:sellable</span>, <span class="hljs-symbol">:autosave</span> =&gt; <span class="hljs-literal">true</span>
<span class="hljs-keyword">end</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Pen</span> &lt; ActiveRecord::Base</span>
  has_one <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:as</span> =&gt; <span class="hljs-symbol">:sellable</span>, <span class="hljs-symbol">:autosave</span> =&gt; <span class="hljs-literal">true</span>
<span class="hljs-keyword">end</span></code></pre>
<p>Immediately we can see duplicated code between Tee and Pen. This can be easily solved with a mixin.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">Sellable</span></span>
  def <span class="hljs-keyword">self</span>.included(base)
    base.has_one <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:as</span> =&gt; <span class="hljs-symbol">:sellable</span>, <span class="hljs-symbol">:autosave</span> =&gt; <span class="hljs-literal">true</span>
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Tee</span> &lt; ActiveRecord::Base</span>
  <span class="hljs-keyword">include</span> Sellable
<span class="hljs-keyword">end</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Pen</span> &lt; ActiveRecord::Base</span>
  <span class="hljs-keyword">include</span> Sellable
<span class="hljs-keyword">end</span></code></pre>
<p>Now comes another issue. Every time we want to access price or title attributes (stored in product_properties) we have to call <code>@tee.product_properties.price</code>. This isn’t very convenient, especially considering that product_properties has to be built first in case it doesn’t exist. So let’s ensure it’s always built by updating the module.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">Sellable</span></span>
  def <span class="hljs-keyword">self</span>.included(base)
    base.has_one <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:as</span> =&gt; <span class="hljs-symbol">:sellable</span>, <span class="hljs-symbol">:autosave</span> =&gt; <span class="hljs-literal">true</span>
    base.alias_method_chain <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:autobuild</span>
  <span class="hljs-keyword">end</span>
  
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">product_properties_with_autobuild</span></span>
    product_properties_without_autobuild <span class="hljs-params">||</span> build_product_properties
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span></code></pre>
<p>Awesome, now <code>product_properties</code> is built automatically in case it doesn’t exist. We still have the method accessing issue though. For that I used <code>method_missing</code>.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">Sellable</span></span>
  def <span class="hljs-keyword">self</span>.included(base)
    base.has_one <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:as</span> =&gt; <span class="hljs-symbol">:sellable</span>, <span class="hljs-symbol">:autosave</span> =&gt; <span class="hljs-literal">true</span>
    base.alias_method_chain <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:autobuild</span>
  <span class="hljs-keyword">end</span>
  
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">product_properties_with_autobuild</span></span>
    product_properties_without_autobuild <span class="hljs-params">||</span> build_product_properties
  <span class="hljs-keyword">end</span>
  
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">method_missing</span><span class="hljs-params">(meth, *args, &amp;blk)</span></span>
    <span class="hljs-keyword">if</span> product_properties.public_methods.<span class="hljs-keyword">include</span>?(meth.to_s)
      product_properties.send(meth, *args, &amp;blk)
    <span class="hljs-keyword">else</span>
      <span class="hljs-keyword">super</span>
    <span class="hljs-keyword">end</span>
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span></code></pre>
<p>Now if a method is missing from Tee or Pen instance it will be delegated to <code>product_properties</code>, which enables us to use <code>@tee.price</code> and <code>@tee.title</code>.</p>
<p>However, what about validations? Let’s say we want all products to always have a title, and we want to see an error appear on a Tee instance when <code>ProductProperties#title</code> is missing. Basically I want to completely remove product_properties from my sight as if it doesn’t exist, make it absolutely transparent. Let’s add the necessary validation in ProductProperties.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ProductProperties</span> &lt; ActiveRecord::Base</span>
  belongs_to <span class="hljs-symbol">:sellable</span>, <span class="hljs-symbol">:polymorphic</span> =&gt; <span class="hljs-literal">true</span>, <span class="hljs-symbol">:dependent</span> =&gt; <span class="hljs-symbol">:destroy</span>
  validates_presence_of <span class="hljs-symbol">:title</span>
<span class="hljs-keyword">end</span></code></pre>
<p>And now let’s make all Sellable models respect the validation as if it’s their own.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">Sellable</span></span>
  def <span class="hljs-keyword">self</span>.included(base)
    base.has_one <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:as</span> =&gt; <span class="hljs-symbol">:sellable</span>, <span class="hljs-symbol">:autosave</span> =&gt; <span class="hljs-literal">true</span>
    base.validate <span class="hljs-symbol">:product_properties_must_be_valid</span>
    base.alias_method_chain <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:autobuild</span>
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">product_properties_with_autobuild</span></span>
    product_properties_without_autobuild <span class="hljs-params">||</span> build_product_properties
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">method_missing</span><span class="hljs-params">(meth, *args, &amp;blk)</span></span>
    <span class="hljs-keyword">if</span> product_properties.public_methods.<span class="hljs-keyword">include</span>?(meth.to_s)
      product_properties.send(meth, *args, &amp;blk)
    <span class="hljs-keyword">else</span>
      <span class="hljs-keyword">super</span>
    <span class="hljs-keyword">end</span>
  <span class="hljs-keyword">end</span>

  protected

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">product_properties_must_be_valid</span></span>
    <span class="hljs-keyword">unless</span> product_properties.valid?
      product_properties.errors.each <span class="hljs-keyword">do</span> <span class="hljs-params">|attr, message|</span>
        errors.add(attr, message)
      <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">end</span>
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span></code></pre>
<p>Notice that I’m including an additional validator with the Sellable module. The validator collects all the errors on ProductProperties and adds them to parent class as if the errors are on a Tee or Pen itself.</p>
<p>As a nice finishing touch we can put this snippet into a Rails initializer.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ActiveRecord::Base</span></span>
  def <span class="hljs-keyword">self</span>.acts_as_product
    <span class="hljs-keyword">include</span> Sellable
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

<span class="hljs-comment"># now we can say</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Tee</span> &lt; ActiveRecord::Base</span>
  acts_as_product
<span class="hljs-keyword">end</span></code></pre>
<p>Although that’s a matter of taste.</p>
<h2 id="fixing-method_missing">Fixing method_missing</h2>
<p>There is a problem with method_missing. It checks the array of public_methods on product_properties to find out if delegation should occur. This check will fail in cases like <code>@tee.title_changed?</code>. That’s a magic method and therefore will not be part of static method array. Well, this is an easy fix.</p>
<pre><code class="hljs ruby"><span class="hljs-comment"># Replace old method_missing with this one:</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">method_missing</span><span class="hljs-params">(meth, *args, &amp;blk)</span></span>
  product_properties.send(meth, *args, &amp;blk)
<span class="hljs-keyword">rescue</span> NoMethodError
  <span class="hljs-keyword">super</span>
<span class="hljs-keyword">end</span></code></pre>
<p>As you can see, even magic methods will work this way. Only if a <code>NoMethodError</code> is thrown we withdraw back into <code>super</code>.</p>
<h2 id="handling-attributes-hash">Handling attributes hash</h2>
<p>In the comments Austin brought up a case where initializing new models like <code>Tee.new(:title =&gt; "foo")</code> will throw an unknown attribute error. That’s expected since we rely on <code>method_missing</code> for accessing ProductProperties attributes. Instead we should define accessor methods explicitly in our individual products. Thankfully, it’s not too hard to accomplish with our Sellable mixin. First we need to add a submodule <code>ClassMethods</code> with a class method that uses class_eval to magically generate missing attributes.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">ClassMethods</span></span>
  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">define_product_properties_accessors</span></span>
    all_attributes = ProductProperties.content_columns.map(&amp;<span class="hljs-symbol">:name</span>)
    ignored_attributes = [<span class="hljs-string">"created_at"</span>, <span class="hljs-string">"updated_at"</span>, <span class="hljs-string">"sellable_type"</span>]
    attributes_to_delegate = all_attributes - ignored_attributes
    attributes_to_delegate.each <span class="hljs-keyword">do</span> <span class="hljs-params">|attrib|</span>
      class_eval <span class="hljs-string">&lt;&lt;-RUBY
        def <span class="hljs-subst">#{attrib}</span>
          product_properties.<span class="hljs-subst">#{attrib}</span>
        end
        
        def <span class="hljs-subst">#{attrib}</span>=(value)
          self.product_properties.<span class="hljs-subst">#{attrib}</span> = value
        end
        
        def <span class="hljs-subst">#{attrib}</span>?
          self.product_properties.<span class="hljs-subst">#{attrib}</span>?
        end
      RUBY</span>
    <span class="hljs-keyword">end</span>
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span></code></pre>
<p>I’ll walk through this code quickly. First we’re extracting only the columns that we want to access. When we call <code>content_columns</code> in the first line of the method, it already excludes a bunch of special columns such as <code>id</code> and <code>type</code>. We then manually subtract more columns we’d like to ignore, such as timestamps, and polymorphic type.</p>
<p>Next we iterate over each remaining attribute and creating instance methods for it, such as <code>title</code>, <code>title=</code> and (for completeness) <code>title?</code>. Having these accessors defined explicitly is enough for ActiveRecord to see them when performing mass assignment, etc. We can now do something like <code>Tee.new(:title =&gt; "foo")</code> without any problems. The extra cases such as <code>@tee.title_changed?</code> are still handled by <code>method_missing</code> so we’re good.</p>
<p>One more thing left. We need to run this method on the base class into which we include Sellable. Just need to add a couple of lines to the <code>self.included</code> hook.</p>
<pre><code class="hljs ruby">def <span class="hljs-keyword">self</span>.included(base)
  base.has_one <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:as</span> =&gt; <span class="hljs-symbol">:sellable</span>, <span class="hljs-symbol">:autosave</span> =&gt; <span class="hljs-literal">true</span>
  base.validate <span class="hljs-symbol">:product_properties_must_be_valid</span>
  base.alias_method_chain <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:autobuild</span>
  
  <span class="hljs-comment"># Add these two lines:</span>
  base.extend ClassMethods
  base.<span class="hljs-function"><span class="hljs-keyword">def</span><span class="hljs-title">ine_product_properties_accessors</span></span>
<span class="hljs-keyword">end</span></code></pre>
<p>And we’re all set.</p>
<h2 id="all-together-now">All together now</h2>
<p>Here’s the full picture of everything we just did.</p>
<pre><code class="hljs ruby"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ActiveRecord::Base</span></span>
  def <span class="hljs-keyword">self</span>.acts_as_product
    <span class="hljs-keyword">include</span> Sellable
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ProductProperties</span> &lt; ActiveRecord::Base</span>
  belongs_to <span class="hljs-symbol">:sellable</span>, <span class="hljs-symbol">:polymorphic</span> =&gt; <span class="hljs-literal">true</span>, <span class="hljs-symbol">:dependent</span> =&gt; <span class="hljs-symbol">:destroy</span>
  validates_presence_of <span class="hljs-symbol">:title</span> <span class="hljs-comment"># for example</span>
<span class="hljs-keyword">end</span>

<span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">Sellable</span></span>
  def <span class="hljs-keyword">self</span>.included(base)
    base.has_one <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:as</span> =&gt; <span class="hljs-symbol">:sellable</span>, <span class="hljs-symbol">:autosave</span> =&gt; <span class="hljs-literal">true</span>
    base.validate <span class="hljs-symbol">:product_properties_must_be_valid</span>
    base.alias_method_chain <span class="hljs-symbol">:product_properties</span>, <span class="hljs-symbol">:autobuild</span>
    base.extend ClassMethods
    base.<span class="hljs-function"><span class="hljs-keyword">def</span><span class="hljs-title">ine_product_properties_accessors</span></span>
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">product_properties_with_autobuild</span></span>
    product_properties_without_autobuild <span class="hljs-params">||</span> build_product_properties
  <span class="hljs-keyword">end</span>

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">method_missing</span><span class="hljs-params">(meth, *args, &amp;blk)</span></span>
    product_properties.send(meth, *args, &amp;blk)
  <span class="hljs-keyword">rescue</span> NoMethodError
    <span class="hljs-keyword">super</span>
  <span class="hljs-keyword">end</span>

  <span class="hljs-class"><span class="hljs-keyword">module</span> <span class="hljs-title">ClassMethods</span></span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">define_product_properties_accessors</span></span>
      all_attributes = ProductProperties.content_columns.map(&amp;<span class="hljs-symbol">:name</span>)
      ignored_attributes = [<span class="hljs-string">"created_at"</span>, <span class="hljs-string">"updated_at"</span>, <span class="hljs-string">"sellable_type"</span>]
      attributes_to_delegate = all_attributes - ignored_attributes
      attributes_to_delegate.each <span class="hljs-keyword">do</span> <span class="hljs-params">|attrib|</span>
        class_eval <span class="hljs-string">&lt;&lt;-RUBY
          def <span class="hljs-subst">#{attrib}</span>
            product_properties.<span class="hljs-subst">#{attrib}</span>
          end

          def <span class="hljs-subst">#{attrib}</span>=(value)
            self.product_properties.<span class="hljs-subst">#{attrib}</span> = value
          end

          def <span class="hljs-subst">#{attrib}</span>?
            self.product_properties.<span class="hljs-subst">#{attrib}</span>?
          end
        RUBY</span>
      <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">end</span>
  <span class="hljs-keyword">end</span>

  protected

  <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">product_properties_must_be_valid</span></span>
    <span class="hljs-keyword">unless</span> product_properties.valid?
      product_properties.errors.each <span class="hljs-keyword">do</span> <span class="hljs-params">|attr, message|</span>
        errors.add(attr, message)
      <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">end</span>
  <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Tee</span> &lt; ActiveRecord::Base</span>
  acts_as_product
<span class="hljs-keyword">end</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Pen</span> &lt; ActiveRecord::Base</span>
  acts_as_product
<span class="hljs-keyword">end</span></code></pre>
<p>This can be easily adapted for any other use case besides products in a store. In fact, with some meta magic or code generation this can easily be made into a plugin which I encourage you to try and send me the link when you’re done.&nbsp;:)</p>
<!--meta
  date: Thu, 21 Jan 2010 04:01:40 +0000
  toc: true
  tldr: Proof of concept MTI implementation in Rails
-->  ]]></description>
  </item>
</channel>
</rss>
