<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:blogger='http://schemas.google.com/blogger/2008' xmlns:georss='http://www.georss.org/georss' xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-11265228</id><updated>2026-04-08T19:50:13.528-07:00</updated><title type='text'>tech</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default?redirect=false'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default?start-index=26&amp;max-results=25&amp;redirect=false'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>181</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-11265228.post-8810329584496972211</id><published>2026-04-08T19:49:00.000-07:00</published><updated>2026-04-08T19:50:13.476-07:00</updated><title type='text'></title><content type='html'>&lt;html lang=&quot;en&quot;&gt;
&lt;head&gt;
&lt;meta charset=&quot;UTF-8&quot;&gt;
&lt;meta name=&quot;viewport&quot; content=&quot;width=device-width, initial-scale=1.0&quot;&gt;
&lt;title&gt;The Prompt That Crossed Two Organizations&lt;/title&gt;
&lt;link rel=&quot;preconnect&quot; href=&quot;https://fonts.googleapis.com&quot;&gt;
&lt;link href=&quot;https://fonts.googleapis.com/css2?family=Playfair+Display:ital,wght@0,400;0,700;1,400&amp;family=DM+Sans:wght@300;400;500&amp;family=DM+Mono:wght@400&amp;display=swap&quot; rel=&quot;stylesheet&quot;&gt;
&lt;style&gt;
  *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }

  :root {
    --ink: #1a1814;
    --ink-mid: #3d3830;
    --ink-light: #7a7060;
    --paper: #f7f4ef;
    --paper-warm: #ede9e0;
    --accent: #c4401a;
    --accent-light: #e8d5cc;
    --rule: #d4cfc5;
    --mono: &#39;DM Mono&#39;, monospace;
    --serif: &#39;Playfair Display&#39;, Georgia, serif;
    --sans: &#39;DM Sans&#39;, sans-serif;
  }

  html { font-size: 18px; scroll-behavior: smooth; }

  body {
    background: var(--paper);
    color: var(--ink);
    font-family: var(--sans);
    font-weight: 300;
    line-height: 1.75;
    -webkit-font-smoothing: antialiased;
  }

  /* ── Header ── */
  header {
    background: var(--ink);
    color: var(--paper);
    padding: 80px 40px 60px;
    position: relative;
    overflow: hidden;
  }

  header::before {
    content: &#39;&#39;;
    position: absolute;
    top: -60px; right: -60px;
    width: 320px; height: 320px;
    border-radius: 50%;
    background: var(--accent);
    opacity: 0.12;
  }

  header::after {
    content: &#39;&#39;;
    position: absolute;
    bottom: 0; left: 0; right: 0;
    height: 3px;
    background: linear-gradient(90deg, var(--accent) 0%, transparent 70%);
  }

  .header-inner {
    max-width: 760px;
    margin: 0 auto;
    position: relative;
  }

  .kicker {
    font-family: var(--mono);
    font-size: 0.65rem;
    letter-spacing: 0.18em;
    text-transform: uppercase;
    color: var(--accent);
    margin-bottom: 24px;
    display: flex;
    align-items: center;
    gap: 12px;
  }

  .kicker::before {
    content: &#39;&#39;;
    display: block;
    width: 32px;
    height: 1px;
    background: var(--accent);
  }

  h1 {
    font-family: var(--serif);
    font-size: clamp(2rem, 5vw, 3.2rem);
    font-weight: 700;
    line-height: 1.15;
    color: var(--paper);
    margin-bottom: 28px;
    max-width: 640px;
  }

  h1 em {
    font-style: italic;
    color: #c8bfb0;
  }

  .deck {
    font-size: 1.05rem;
    font-weight: 300;
    color: #9b9080;
    max-width: 560px;
    line-height: 1.6;
  }

  .byline {
    margin-top: 28px;
    display: flex;
    align-items: center;
    gap: 10px;
    flex-wrap: wrap;
  }

  .byline-author {
    font-family: var(--mono);
    font-size: 0.7rem;
    letter-spacing: 0.1em;
    color: var(--paper);
    opacity: 0.85;
  }

  .byline-sep {
    color: var(--accent);
    font-size: 0.8rem;
  }

  .byline-collab {
    font-family: var(--mono);
    font-size: 0.7rem;
    letter-spacing: 0.08em;
    color: #9b9080;
  }

  /* ── Article body ── */
  article {
    max-width: 760px;
    margin: 0 auto;
    padding: 64px 40px 100px;
  }

  p {
    margin-bottom: 1.4em;
    font-size: 1rem;
    color: var(--ink-mid);
  }

  /* ── Section headings ── */
  h2 {
    font-family: var(--serif);
    font-size: 1.55rem;
    font-weight: 400;
    font-style: italic;
    color: var(--ink);
    margin: 56px 0 20px;
    padding-bottom: 16px;
    border-bottom: 1px solid var(--rule);
    position: relative;
  }

  h2::after {
    content: &#39;&#39;;
    position: absolute;
    bottom: -1px; left: 0;
    width: 48px; height: 2px;
    background: var(--accent);
  }

  /* ── Pull quote ── */
  .pull {
    margin: 48px -24px;
    padding: 32px 40px;
    border-left: 3px solid var(--accent);
    background: var(--paper-warm);
    font-family: var(--serif);
    font-size: 1.25rem;
    font-style: italic;
    color: var(--ink);
    line-height: 1.5;
    position: relative;
  }

  /* ── Finding blocks ── */
  .finding {
    margin: 20px 0;
    padding: 20px 24px;
    background: white;
    border: 1px solid var(--rule);
    border-left: 3px solid var(--accent);
    border-radius: 0 4px 4px 0;
  }

  .finding-label {
    font-family: var(--mono);
    font-size: 0.62rem;
    letter-spacing: 0.14em;
    text-transform: uppercase;
    color: var(--accent);
    margin-bottom: 8px;
  }

  .finding p {
    margin: 0;
    font-size: 0.95rem;
    color: var(--ink-mid);
  }

  .finding strong {
    color: var(--ink);
    font-weight: 500;
  }

  /* ── Option cards ── */
  .options {
    display: grid;
    gap: 16px;
    margin: 28px 0;
  }

  .option {
    padding: 24px 28px;
    background: white;
    border: 1px solid var(--rule);
    border-radius: 4px;
  }

  .option-label {
    font-family: var(--mono);
    font-size: 0.62rem;
    letter-spacing: 0.14em;
    text-transform: uppercase;
    color: var(--ink-light);
    margin-bottom: 6px;
  }

  .option h3 {
    font-family: var(--sans);
    font-size: 0.95rem;
    font-weight: 500;
    color: var(--ink);
    margin-bottom: 10px;
  }

  .option p {
    font-size: 0.9rem;
    margin: 0 0 10px;
    color: var(--ink-mid);
  }

  .option-meta {
    display: flex;
    gap: 24px;
    margin-top: 12px;
    padding-top: 12px;
    border-top: 1px solid var(--rule);
    font-size: 0.82rem;
  }

  .improves { color: #2a6e4a; }
  .risk     { color: #8b3a1a; }

  .option-meta span::before {
    font-family: var(--mono);
    font-size: 0.65rem;
    display: block;
    letter-spacing: 0.1em;
    text-transform: uppercase;
    margin-bottom: 3px;
    opacity: 0.6;
  }

  .improves::before { content: &#39;Improves&#39;; }
  .risk::before     { content: &#39;Risk&#39;; }

  /* ── Bottom line ── */
  .bottom-line {
    margin: 40px 0;
    padding: 28px 32px;
    background: var(--ink);
    color: var(--paper);
    border-radius: 4px;
    font-size: 0.95rem;
    line-height: 1.7;
    font-weight: 300;
  }

  .bottom-line strong {
    color: #e0d8cc;
    font-weight: 500;
  }

  /* ── Divider ── */
  .divider {
    display: flex;
    align-items: center;
    gap: 16px;
    margin: 56px 0;
    color: var(--rule);
  }

  .divider::before,
  .divider::after {
    content: &#39;&#39;;
    flex: 1;
    height: 1px;
    background: var(--rule);
  }

  .divider-icon {
    font-family: var(--mono);
    font-size: 0.7rem;
    letter-spacing: 0.2em;
    color: var(--ink-light);
  }

  /* ── Closing note ── */
  .closing-note {
    margin-top: 64px;
    padding: 32px 0 0;
    border-top: 1px solid var(--rule);
    font-size: 0.88rem;
    color: var(--ink-light);
    font-style: italic;
    line-height: 1.7;
  }

  /* ── Footer ── */
  footer {
    background: var(--ink);
    color: #6b6358;
    text-align: center;
    padding: 32px;
    font-family: var(--mono);
    font-size: 0.65rem;
    letter-spacing: 0.1em;
  }

  /* ── Responsive ── */
  @media (max-width: 600px) {
    header { padding: 48px 24px 40px; }
    article { padding: 40px 24px 64px; }
    .pull { margin: 32px -8px; padding: 24px 24px; }
  }

  /* ── Animations ── */
  @keyframes fadeUp {
    from { opacity: 0; transform: translateY(20px); }
    to   { opacity: 1; transform: translateY(0); }
  }

  header .header-inner &gt; * {
    animation: fadeUp 0.6s ease both;
  }
  .kicker          { animation-delay: 0.1s; }
  h1               { animation-delay: 0.25s; }
  .deck            { animation-delay: 0.4s; }
&lt;/style&gt;
&lt;/head&gt;
&lt;body&gt;

&lt;header&gt;
  &lt;div class=&quot;header-inner&quot;&gt;
    &lt;div class=&quot;kicker&quot;&gt;Engineering × AI&lt;/div&gt;
    &lt;h1&gt;The Prompt That Crossed Two Organizations — &lt;em&gt;And Got Sharper Each Time&lt;/em&gt;&lt;/h1&gt;
    &lt;p class=&quot;deck&quot;&gt;How a product executive&#39;s pressure-testing framework traveled to systems engineering, and what happened when we pointed it at a real AWS workflow.&lt;/p&gt;
    &lt;div class=&quot;byline&quot;&gt;
      &lt;span class=&quot;byline-author&quot;&gt;By Thushara Wijeratna&lt;/span&gt;
      &lt;span class=&quot;byline-sep&quot;&gt;·&lt;/span&gt;
      &lt;span class=&quot;byline-collab&quot;&gt;with Scott Johnson &amp;amp; Ryan Polley&lt;/span&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/header&gt;

&lt;article&gt;

  &lt;p&gt;There&#39;s a quiet revolution happening in how smart teams use AI — and it has nothing to do with the model. It has everything to do with the instructions.&lt;/p&gt;

  &lt;p&gt;A few weeks ago I borrowed a prompt template from a product owner at a large enterprise. He had built it in ChatGPT to do something powerful: give his leadership team a private space to pressure-test ideas before they ever entered a room. When executives were developing roadmaps, he&#39;d run them through the model first — surfacing assumptions, stress-testing the logic, anticipating the hard questions a CFO or GM might raise. The result was a win on both sides of the table. The CEO could arrive at conversations with a sharper, more fully-formed point of view. And the product manager got to execute against a plan that had already survived serious scrutiny — no half-baked pivots, no surprises mid-flight.&lt;/p&gt;

  &lt;p&gt;Think of it less as critique and more as a rehearsal room. The tool doesn&#39;t challenge people — it challenges ideas, privately, before the stakes are high.&lt;/p&gt;

  &lt;p&gt;I took the same framework and ran it in Claude — the model we use at Solo. It worked just as well. Which raises something worth sitting with: the framework didn&#39;t just travel across organizations and domains. It traveled across AI models entirely. That&#39;s the tell. When the same set of instructions produces sharp, useful output regardless of which model is running them, the instructions are the asset. The model is increasingly the commodity.&lt;/p&gt;

  &lt;p&gt;I read those instructions and thought: &lt;em&gt;this exact mental model applies to systems architecture.&lt;/em&gt;&lt;/p&gt;

  &lt;p&gt;So I adapted them. Same seven-step skeleton — identify the thesis, stress-test the portfolio balance, expose assumptions, map risk concentration, name the opportunity costs, simulate the leadership challenge, propose alternative shapes. I changed the vocabulary and the lens. Instead of asking &lt;em&gt;what will Finance push back on&lt;/em&gt;, I asked &lt;em&gt;what will the CEO and engineering team challenge&lt;/em&gt;. Instead of scoring for revenue potential, I scored for cost and time savings. The goal shifted from sharpening an executive&#39;s boardroom instincts to sharpening an engineer&#39;s thinking in peer and leadership conversations.&lt;/p&gt;

  &lt;p&gt;Then I pointed it at something real: the workflow our team at Solo uses to create new EventBridge scheduled rules in AWS.&lt;/p&gt;

  &lt;div class=&quot;divider&quot;&gt;&lt;span class=&quot;divider-icon&quot;&gt;— —&lt;/span&gt;&lt;/div&gt;

  &lt;h2&gt;What the workflow looked like&lt;/h2&gt;

  &lt;p&gt;The process isn&#39;t complicated on its face. You take a YAML template, swap in five dynamic variables, insert the block into a CloudFormation file, open a PR, get infra review, run a change set in the AWS console, verify, execute, merge. Clear enough. Solid IaC discipline — no console drift, changes tracked in git, peer review required.&lt;/p&gt;

  &lt;p&gt;What happened next is what makes this worth writing about.&lt;/p&gt;

  &lt;div class=&quot;divider&quot;&gt;&lt;span class=&quot;divider-icon&quot;&gt;— —&lt;/span&gt;&lt;/div&gt;

  &lt;h2&gt;What Claude found&lt;/h2&gt;

  &lt;div class=&quot;finding&quot;&gt;
    &lt;div class=&quot;finding-label&quot;&gt;Step 1 — Strategic thesis&lt;/div&gt;
    &lt;p&gt;&lt;strong&gt;The thesis was absent.&lt;/strong&gt; Claude&#39;s first move was to name this: &lt;em&gt;defensive / constraint-driven, no clear thesis.&lt;/em&gt; The process enforces the right principle — infrastructure as code over console changes — but it&#39;s encoded as a manual human checklist rather than a system with guardrails. The implied thesis should be &quot;infrastructure changes are safe, auditable, and low-friction.&quot; Claude&#39;s verdict: it&#39;s only partially achieving that.&lt;/p&gt;
  &lt;/div&gt;

  &lt;div class=&quot;finding&quot;&gt;
    &lt;div class=&quot;finding-label&quot;&gt;Step 2 — Portfolio balance&lt;/div&gt;
    &lt;p&gt;&lt;strong&gt;The portfolio is completely unbalanced.&lt;/strong&gt; Claude broke the workflow down across four dimensions: IaC discipline (present), automation and tooling (absent), validation guardrails (absent), developer autonomy (weak). Its summary was blunt: &lt;em&gt;all weight is on human process, zero investment in tooling that would make this self-service and safe simultaneously.&lt;/em&gt;&lt;/p&gt;
  &lt;/div&gt;

  &lt;div class=&quot;finding&quot;&gt;
    &lt;div class=&quot;finding-label&quot;&gt;Step 3 — Assumption exposure&lt;/div&gt;
    &lt;p&gt;&lt;strong&gt;The assumptions don&#39;t hold up.&lt;/strong&gt; Claude identified four things that must be true for this process to succeed — and challenged each one:&lt;/p&gt;
    &lt;p style=&quot;margin-top:12px&quot;&gt;&lt;em&gt;&quot;Engineers will follow the checklist correctly every time.&quot;&lt;/em&gt; There&#39;s no validation that catches a malformed cron expression, wrong snake_case, or a misaligned variable before it hits CloudFormation.&lt;/p&gt;
    &lt;p&gt;&lt;em&gt;&quot;Infra team review is a meaningful gate.&quot;&lt;/em&gt; Unclear. If it&#39;s checking syntactic correctness, a linter does this faster and more reliably. If it&#39;s checking strategic intent, that&#39;s not documented anywhere.&lt;/p&gt;
    &lt;p&gt;&lt;em&gt;&quot;The change set review step prevents mistakes.&quot;&lt;/em&gt; Partially — but it requires the engineer to know what to look for. No checklist defines what &quot;looks good&quot; actually means.&lt;/p&gt;
    &lt;p&gt;&lt;em&gt;&quot;Console changes never happen.&quot;&lt;/em&gt; The last line of the process says &lt;em&gt;never change the rule from the AWS Console&lt;/em&gt; — but there&#39;s no enforcement mechanism. That&#39;s policy, not a guardrail.&lt;/p&gt;
  &lt;/div&gt;

  &lt;div class=&quot;finding&quot;&gt;
    &lt;div class=&quot;finding-label&quot;&gt;Step 4 — Risk concentration&lt;/div&gt;
    &lt;p&gt;&lt;strong&gt;The risk concentration is real.&lt;/strong&gt; One engineer executing the process incorrectly causes a production scheduling outage or a silent missed execution. The template has six-plus &lt;code style=&quot;font-family:var(--mono);font-size:0.85em;background:var(--paper-warm);padding:1px 5px;border-radius:3px&quot;&gt;Fn::ImportValue&lt;/code&gt; calls — a single upstream stack name change silently breaks every rule, with no cross-reference validation documented. And the process assumes the engineer simultaneously knows valid cron syntax, CloudFormation change set semantics, ECS task override structure, and rake task naming conventions. That&#39;s a high knowledge bar with no scaffolding.&lt;/p&gt;
  &lt;/div&gt;

  &lt;div class=&quot;pull&quot;&gt;
    The most underappreciated risk lives in the last line: &lt;em&gt;&quot;Never change the rule from the AWS Console.&quot;&lt;/em&gt; A critical constraint buried where engineers are least likely to retain it.
  &lt;/div&gt;

  &lt;p&gt;If someone misses it — especially under time pressure — they introduce configuration drift that CloudFormation won&#39;t detect on the next deploy. The instruction exists. The enforcement doesn&#39;t.&lt;/p&gt;

  &lt;div class=&quot;divider&quot;&gt;&lt;span class=&quot;divider-icon&quot;&gt;— —&lt;/span&gt;&lt;/div&gt;

  &lt;h2&gt;The leadership challenge simulation&lt;/h2&gt;

  &lt;p&gt;Claude predicted exactly the questions that would land hardest in an executive or peer review:&lt;/p&gt;

  &lt;div class=&quot;finding&quot;&gt;
    &lt;div class=&quot;finding-label&quot;&gt;CEO / Engineering lead&lt;/div&gt;
    &lt;p&gt;&lt;em&gt;&quot;How many engineer-hours does this take per rule addition, end to end? What&#39;s the error rate? Have we had production scheduling failures from this process?&quot;&lt;/em&gt;&lt;/p&gt;
    &lt;p style=&quot;margin-top:10px&quot;&gt;Probably no clean answers. That&#39;s the problem.&lt;/p&gt;
  &lt;/div&gt;

  &lt;div class=&quot;finding&quot;&gt;
    &lt;div class=&quot;finding-label&quot;&gt;Finance&lt;/div&gt;
    &lt;p&gt;&lt;em&gt;&quot;If a scheduled job silently fails to be created or runs at the wrong time, what&#39;s the business impact?&quot;&lt;/em&gt;&lt;/p&gt;
    &lt;p style=&quot;margin-top:10px&quot;&gt;The process has no alerting or confirmation that a newly added rule is actually firing. Execution success is not the same as operational correctness.&lt;/p&gt;
  &lt;/div&gt;

  &lt;div class=&quot;finding&quot;&gt;
    &lt;div class=&quot;finding-label&quot;&gt;Engineering&lt;/div&gt;
    &lt;p&gt;&lt;em&gt;&quot;Why do I need infra team review for a cron job? Why can&#39;t I validate this locally? Why is there no test environment path?&quot;&lt;/em&gt;&lt;/p&gt;
    &lt;p style=&quot;margin-top:10px&quot;&gt;Claude&#39;s assessment: these are legitimate objections. The current process treats every engineer as a potential misconfigurer rather than building systems that make misconfiguration hard.&lt;/p&gt;
  &lt;/div&gt;

  &lt;div class=&quot;divider&quot;&gt;&lt;span class=&quot;divider-icon&quot;&gt;— —&lt;/span&gt;&lt;/div&gt;

  &lt;h2&gt;Three alternative shapes&lt;/h2&gt;

  &lt;p&gt;Claude proposed three different strategic directions, each with honest tradeoffs:&lt;/p&gt;

  &lt;div class=&quot;options&quot;&gt;
    &lt;div class=&quot;option&quot;&gt;
      &lt;div class=&quot;option-label&quot;&gt;Option A — Recommended near-term&lt;/div&gt;
      &lt;h3&gt;Script-first&lt;/h3&gt;
      &lt;p&gt;Build a small CLI that prompts for inputs, validates cron syntax, generates the YAML block, and inserts it correctly into the file. This eliminates the entire class of template variable substitution errors and removes the knowledge burden from the engineer.&lt;/p&gt;
      &lt;div class=&quot;option-meta&quot;&gt;
        &lt;span class=&quot;improves&quot;&gt;Eliminates substitution error class entirely&lt;/span&gt;
        &lt;span class=&quot;risk&quot;&gt;One-time build and maintenance investment&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;

    &lt;div class=&quot;option&quot;&gt;
      &lt;div class=&quot;option-label&quot;&gt;Option B&lt;/div&gt;
      &lt;h3&gt;Separate stack per rule&lt;/h3&gt;
      &lt;p&gt;Instead of one monolithic YAML file, each rule gets its own CloudFormation stack. Engineers own their rule&#39;s lifecycle. Merge conflicts disappear. Blast radius is isolated.&lt;/p&gt;
      &lt;div class=&quot;option-meta&quot;&gt;
        &lt;span class=&quot;improves&quot;&gt;Eliminates merge conflicts, isolates blast radius&lt;/span&gt;
        &lt;span class=&quot;risk&quot;&gt;Stack proliferation requires naming discipline&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;

    &lt;div class=&quot;option&quot;&gt;
      &lt;div class=&quot;option-label&quot;&gt;Option C — Longer horizon&lt;/div&gt;
      &lt;h3&gt;Migrate to EventBridge Scheduler + CDK / Terraform&lt;/h3&gt;
      &lt;p&gt;Replace CloudFormation-managed EventBridge Rules with the purpose-built newer service, managed through CDK or Terraform modules. Better DX, built-in retry policies, templated constructs reduce copy-paste risk significantly.&lt;/p&gt;
      &lt;div class=&quot;option-meta&quot;&gt;
        &lt;span class=&quot;improves&quot;&gt;Better DX, built-in retries, less copy-paste risk&lt;/span&gt;
        &lt;span class=&quot;risk&quot;&gt;Migration cost, team upskilling, short-term disruption&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;

  &lt;div class=&quot;bottom-line&quot;&gt;
    &lt;strong&gt;Claude&#39;s bottom line:&lt;/strong&gt; This process enforces the right principle with the wrong mechanism. The risk isn&#39;t that engineers are careless — it&#39;s that the process provides no structural resistance to errors. A linter, a generator script, and a validation step in CI would eliminate the majority of failure modes at low cost. &lt;strong&gt;The highest-leverage immediate action: a script that generates the YAML block from inputs and validates cron syntax before the PR is opened. Everything else can wait.&lt;/strong&gt;
  &lt;/div&gt;

  &lt;div class=&quot;divider&quot;&gt;&lt;span class=&quot;divider-icon&quot;&gt;— —&lt;/span&gt;&lt;/div&gt;

  &lt;h2&gt;What this is really about&lt;/h2&gt;

  &lt;p&gt;The genealogy of this critique is what I keep coming back to. A product owner at an enterprise company built a framework in ChatGPT to make product leaders sharper. I adapted it for Claude to make systems engineers sharper. The seven-step skeleton traveled across two organizations, two domains, two AI models, and two completely different problems — and produced something genuinely useful every time.&lt;/p&gt;

  &lt;p&gt;That last part matters more than it might seem. We&#39;re entering a moment where the major AI models are converging in capability. The choice between them is increasingly a matter of workflow preference, not raw power. What doesn&#39;t transfer automatically — what has to be deliberately designed — is how you instruct them. The same prompt that works in ChatGPT works in Claude. The same framework that sharpens a product roadmap sharpens an engineering workflow. The instructions are the portable, reusable, compounding asset. The model is the infrastructure underneath.&lt;/p&gt;

  &lt;p&gt;We spend a lot of time evaluating which AI model to use and almost no time designing &lt;em&gt;how we instruct it.&lt;/em&gt; The difference between an AI that validates your thinking and one that challenges it isn&#39;t the model version — it&#39;s the instruction set. One framing decision, encoded in a project&#39;s system prompt, shifts the output from agreeable to adversarial, from a mirror to a pressure test.&lt;/p&gt;

  &lt;p&gt;The insight the enterprise product owner had — that you can force structured, sequential reasoning by encoding a multi-step framework as the operating instruction — turns out to be domain-agnostic and model-agnostic. The same architecture works on roadmaps, on engineering workflows, on financial models, on hiring processes. You change the vocabulary. The &lt;em&gt;sharpness&lt;/em&gt; is the point.&lt;/p&gt;

  &lt;p&gt;The prompt that lives in our Claude project now means any engineer can walk in with a workflow, a design doc, or an architectural decision and get back something that will make them think harder — not feel better.&lt;/p&gt;

  &lt;p&gt;That&#39;s the unlock. And it cost nothing but the willingness to borrow a smart idea from someone doing a completely different job, on a completely different platform, solving a completely different problem.&lt;/p&gt;

  &lt;p&gt;The best prompts, it turns out, travel well.&lt;/p&gt;

  &lt;div class=&quot;closing-note&quot;&gt;
    &lt;p&gt;Want to build your own pressure-testing project? The pattern is straightforward: pick an adversarial advisor persona, write a multi-step reasoning framework that forces each analytical lens to run in sequence, and explicitly ban validation as a default behavior. It works in Claude. It works in ChatGPT. The framework above has seven steps — but what makes it work isn&#39;t the number, and it isn&#39;t the model. It&#39;s the instruction not to let weak reasoning slide.&lt;/p&gt;
  &lt;/div&gt;

&lt;/article&gt;

&lt;footer&gt;
  Solo &amp;nbsp;·&amp;nbsp; Engineering &amp;amp; AI &amp;nbsp;·&amp;nbsp; 2025
&lt;/footer&gt;

&lt;/body&gt;
&lt;/html&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/8810329584496972211/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/8810329584496972211' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/8810329584496972211'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/8810329584496972211'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2026/04/the-prompt-that-crossed-two.html' title=''/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-7034466967061592555</id><published>2026-03-26T20:35:00.000-07:00</published><updated>2026-03-26T20:47:37.719-07:00</updated><title type='text'>How We Migrated Sidekiq&#39;s Redis Without Losing a Single Job</title><content type='html'>&lt;!DOCTYPE html&gt;
&lt;html lang=&quot;en&quot;&gt;
&lt;head&gt;
&lt;meta charset=&quot;UTF-8&quot; /&gt;
&lt;meta name=&quot;viewport&quot; content=&quot;width=device-width, initial-scale=1.0&quot;/&gt;
&lt;title&gt;How We Migrated Sidekiq&#39;s Redis Without Losing a Single Job&lt;/title&gt;
&lt;link href=&quot;https://fonts.googleapis.com/css2?family=Playfair+Display:ital,wght@0,700;0,900;1,700&amp;family=IBM+Plex+Mono:wght@400;500&amp;family=Source+Serif+4:ital,opsz,wght@0,8..60,400;0,8..60,600;1,8..60,400&amp;display=swap&quot; rel=&quot;stylesheet&quot;/&gt;
&lt;style&gt;
  :root {
    --bg: #f7f4ef;
    --ink: #1a1714;
    --muted: #6b6259;
    --accent: #c0392b;
    --accent-soft: #f0dbd9;
    --rule: #d6cfc5;
    --code-bg: #1e1b18;
    --code-fg: #e8dfd4;
    --code-keyword: #e07b54;
    --code-string: #89c27a;
    --code-comment: #7a6f65;
    --code-num: #d4a84b;
  }

  *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }

  body {
    background: var(--bg);
    color: var(--ink);
    font-family: &#39;Source Serif 4&#39;, Georgia, serif;
    font-size: 19px;
    line-height: 1.75;
    -webkit-font-smoothing: antialiased;
  }

  /* ── HEADER / MASTHEAD ── */
  .masthead {
    border-bottom: 1px solid var(--rule);
    padding: 18px 0 14px;
    text-align: center;
    letter-spacing: 0.18em;
    font-size: 11px;
    font-family: &#39;IBM Plex Mono&#39;, monospace;
    color: var(--muted);
    text-transform: uppercase;
  }
  .masthead a { color: var(--accent); text-decoration: none; }
  .masthead .sep { margin: 0 10px; }

  /* ── HERO ── */
  .hero {
    max-width: 780px;
    margin: 0 auto;
    padding: 64px 24px 48px;
    border-bottom: 1px solid var(--rule);
  }
  .hero-kicker {
    font-family: &#39;IBM Plex Mono&#39;, monospace;
    font-size: 11px;
    letter-spacing: 0.2em;
    text-transform: uppercase;
    color: var(--accent);
    margin-bottom: 20px;
  }
  .hero h1 {
    font-family: &#39;Playfair Display&#39;, Georgia, serif;
    font-size: clamp(32px, 5.5vw, 56px);
    font-weight: 900;
    line-height: 1.1;
    letter-spacing: -0.01em;
    color: var(--ink);
    margin-bottom: 28px;
  }
  .hero h1 em {
    font-style: italic;
    color: var(--accent);
  }
  .hero-meta {
    display: flex;
    align-items: center;
    gap: 20px;
    font-family: &#39;IBM Plex Mono&#39;, monospace;
    font-size: 12px;
    color: var(--muted);
    letter-spacing: 0.05em;
    flex-wrap: wrap;
  }
  .hero-meta .dot { color: var(--rule); font-size: 16px; }
  .hero-lede {
    margin-top: 28px;
    font-size: 20px;
    color: #3a342e;
    line-height: 1.7;
    font-style: italic;
    border-left: 3px solid var(--accent);
    padding-left: 20px;
  }

  /* ── ARTICLE BODY ── */
  .article {
    max-width: 780px;
    margin: 0 auto;
    padding: 52px 24px 80px;
  }

  .article p {
    margin-bottom: 1.5em;
    color: #2a2420;
  }

  /* ── SECTION HEADINGS ── */
  .article h2 {
    font-family: &#39;Playfair Display&#39;, Georgia, serif;
    font-size: 28px;
    font-weight: 700;
    color: var(--ink);
    margin: 2.8em 0 0.8em;
    padding-top: 1em;
    border-top: 1px solid var(--rule);
    letter-spacing: -0.01em;
  }

  /* ── PULL QUOTE / AI QUOTE ── */
  .ai-said {
    background: var(--accent-soft);
    border-left: 4px solid var(--accent);
    border-radius: 0 6px 6px 0;
    padding: 20px 24px;
    margin: 2em 0;
    font-style: italic;
    color: #5a2820;
    font-size: 17px;
  }
  .ai-said::before {
    content: &#39;&quot;AI tools said:&quot;&#39;;
    display: block;
    font-family: &#39;IBM Plex Mono&#39;, monospace;
    font-style: normal;
    font-size: 10px;
    letter-spacing: 0.15em;
    text-transform: uppercase;
    color: var(--accent);
    margin-bottom: 8px;
  }

  /* ── CODE BLOCKS ── */
  .code-wrap {
    margin: 2em 0;
    border-radius: 8px;
    overflow: hidden;
    box-shadow: 0 4px 20px rgba(0,0,0,0.18);
  }
  .code-header {
    background: #2c2723;
    padding: 10px 18px;
    font-family: &#39;IBM Plex Mono&#39;, monospace;
    font-size: 11px;
    color: #8a7d73;
    letter-spacing: 0.1em;
    display: flex;
    align-items: center;
    gap: 8px;
  }
  .code-header .dots span {
    display: inline-block;
    width: 10px; height: 10px;
    border-radius: 50%;
    margin-right: 5px;
  }
  .code-header .dots span:nth-child(1) { background: #ff5f57; }
  .code-header .dots span:nth-child(2) { background: #febc2e; }
  .code-header .dots span:nth-child(3) { background: #28c840; }
  pre {
    background: var(--code-bg);
    color: var(--code-fg);
    padding: 24px 24px;
    overflow-x: auto;
    font-family: &#39;IBM Plex Mono&#39;, monospace;
    font-size: 14px;
    line-height: 1.7;
    margin: 0;
  }
  .kw  { color: var(--code-keyword); }
  .str { color: var(--code-string); }
  .cm  { color: var(--code-comment); font-style: italic; }
  .cn  { color: #e07b54; font-weight: 500; }
  .mb  { color: #7ec8e3; }

  /* ── HIGHLIGHT CALLOUT ── */
  .callout {
    background: var(--ink);
    color: var(--bg);
    border-radius: 8px;
    padding: 28px 32px;
    margin: 2.5em 0;
    font-size: 18px;
    line-height: 1.65;
  }
  .callout strong { color: #f0c070; }

  /* ── TAKEAWAYS LIST ── */
  .takeaways {
    list-style: none;
    padding: 0;
    margin: 1.2em 0 2em;
  }
  .takeaways li {
    padding: 14px 0 14px 0;
    border-bottom: 1px solid var(--rule);
    display: grid;
    grid-template-columns: 28px 1fr;
    gap: 12px;
    align-items: start;
    font-size: 17px;
  }
  .takeaways li:first-child { border-top: 1px solid var(--rule); }
  .takeaways li .num {
    font-family: &#39;IBM Plex Mono&#39;, monospace;
    font-size: 11px;
    color: var(--accent);
    font-weight: 500;
    padding-top: 3px;
  }
  .takeaways li strong { display: block; margin-bottom: 2px; }

  /* ── FOOTER ── */
  .article-footer {
    max-width: 780px;
    margin: 0 auto;
    padding: 32px 24px 48px;
    border-top: 1px solid var(--rule);
    display: flex;
    align-items: center;
    justify-content: space-between;
    flex-wrap: wrap;
    gap: 16px;
  }
  .article-footer .byline {
    font-family: &#39;IBM Plex Mono&#39;, monospace;
    font-size: 12px;
    color: var(--muted);
    letter-spacing: 0.06em;
  }
  .article-footer .tags {
    display: flex;
    gap: 8px;
    flex-wrap: wrap;
  }
  .tag {
    font-family: &#39;IBM Plex Mono&#39;, monospace;
    font-size: 10px;
    letter-spacing: 0.12em;
    text-transform: uppercase;
    background: var(--accent-soft);
    color: var(--accent);
    padding: 4px 10px;
    border-radius: 3px;
  }

  /* ── DIVIDER ── */
  .rule-ornament {
    text-align: center;
    margin: 3em 0 1em;
    color: var(--rule);
    font-size: 22px;
    letter-spacing: 0.4em;
  }

  /* ── RESPONSIVE ── */
  @media (max-width: 600px) {
    body { font-size: 17px; }
    .hero { padding: 40px 18px 36px; }
    .article { padding: 36px 18px 60px; }
    pre { font-size: 13px; padding: 18px; }
  }
&lt;/style&gt;
&lt;/head&gt;
&lt;body&gt;

&lt;!-- MASTHEAD --&gt;
&lt;nav class=&quot;masthead&quot;&gt;
  &lt;a href=&quot;https://thushw.blogspot.com&quot;&gt;thushw&lt;/a&gt;
  &lt;span class=&quot;sep&quot;&gt;·&lt;/span&gt;
  Engineering Notes
  &lt;span class=&quot;sep&quot;&gt;·&lt;/span&gt;
  March 2026
&lt;/nav&gt;

&lt;!-- HERO --&gt;
&lt;header class=&quot;hero&quot;&gt;
  &lt;p class=&quot;hero-kicker&quot;&gt;Infrastructure · Redis · Sidekiq&lt;/p&gt;
  &lt;h1&gt;How We Migrated Sidekiq&#39;s Redis Without Losing a Single Job &lt;em&gt;(and Without Listening to AI)&lt;/em&gt;&lt;/h1&gt;
  &lt;div class=&quot;hero-meta&quot;&gt;
    &lt;span&gt;Solo Engineering Team&lt;/span&gt;
    &lt;span class=&quot;dot&quot;&gt;·&lt;/span&gt;
    &lt;span&gt;March 2026&lt;/span&gt;
    &lt;span class=&quot;dot&quot;&gt;·&lt;/span&gt;
    &lt;span&gt;8 min read&lt;/span&gt;
  &lt;/div&gt;
  &lt;p class=&quot;hero-lede&quot;&gt;We moved our Sidekiq backend from Redis Enterprise to AWS ElastiCache. The AI tools recommended a careful, expensive approach. We did something simpler — and it worked perfectly.&lt;/p&gt;
&lt;/header&gt;

&lt;!-- ARTICLE --&gt;
&lt;main class=&quot;article&quot;&gt;

  &lt;h2&gt;The Setup&lt;/h2&gt;
  &lt;p&gt;Our app runs Sidekiq workers on ECS. Each process connects to Redis on startup to read and process jobs. We were moving from Redis Enterprise to ElastiCache — different host, different connection string, same protocol.&lt;/p&gt;
  &lt;p&gt;New jobs would start going to the new Redis as soon as we deployed. But existing jobs queued in the old Redis? They&#39;d be orphaned the moment every worker switched over.&lt;/p&gt;

  &lt;h2&gt;What the AI Tools Said&lt;/h2&gt;
  &lt;p&gt;We asked around — Claude, ChatGPT, Gemini, Grok. They all landed in roughly the same place:&lt;/p&gt;

  &lt;div class=&quot;ai-said&quot;&gt;
    You should deploy a separate environment connected to the old Redis. Let it drain the queue over time, then decommission.
  &lt;/div&gt;

  &lt;p&gt;It&#39;s not wrong. But it&#39;s heavy. That approach meant new ECS task definitions, environment variable management across two sets of infra, coordinating the decommission, and extra cost while two clusters run in parallel.&lt;/p&gt;
  &lt;p&gt;When we pushed back, one tool offered an alternative: run two Sidekiq processes per Docker container — one pointed at old Redis, one at new. That would have required changes to CloudFormation templates, process supervision config inside the container, and careful cleanup afterward. Trading one complex migration for another.&lt;/p&gt;

  &lt;div class=&quot;callout&quot;&gt;
    But they missed something important: &lt;strong&gt;Sidekiq&#39;s backing store is completely external to the process.&lt;/strong&gt; A job scheduled on Redis Enterprise doesn&#39;t &lt;em&gt;belong&lt;/em&gt; to any particular Sidekiq process — it just sits there until a worker with a connection to that Redis comes along. The worker is stateless.
  &lt;/div&gt;

  &lt;p&gt;So the &quot;debugging nightmare&quot; scenario the AI tools described... wouldn&#39;t actually happen.&lt;/p&gt;

  &lt;h2&gt;The Actual Solution&lt;/h2&gt;
  &lt;p&gt;Our team came up with something much simpler. In &lt;code style=&quot;font-family:&#39;IBM Plex Mono&#39;,monospace;font-size:15px;background:#e8e0d8;padding:2px 6px;border-radius:3px;&quot;&gt;config/initializers/sidekiq.rb&lt;/code&gt;, at startup, each Sidekiq process decides which Redis to connect to. We added one line:&lt;/p&gt;

  &lt;div class=&quot;code-wrap&quot;&gt;
    &lt;div class=&quot;code-header&quot;&gt;
      &lt;div class=&quot;dots&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;span&gt;&lt;/span&gt;&lt;span&gt;&lt;/span&gt;&lt;/div&gt;
      config/initializers/sidekiq.rb — the one-liner
    &lt;/div&gt;
    &lt;pre&gt;&lt;span class=&quot;cm&quot;&gt;# Coin toss at startup — connects this process to one Redis for its entire lifetime&lt;/span&gt;
redis_url = &lt;span class=&quot;mb&quot;&gt;rand&lt;/span&gt; &amp;lt; &lt;span class=&quot;cn&quot;&gt;0.5&lt;/span&gt; ? &lt;span class=&quot;cn&quot;&gt;LYMO_SIDEKIQ_NEW_REDIS_URL&lt;/span&gt; : &lt;span class=&quot;cn&quot;&gt;LYMO_SIDEKIQ_OLD_REDIS_URL&lt;/span&gt;&lt;/pre&gt;
  &lt;/div&gt;

  &lt;p&gt;That&#39;s it. On startup, each worker tosses a coin. Heads → new ElastiCache. Tails → old Redis Enterprise.&lt;/p&gt;
  &lt;p&gt;The result: roughly half the cluster continued draining the old queue, while the other half processed new jobs on ElastiCache. No new infra. No task definition changes. No separate environment to coordinate.&lt;/p&gt;
  &lt;p&gt;We also pointed all &lt;em&gt;job producers&lt;/em&gt; (the code that enqueues jobs) at the new Redis immediately. So new work only ever went to ElastiCache. The old Redis just needed to drain.&lt;/p&gt;
  &lt;p&gt;This is where Sidekiq&#39;s initializer structure becomes the key enabler. Each &lt;code style=&quot;font-family:&#39;IBM Plex Mono&#39;,monospace;font-size:15px;background:#e8e0d8;padding:2px 6px;border-radius:3px;&quot;&gt;configure_server&lt;/code&gt; and &lt;code style=&quot;font-family:&#39;IBM Plex Mono&#39;,monospace;font-size:15px;background:#e8e0d8;padding:2px 6px;border-radius:3px;&quot;&gt;configure_client&lt;/code&gt; are can be wired seperately where the server (one that reads) uses the &lt;code style=&quot;font-family:&#39;IBM Plex Mono&#39;,monospace;font-size:15px;background:#e8e0d8;padding:2px 6px;border-radius:3px;&quot;&gt;redis_url&lt;/code&gt; resolved at startup:&lt;/p&gt;

  &lt;div class=&quot;code-wrap&quot;&gt;
    &lt;div class=&quot;code-header&quot;&gt;
      &lt;div class=&quot;dots&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;span&gt;&lt;/span&gt;&lt;span&gt;&lt;/span&gt;&lt;/div&gt;
      config/initializers/sidekiq.rb — full initializer
    &lt;/div&gt;
    &lt;pre&gt;redis_url = &lt;span class=&quot;mb&quot;&gt;rand&lt;/span&gt; &amp;lt; &lt;span class=&quot;cn&quot;&gt;0.5&lt;/span&gt; ? &lt;span class=&quot;cn&quot;&gt;LYMO_SIDEKIQ_NEW_REDIS_URL&lt;/span&gt; : &lt;span class=&quot;cn&quot;&gt;LYMO_SIDEKIQ_OLD_REDIS_URL&lt;/span&gt;

&lt;span class=&quot;cn&quot;&gt;Sidekiq&lt;/span&gt;.&lt;span class=&quot;mb&quot;&gt;configure_server&lt;/span&gt; &lt;span class=&quot;kw&quot;&gt;do&lt;/span&gt; |config|
  config.redis = { url: redis_url }
&lt;span class=&quot;kw&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;cn&quot;&gt;Sidekiq&lt;/span&gt;.&lt;span class=&quot;mb&quot;&gt;configure_client&lt;/span&gt; &lt;span class=&quot;kw&quot;&gt;do&lt;/span&gt; |config|
  config.redis = { url: new_redis_url }
&lt;span class=&quot;kw&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;
  &lt;/div&gt;

  &lt;p&gt;One coin toss. One URL to pull from. That process reads &lt;em&gt;and&lt;/em&gt; from the same Redis for its entire lifetime.&lt;/p&gt;
  &lt;p&gt;The clients (that push jobs) will always use the new url, and the reads would be split between the old and new url. In time, the old queue drains as it receives no further jobs. The old Redis processes were naturally left behind to drain, and as they cycled out, the cluster fully converged on the new setup with no intervention required.&lt;/p&gt;

  &lt;h2&gt;How It Went&lt;/h2&gt;
  &lt;p&gt;It worked exactly as expected. Within a day, roughly 90% of the old queue had drained naturally. Workers reading from old Redis gradually found less and less work, while ElastiCache handled all the new throughput.&lt;/p&gt;
  &lt;p&gt;The remaining jobs were a different story: &lt;strong&gt;scheduled jobs&lt;/strong&gt;. These live in Sidekiq&#39;s sorted set and don&#39;t get picked up until their execution time arrives — which could be hours away. Waiting wasn&#39;t ideal, so we wrote a small script to move them from the old Redis to the new one manually. A few lines to iterate the scheduled (and retry) set, re-enqueue on ElastiCache, and delete from old Redis. Clean cutover.&lt;/p&gt;
  &lt;p&gt;Once that was done, we deployed the cleanup — removed the conditional and all references to the old Redis connection. Four lines of code deleted. Done.&lt;/p&gt;
  &lt;p&gt;Oh, and while all of this was happening? The rest of the team made a dozen normal deployments — which restarted servers, reshuffled which Redis each process landed on, and generally did everything the AI tools said would cause a debugging nightmare. Nothing broke. No jobs lost. The doom and gloom theories were disproven in the most practical way possible: by live testing.&lt;/p&gt;

  &lt;h2&gt;Why the AI Advice Missed the Mark&lt;/h2&gt;
  &lt;p&gt;The AI tools were technically cautious but operationally naive. They modeled the problem as &quot;jobs are tied to a running process&quot; — which isn&#39;t how Sidekiq works. Redis is the source of truth, not the worker. The worker is stateless.&lt;/p&gt;
  &lt;p&gt;They also defaulted to the safest, most conservative architecture: full environment isolation. That&#39;s sensible for high-stakes migrations. But for a queue drain, it&#39;s significant overengineering.&lt;/p&gt;
  &lt;p&gt;The human insight — &lt;em&gt;the DB is external, the workers are stateless, so we can split them probabilistically&lt;/em&gt; — is the kind of lateral thinking that comes from actually understanding the system rather than pattern-matching to a template.&lt;/p&gt;

  &lt;div class=&quot;rule-ornament&quot;&gt;— ✦ —&lt;/div&gt;

  &lt;h2&gt;Takeaways&lt;/h2&gt;
  &lt;ul class=&quot;takeaways&quot;&gt;
    &lt;li&gt;
      &lt;span class=&quot;num&quot;&gt;01&lt;/span&gt;
      &lt;div&gt;&lt;strong&gt;Sidekiq workers are stateless.&lt;/strong&gt; Redis is the state. This gives you more migration flexibility than you might think.&lt;/div&gt;
    &lt;/li&gt;
    &lt;li&gt;
      &lt;span class=&quot;num&quot;&gt;02&lt;/span&gt;
      &lt;div&gt;&lt;strong&gt;Probabilistic splits are underrated.&lt;/strong&gt; You don&#39;t always need clean cutoffs. A coin toss at startup is simple, observable, and reversible.&lt;/div&gt;
    &lt;/li&gt;
    &lt;li&gt;
      &lt;span class=&quot;num&quot;&gt;03&lt;/span&gt;
      &lt;div&gt;&lt;strong&gt;AI tools are good at safe answers, not always good at efficient ones.&lt;/strong&gt; They&#39;ll often recommend the conservative solution even when a simpler one exists. Treat their output as a starting point, not a final answer.&lt;/div&gt;
    &lt;/li&gt;
    &lt;li&gt;
      &lt;span class=&quot;num&quot;&gt;04&lt;/span&gt;
      &lt;div&gt;&lt;strong&gt;The cleanup should be as simple as the migration.&lt;/strong&gt; If your migration leaves behind complex infra, you&#39;ve done too much. Ours cleaned up with four deleted lines.&lt;/div&gt;
    &lt;/li&gt;
  &lt;/ul&gt;

&lt;/main&gt;

&lt;!-- FOOTER --&gt;
&lt;footer class=&quot;article-footer&quot;&gt;
  &lt;div class=&quot;byline&quot;&gt;Solo Engineering Team · March 2026&lt;/div&gt;
  &lt;div class=&quot;tags&quot;&gt;
    &lt;span class=&quot;tag&quot;&gt;Redis&lt;/span&gt;
    &lt;span class=&quot;tag&quot;&gt;Sidekiq&lt;/span&gt;
    &lt;span class=&quot;tag&quot;&gt;AWS ElastiCache&lt;/span&gt;
    &lt;span class=&quot;tag&quot;&gt;Migration&lt;/span&gt;
    &lt;span class=&quot;tag&quot;&gt;Ruby&lt;/span&gt;
    &lt;span class=&quot;tag&quot;&gt;ECS&lt;/span&gt;
  &lt;/div&gt;
&lt;/footer&gt;

&lt;/body&gt;
&lt;/html&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/7034466967061592555/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/7034466967061592555' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/7034466967061592555'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/7034466967061592555'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2026/03/how-we-migrated-sidekiqs-redis-without.html' title='How We Migrated Sidekiq&#39;s Redis Without Losing a Single Job'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-1476326629750694464</id><published>2026-03-15T13:23:00.000-07:00</published><updated>2026-03-15T13:23:59.301-07:00</updated><title type='text'>Dead Code Is a Cognitive Tax — Here&#39;s How AI Helps You Stop Paying It</title><content type='html'>&lt;style&gt; .bp-post{font-family:Georgia,&#39;Times New Roman&#39;,serif;font-size:17px;line-height:1.8;color:#222;max-width:780px;margin:0 auto} .bp-post h1{font-family:Arial,sans-serif;font-size:2em;font-weight:800;line-height:1.25;color:#111;margin-bottom:6px} .bp-post .byline{font-family:Arial,sans-serif;font-size:.85em;color:#888;margin-bottom:36px;border-bottom:1px solid #eee;padding-bottom:16px} .bp-post h2{font-family:Arial,sans-serif;font-size:1.3em;font-weight:700;color:#111;margin-top:48px;margin-bottom:8px} .bp-post h3{font-family:Arial,sans-serif;font-size:1.05em;font-weight:700;color:#333;margin-top:32px;margin-bottom:6px} .bp-post p{margin:0 0 20px} .bp-post ul,.bp-post ol{margin:0 0 20px 24px;padding:0} .bp-post li{margin-bottom:10px} .bp-post hr{border:none;border-top:1px solid #ddd;margin:44px 0} .bp-post blockquote{border-left:4px solid #4a90d9;margin:28px 0;padding:12px 20px;background:#f4f8fd;font-style:italic;color:#444} .bp-post table{width:100%;border-collapse:collapse;margin:24px 0;font-family:Arial,sans-serif;font-size:.88em} .bp-post th{background:#f0f0f0;text-align:left;padding:10px 14px;border:1px solid #ddd;font-weight:700} .bp-post td{padding:9px 14px;border:1px solid #ddd;vertical-align:top} .bp-post tr:nth-child(even) td{background:#fafafa} .bp-post .stat-box{display:flex;gap:16px;flex-wrap:wrap;margin:28px 0} .bp-post .stat{background:#f4f8fd;border:1px solid #d0e3f5;border-radius:8px;padding:16px 22px;text-align:center;flex:1;min-width:120px} .bp-post .stat .number{font-family:Arial,sans-serif;font-size:1.8em;font-weight:800;color:#2563a8} .bp-post .stat .label{font-family:Arial,sans-serif;font-size:.75em;color:#666;margin-top:4px} .bp-post .callout{background:#fffbea;border-left:4px solid #f0c040;padding:14px 20px;margin:28px 0;font-family:Arial,sans-serif;font-size:.95em;color:#444} .bp-post code{font-family:&#39;Courier New&#39;,monospace;background:#f2f2f2;padding:2px 6px;border-radius:4px;font-size:.88em;color:#c0392b} .bp-post .footer-note{font-family:Arial,sans-serif;font-size:.85em;color:#888;font-style:italic;margin-top:48px;border-top:1px solid #eee;padding-top:16px} &lt;/style&gt; &lt;div class=&quot;bp-post&quot;&gt; &lt;h1&gt;Dead Code Is a Cognitive Tax — Here&#39;s How AI Helps You Stop Paying It&lt;/h1&gt; &lt;div class=&quot;byline&quot;&gt;Posted to Engineering &amp;nbsp;·&amp;nbsp; [Your Name] &amp;nbsp;·&amp;nbsp; [Date]&lt;/div&gt; &lt;p&gt;Every engineer knows the feeling. You open an unfamiliar part of the codebase, and you&#39;re immediately staring down a tangle of services, workers, models, and task entries — none of which come with a label saying &lt;em&gt;&quot;still matters&quot;&lt;/em&gt; or &lt;em&gt;&quot;abandoned in 2023.&quot;&lt;/em&gt; You read the code carefully, try to trace the call graph, maybe even grep for usages — and only after 30 minutes do you realize: &lt;strong&gt;this thing hasn&#39;t run in production for over a year.&lt;/strong&gt;&lt;/p&gt; &lt;p&gt;That tax on your attention has a name: &lt;strong&gt;cognitive load&lt;/strong&gt;. And dead code is one of its most insidious sources.&lt;/p&gt; &lt;hr&gt; &lt;h2&gt;What Is Cognitive Load in a Codebase?&lt;/h2&gt; &lt;p&gt;Cognitive load, in the context of software engineering, is the total mental effort required to understand a system well enough to work in it safely. Every class, method, model, and background job you encounter is a unit of context you have to hold in your head.&lt;/p&gt; &lt;p&gt;The problem is that your brain doesn&#39;t automatically know which of those units are &lt;em&gt;live&lt;/em&gt; and which are &lt;em&gt;ghosts&lt;/em&gt;. If an &lt;code&gt;EstimateWorker&lt;/code&gt; class exists in your repo, you have to assume it matters — until you prove otherwise. That proof takes time, attention, and often a distracting detour away from the actual work you sat down to do.&lt;/p&gt; &lt;blockquote&gt;Dead code doesn&#39;t just waste disk space. It actively misleads you.&lt;/blockquote&gt; &lt;hr&gt; &lt;h2&gt;A Real-World Example: The Estimation Pipeline Cleanup&lt;/h2&gt; &lt;p&gt;Recently, our team completed a cleanup effort across seven pull requests targeting a legacy estimation infrastructure — a suite of services originally built around Prophet forecasts and a Clair analysis pipeline — that had gone completely dark since late 2023.&lt;/p&gt; &lt;p&gt;Here&#39;s what was still sitting in the codebase, doing nothing:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;&lt;strong&gt;&lt;code&gt;EstimateService&lt;/code&gt;&lt;/strong&gt; — fetched a CSV over HTTP, upserted records into the database, and refreshed an estimation cache. Silent for months.&lt;/li&gt; &lt;li&gt;&lt;strong&gt;&lt;code&gt;EstimateWorker&lt;/code&gt;&lt;/strong&gt; — a Sidekiq background job that uploaded files to S3, triggered the estimation flow, and posted Slack notifications. Long dead.&lt;/li&gt; &lt;li&gt;&lt;strong&gt;&lt;code&gt;Estimation::Prophet::DownloadWorker&lt;/code&gt;&lt;/strong&gt; — downloaded forecast CSVs from S3 and upserted them into a &lt;code&gt;Prophet&lt;/code&gt; table. Never called.&lt;/li&gt; &lt;li&gt;&lt;strong&gt;&lt;code&gt;Estimators::ClairAnalysis&lt;/code&gt;&lt;/strong&gt; — computed hourly analysis records for a brief window in late 2023, then stopped.&lt;/li&gt; &lt;li&gt;&lt;strong&gt;&lt;code&gt;ClairAnalysis&lt;/code&gt; model and its backing database table&lt;/strong&gt; — zero writes since the pipeline went quiet.&lt;/li&gt; &lt;li&gt;&lt;strong&gt;Three &lt;code&gt;SwitchBoard&lt;/code&gt; dispatch entries&lt;/strong&gt; — &lt;code&gt;events_collect_for_next_week&lt;/code&gt;, &lt;code&gt;generate_weekly_user_report&lt;/code&gt;, &lt;code&gt;estimate_v2&lt;/code&gt; — all orphaned task names in a routing map.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;Any engineer — or AI assistant — reading this codebase would reasonably assume all of the above was &lt;em&gt;active production infrastructure&lt;/em&gt;. None of it was.&lt;/p&gt; &lt;h3&gt;The Numbers&lt;/h3&gt; &lt;div class=&quot;stat-box&quot;&gt; &lt;div class=&quot;stat&quot;&gt;&lt;div class=&quot;number&quot;&gt;7&lt;/div&gt;&lt;div class=&quot;label&quot;&gt;Pull Requests&lt;/div&gt;&lt;/div&gt; &lt;div class=&quot;stat&quot;&gt;&lt;div class=&quot;number&quot;&gt;31&lt;/div&gt;&lt;div class=&quot;label&quot;&gt;Files Changed&lt;/div&gt;&lt;/div&gt; &lt;div class=&quot;stat&quot;&gt;&lt;div class=&quot;number&quot;&gt;943&lt;/div&gt;&lt;div class=&quot;label&quot;&gt;Lines Deleted&lt;/div&gt;&lt;/div&gt; &lt;div class=&quot;stat&quot;&gt;&lt;div class=&quot;number&quot;&gt;−816&lt;/div&gt;&lt;div class=&quot;label&quot;&gt;Net Lines Removed&lt;/div&gt;&lt;/div&gt; &lt;/div&gt; &lt;table&gt; &lt;thead&gt;&lt;tr&gt;&lt;th&gt;PR&lt;/th&gt;&lt;th&gt;Branch&lt;/th&gt;&lt;th&gt;+Added&lt;/th&gt;&lt;th&gt;−Deleted&lt;/th&gt;&lt;th&gt;Files&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt; &lt;tbody&gt; &lt;tr&gt;&lt;td&gt;#1&lt;/td&gt;&lt;td&gt;cleanup-tasks&lt;/td&gt;&lt;td&gt;13&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;/tr&gt; &lt;tr&gt;&lt;td&gt;#2&lt;/td&gt;&lt;td&gt;cleanup-unused-estimate&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;74&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;/tr&gt; &lt;tr&gt;&lt;td&gt;#3&lt;/td&gt;&lt;td&gt;remove-clair-analysis&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;314&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;/tr&gt; &lt;tr&gt;&lt;td&gt;#4&lt;/td&gt;&lt;td&gt;remove-prophet&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;210&lt;/td&gt;&lt;td&gt;5&lt;/td&gt;&lt;/tr&gt; &lt;tr&gt;&lt;td&gt;#5&lt;/td&gt;&lt;td&gt;remove-clair-analysis-model&lt;/td&gt;&lt;td&gt;20&lt;/td&gt;&lt;td&gt;57&lt;/td&gt;&lt;td&gt;3&lt;/td&gt;&lt;/tr&gt; &lt;tr&gt;&lt;td&gt;#6&lt;/td&gt;&lt;td&gt;rename-clair-v2s&lt;/td&gt;&lt;td&gt;94&lt;/td&gt;&lt;td&gt;68&lt;/td&gt;&lt;td&gt;13&lt;/td&gt;&lt;/tr&gt; &lt;tr&gt;&lt;td&gt;#7&lt;/td&gt;&lt;td&gt;remove-estimate-unused&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;204&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;/tr&gt; &lt;tr&gt;&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;127&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;943&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;31&lt;/strong&gt;&lt;/td&gt;&lt;/tr&gt; &lt;/tbody&gt; &lt;/table&gt; &lt;p&gt;The 127 additions are almost entirely the rename PR (#6) — migrations, updated references, and renamed specs. Every other PR was pure deletion.&lt;/p&gt; &lt;hr&gt; &lt;h2&gt;The Cognitive Impact of the Cleanup&lt;/h2&gt; &lt;p&gt;&lt;strong&gt;Cleaner model surface.&lt;/strong&gt; Once &lt;code&gt;EstimateService&lt;/code&gt;, &lt;code&gt;EstimateWorker&lt;/code&gt;, and &lt;code&gt;ClairAnalysis&lt;/code&gt; were gone, the remaining models — &lt;code&gt;Clair&lt;/code&gt;, &lt;code&gt;ClairDailyInterimResult&lt;/code&gt;, &lt;code&gt;ClairSetting&lt;/code&gt; — actually reflected how the system works today.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Naming that signals intent.&lt;/strong&gt; &lt;code&gt;ClairV2&lt;/code&gt; implies a versioning scheme. &lt;code&gt;ClairDailyInterimResult&lt;/code&gt; tells you exactly what the thing is and why it exists.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;A smaller &lt;code&gt;SwitchBoard&lt;/code&gt; dispatch map.&lt;/strong&gt; Removing the three orphaned entries made the dispatch map honest again.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;A shorter test suite that still covers everything that matters.&lt;/strong&gt; Several spec files covering deleted code were removed. The test suite got faster without losing any meaningful coverage.&lt;/p&gt; &lt;hr&gt; &lt;h2&gt;Where AI Fits In: Finding Dead Code You Can&#39;t See&lt;/h2&gt; &lt;p&gt;Here&#39;s the uncomfortable truth about dead code: it&#39;s often invisible to the people closest to it. If you wrote &lt;code&gt;EstimateWorker&lt;/code&gt; two years ago and the team that decommissioned the upstream service never filed a ticket, you might not even know it&#39;s dead. The code looks fine. The tests pass. Nothing alerts you.&lt;/p&gt; &lt;h3&gt;A Telling Real-World Example: Claude Gets Confused, Then Catches Itself&lt;/h3&gt; &lt;p&gt;We recently asked Claude to generate a flow diagram of our pay guarantee process. Claude produced a diagram that looked plausible — tracing through services, models, and workers in a way that made logical sense.&lt;/p&gt; &lt;p&gt;The problem? Part of that diagram was wrong — because Claude had incorporated a module that was no longer active into its understanding of the flow. The dead code was so well-structured and apparently coherent that the AI read it as live infrastructure and wove it into the diagram without hesitation.&lt;/p&gt; &lt;div class=&quot;callout&quot;&gt;&lt;strong&gt;But here&#39;s what makes this story instructive rather than just cautionary:&lt;/strong&gt; When an engineer removed this hopefully last bit of dead code, Claude immediately realized that the diagram she drew earlier relied on this bad signal, revised its understanding, and corrected the diagram.&lt;/div&gt; &lt;p&gt;That sequence — &lt;em&gt;confidently wrong, then self-correcting&lt;/em&gt; — is a useful frame for thinking about AI and dead code. &lt;strong&gt;It fooled the AI for the same reason it fools engineers: it looks like it belongs.&lt;/strong&gt;&lt;/p&gt; &lt;h3&gt;What AI Can Do&lt;/h3&gt; &lt;p&gt;&lt;strong&gt;Tracing call graphs at scale.&lt;/strong&gt; AI can trace the full call graph of a function or class across an entire monorepo — answering not just with direct callers, but with the &lt;em&gt;absence&lt;/em&gt; of callers.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Cross-referencing runtime signals with static code.&lt;/strong&gt; When connected to observability data — logs, APM traces, queue metrics — an AI can compare what the code &lt;em&gt;says it does&lt;/em&gt; with what &lt;em&gt;actually runs in production&lt;/em&gt;.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Flagging stale patterns.&lt;/strong&gt; Dead code has fingerprints: models with no recent migrations, task names absent from any scheduler config, service classes with no callers outside their own spec files.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Drafting cleanup PRs.&lt;/strong&gt; Once dead code is identified, AI can help draft the actual removal — proposing what to delete, what to rename, and what specs to clean up.&lt;/p&gt; &lt;h3&gt;What AI Can&#39;t Do (Yet)&lt;/h3&gt; &lt;p&gt;AI isn&#39;t a replacement for engineering judgment. A worker might be &quot;dead&quot; in CI but still referenced by a cron job in an ops runbook nobody&#39;s touched in three years.&lt;/p&gt; &lt;blockquote&gt;The right model is AI as a scout, engineer as the decision-maker. AI surfaces candidates. Engineers verify, contextualise, and own the deletion.&lt;/blockquote&gt; &lt;hr&gt; &lt;h2&gt;Making Dead Code Cleanup a Habit&lt;/h2&gt; &lt;ol&gt; &lt;li&gt;&lt;strong&gt;Timestamp your decommissions.&lt;/strong&gt; When you turn off a pipeline, leave a comment in the code with the date.&lt;/li&gt; &lt;li&gt;&lt;strong&gt;Review your task dispatch maps regularly.&lt;/strong&gt; A quarterly review catches orphaned entries before they fossilise.&lt;/li&gt; &lt;li&gt;&lt;strong&gt;Use AI during onboarding and code review.&lt;/strong&gt; AI tools can help new engineers quickly validate whether something is live — and surface it for cleanup if it isn&#39;t.&lt;/li&gt; &lt;li&gt;&lt;strong&gt;Treat deletion as a first-class deliverable.&lt;/strong&gt; 816 lines removed is a meaningful engineering contribution. Make it visible in sprint planning, changelogs, and retros.&lt;/li&gt; &lt;/ol&gt; &lt;hr&gt; &lt;h2&gt;Conclusion&lt;/h2&gt; &lt;p&gt;Large codebases accumulate cognitive debt quietly, continuously, and with compounding interest. Dead code is one of the most expensive line items: it misleads engineers, bloats test suites, and turns routine code reading into archaeology.&lt;/p&gt; &lt;p&gt;As we saw first-hand, it even misleads AI. Claude confidently incorporated a dead module into a flow diagram of our pay guarantee process — because the code looked live. That moment of confusion, and the self-correction that followed, is a perfect metaphor for where we are with AI-assisted engineering today: powerful, promising, and most effective when paired with good runtime context and human judgment.&lt;/p&gt; &lt;p&gt;The goal isn&#39;t a perfect codebase. It&#39;s a codebase where the code you&#39;re reading is the code that&#39;s actually running. That&#39;s a goal worth shipping toward.&lt;/p&gt; &lt;div class=&quot;footer-note&quot;&gt;Have you used AI tooling to identify dead code in your own codebase? We&#39;d love to hear how in the comments.&lt;/div&gt; &lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/1476326629750694464/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/1476326629750694464' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/1476326629750694464'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/1476326629750694464'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2026/03/dead-code-is-cognitive-tax-heres-how-ai.html' title='Dead Code Is a Cognitive Tax — Here&#39;s How AI Helps You Stop Paying It'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-5583919736087698689</id><published>2026-02-26T16:39:00.000-08:00</published><updated>2026-02-26T16:43:16.789-08:00</updated><title type='text'>When AI Sounds Right But Isn&#39;t: A Sidekiq Story</title><content type='html'>&lt;p&gt;
  I recently ran into a Redis migration problem with Sidekiq. I needed to point our workers at a new Redis instance while safely draining jobs still sitting in the old one. I turned to Claude for advice — and what followed is a good reminder of why GenAI output should always be treated as a &lt;em&gt;starting point&lt;/em&gt;, not a final answer.
&lt;/p&gt;

&lt;hr /&gt;

&lt;h3&gt;Step 1: My Initial Idea (and Why Claude Correctly Pushed Back)&lt;/h3&gt;

&lt;p&gt;
  My first instinct was a &quot;coin toss&quot; approach: configure the Sidekiq server to point to the old Redis with 50% probability and the new Redis otherwise, while the client always pushes to the new Redis. Once the old queue drained, I&#39;d do a clean cutover.
&lt;/p&gt;

&lt;p&gt;Claude&#39;s response here was actually good:&lt;/p&gt;

&lt;blockquote style=&quot;background:#f4f4f4; border-left:4px solid #ccc; padding:12px 16px; margin:16px 0; font-family:monospace; font-size:0.95em;&quot;&gt;
  &lt;strong&gt;Claude:&lt;/strong&gt; The coin toss approach has some real problems... Non-deterministic behavior is hard to reason about in production. You lose retry guarantees. A job that fails while a worker is connected to the old Redis will be retried there — but if on the next attempt the worker happens to connect to the new Redis, it won&#39;t see that job. The job could get stuck or lost. Sidekiq server connects to one Redis at startup, not per-job...
&lt;/blockquote&gt;

&lt;p&gt;
  Fair enough. The critique was accurate. Claude then suggested a cleaner alternative: use &lt;strong&gt;Sidekiq 7&#39;s &lt;code&gt;config.capsules&lt;/code&gt;&lt;/strong&gt; to run multiple Redis connections within a single process — one capsule for the new Redis, one to drain the old.
&lt;/p&gt;

&lt;hr /&gt;

&lt;h3&gt;Step 2: A Confident, Detailed, Completely Wrong Answer&lt;/h3&gt;

&lt;p&gt;
  I was on Sidekiq Pro 7.3.9, so I asked Claude to show me how to configure capsules for this use case. It gave me this:
&lt;/p&gt;

&lt;pre style=&quot;background:#f4f4f4; border:1px solid #ddd; padding:16px; overflow-x:auto; font-size:0.9em;&quot;&gt;&lt;code&gt;Sidekiq.configure_server do |config|
  # New Redis - primary capsule (default)
  config.redis = { url: new_redis_url }

  # Old Redis - drain capsule
  config.capsule(&quot;legacy&quot;) do |cap|
    cap.redis = { url: old_redis_url }
    cap.concurrency = 5
    cap.queues = %w[default mailers critical]
  end
end

Sidekiq.configure_client do |config|
  config.redis = { url: new_redis_url }
end&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;
  The response was detailed. It used the right vocabulary. It even included helpful caveats about thread pool sizing, checking the retry and scheduled sets before teardown, and queue enumeration via &lt;code&gt;redis-cli SMEMBERS queues&lt;/code&gt;. It &lt;em&gt;felt&lt;/em&gt; authoritative.
&lt;/p&gt;

&lt;p&gt;There was just one problem: &lt;strong&gt;it was wrong.&lt;/strong&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;h3&gt;Step 3: Thirty Seconds in the Docs&lt;/h3&gt;

&lt;p&gt;
  Before implementing anything, I did a quick check of the official Sidekiq documentation. Here&#39;s what it says, plainly:
&lt;/p&gt;

&lt;blockquote style=&quot;background:#fff8e1; border-left:4px solid #f0b429; padding:12px 16px; margin:16px 0; font-size:0.95em;&quot;&gt;
  &lt;strong&gt;Sidekiq Docs:&lt;/strong&gt; &quot;There is still one iron-clad rule: a Sidekiq process only executes jobs from one Redis instance; all Capsules within a process must use the same Redis instance. If you want to process jobs from two separate Redis instances, you need to start two separate Sidekiq processes.&quot;
&lt;/blockquote&gt;

&lt;p&gt;
  Capsules are for organizing queues and concurrency within a &lt;em&gt;single&lt;/em&gt; Redis instance. They cannot span multiple Redis instances. The entire premise of Claude&#39;s solution was invalid.
&lt;/p&gt;

&lt;p&gt;When I pointed this out, Claude acknowledged it immediately:&lt;/p&gt;

&lt;blockquote style=&quot;background:#f4f4f4; border-left:4px solid #ccc; padding:12px 16px; margin:16px 0; font-size:0.95em;&quot;&gt;
  &lt;strong&gt;Claude:&lt;/strong&gt; You&#39;re right, I was wrong — capsules cannot span multiple Redis instances. They&#39;re for organizing queues/concurrency within a single Redis. So for your drain scenario, you actually need two separate Sidekiq processes with different config files.
&lt;/blockquote&gt;

&lt;p&gt;No pushback. No hedging. Just a clean admission that the confident, detailed answer was incorrect.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3&gt;What Makes This Failure Mode Dangerous&lt;/h3&gt;

&lt;p&gt;
  This isn&#39;t a story about AI being obviously broken. It&#39;s about AI being &lt;em&gt;convincingly wrong&lt;/em&gt; — and that&#39;s a harder problem to guard against.
&lt;/p&gt;

&lt;p&gt;A few things made this particularly easy to fall for:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;The answer was structurally sound.&lt;/strong&gt; Capsules &lt;em&gt;are&lt;/em&gt; a real Sidekiq 7 feature. The code was syntactically valid Ruby. The caveats about concurrency and retry sets were genuinely useful. Only the core assumption — that capsules can target different Redis instances — was wrong.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;The fluency signals trust.&lt;/strong&gt; When an answer uses the right terminology, references the right version numbers, and anticipates edge cases, it reads as expert. That fluency is a product of training on large amounts of text, not of verified understanding.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;The model doesn&#39;t know what it doesn&#39;t know.&lt;/strong&gt; Claude didn&#39;t say &quot;I&#39;m not certain about the multi-Redis constraint — check the docs.&quot; It presented the solution as if it were established fact.&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3&gt;A Simple Rule of Thumb&lt;/h3&gt;

&lt;p&gt;
  If you wouldn&#39;t ship code based solely on a Stack Overflow answer from 2019 without reading the docs, don&#39;t ship code based solely on a GenAI answer either. The bar should be the same — or higher, because at least the Stack Overflow answer has upvotes, comments, and a date stamp.
&lt;/p&gt;

&lt;p&gt;
  GenAI is genuinely useful for orientation: understanding an unfamiliar API surface, exploring options, getting unstuck. But any answer that involves a specific documented behavior — especially version-specific constraints — needs at least one authoritative source check before you act on it.
&lt;/p&gt;

&lt;p&gt;
  In this case, thirty seconds in the Sidekiq docs saved what could have been hours of debugging a fundamentally broken architecture. That&#39;s a pretty good return on thirty seconds.
&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;The actual solution, if you&#39;re curious: two separate Sidekiq processes with separate config files, each pointing at a different Redis instance. One processes new work, one drains the old queues. When the old queue, retry set, and scheduled set are all empty, shut the old process down.&lt;/em&gt;&lt;/p&gt;

</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/5583919736087698689/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/5583919736087698689' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/5583919736087698689'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/5583919736087698689'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2026/02/when-ai-sounds-right-but-isnt-sidekiq.html' title='When AI Sounds Right But Isn&#39;t: A Sidekiq Story'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-6493568766900669276</id><published>2025-11-28T19:23:00.000-08:00</published><updated>2025-11-28T19:46:34.321-08:00</updated><title type='text'>Death-Defying Sidekiq Jobs: Part 2</title><content type='html'>&lt;div style=&quot;color: #333333; font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe UI&amp;quot;, Roboto, &amp;quot;Helvetica Neue&amp;quot;, Arial, sans-serif; line-height: 1.6; margin: 0px auto; max-width: 800px; padding: 20px;&quot;&gt;

&lt;svg viewBox=&quot;0 0 1200 400&quot; xmlns=&quot;http://www.w3.org/2000/svg&quot;&gt;
  &lt;!-- Background gradient --&gt;
  &lt;defs&gt;
    &lt;linearGradient id=&quot;bg&quot; x1=&quot;0%&quot; y1=&quot;0%&quot; x2=&quot;100%&quot; y2=&quot;100%&quot;&gt;
      &lt;stop offset=&quot;0%&quot; style=&quot;stop-color:#1a1a2e;stop-opacity:1&quot; /&gt;
      &lt;stop offset=&quot;100%&quot; style=&quot;stop-color:#16213e;stop-opacity:1&quot; /&gt;
    &lt;/linearGradient&gt;
    
    &lt;linearGradient id=&quot;kickGlow&quot; x1=&quot;0%&quot; y1=&quot;0%&quot; x2=&quot;100%&quot; y2=&quot;0%&quot;&gt;
      &lt;stop offset=&quot;0%&quot; style=&quot;stop-color:#ff6b6b;stop-opacity:0&quot; /&gt;
      &lt;stop offset=&quot;50%&quot; style=&quot;stop-color:#ff6b6b;stop-opacity:0.3&quot; /&gt;
      &lt;stop offset=&quot;100%&quot; style=&quot;stop-color:#ff6b6b;stop-opacity:0&quot; /&gt;
    &lt;/linearGradient&gt;
  &lt;/defs&gt;
  
  &lt;!-- Background --&gt;
  &lt;rect width=&quot;1200&quot; height=&quot;400&quot; fill=&quot;url(#bg)&quot;/&gt;
  
  &lt;!-- Impact lines radiating from kick --&gt;
  &lt;line x1=&quot;500&quot; y1=&quot;200&quot; x2=&quot;650&quot; y2=&quot;150&quot; stroke=&quot;#ff6b6b&quot; stroke-width=&quot;3&quot; opacity=&quot;0.6&quot;/&gt;
  &lt;line x1=&quot;500&quot; y1=&quot;200&quot; x2=&quot;680&quot; y2=&quot;180&quot; stroke=&quot;#ff6b6b&quot; stroke-width=&quot;3&quot; opacity=&quot;0.6&quot;/&gt;
  &lt;line x1=&quot;500&quot; y1=&quot;200&quot; x2=&quot;690&quot; y2=&quot;220&quot; stroke=&quot;#ff6b6b&quot; stroke-width=&quot;3&quot; opacity=&quot;0.6&quot;/&gt;
  &lt;line x1=&quot;500&quot; y1=&quot;200&quot; x2=&quot;670&quot; y2=&quot;250&quot; stroke=&quot;#ff6b6b&quot; stroke-width=&quot;3&quot; opacity=&quot;0.6&quot;/&gt;
  
  &lt;!-- Kick impact glow --&gt;
  &lt;ellipse cx=&quot;550&quot; cy=&quot;200&quot; rx=&quot;80&quot; ry=&quot;60&quot; fill=&quot;url(#kickGlow)&quot;/&gt;
  
  &lt;!-- Martial artist figure (simplified, Sidekiq-style) --&gt;
  &lt;g id=&quot;martial-artist&quot;&gt;
    &lt;!-- Head --&gt;
    &lt;circle cx=&quot;350&quot; cy=&quot;120&quot; r=&quot;25&quot; fill=&quot;#f39c12&quot; stroke=&quot;#e67e22&quot; stroke-width=&quot;3&quot;/&gt;
    
    &lt;!-- Body --&gt;
    &lt;rect x=&quot;335&quot; y=&quot;145&quot; width=&quot;30&quot; height=&quot;60&quot; fill=&quot;#e74c3c&quot; stroke=&quot;#c0392b&quot; stroke-width=&quot;3&quot; rx=&quot;5&quot;/&gt;
    
    &lt;!-- Back arm (punching forward) --&gt;
    &lt;line x1=&quot;335&quot; y1=&quot;160&quot; x2=&quot;280&quot; y2=&quot;140&quot; stroke=&quot;#e74c3c&quot; stroke-width=&quot;12&quot; stroke-linecap=&quot;round&quot;/&gt;
    &lt;circle cx=&quot;280&quot; cy=&quot;140&quot; r=&quot;8&quot; fill=&quot;#f39c12&quot;/&gt;
    
    &lt;!-- Front arm (pulled back) --&gt;
    &lt;line x1=&quot;365&quot; y1=&quot;160&quot; x2=&quot;390&quot; y2=&quot;180&quot; stroke=&quot;#e74c3c&quot; stroke-width=&quot;12&quot; stroke-linecap=&quot;round&quot;/&gt;
    &lt;circle cx=&quot;390&quot; cy=&quot;180&quot; r=&quot;8&quot; fill=&quot;#f39c12&quot;/&gt;
    
    &lt;!-- Support leg --&gt;
    &lt;line x1=&quot;345&quot; y1=&quot;205&quot; x2=&quot;330&quot; y2=&quot;260&quot; stroke=&quot;#2c3e50&quot; stroke-width=&quot;12&quot; stroke-linecap=&quot;round&quot;/&gt;
    &lt;line x1=&quot;330&quot; y1=&quot;260&quot; x2=&quot;320&quot; y2=&quot;280&quot; stroke=&quot;#34495e&quot; stroke-width=&quot;10&quot; stroke-linecap=&quot;round&quot;/&gt;
    
    &lt;!-- Kicking leg (extended high kick) --&gt;
    &lt;line x1=&quot;355&quot; y1=&quot;205&quot; x2=&quot;450&quot; y2=&quot;180&quot; stroke=&quot;#2c3e50&quot; stroke-width=&quot;14&quot; stroke-linecap=&quot;round&quot;/&gt;
    &lt;line x1=&quot;450&quot; y1=&quot;180&quot; x2=&quot;520&quot; y2=&quot;190&quot; stroke=&quot;#34495e&quot; stroke-width=&quot;12&quot; stroke-linecap=&quot;round&quot;/&gt;
    &lt;!-- Foot --&gt;
    &lt;ellipse cx=&quot;535&quot; cy=&quot;195&quot; rx=&quot;20&quot; ry=&quot;12&quot; fill=&quot;#e74c3c&quot; stroke=&quot;#c0392b&quot; stroke-width=&quot;2&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Bugs scurrying away --&gt;
  &lt;!-- Bug 1 - flying up-right --&gt;
  &lt;g id=&quot;bug1&quot; opacity=&quot;0.9&quot;&gt;
    &lt;ellipse cx=&quot;720&quot; cy=&quot;120&quot; rx=&quot;18&quot; ry=&quot;22&quot; fill=&quot;#27ae60&quot; stroke=&quot;#229954&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;ellipse cx=&quot;715&quot; cy=&quot;115&quot; rx=&quot;5&quot; ry=&quot;5&quot; fill=&quot;#fff&quot;/&gt;
    &lt;ellipse cx=&quot;725&quot; cy=&quot;115&quot; rx=&quot;5&quot; ry=&quot;5&quot; fill=&quot;#fff&quot;/&gt;
    &lt;line x1=&quot;705&quot; y1=&quot;125&quot; x2=&quot;690&quot; y2=&quot;120&quot; stroke=&quot;#229954&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;705&quot; y1=&quot;130&quot; x2=&quot;690&quot; y2=&quot;135&quot; stroke=&quot;#229954&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;735&quot; y1=&quot;125&quot; x2=&quot;750&quot; y2=&quot;120&quot; stroke=&quot;#229954&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;735&quot; y1=&quot;130&quot; x2=&quot;750&quot; y2=&quot;135&quot; stroke=&quot;#229954&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;!-- Motion lines --&gt;
    &lt;line x1=&quot;700&quot; y1=&quot;125&quot; x2=&quot;680&quot; y2=&quot;130&quot; stroke=&quot;#27ae60&quot; stroke-width=&quot;2&quot; opacity=&quot;0.4&quot;/&gt;
    &lt;line x1=&quot;705&quot; y1=&quot;135&quot; x2=&quot;685&quot; y2=&quot;140&quot; stroke=&quot;#27ae60&quot; stroke-width=&quot;2&quot; opacity=&quot;0.4&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Bug 2 - flying right --&gt;
  &lt;g id=&quot;bug2&quot; opacity=&quot;0.9&quot;&gt;
    &lt;ellipse cx=&quot;800&quot; cy=&quot;180&quot; rx=&quot;18&quot; ry=&quot;22&quot; fill=&quot;#e74c3c&quot; stroke=&quot;#c0392b&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;ellipse cx=&quot;795&quot; cy=&quot;175&quot; rx=&quot;5&quot; ry=&quot;5&quot; fill=&quot;#fff&quot;/&gt;
    &lt;ellipse cx=&quot;805&quot; cy=&quot;175&quot; rx=&quot;5&quot; ry=&quot;5&quot; fill=&quot;#fff&quot;/&gt;
    &lt;line x1=&quot;785&quot; y1=&quot;185&quot; x2=&quot;770&quot; y2=&quot;180&quot; stroke=&quot;#c0392b&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;785&quot; y1=&quot;190&quot; x2=&quot;770&quot; y2=&quot;195&quot; stroke=&quot;#c0392b&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;815&quot; y1=&quot;185&quot; x2=&quot;830&quot; y2=&quot;180&quot; stroke=&quot;#c0392b&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;815&quot; y1=&quot;190&quot; x2=&quot;830&quot; y2=&quot;195&quot; stroke=&quot;#c0392b&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;!-- Motion lines --&gt;
    &lt;line x1=&quot;780&quot; y1=&quot;180&quot; x2=&quot;760&quot; y2=&quot;180&quot; stroke=&quot;#e74c3c&quot; stroke-width=&quot;2&quot; opacity=&quot;0.4&quot;/&gt;
    &lt;line x1=&quot;780&quot; y1=&quot;190&quot; x2=&quot;760&quot; y2=&quot;190&quot; stroke=&quot;#e74c3c&quot; stroke-width=&quot;2&quot; opacity=&quot;0.4&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Bug 3 - flying down-right --&gt;
  &lt;g id=&quot;bug3&quot; opacity=&quot;0.9&quot;&gt;
    &lt;ellipse cx=&quot;820&quot; cy=&quot;260&quot; rx=&quot;18&quot; ry=&quot;22&quot; fill=&quot;#9b59b6&quot; stroke=&quot;#8e44ad&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;ellipse cx=&quot;815&quot; cy=&quot;255&quot; rx=&quot;5&quot; ry=&quot;5&quot; fill=&quot;#fff&quot;/&gt;
    &lt;ellipse cx=&quot;825&quot; cy=&quot;255&quot; rx=&quot;5&quot; ry=&quot;5&quot; fill=&quot;#fff&quot;/&gt;
    &lt;line x1=&quot;805&quot; y1=&quot;265&quot; x2=&quot;790&quot; y2=&quot;260&quot; stroke=&quot;#8e44ad&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;805&quot; y1=&quot;270&quot; x2=&quot;790&quot; y2=&quot;275&quot; stroke=&quot;#8e44ad&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;835&quot; y1=&quot;265&quot; x2=&quot;850&quot; y2=&quot;260&quot; stroke=&quot;#8e44ad&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;835&quot; y1=&quot;270&quot; x2=&quot;850&quot; y2=&quot;275&quot; stroke=&quot;#8e44ad&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;!-- Motion lines --&gt;
    &lt;line x1=&quot;800&quot; y1=&quot;265&quot; x2=&quot;780&quot; y2=&quot;270&quot; stroke=&quot;#9b59b6&quot; stroke-width=&quot;2&quot; opacity=&quot;0.4&quot;/&gt;
    &lt;line x1=&quot;805&quot; y1=&quot;275&quot; x2=&quot;785&quot; y2=&quot;280&quot; stroke=&quot;#9b59b6&quot; stroke-width=&quot;2&quot; opacity=&quot;0.4&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Bug 4 - flying right-up --&gt;
  &lt;g id=&quot;bug4&quot; opacity=&quot;0.9&quot;&gt;
    &lt;ellipse cx=&quot;900&quot; cy=&quot;140&quot; rx=&quot;16&quot; ry=&quot;20&quot; fill=&quot;#f39c12&quot; stroke=&quot;#e67e22&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;ellipse cx=&quot;896&quot; cy=&quot;136&quot; rx=&quot;4&quot; ry=&quot;4&quot; fill=&quot;#fff&quot;/&gt;
    &lt;ellipse cx=&quot;904&quot; cy=&quot;136&quot; rx=&quot;4&quot; ry=&quot;4&quot; fill=&quot;#fff&quot;/&gt;
    &lt;line x1=&quot;888&quot; y1=&quot;145&quot; x2=&quot;875&quot; y2=&quot;142&quot; stroke=&quot;#e67e22&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;888&quot; y1=&quot;150&quot; x2=&quot;875&quot; y2=&quot;153&quot; stroke=&quot;#e67e22&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;912&quot; y1=&quot;145&quot; x2=&quot;925&quot; y2=&quot;142&quot; stroke=&quot;#e67e22&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;912&quot; y1=&quot;150&quot; x2=&quot;925&quot; y2=&quot;153&quot; stroke=&quot;#e67e22&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;!-- Motion lines --&gt;
    &lt;line x1=&quot;885&quot; y1=&quot;145&quot; x2=&quot;868&quot; y2=&quot;150&quot; stroke=&quot;#f39c12&quot; stroke-width=&quot;2&quot; opacity=&quot;0.4&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Bug 5 - flying far right --&gt;
  &lt;g id=&quot;bug5&quot; opacity=&quot;0.9&quot;&gt;
    &lt;ellipse cx=&quot;980&quot; cy=&quot;200&quot; rx=&quot;16&quot; ry=&quot;20&quot; fill=&quot;#3498db&quot; stroke=&quot;#2980b9&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;ellipse cx=&quot;976&quot; cy=&quot;196&quot; rx=&quot;4&quot; ry=&quot;4&quot; fill=&quot;#fff&quot;/&gt;
    &lt;ellipse cx=&quot;984&quot; cy=&quot;196&quot; rx=&quot;4&quot; ry=&quot;4&quot; fill=&quot;#fff&quot;/&gt;
    &lt;line x1=&quot;968&quot; y1=&quot;205&quot; x2=&quot;955&quot; y2=&quot;202&quot; stroke=&quot;#2980b9&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;968&quot; y1=&quot;210&quot; x2=&quot;955&quot; y2=&quot;213&quot; stroke=&quot;#2980b9&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;992&quot; y1=&quot;205&quot; x2=&quot;1005&quot; y2=&quot;202&quot; stroke=&quot;#2980b9&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;992&quot; y1=&quot;210&quot; x2=&quot;1005&quot; y2=&quot;213&quot; stroke=&quot;#2980b9&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;!-- Motion lines --&gt;
    &lt;line x1=&quot;965&quot; y1=&quot;200&quot; x2=&quot;948&quot; y2=&quot;200&quot; stroke=&quot;#3498db&quot; stroke-width=&quot;2&quot; opacity=&quot;0.4&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Bug 6 - small bug flying away --&gt;
  &lt;g id=&quot;bug6&quot; opacity=&quot;0.85&quot;&gt;
    &lt;ellipse cx=&quot;1050&quot; cy=&quot;230&quot; rx=&quot;14&quot; ry=&quot;18&quot; fill=&quot;#1abc9c&quot; stroke=&quot;#16a085&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;ellipse cx=&quot;1047&quot; cy=&quot;226&quot; rx=&quot;3&quot; ry=&quot;3&quot; fill=&quot;#fff&quot;/&gt;
    &lt;ellipse cx=&quot;1053&quot; cy=&quot;226&quot; rx=&quot;3&quot; ry=&quot;3&quot; fill=&quot;#fff&quot;/&gt;
    &lt;line x1=&quot;1040&quot; y1=&quot;235&quot; x2=&quot;1030&quot; y2=&quot;233&quot; stroke=&quot;#16a085&quot; stroke-width=&quot;1.5&quot;/&gt;
    &lt;line x1=&quot;1060&quot; y1=&quot;235&quot; x2=&quot;1070&quot; y2=&quot;233&quot; stroke=&quot;#16a085&quot; stroke-width=&quot;1.5&quot;/&gt;
  &lt;/g&gt;
  

&lt;/svg&gt;  

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;In my &lt;a href=&quot;https://thushw.blogspot.com/2025/11/death-defying-sidekiq-jobs.html&quot; style=&quot;color: #3498db; text-decoration: none;&quot;&gt;previous post&lt;/a&gt;, I outlined the problem of parent jobs getting killed during Sidekiq shutdowns because they took too long to enqueue child jobs. We implemented a solution that used an active driver index instead of the expensive redis iterator, but the story doesn&#39;t end there.&lt;/p&gt;

&lt;h2 style=&quot;color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em;&quot;&gt;The Data Revealed More&lt;/h2&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;After deploying the active driver index, I gathered metrics on the parent job execution times. The good news: runtime dropped significantly. The bad news: even with the new index, the higher percentile execution times still hovered around 40 seconds.&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipc-fEIw2C5ohjqOOFTD8lgcpD0We9dr3srO6PgAUdP3_7ALYO_6l1nEeNlwS4PzcB85dQJvnbw3Y05JbCsEHDVxAvpGnDUCBBj_Fx7UJLfPFa9C6t73lOB_Ta8xq3cQh9IBZ8p_bej9ldIXB8oOZM0e9ebF726jQIyJ2T2c6ruce5gugEqG5vFg/s2998/Screenshot%202025-11-28%20at%207.11.38%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;438&quot; data-original-width=&quot;2998&quot; height=&quot;94&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipc-fEIw2C5ohjqOOFTD8lgcpD0We9dr3srO6PgAUdP3_7ALYO_6l1nEeNlwS4PzcB85dQJvnbw3Y05JbCsEHDVxAvpGnDUCBBj_Fx7UJLfPFa9C6t73lOB_Ta8xq3cQh9IBZ8p_bej9ldIXB8oOZM0e9ebF726jQIyJ2T2c6ruce5gugEqG5vFg/w603-h94/Screenshot%202025-11-28%20at%207.11.38%E2%80%AFPM.png&quot; width=&quot;603&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;&lt;span face=&quot;-apple-system, BlinkMacSystemFont, &amp;quot;Segoe UI&amp;quot;, Roboto, &amp;quot;Helvetica Neue&amp;quot;, Arial, sans-serif&quot;&gt;That 40-second ceiling was a problem. Sidekiq&#39;s shutdown grace period is 25 seconds by default, and while we could extend it, we&#39;d just be postponing the inevitable. Jobs that take 40 seconds to enqueue children are still vulnerable to being killed mid-execution during deployments or restarts.&lt;/span&gt;&lt;/p&gt;

&lt;h2 style=&quot;color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em;&quot;&gt;Enter &lt;code style=&quot;background-color: #f8f9fa; border-radius: 3px; font-family: Consolas, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 0.9em; padding: 2px 6px;&quot;&gt;bulk_perform&lt;/code&gt;&lt;/h2&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;The problem was that we had 100,000 jobs to push to sidekiq and while each push was in the order of a micro-second or less, the math adds up, and soon we are waiting close to a minute till all jobs were sent. I knew that this was a common problem with I/O bound systems where generally a &quot;bulk&quot; operation comes to the resuce. As in database writes, where we need to write a thousand records, we use a bulk insert, where through a single connection/call, the client sends a 1000 prepared statements that then are executed as a single batch in the database server (ex: postgres). A quick GenAI search hit upon &lt;code style=&quot;background-color: #f8f9fa; border-radius: 3px; font-family: Consolas, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 0.9em; padding: 2px 6px;&quot;&gt;bulk_perform&lt;/code&gt; - a method specifically designed for this exact scenario in the sidekiq world. Instead of enqueuing jobs one at a time, &lt;code style=&quot;background-color: #f8f9fa; border-radius: 3px; font-family: Consolas, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 0.9em; padding: 2px 6px;&quot;&gt;bulk_perform&lt;/code&gt; allows you to asynchronously submit up to 1,000 jobs to Sidekiq at once.&lt;/p&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;Here&#39;s what the refactored code looked like:&lt;/p&gt;

&lt;pre style=&quot;background-color: #f8f9fa; border-left: 4px solid rgb(52, 152, 219); border-radius: 4px; font-family: Consolas, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 0.9em; line-height: 1.4; margin: 1.5em 0px; overflow-x: auto; padding: 15px 20px;&quot;&gt;&lt;code style=&quot;background-color: transparent; padding: 0px;&quot;&gt;class ParentJob
  include Sidekiq::Job

  def perform(work_item_ids)
    # Prepare all job arguments
    job_args = work_item_ids.map { |id| [id] }
    
    ChildJob.perform_bulk(job_args)
  end
end&lt;/code&gt;&lt;/pre&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;The key difference: &lt;code style=&quot;background-color: #f8f9fa; border-radius: 3px; font-family: Consolas, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 0.9em; padding: 2px 6px;&quot;&gt;perform_bulk&lt;/code&gt; pushes the jobs to Redis in a single pipelined operation rather than individual Redis calls. This dramatically reduces the network overhead that was causing our bottleneck.&lt;/p&gt;

&lt;h2 style=&quot;color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em;&quot;&gt;The Results&lt;/h2&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;The impact was immediate and dramatic. Parent job execution times dropped to just a few seconds, even for large batches. The 99th percentile went from 40 seconds down to under 5 seconds.&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxk2xwi0M7mogkOJglCcNsqg_nyfigbRtvKQTLlVs9TxWwIbzDXgBeYzIqAMS85seD8z-jaupTVDmaYCXkAbo9K2brZbwqWPOkKqdt81TwZcyz35Qw1u_IHuNJwB7Bn2MPwAXkBrHBwbw1KYBMpMR5BkX6vyyp4RfLQv6xzoJqiFMa4STgEkjm8g/s3020/Screenshot%202025-11-28%20at%206.49.58%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;504&quot; data-original-width=&quot;3020&quot; height=&quot;106&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgxk2xwi0M7mogkOJglCcNsqg_nyfigbRtvKQTLlVs9TxWwIbzDXgBeYzIqAMS85seD8z-jaupTVDmaYCXkAbo9K2brZbwqWPOkKqdt81TwZcyz35Qw1u_IHuNJwB7Bn2MPwAXkBrHBwbw1KYBMpMR5BkX6vyyp4RfLQv6xzoJqiFMa4STgEkjm8g/w640-h106/Screenshot%202025-11-28%20at%206.49.58%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;This shows the results of our incremental optimizations:&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhL9o-zLvTILBoK2BY3-wKt8e7InTHllmvj73aoyCKE5N0MXxV2w8ghBTPg5FQBAd4jR-lOLz7EtHAPL2J76hZ7MP_m_6F6unIEZkNecSYj1j0WM9hyFIo9WUkrlQsWdqOi2FMD21yPTGw5tlZCYlGmWeSmW1dNSHhNXYXFhJuFmxfVKPLVn-s80Q/s3002/Screenshot%202025-11-28%20at%206.49.30%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;484&quot; data-original-width=&quot;3002&quot; height=&quot;104&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhL9o-zLvTILBoK2BY3-wKt8e7InTHllmvj73aoyCKE5N0MXxV2w8ghBTPg5FQBAd4jR-lOLz7EtHAPL2J76hZ7MP_m_6F6unIEZkNecSYj1j0WM9hyFIo9WUkrlQsWdqOi2FMD21yPTGw5tlZCYlGmWeSmW1dNSHhNXYXFhJuFmxfVKPLVn-s80Q/w640-h104/Screenshot%202025-11-28%20at%206.49.30%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;More importantly, the job now &lt;strong style=&quot;color: #2c3e50; font-weight: 600;&quot;&gt;always&lt;/strong&gt; finishes gracefully during a Sidekiq-initiated shutdown. No more interrupted enqueuing, no more orphaned work items, no more race conditions.&lt;/p&gt;&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;The overall time for job processing was reduced significantly, allowing for more efficient use of the cluster:&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinG84bsyNhBXhuLsrNsb52co0WyzDBhj9buBBSW87jP_tyv3RNO4pcmTSvlcNUEX0NbVArnWSr60kY6JY3CuGU9n4E1P-dEF8pe7qFVgGVzdsKqg3QbU5Yu-2LN00k6xOXHCTn01p2cj7j6LhdNxi_zz86mTGk-GDfr-U8BSvRZ4eh4efCJdRquA/s3320/Screenshot%202025-11-28%20at%206.46.42%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1078&quot; data-original-width=&quot;3320&quot; height=&quot;208&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinG84bsyNhBXhuLsrNsb52co0WyzDBhj9buBBSW87jP_tyv3RNO4pcmTSvlcNUEX0NbVArnWSr60kY6JY3CuGU9n4E1P-dEF8pe7qFVgGVzdsKqg3QbU5Yu-2LN00k6xOXHCTn01p2cj7j6LhdNxi_zz86mTGk-GDfr-U8BSvRZ4eh4efCJdRquA/w640-h208/Screenshot%202025-11-28%20at%206.46.42%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;

&lt;h2 style=&quot;color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em;&quot;&gt;Lessons Learned&lt;/h2&gt;

&lt;ol style=&quot;margin-bottom: 1em; margin-left: 20px;&quot;&gt;
&lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50; font-weight: 600;&quot;&gt;Measure first, optimize second&lt;/strong&gt;:&amp;nbsp; Premature optimization is still the root of at least &lt;i&gt;some&lt;/i&gt; evil. Our goal here was to run the task under 20 seconds so that it would not get interrupted by sidekiq. If our first optimization got us there, we would not need to use &lt;code style=&quot;background-color: #f8f9fa; border-radius: 3px; font-family: Consolas, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 0.9em; padding: 2px 6px;&quot;&gt;bulk_perform&lt;/code&gt;. And &lt;code style=&quot;background-color: #f8f9fa; border-radius: 3px; font-family: Consolas, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 0.9em; padding: 2px 6px;&quot;&gt;bulk_perform&lt;/code&gt; is not a slam dunk. Since all the arguments for the jobs are marshaled at once, it can overwhelm your redis db if it is running high on memory already.&lt;/li&gt;

&lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50; font-weight: 600;&quot;&gt;Deep dive when the situation demands it&lt;/strong&gt;: &lt;code style=&quot;background-color: #f8f9fa; border-radius: 3px; font-family: Consolas, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 0.9em; padding: 2px 6px;&quot;&gt;bulk_perform&lt;/code&gt; has been in Sidekiq for years, but I&#39;d never needed it until this specific use case pushed me to look deeper. Where else might we improve silent in-efficiencies with this technique? Time will tell.&lt;/li&gt;

&lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50; font-weight: 600;&quot;&gt;Network calls are expensive&lt;/strong&gt;: The difference between 1,000 individual Redis calls and one pipelined bulk operation was the difference between 40 seconds and 3 seconds.&lt;/li&gt;

&lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50; font-weight: 600;&quot;&gt;Graceful shutdowns matter&lt;/strong&gt;: Taking the time to handle shutdowns properly means deployments are smoother and data integrity is maintained.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 style=&quot;color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em;&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;What started as a critical bug during deployments became an opportunity to understand Sidekiq&#39;s internals more deeply. The journey from &quot;jobs getting killed&quot; to &quot;graceful shutdowns every time&quot; involved measuring performance, understanding bottlenecks, and discovering the right tool for the job.&lt;/p&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;If you&#39;re enqueuing large numbers of child jobs from a parent job, &lt;code style=&quot;background-color: #f8f9fa; border-radius: 3px; font-family: Consolas, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 0.9em; padding: 2px 6px;&quot;&gt;bulk_perform&lt;/code&gt; may just be the ticket.&lt;/p&gt;

&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/6493568766900669276/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/6493568766900669276' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/6493568766900669276'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/6493568766900669276'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2025/11/death-defying-sidekiq-jobs-part-2-in-my.html' title='Death-Defying Sidekiq Jobs: Part 2'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipc-fEIw2C5ohjqOOFTD8lgcpD0We9dr3srO6PgAUdP3_7ALYO_6l1nEeNlwS4PzcB85dQJvnbw3Y05JbCsEHDVxAvpGnDUCBBj_Fx7UJLfPFa9C6t73lOB_Ta8xq3cQh9IBZ8p_bej9ldIXB8oOZM0e9ebF726jQIyJ2T2c6ruce5gugEqG5vFg/s72-w603-h94-c/Screenshot%202025-11-28%20at%207.11.38%E2%80%AFPM.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-620563521154796584</id><published>2025-11-13T15:10:00.000-08:00</published><updated>2025-11-14T08:52:06.213-08:00</updated><title type='text'>Death-defying sidekiq jobs</title><content type='html'>
&lt;svg viewBox=&quot;0 0 1200 300&quot; xmlns=&quot;http://www.w3.org/2000/svg&quot;&gt;
  &lt;!-- Background --&gt;
  &lt;defs&gt;
    &lt;linearGradient id=&quot;bg&quot; x1=&quot;0%&quot; y1=&quot;0%&quot; x2=&quot;100%&quot; y2=&quot;100%&quot;&gt;
      &lt;stop offset=&quot;0%&quot; style=&quot;stop-color:#0a0a1a;stop-opacity:1&quot; /&gt;
      &lt;stop offset=&quot;100%&quot; style=&quot;stop-color:#1a1a3a;stop-opacity:1&quot; /&gt;
    &lt;/linearGradient&gt;
    
    &lt;!-- Spotlight effect --&gt;
    &lt;radialGradient id=&quot;spotlight&quot; cx=&quot;50%&quot; cy=&quot;50%&quot;&gt;
      &lt;stop offset=&quot;0%&quot; style=&quot;stop-color:#00ff88;stop-opacity:0.15&quot; /&gt;
      &lt;stop offset=&quot;100%&quot; style=&quot;stop-color:#00ff88;stop-opacity:0&quot; /&gt;
    &lt;/radialGradient&gt;
  &lt;/defs&gt;
  
  &lt;rect width=&quot;1200&quot; height=&quot;300&quot; fill=&quot;url(#bg)&quot;/&gt;
  &lt;circle cx=&quot;600&quot; cy=&quot;150&quot; r=&quot;300&quot; fill=&quot;url(#spotlight)&quot;/&gt;
  
  &lt;!-- Binary numbers scattered across background --&gt;
  &lt;g opacity=&quot;0.25&quot; fill=&quot;#00ff88&quot; font-family=&quot;Courier New, monospace&quot; font-size=&quot;14&quot;&gt;
    &lt;text x=&quot;80&quot; y=&quot;40&quot;&gt;1011001&lt;/text&gt;
    &lt;text x=&quot;220&quot; y=&quot;60&quot;&gt;0101110&lt;/text&gt;
    &lt;text x=&quot;380&quot; y=&quot;35&quot;&gt;1100101&lt;/text&gt;
    &lt;text x=&quot;540&quot; y=&quot;55&quot;&gt;0011010&lt;/text&gt;
    &lt;text x=&quot;700&quot; y=&quot;45&quot;&gt;1010111&lt;/text&gt;
    &lt;text x=&quot;860&quot; y=&quot;38&quot;&gt;0110100&lt;/text&gt;
    &lt;text x=&quot;1020&quot; y=&quot;52&quot;&gt;1001011&lt;/text&gt;
    
    &lt;text x=&quot;50&quot; y=&quot;270&quot;&gt;0101011&lt;/text&gt;
    &lt;text x=&quot;190&quot; y=&quot;285&quot;&gt;1100110&lt;/text&gt;
    &lt;text x=&quot;350&quot; y=&quot;275&quot;&gt;0010101&lt;/text&gt;
    &lt;text x=&quot;510&quot; y=&quot;290&quot;&gt;1011100&lt;/text&gt;
    &lt;text x=&quot;670&quot; y=&quot;280&quot;&gt;0111001&lt;/text&gt;
    &lt;text x=&quot;830&quot; y=&quot;288&quot;&gt;1000110&lt;/text&gt;
    &lt;text x=&quot;990&quot; y=&quot;272&quot;&gt;0101111&lt;/text&gt;
    
    &lt;text x=&quot;100&quot; y=&quot;140&quot;&gt;101&lt;/text&gt;
    &lt;text x=&quot;950&quot; y=&quot;160&quot;&gt;010&lt;/text&gt;
    &lt;text x=&quot;1100&quot; y=&quot;120&quot;&gt;110&lt;/text&gt;
    &lt;text x=&quot;40&quot; y=&quot;180&quot;&gt;001&lt;/text&gt;
  &lt;/g&gt;
  
  &lt;!-- Time series line chart with circuitry --&gt;
  &lt;g opacity=&quot;0.4&quot;&gt;
    &lt;!-- Main data line --&gt;
    &lt;polyline points=&quot;80,180 150,160 220,190 290,140 360,170 430,120 500,150 570,110 640,160 710,130 780,180 850,150 920,165 990,145 1060,175 1120,155&quot; 
              fill=&quot;none&quot; stroke=&quot;#00ccff&quot; stroke-width=&quot;2.5&quot; stroke-linecap=&quot;round&quot;/&gt;
    
    &lt;!-- Circuit nodes at data points --&gt;
    &lt;circle cx=&quot;150&quot; cy=&quot;160&quot; r=&quot;5&quot; fill=&quot;none&quot; stroke=&quot;#00ccff&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;circle cx=&quot;290&quot; cy=&quot;140&quot; r=&quot;5&quot; fill=&quot;none&quot; stroke=&quot;#00ccff&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;circle cx=&quot;430&quot; cy=&quot;120&quot; r=&quot;5&quot; fill=&quot;none&quot; stroke=&quot;#ff4444&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;circle cx=&quot;570&quot; cy=&quot;110&quot; r=&quot;5&quot; fill=&quot;none&quot; stroke=&quot;#00ccff&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;circle cx=&quot;710&quot; cy=&quot;130&quot; r=&quot;5&quot; fill=&quot;none&quot; stroke=&quot;#00ccff&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;circle cx=&quot;850&quot; cy=&quot;150&quot; r=&quot;5&quot; fill=&quot;none&quot; stroke=&quot;#00ccff&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;circle cx=&quot;990&quot; cy=&quot;145&quot; r=&quot;5&quot; fill=&quot;none&quot; stroke=&quot;#00ccff&quot; stroke-width=&quot;2&quot;/&gt;
    
    &lt;!-- Anomaly highlight --&gt;
    &lt;circle cx=&quot;430&quot; cy=&quot;120&quot; r=&quot;8&quot; fill=&quot;none&quot; stroke=&quot;#ff4444&quot; stroke-width=&quot;1.5&quot; opacity=&quot;0.6&quot;/&gt;
    &lt;circle cx=&quot;430&quot; cy=&quot;120&quot; r=&quot;12&quot; fill=&quot;none&quot; stroke=&quot;#ff4444&quot; stroke-width=&quot;1&quot; opacity=&quot;0.4&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Circuit-like connections --&gt;
  &lt;g opacity=&quot;0.25&quot; stroke=&quot;#00ff88&quot; stroke-width=&quot;1.5&quot; fill=&quot;none&quot;&gt;
    &lt;line x1=&quot;150&quot; y1=&quot;80&quot; x2=&quot;150&quot; y2=&quot;120&quot;/&gt;
    &lt;line x1=&quot;130&quot; y1=&quot;100&quot; x2=&quot;170&quot; y2=&quot;100&quot;/&gt;
    &lt;circle cx=&quot;150&quot; cy=&quot;100&quot; r=&quot;6&quot;/&gt;
    
    &lt;line x1=&quot;900&quot; y1=&quot;200&quot; x2=&quot;900&quot; y2=&quot;240&quot;/&gt;
    &lt;line x1=&quot;880&quot; y1=&quot;220&quot; x2=&quot;920&quot; y2=&quot;220&quot;/&gt;
    &lt;circle cx=&quot;900&quot; cy=&quot;220&quot; r=&quot;6&quot;/&gt;
    
    &lt;path d=&quot;M 250,240 L 250,250 L 280,250 L 280,260&quot; stroke-linecap=&quot;square&quot;/&gt;
    &lt;circle cx=&quot;280&quot; cy=&quot;260&quot; r=&quot;5&quot;/&gt;
    
    &lt;path d=&quot;M 1050,70 L 1050,90 L 1080,90&quot; stroke-linecap=&quot;square&quot;/&gt;
    &lt;rect x=&quot;1078&quot; y=&quot;85&quot; width=&quot;10&quot; height=&quot;10&quot; stroke=&quot;#00ff88&quot; fill=&quot;none&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Binary stream in middle --&gt;
  &lt;g opacity=&quot;0.3&quot; fill=&quot;#00ff88&quot; font-family=&quot;Courier New, monospace&quot; font-size=&quot;16&quot; font-weight=&quot;bold&quot;&gt;
    &lt;text x=&quot;150&quot; y=&quot;150&quot;&gt;10110&lt;/text&gt;
    &lt;text x=&quot;320&quot; y=&quot;155&quot;&gt;01101&lt;/text&gt;
    &lt;text x=&quot;750&quot; y=&quot;145&quot;&gt;11010&lt;/text&gt;
    &lt;text x=&quot;920&quot; y=&quot;150&quot;&gt;00101&lt;/text&gt;
  &lt;/g&gt;
  
  &lt;!-- More binary scattered --&gt;
  &lt;g opacity=&quot;0.2&quot; fill=&quot;#00ccff&quot; font-family=&quot;Courier New, monospace&quot; font-size=&quot;12&quot;&gt;
    &lt;text x=&quot;200&quot; y=&quot;110&quot;&gt;1&lt;/text&gt;
    &lt;text x=&quot;240&quot; y=&quot;95&quot;&gt;0&lt;/text&gt;
    &lt;text x=&quot;280&quot; y=&quot;115&quot;&gt;1&lt;/text&gt;
    &lt;text x=&quot;460&quot; y=&quot;200&quot;&gt;0&lt;/text&gt;
    &lt;text x=&quot;500&quot; y=&quot;220&quot;&gt;1&lt;/text&gt;
    &lt;text x=&quot;540&quot; y=&quot;210&quot;&gt;1&lt;/text&gt;
    &lt;text x=&quot;680&quot; y=&quot;90&quot;&gt;1&lt;/text&gt;
    &lt;text x=&quot;720&quot; y=&quot;105&quot;&gt;0&lt;/text&gt;
    &lt;text x=&quot;810&quot; y=&quot;195&quot;&gt;1&lt;/text&gt;
    &lt;text x=&quot;850&quot; y=&quot;210&quot;&gt;0&lt;/text&gt;
  &lt;/g&gt;
  
  &lt;!-- Magnifying glass examining the anomaly --&gt;
  &lt;g transform=&quot;translate(430, 120)&quot;&gt;
    &lt;!-- Handle --&gt;
    &lt;line x1=&quot;50&quot; y1=&quot;50&quot; x2=&quot;85&quot; y2=&quot;85&quot; stroke=&quot;#8b7355&quot; stroke-width=&quot;8&quot; stroke-linecap=&quot;round&quot;/&gt;
    &lt;!-- Rim --&gt;
    &lt;circle cx=&quot;0&quot; cy=&quot;0&quot; r=&quot;45&quot; fill=&quot;none&quot; stroke=&quot;#8b7355&quot; stroke-width=&quot;7&quot;/&gt;
    &lt;!-- Glass with shine --&gt;
    &lt;circle cx=&quot;0&quot; cy=&quot;0&quot; r=&quot;40&quot; fill=&quot;#e8f4f8&quot; opacity=&quot;0.15&quot;/&gt;
    &lt;circle cx=&quot;0&quot; cy=&quot;0&quot; r=&quot;40&quot; fill=&quot;none&quot; stroke=&quot;#b8d4e0&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;!-- Reflection --&gt;
    &lt;ellipse cx=&quot;-12&quot; cy=&quot;-12&quot; rx=&quot;15&quot; ry=&quot;22&quot; fill=&quot;white&quot; opacity=&quot;0.4&quot;/&gt;
    
    &lt;!-- Magnified view inside glass --&gt;
    &lt;g clip-path=&quot;circle(40px at 0px 0px)&quot;&gt;
      &lt;!-- Zoomed circuit detail --&gt;
      &lt;line x1=&quot;-30&quot; y1=&quot;0&quot; x2=&quot;30&quot; y2=&quot;0&quot; stroke=&quot;#00ccff&quot; stroke-width=&quot;3&quot;/&gt;
      &lt;line x1=&quot;0&quot; y1=&quot;-30&quot; x2=&quot;0&quot; y2=&quot;30&quot; stroke=&quot;#00ccff&quot; stroke-width=&quot;3&quot;/&gt;
      &lt;circle cx=&quot;0&quot; cy=&quot;0&quot; r=&quot;8&quot; fill=&quot;none&quot; stroke=&quot;#ff4444&quot; stroke-width=&quot;3&quot;/&gt;
      &lt;circle cx=&quot;0&quot; cy=&quot;0&quot; r=&quot;12&quot; fill=&quot;none&quot; stroke=&quot;#ff4444&quot; stroke-width=&quot;2&quot; opacity=&quot;0.5&quot;/&gt;
      
      &lt;!-- Binary inside --&gt;
      &lt;text x=&quot;-8&quot; y=&quot;-18&quot; font-family=&quot;Courier New&quot; font-size=&quot;10&quot; fill=&quot;#00ff88&quot; font-weight=&quot;bold&quot;&gt;1&lt;/text&gt;
      &lt;text x=&quot;12&quot; y=&quot;-18&quot; font-family=&quot;Courier New&quot; font-size=&quot;10&quot; fill=&quot;#00ff88&quot; font-weight=&quot;bold&quot;&gt;0&lt;/text&gt;
      &lt;text x=&quot;-8&quot; y=&quot;25&quot; font-family=&quot;Courier New&quot; font-size=&quot;10&quot; fill=&quot;#00ff88&quot; font-weight=&quot;bold&quot;&gt;1&lt;/text&gt;
      &lt;text x=&quot;12&quot; y=&quot;25&quot; font-family=&quot;Courier New&quot; font-size=&quot;10&quot; fill=&quot;#00ff88&quot; font-weight=&quot;bold&quot;&gt;1&lt;/text&gt;
    &lt;/g&gt;
  &lt;/g&gt;
  
  &lt;!-- Additional circuit symbols --&gt;
  &lt;g opacity=&quot;0.3&quot; stroke=&quot;#00ff88&quot; fill=&quot;none&quot;&gt;
    &lt;!-- Resistor symbol --&gt;
    &lt;path d=&quot;M 60,120 l 5,-5 l 5,10 l 5,-10 l 5,10 l 5,-10 l 5,10 l 5,-5&quot; stroke-width=&quot;2&quot;/&gt;
    
    &lt;!-- Capacitor symbol --&gt;
    &lt;line x1=&quot;1120&quot; y1=&quot;210&quot; x2=&quot;1120&quot; y2=&quot;230&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;1130&quot; y1=&quot;210&quot; x2=&quot;1130&quot; y2=&quot;230&quot; stroke-width=&quot;2&quot;/&gt;
    
    &lt;!-- Ground symbol --&gt;
    &lt;line x1=&quot;350&quot; y1=&quot;35&quot; x2=&quot;350&quot; y2=&quot;45&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;345&quot; y1=&quot;45&quot; x2=&quot;355&quot; y2=&quot;45&quot; stroke-width=&quot;2&quot;/&gt;
    &lt;line x1=&quot;347&quot; y1=&quot;48&quot; x2=&quot;353&quot; y2=&quot;48&quot; stroke-width=&quot;1.5&quot;/&gt;
  &lt;/g&gt;
&lt;/svg&gt;
&lt;div style=&quot;color: #333333; font-family: -apple-system, BlinkMacSystemFont, &amp;quot;Segoe UI&amp;quot;, Roboto, Oxygen, Ubuntu, Cantarell, sans-serif; line-height: 1.6; margin: 0px auto; max-width: 800px; padding: 20px;&quot;&gt;

&lt;p style=&quot;font-size: 16px; margin-bottom: 15px;&quot;&gt;As promised in my earlier post, I&#39;m thrilled to announce that the changes to prevent Sidekiq job termination have been successfully deployed, and the results look promising!&lt;/p&gt;

&lt;p style=&quot;font-size: 16px; margin-bottom: 15px;&quot;&gt;But before I get ahead of myself, let me break down the problem again. &lt;span style=&quot;font-style: italic;&quot;&gt;(If you haven&#39;t read the &lt;a href=&quot;https://thushw.blogspot.com/2025/10/when-aws-went-down-our-users-didnt-lose.html&quot;&gt;previous&lt;/a&gt; &lt;a href=&quot;https://thushw.blogspot.com/2025/11/when-your-fix-becomes-problem-aws.html&quot;&gt;posts&lt;/a&gt;, you might want to check them out for context.)&lt;/span&gt;&lt;/p&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #2c3e50; font-size: 24px; margin-bottom: 15px; margin-top: 30px; padding-bottom: 8px;&quot;&gt;The Problem&lt;/h2&gt;

&lt;ol style=&quot;margin-bottom: 20px; padding-left: 25px;&quot;&gt;
  &lt;li style=&quot;line-height: 1.6; margin-bottom: 10px;&quot;&gt;We have a parent job that spawns child jobs for mileage calculation for each user&lt;/li&gt;
  &lt;li style=&quot;line-height: 1.6; margin-bottom: 10px;&quot;&gt;The parent job runs longer than 30 seconds and occasionally gets killed by Sidekiq&lt;/li&gt;
  &lt;li style=&quot;line-height: 1.6; margin-bottom: 10px;&quot;&gt;Why does this happen? Sidekiq restarts every time we deploy new code (several times a day—we are a startup, after all!). Auto-scaling rules on the cluster can also reboot Sidekiq&lt;/li&gt;
  &lt;li style=&quot;line-height: 1.6; margin-bottom: 10px;&quot;&gt;Generally, this parent job is idempotent when interrupted during the time series iteration (where 99% of the time is spent), so it doesn&#39;t usually cause data corruption—just an annoying inefficiency&lt;/li&gt;
  &lt;li style=&quot;line-height: 1.6; margin-bottom: 10px;&quot;&gt;In the unlucky 1% of cases, we could spawn two jobs for each user, causing each to compute mileage independently and doubling the count&lt;/li&gt;
  &lt;li style=&quot;line-height: 1.6; margin-bottom: 10px;&quot;&gt;We can&#39;t handle concurrent invocations (which happen at the end of an outage) because it&#39;s hard to differentiate between a scheduled invocation and one triggered by a service restart&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #2c3e50; font-size: 24px; margin-bottom: 15px; margin-top: 30px; padding-bottom: 8px;&quot;&gt;The Solution (Deployed Methodically)&lt;/h2&gt;

&lt;p style=&quot;font-size: 16px; margin-bottom: 15px;&quot;&gt;First, I tackled these steps:&lt;/p&gt;

&lt;ol style=&quot;margin-bottom: 20px; padding-left: 25px;&quot;&gt;
  &lt;li style=&quot;line-height: 1.6; margin-bottom: 10px;&quot;&gt;&lt;strong style=&quot;color: #2c3e50; font-weight: 600;&quot;&gt;Deployed metrics&lt;/strong&gt; to track how long the parent job takes. We now have over a day&#39;s worth of data. Notice it takes way longer than 30 seconds—if our new approach succeeds, this graph should flatten out in the coming days&lt;/li&gt;
  &lt;li style=&quot;line-height: 1.6; margin-bottom: 10px;&quot;&gt;&lt;strong style=&quot;color: #2c3e50; font-weight: 600;&quot;&gt;Deployed code&lt;/strong&gt; that builds a parallel data structure to hold driver IDs&lt;/li&gt;
  &lt;li style=&quot;line-height: 1.6; margin-bottom: 10px;&quot;&gt;&lt;strong style=&quot;color: #2c3e50; font-weight: 600;&quot;&gt;Tested&lt;/strong&gt; to ensure both the old and new approaches return the same set of users&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhay5Rg9d85HIQIF8txXwe_FN1lHsFVqEs2uDS8J4EV5qtcoJSS_lT_CSy2sEDqDzHRJKfhxs78JZCsHfAlt-JZJxffkGnzEQf0lXniGM0oxtsJobdLIGEJcS1VmZZCtVk99ux4gOgR6D2HnuxhQxVALsODyWSVzIF7C5_tvi8zmvAJNYpcitjCcQ/s1668/Screenshot%202025-11-12%20at%207.31.44%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;374&quot; data-original-width=&quot;1668&quot; height=&quot;144&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhay5Rg9d85HIQIF8txXwe_FN1lHsFVqEs2uDS8J4EV5qtcoJSS_lT_CSy2sEDqDzHRJKfhxs78JZCsHfAlt-JZJxffkGnzEQf0lXniGM0oxtsJobdLIGEJcS1VmZZCtVk99ux4gOgR6D2HnuxhQxVALsODyWSVzIF7C5_tvi8zmvAJNYpcitjCcQ/w640-h144/Screenshot%202025-11-12%20at%207.31.44%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #2c3e50; font-size: 24px; margin-bottom: 15px; margin-top: 30px; padding-bottom: 8px;&quot;&gt;Testing Challenges&lt;/h2&gt;

&lt;p style=&quot;font-size: 16px; margin-bottom: 15px;&quot;&gt;Step #3 proved harder than expected. Testing against a live system means the numbers never match exactly. I wrote code to examine the differences, built a hypothesis about why/how the numbers would differ, and tested it against the data.&lt;/p&gt;

&lt;div style=&quot;background-color: #f4f4f4; border-bottom: 1px solid #ddd; border-color: rgb(221, 221, 221) rgb(221, 221, 221) rgb(221, 221, 221) rgb(52, 152, 219); border-left: 4px solid #3498db; border-right: 1px solid #ddd; border-style: solid; border-top: 1px solid #ddd; border-width: 1px 1px 1px 4px; font-family: &amp;quot;Courier New&amp;quot;, Consolas, Monaco, monospace; font-size: 14px; line-height: 1.8; margin: 20px 0px; overflow-x: auto; padding: 15px;&quot;&gt;
&lt;pre style=&quot;margin: 0px; overflow-wrap: break-word; white-space: pre-wrap; word-wrap: break-word;&quot;&gt;users = []
GeoTimeseries.iterate do |user_id|
  users &amp;lt;&amp;lt; user_id if GeoTimeseries.recently_driven?(user_id) 
end
orig_set = Set.new(users)

current_time = Time.current
new_set = 
  6.times.reduce(Set.new) do |user_ids, i|
    bucket_time = current_time - (i * Geo::LastHourSink::BUCKET_DURATION)
    bucket_key = Geo::LastHourSink.bucket_key_for(bucket_time)
    members = $redis_aws.smembers(bucket_key).map(&amp;amp;:to_i)
    user_ids.merge(members)
  end&lt;/pre&gt;
&lt;/div&gt;
  
&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #2c3e50; font-size: 24px; margin-bottom: 15px; margin-top: 30px; padding-bottom: 8px;&quot;&gt;Analyzing the Differences&lt;/h2&gt;

&lt;p style=&quot;font-size: 16px; margin-bottom: 10px;&quot;&gt;To understand the discrepancies:&lt;/p&gt;

&lt;ul style=&quot;list-style-type: none; margin-bottom: 20px; padding-left: 20px;&quot;&gt;
  &lt;li style=&quot;margin-bottom: 8px;&quot;&gt;• &lt;span style=&quot;background-color: #f4f4f4; border-radius: 3px; font-family: &#39;Courier New&#39;, monospace; padding: 2px 6px;&quot;&gt;orig_set - new_set&lt;/span&gt; shows users our new technique missed&lt;/li&gt;
  &lt;li style=&quot;margin-bottom: 8px;&quot;&gt;• &lt;span style=&quot;background-color: #f4f4f4; border-radius: 3px; font-family: &#39;Courier New&#39;, monospace; padding: 2px 6px;&quot;&gt;new_set - orig_set&lt;/span&gt; shows users who appear with the new technique but were absent before&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #2c3e50; font-size: 24px; margin-bottom: 15px; margin-top: 30px; padding-bottom: 8px;&quot;&gt;Users We Missed (orig_set - new_set)&lt;/h2&gt;

&lt;p style=&quot;font-size: 16px; margin-bottom: 15px;&quot;&gt;Spot-checking the last timestamp of several users showed they&#39;d last driven slightly over an hour ago. This makes sense—our new technique runs about a minute after the time series iteration, by which point we&#39;d already expired some early drivers.&lt;/p&gt;

&lt;p style=&quot;font-size: 16px; margin-bottom: 15px;&quot;&gt;Running the time delta across the complete set revealed two patterns:&lt;/p&gt;

&lt;ol style=&quot;margin-bottom: 20px; padding-left: 25px;&quot;&gt;
  &lt;li style=&quot;line-height: 1.6; margin-bottom: 10px;&quot;&gt;Users who stopped driving slightly before the 1-hour mark&lt;/li&gt;
  &lt;li style=&quot;line-height: 1.6; margin-bottom: 10px;&quot;&gt;Users who started driving a few seconds ago&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0o3Av0tJhmWeZsohUju0vxY0QDHdLlu18RhmEg0gHaRgqXO7XJkLSd1MJn8YOIDD336i8DZKNGVrJ-99DA5PJGOupF0GkG2uwPnISV1Jxcv6E6bOWp2rI4aM6Lgyvgid4BDOIpgB0P6aivdnyJ00KbxZQOCoyS4NUl0a0WNtKCUXDSOyfRu1C8w/s2004/Screenshot%202025-11-13%20at%201.02.28%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1844&quot; data-original-width=&quot;2004&quot; height=&quot;588&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0o3Av0tJhmWeZsohUju0vxY0QDHdLlu18RhmEg0gHaRgqXO7XJkLSd1MJn8YOIDD336i8DZKNGVrJ-99DA5PJGOupF0GkG2uwPnISV1Jxcv6E6bOWp2rI4aM6Lgyvgid4BDOIpgB0P6aivdnyJ00KbxZQOCoyS4NUl0a0WNtKCUXDSOyfRu1C8w/w640-h588/Screenshot%202025-11-13%20at%201.02.28%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;I hypothesized that users who hadn&#39;t driven for the past hour must have just started driving. If correct, these users should now be present in our new data structure—which I validated.&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjLH2_DsmhXdYtYHZ1MZB7YNJQjfkO4aK9rV4-F-NeFvFoBBcZ1T9HgGLYenNr3y11v9qgXLQY4arCOYmRU-2G9TAM3-z7_b6LDzjBo6AJVG7r_CttlRA62CJ3eVEY3N93ksqyxiIWTLnAFOU4a6L1c8pAfXYhsrBiwQOvwavogTaJlp0LUHShQA/s2132/Screenshot%202025-11-13%20at%201.40.03%E2%80%AFPM.png&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;682&quot; data-original-width=&quot;2132&quot; height=&quot;204&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjLH2_DsmhXdYtYHZ1MZB7YNJQjfkO4aK9rV4-F-NeFvFoBBcZ1T9HgGLYenNr3y11v9qgXLQY4arCOYmRU-2G9TAM3-z7_b6LDzjBo6AJVG7r_CttlRA62CJ3eVEY3N93ksqyxiIWTLnAFOU4a6L1c8pAfXYhsrBiwQOvwavogTaJlp0LUHShQA/w640-h204/Screenshot%202025-11-13%20at%201.40.03%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #2c3e50; font-size: 24px; margin-bottom: 15px; margin-top: 30px; padding-bottom: 8px;&quot;&gt;New Drivers (new_set - orig_set)&lt;/h2&gt;

&lt;p style=&quot;font-size: 16px; margin-bottom: 15px;&quot;&gt;Everyone in this set had just started driving, so it made sense we missed them during the iteration that happened a minute earlier. (This screenshot shows -- second column --how long they have been driving and they are mostly under 60 seconds )&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg13x20ERjWfxr_FyuBsNg9iw7McCFhbC-xb2sbjxdmXbJ32lv4Uy6f-HKF1Xywzxu9xOgMiVpLtuaT9AtmPp86Sd1Xmft0_qTLNU-acY2mgBQBpxOauoBinw_ofT82onKxK4mbzc76YZtbhxVxOrid23ffJoda_OvntdPA9TSs_GnxkFpiULgnbg/s2132/Screenshot%202025-11-13%20at%202.29.11%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1624&quot; data-original-width=&quot;2132&quot; height=&quot;488&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg13x20ERjWfxr_FyuBsNg9iw7McCFhbC-xb2sbjxdmXbJ32lv4Uy6f-HKF1Xywzxu9xOgMiVpLtuaT9AtmPp86Sd1Xmft0_qTLNU-acY2mgBQBpxOauoBinw_ofT82onKxK4mbzc76YZtbhxVxOrid23ffJoda_OvntdPA9TSs_GnxkFpiULgnbg/w640-h488/Screenshot%202025-11-13%20at%202.29.11%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p style=&quot;font-size: 16px; margin-bottom: 15px;&quot;&gt;&lt;span face=&quot;-apple-system, BlinkMacSystemFont, &amp;quot;Segoe UI&amp;quot;, Roboto, Oxygen, Ubuntu, Cantarell, sans-serif&quot; style=&quot;color: #666666; font-style: italic;&quot;&gt;With these validations complete, I&#39;m confident in the new approach. Stay tuned for follow-up metrics showing the flattened execution times!&lt;/span&gt;&lt;/p&gt;

&lt;/div&gt;

</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/620563521154796584/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/620563521154796584' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/620563521154796584'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/620563521154796584'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2025/11/death-defying-sidekiq-jobs.html' title='Death-defying sidekiq jobs'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhay5Rg9d85HIQIF8txXwe_FN1lHsFVqEs2uDS8J4EV5qtcoJSS_lT_CSy2sEDqDzHRJKfhxs78JZCsHfAlt-JZJxffkGnzEQf0lXniGM0oxtsJobdLIGEJcS1VmZZCtVk99ux4gOgR6D2HnuxhQxVALsODyWSVzIF7C5_tvi8zmvAJNYpcitjCcQ/s72-w640-h144-c/Screenshot%202025-11-12%20at%207.31.44%E2%80%AFPM.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-1792388757843934734</id><published>2025-11-07T22:02:00.000-08:00</published><updated>2025-11-07T22:09:09.736-08:00</updated><title type='text'>When Your Fix Becomes the Problem: A Tale of AWS Outages, Redis Flags, and Performance Scaling</title><content type='html'>&lt;svg viewBox=&quot;0 0 1200 630&quot; xmlns=&quot;http://www.w3.org/2000/svg&quot;&gt;
  &lt;defs&gt;
    &lt;linearGradient id=&quot;bg&quot; x1=&quot;0%&quot; y1=&quot;0%&quot; x2=&quot;100%&quot; y2=&quot;100%&quot;&gt;
      &lt;stop offset=&quot;0%&quot; style=&quot;stop-color:#1a1a2e;stop-opacity:1&quot; /&gt;
      &lt;stop offset=&quot;100%&quot; style=&quot;stop-color:#16213e;stop-opacity:1&quot; /&gt;
    &lt;/linearGradient&gt;
    &lt;filter id=&quot;glow&quot;&gt;
      &lt;feGaussianBlur stdDeviation=&quot;3&quot; result=&quot;coloredBlur&quot;/&gt;
      &lt;feMerge&gt;
        &lt;feMergeNode in=&quot;coloredBlur&quot;/&gt;
        &lt;feMergeNode in=&quot;SourceGraphic&quot;/&gt;
      &lt;/feMerge&gt;
    &lt;/filter&gt;
  &lt;/defs&gt;
  
  &lt;!-- Background --&gt;
  &lt;rect width=&quot;1200&quot; height=&quot;630&quot; fill=&quot;url(#bg)&quot;/&gt;
  
  &lt;!-- Circuit pattern background --&gt;
  &lt;g opacity=&quot;0.1&quot; stroke=&quot;#3498db&quot; stroke-width=&quot;1&quot; fill=&quot;none&quot;&gt;
    &lt;line x1=&quot;0&quot; y1=&quot;100&quot; x2=&quot;200&quot; y2=&quot;100&quot;/&gt;
    &lt;line x1=&quot;200&quot; y1=&quot;100&quot; x2=&quot;200&quot; y2=&quot;200&quot;/&gt;
    &lt;line x1=&quot;200&quot; y1=&quot;200&quot; x2=&quot;400&quot; y2=&quot;200&quot;/&gt;
    &lt;circle cx=&quot;200&quot; cy=&quot;100&quot; r=&quot;4&quot; fill=&quot;#3498db&quot;/&gt;
    &lt;circle cx=&quot;200&quot; cy=&quot;200&quot; r=&quot;4&quot; fill=&quot;#3498db&quot;/&gt;
    
    &lt;line x1=&quot;1000&quot; y1=&quot;400&quot; x2=&quot;1200&quot; y2=&quot;400&quot;/&gt;
    &lt;line x1=&quot;1000&quot; y1=&quot;400&quot; x2=&quot;1000&quot; y2=&quot;500&quot;/&gt;
    &lt;line x1=&quot;800&quot; y1=&quot;500&quot; x2=&quot;1000&quot; y2=&quot;500&quot;/&gt;
    &lt;circle cx=&quot;1000&quot; cy=&quot;400&quot; r=&quot;4&quot; fill=&quot;#3498db&quot;/&gt;
    &lt;circle cx=&quot;1000&quot; cy=&quot;500&quot; r=&quot;4&quot; fill=&quot;#3498db&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Main icons --&gt;
  &lt;!-- Redis icon (left) --&gt;
  &lt;g transform=&quot;translate(150, 200)&quot;&gt;
    &lt;circle cx=&quot;0&quot; cy=&quot;0&quot; r=&quot;60&quot; fill=&quot;#DC382D&quot; opacity=&quot;0.9&quot;/&gt;
    &lt;path d=&quot;M -30,-10 L -15,-5 L 0,-10 L 15,-5 L 30,-10 L 30,10 L 15,15 L 0,10 L -15,15 L -30,10 Z&quot; 
          fill=&quot;white&quot; opacity=&quot;0.9&quot;/&gt;
    &lt;ellipse cx=&quot;0&quot; cy=&quot;0&quot; rx=&quot;30&quot; ry=&quot;8&quot; fill=&quot;none&quot; stroke=&quot;white&quot; stroke-width=&quot;2&quot; opacity=&quot;0.7&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Warning/Clock icon (center) --&gt;
  &lt;g transform=&quot;translate(600, 315)&quot;&gt;
    &lt;circle cx=&quot;0&quot; cy=&quot;0&quot; r=&quot;80&quot; fill=&quot;#f39c12&quot; opacity=&quot;0.2&quot; filter=&quot;url(#glow)&quot;/&gt;
    &lt;circle cx=&quot;0&quot; cy=&quot;0&quot; r=&quot;50&quot; fill=&quot;none&quot; stroke=&quot;#f39c12&quot; stroke-width=&quot;4&quot;/&gt;
    &lt;path d=&quot;M 0,-30 L 0,0 L 20,20&quot; fill=&quot;none&quot; stroke=&quot;#f39c12&quot; stroke-width=&quot;4&quot; stroke-linecap=&quot;round&quot;/&gt;
    &lt;circle cx=&quot;0&quot; cy=&quot;0&quot; r=&quot;4&quot; fill=&quot;#f39c12&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Sidekiq/Worker icon (right) --&gt;
  &lt;g transform=&quot;translate(1050, 200)&quot;&gt;
    &lt;rect x=&quot;-50&quot; y=&quot;-50&quot; width=&quot;100&quot; height=&quot;100&quot; rx=&quot;10&quot; fill=&quot;#B52F3A&quot; opacity=&quot;0.9&quot;/&gt;
    &lt;circle cx=&quot;-20&quot; cy=&quot;-20&quot; r=&quot;8&quot; fill=&quot;white&quot;/&gt;
    &lt;circle cx=&quot;20&quot; cy=&quot;-20&quot; r=&quot;8&quot; fill=&quot;white&quot;/&gt;
    &lt;path d=&quot;M -30,10 Q 0,30 30,10&quot; fill=&quot;none&quot; stroke=&quot;white&quot; stroke-width=&quot;4&quot; stroke-linecap=&quot;round&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Connecting lines with animation feel --&gt;
  &lt;g stroke=&quot;#3498db&quot; stroke-width=&quot;2&quot; fill=&quot;none&quot; opacity=&quot;0.5&quot;&gt;
    &lt;line x1=&quot;210&quot; y1=&quot;200&quot; x2=&quot;520&quot; y2=&quot;315&quot; stroke-dasharray=&quot;5,5&quot;/&gt;
    &lt;line x1=&quot;680&quot; y1=&quot;315&quot; x2=&quot;1000&quot; y2=&quot;200&quot; stroke-dasharray=&quot;5,5&quot;/&gt;
  &lt;/g&gt;
  
  &lt;!-- Title --&gt;
  &lt;text x=&quot;600&quot; y=&quot;480&quot; font-family=&quot;Arial, sans-serif&quot; font-size=&quot;48&quot; font-weight=&quot;bold&quot; 
        fill=&quot;#ecf0f1&quot; text-anchor=&quot;middle&quot;&gt;
    When Your Fix Becomes the Problem
  &lt;/text&gt;
  
  &lt;!-- Subtitle --&gt;
  &lt;text x=&quot;600&quot; y=&quot;530&quot; font-family=&quot;Arial, sans-serif&quot; font-size=&quot;28&quot; 
        fill=&quot;#95a5a6&quot; text-anchor=&quot;middle&quot;&gt;
    AWS Outages, Redis Flags &amp;amp; Performance Scaling
  &lt;/text&gt;
  
  &lt;!-- Small tech badges --&gt;
  &lt;g transform=&quot;translate(400, 570)&quot;&gt;
    &lt;rect x=&quot;0&quot; y=&quot;0&quot; width=&quot;80&quot; height=&quot;30&quot; rx=&quot;15&quot; fill=&quot;#3498db&quot; opacity=&quot;0.8&quot;/&gt;
    &lt;text x=&quot;40&quot; y=&quot;20&quot; font-family=&quot;Arial, sans-serif&quot; font-size=&quot;14&quot; 
          fill=&quot;white&quot; text-anchor=&quot;middle&quot; font-weight=&quot;bold&quot;&gt;Redis&lt;/text&gt;
  &lt;/g&gt;
  
  &lt;g transform=&quot;translate(500, 570)&quot;&gt;
    &lt;rect x=&quot;0&quot; y=&quot;0&quot; width=&quot;100&quot; height=&quot;30&quot; rx=&quot;15&quot; fill=&quot;#B52F3A&quot; opacity=&quot;0.8&quot;/&gt;
    &lt;text x=&quot;50&quot; y=&quot;20&quot; font-family=&quot;Arial, sans-serif&quot; font-size=&quot;14&quot; 
          fill=&quot;white&quot; text-anchor=&quot;middle&quot; font-weight=&quot;bold&quot;&gt;Sidekiq&lt;/text&gt;
  &lt;/g&gt;
  
  &lt;g transform=&quot;translate(620, 570)&quot;&gt;
    &lt;rect x=&quot;0&quot; y=&quot;0&quot; width=&quot;80&quot; height=&quot;30&quot; rx=&quot;15&quot; fill=&quot;#f39c12&quot; opacity=&quot;0.8&quot;/&gt;
    &lt;text x=&quot;40&quot; y=&quot;20&quot; font-family=&quot;Arial, sans-serif&quot; font-size=&quot;14&quot; 
          fill=&quot;white&quot; text-anchor=&quot;middle&quot; font-weight=&quot;bold&quot;&gt;AWS&lt;/text&gt;
  &lt;/g&gt;
&lt;/svg&gt;

&lt;div style=&quot;color: #333333; font-family: Georgia, &amp;quot;Times New Roman&amp;quot;, serif; line-height: 1.7; margin: 0px auto; max-width: 800px;&quot;&gt;

&lt;h1 style=&quot;color: #2c3e50; font-size: 2em; line-height: 1.3; margin-bottom: 0.5em;&quot;&gt;&lt;br /&gt;&lt;/h1&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em; padding-bottom: 0.3em;&quot;&gt;The Original Problem: AWS Outage Chaos&lt;/h2&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;During the recent Oct 20th 2025 AWS outage, our team discovered an uncomfortable truth about our scheduled jobs. We had jobs configured to run exactly once per schedule via AWS EventBridge Scheduler. Simple enough, right?&lt;/p&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;Wrong.&lt;/p&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;When AWS came back online after an extended outage, EventBridge released a flood of queued job triggers that had accumulated during the downtime. Our &quot;run once&quot; job suddenly ran multiple times in rapid succession, causing data inconsistencies and duplicate operations. Check out my previous post on how we recovered user data after the outage &lt;a href=&quot;https://thushw.blogspot.com/2025/10/when-aws-went-down-our-users-didnt-lose.html&quot; target=&quot;_blank&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em; padding-bottom: 0.3em;&quot;&gt;The Solution That Worked (Too Well)&lt;/h2&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;The fix seemed straightforward: implement a Redis-based distributed lock to prevent concurrent executions. Before each job execution, we&#39;d set a flag in Redis. If the flag was already set, the job would recognize a concurrent execution was in progress and gracefully bail out.&lt;/p&gt;

&lt;pre style=&quot;background: rgb(248, 248, 248); border-color: rgb(221, 221, 221) rgb(221, 221, 221) rgb(221, 221, 221) rgb(52, 152, 219); border-image: initial; border-left: 3px solid #3498db; border-style: solid; border-width: 1px 1px 1px 3px; border: 1px solid rgb(221, 221, 221); font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; font-size: 0.9em; line-height: 1.5; margin: 1.5em 0px; overflow-x: auto; padding: 15px;&quot;&gt;&lt;code&gt;def perform
  return if concurrent_execution_detected?
  
  set_execution_flag
  
  begin
    # Iterate over driver TimeSeries data to find active drivers
    process_active_drivers
  ensure
    clear_execution_flag
  end
end

def concurrent_execution_detected?
  !REDIS.set(&quot;job:#{job_id}:running&quot;, &quot;1&quot;, nx: true, ex: 300)
end&lt;/code&gt;&lt;/pre&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;We deployed this with confidence. Problem solved!&lt;/p&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em; padding-bottom: 0.3em;&quot;&gt;The Problem With the Solution&lt;/h2&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;Except... it wasn&#39;t.&lt;/p&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;Shortly after deployment, we noticed something odd: some scheduled slots had &lt;strong style=&quot;color: #2c3e50;&quot;&gt;no job execution at all&lt;/strong&gt;. The job simply didn&#39;t run when it was supposed to. This was arguably worse than running multiple times—at least duplicate runs were noisy and obvious.&lt;/p&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em; padding-bottom: 0.3em;&quot;&gt;The Real Culprit: Death by a Thousand Drivers&lt;/h2&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;After digging through logs and tracing job lifecycles, we found the smoking gun: Sidekiq&#39;s graceful shutdown mechanism combined with our job&#39;s growing execution time.&lt;/p&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;Here&#39;s what was happening:&lt;/p&gt;

&lt;ol style=&quot;margin: 1em 0px; padding-left: 2em;&quot;&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;A scheduled job starts executing&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;The job iterates over TimeSeries data for all our drivers&#39; geospatial data&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;Kubernetes scales down our Sidekiq cluster (or a pod gets replaced during deployment)&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;Sidekiq begins its graceful shutdown, giving jobs 30 seconds to complete&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;Our job takes longer than 30 seconds (sometimes over a minute!)&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;Sidekiq hard-kills the job&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;The Redis flag remains set (because the &lt;code style=&quot;background: rgb(244, 244, 244); border-radius: 3px; font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; font-size: 0.9em; padding: 2px 6px;&quot;&gt;ensure&lt;/code&gt; block never runs)&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;Sidekiq automatically retries the job on another worker&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;The retry sees the Redis flag and thinks &quot;concurrent execution detected!&quot;&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;The retry bails out immediately&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;No job completes for that scheduled slot&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 style=&quot;color: #555555; font-size: 1.2em; margin-bottom: 0.5em; margin-top: 1.3em;&quot;&gt;The Hidden Performance Regression&lt;/h3&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;What made this particularly insidious was that our job used to be fast. When we first launched, iterating through driver TimeSeries data took milliseconds. But as our traffic surged and our driver count grew, the Redis keyspace for the TimeSeries structure expanded significantly.&lt;/p&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;What was once a quick scan became a slow crawl through thousands of driver records, filtering for those who had driven in the last hour. We only actually needed the geospatial data from the last 10 minutes, but we were scanning &lt;strong style=&quot;color: #2c3e50;&quot;&gt;everything&lt;/strong&gt;.&lt;/p&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;The job had slowly, imperceptibly degraded from sub-second execution to over a minute—crossing that critical 30-second Sidekiq shutdown threshold.&lt;/p&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em; padding-bottom: 0.3em;&quot;&gt;The Real Fix: Performance First, Then Locking&lt;/h2&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;We realized the Redis lock wasn&#39;t wrong—it was just unable to work correctly with a slow job. The real problem was that we couldn&#39;t distinguish between two scenarios:&lt;/p&gt;

&lt;ol style=&quot;margin: 1em 0px; padding-left: 2em;&quot;&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;Truly concurrent jobs&lt;/strong&gt; (from AWS outage flooding) → Should be blocked&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;Retry after Sidekiq kill&lt;/strong&gt; (legitimate recovery) → Should proceed&lt;/li&gt;
&lt;/ol&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;When the job took 60+ seconds, Sidekiq would kill it and spawn a retry. But the Redis lock was still held, so the retry would see it as a concurrent execution and bail out. The lock was working as designed; the job was just too slow to survive Sidekiq&#39;s shutdown process.&lt;/p&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;The solution wasn&#39;t to remove the lock—we still need it to handle AWS outage scenarios. The solution was to make the job fast enough that it would never be killed mid-execution.&lt;/p&gt;

&lt;h3 style=&quot;color: #555555; font-size: 1.2em; margin-bottom: 0.5em; margin-top: 1.3em;&quot;&gt;The Performance Bottleneck&lt;/h3&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;Our original implementation looked something like this:&lt;/p&gt;

&lt;pre style=&quot;background: rgb(248, 248, 248); border-color: rgb(221, 221, 221) rgb(221, 221, 221) rgb(221, 221, 221) rgb(52, 152, 219); border-image: initial; border-left: 3px solid #3498db; border-style: solid; border-width: 1px 1px 1px 3px; border: 1px solid rgb(221, 221, 221); font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; font-size: 0.9em; line-height: 1.5; margin: 1.5em 0px; overflow-x: auto; padding: 15px;&quot;&gt;&lt;code&gt;def process_active_drivers
  all_drivers = Driver.all
  
  all_drivers.each do |driver|
    # Fetch and scan the entire TimeSeries for this driver
    timeseries = REDIS.zrange(&quot;driver:#{driver.id}:locations&quot;, 0, -1)
    
    # Filter for entries from the last hour
    recent_locations = timeseries.select do |entry|
      entry.timestamp &amp;gt; 1.hour.ago
    end
    
    # We only needed the last 10 minutes anyway!
    process_recent_activity(recent_locations.select { |e| e.timestamp &amp;gt; 10.minutes.ago })
  end
end&lt;/code&gt;&lt;/pre&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;This meant:&lt;/p&gt;
&lt;ul style=&quot;margin: 1em 0px; padding-left: 2em;&quot;&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;Fetching potentially thousands of driver records from the database&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;For each driver, pulling their entire geospatial TimeSeries from Redis&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;Filtering in Ruby to find recent activity&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;All to find maybe a few dozen drivers who were actually active&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 style=&quot;color: #555555; font-size: 1.2em; margin-bottom: 0.5em; margin-top: 1.3em;&quot;&gt;The Solution: An Active Driver Index&lt;/h3&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;Instead of scanning all drivers and their complete history, we built a lightweight index structure in Redis that tracked &lt;strong style=&quot;color: #2c3e50;&quot;&gt;only the drivers who had been active in the last hour&lt;/strong&gt;:&lt;/p&gt;

&lt;pre style=&quot;background: rgb(248, 248, 248); border-color: rgb(221, 221, 221) rgb(221, 221, 221) rgb(221, 221, 221) rgb(52, 152, 219); border-image: initial; border-left: 3px solid #3498db; border-style: solid; border-width: 1px 1px 1px 3px; border: 1px solid rgb(221, 221, 221); font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; font-size: 0.9em; line-height: 1.5; margin: 1.5em 0px; overflow-x: auto; padding: 15px;&quot;&gt;&lt;code&gt;# When a driver&#39;s location is recorded (happens frequently)
def record_driver_location(driver_id, location_data)
  # Store in the main TimeSeries (as before)
  REDIS.zadd(&quot;driver:#{driver.id}:locations&quot;, timestamp, location_data)
  
  # NEW: Add driver to the active drivers set with expiry
  REDIS.zadd(&quot;active_drivers&quot;, Time.now.to_i, driver_id)
  
  # Clean up entries older than 1 hour
  REDIS.zremrangebyscore(&quot;active_drivers&quot;, 0, 1.hour.ago.to_i)
end&lt;/code&gt;&lt;/pre&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;Now our scheduled job became:&lt;/p&gt;

&lt;pre style=&quot;background: rgb(248, 248, 248); border-color: rgb(221, 221, 221) rgb(221, 221, 221) rgb(221, 221, 221) rgb(52, 152, 219); border-image: initial; border-left: 3px solid #3498db; border-style: solid; border-width: 1px 1px 1px 3px; border: 1px solid rgb(221, 221, 221); font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; font-size: 0.9em; line-height: 1.5; margin: 1.5em 0px; overflow-x: auto; padding: 15px;&quot;&gt;&lt;code&gt;def process_active_drivers
  # Get only drivers active in the last hour (sorted set query is O(log N))
  cutoff = 1.hour.ago.to_i
  active_driver_ids = REDIS.zrangebyscore(&quot;active_drivers&quot;, cutoff, &quot;+inf&quot;)
  
  # Only fetch the data we need
  active_driver_ids.each do |driver_id|
    # Get just the last 10 minutes of data using ZRANGEBYSCORE
    recent_locations = REDIS.zrangebyscore(
      &quot;driver:#{driver_id}:locations&quot;,
      10.minutes.ago.to_i,
      &quot;+inf&quot;
    )
    
    process_recent_activity(recent_locations)
  end
end&lt;/code&gt;&lt;/pre&gt;

&lt;h3 style=&quot;color: #555555; font-size: 1.2em; margin-bottom: 0.5em; margin-top: 1.3em;&quot;&gt;The Expected Results&lt;/h3&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;Based on benchmarking, the performance improvement should be dramatic:&lt;/p&gt;
&lt;ul style=&quot;margin: 1em 0px; padding-left: 2em;&quot;&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;Before&lt;/strong&gt;: 60+ seconds (and growing with scale)&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;After&lt;/strong&gt;: &amp;lt;1 second consistently (in testing)&lt;/li&gt;
&lt;/ul&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;By maintaining a parallel index of active drivers, we:&lt;/p&gt;
&lt;ul style=&quot;margin: 1em 0px; padding-left: 2em;&quot;&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;Eliminated the need to scan all drivers&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;Eliminated the need to fetch and filter complete TimeSeries data&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;Reduced the job from O(N×M) to O(A×K) where A is active drivers (tiny compared to N) and K is recent locations per driver&lt;/li&gt;
&lt;/ul&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;If the benchmarks hold in production, with the job completing in under a second:&lt;/p&gt;
&lt;ul style=&quot;margin: 1em 0px; padding-left: 2em;&quot;&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;Sidekiq&#39;s 30-second shutdown window will no longer be a concern&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;The Redis lock will finally work as intended—preventing duplicate jobs from AWS outages without blocking legitimate retries&lt;/li&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;We can distinguish between truly concurrent jobs (which should be blocked) and retry jobs (which should proceed)&lt;/li&gt;
&lt;/ul&gt;

&lt;div style=&quot;background: rgb(232, 244, 248); border-left: 4px solid rgb(52, 152, 219); margin: 1.5em 0px; padding: 15px;&quot;&gt;
    &lt;p style=&quot;margin: 0px;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;Update&lt;/strong&gt;: I&#39;ll be deploying this solution soon and will follow up with a part 2 covering the actual production results and any surprises we encounter along the way.&lt;/p&gt;
&lt;/div&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em; padding-bottom: 0.3em;&quot;&gt;Lessons Learned&lt;/h2&gt;

&lt;ol style=&quot;margin: 1em 0px; padding-left: 2em;&quot;&gt;
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;Performance problems masquerade as concurrency problems&lt;/strong&gt; - Our Redis lock was correct, but it couldn&#39;t work with a job that took longer than Sidekiq&#39;s shutdown window. We couldn&#39;t distinguish between &quot;truly concurrent&quot; and &quot;legitimate retry.&quot;&lt;/li&gt;
    
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;What works at 10x doesn&#39;t work at 100x&lt;/strong&gt; - Our original implementation was fine for dozens of drivers. With thousands, it became a bottleneck that made our concurrency control unworkable.&lt;/li&gt;
    
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;Maintain the right indices&lt;/strong&gt; - Scanning complete datasets to find recent activity is a code smell. Build lightweight indices that track what you actually need.&lt;/li&gt;
    
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;Use Redis data structures wisely&lt;/strong&gt; - Sorted sets (ZSET) with time-based scores are perfect for &quot;recently active&quot; tracking with automatic time-based filtering.&lt;/li&gt;
    
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;Measure, don&#39;t assume&lt;/strong&gt; - We didn&#39;t notice the job slowing down because it happened gradually. Better monitoring would have caught this before it became critical.&lt;/li&gt;
    
    &lt;li style=&quot;margin-bottom: 0.5em;&quot;&gt;&lt;strong style=&quot;color: #2c3e50;&quot;&gt;Fix root causes, not symptoms&lt;/strong&gt; - The Redis lock wasn&#39;t the problem—it was exactly what we needed for AWS outages. The problem was the job being too slow to work with the lock correctly.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em; padding-bottom: 0.3em;&quot;&gt;The Architecture Pattern&lt;/h2&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;This pattern of maintaining a &quot;recently active&quot; index alongside your main data structure is broadly applicable:&lt;/p&gt;

&lt;pre style=&quot;background: rgb(248, 248, 248); border-color: rgb(221, 221, 221) rgb(221, 221, 221) rgb(221, 221, 221) rgb(52, 152, 219); border-image: initial; border-left: 3px solid #3498db; border-style: solid; border-width: 1px 1px 1px 3px; border: 1px solid rgb(221, 221, 221); font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace; font-size: 0.9em; line-height: 1.5; margin: 1.5em 0px; overflow-x: auto; padding: 15px;&quot;&gt;&lt;code&gt;# Pattern: Active Entity Index
# Main data: Complete history for each entity
# Index: Set of entities active in time window

class ActivityTracker
  def record_activity(entity_id, data)
    timestamp = Time.now.to_i
    
    # Store complete data
    REDIS.zadd(&quot;#{entity_type}:#{entity_id}:history&quot;, timestamp, data)
    
    # Update active index
    REDIS.zadd(&quot;active_#{entity_type}&quot;, timestamp, entity_id)
    
    # Periodic cleanup (or use Redis expiry)
    cleanup_old_entries if rand &amp;lt; 0.01
  end
  
  def get_recently_active(time_window = 1.hour)
    cutoff = time_window.ago.to_i
    REDIS.zrangebyscore(&quot;active_#{entity_type}&quot;, cutoff, &quot;+inf&quot;)
  end
end&lt;/code&gt;&lt;/pre&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;This trades a small amount of additional write overhead for massive read performance gains when you need to find &quot;what&#39;s active right now?&quot;&lt;/p&gt;

&lt;h2 style=&quot;border-bottom: 2px solid rgb(52, 152, 219); color: #34495e; font-size: 1.5em; margin-bottom: 0.5em; margin-top: 1.5em; padding-bottom: 0.3em;&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;Distributed systems problems often look like they require coordination primitives when they really require performance optimization. Our Redis lock was the right solution for preventing duplicate jobs during AWS outages—but it could only work correctly once the job was fast enough to complete before Sidekiq&#39;s shutdown timeout.&lt;/p&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;The key insight: &lt;strong style=&quot;color: #2c3e50;&quot;&gt;you can&#39;t distinguish between concurrent execution and legitimate retry if your job doesn&#39;t finish before the system kills it&lt;/strong&gt;. By making the job 60× faster, we enabled our concurrency control to work as designed.&lt;/p&gt;

&lt;p style=&quot;margin-bottom: 1em;&quot;&gt;Sometimes the best fix for a distributed systems problem isn&#39;t better coordination—it&#39;s making operations fast enough that edge cases become rare and recoverable.&lt;/p&gt;

&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/1792388757843934734/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/1792388757843934734' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/1792388757843934734'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/1792388757843934734'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2025/11/when-your-fix-becomes-problem-aws.html' title='When Your Fix Becomes the Problem: A Tale of AWS Outages, Redis Flags, and Performance Scaling'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-7740553086760774633</id><published>2025-10-26T15:43:00.000-07:00</published><updated>2025-10-26T17:52:53.662-07:00</updated><title type='text'>When AWS Went Down, Our Users Didn’t Lose Their Miles</title><content type='html'>&lt;!--Blogspot-formatted HTML post with working light diagram--&gt;

&lt;p style=&quot;text-align: center;&quot;&gt;&lt;em style=&quot;text-align: left;&quot;&gt;On Oct 20, 2025 UTC, AWS experienced a significant regional service disruption that affected several of our core components — specifically the &lt;strong&gt;EventBridge&lt;/strong&gt; scheduling layer that powers our mileage pipeline.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;For several hours, our data ingestion flow couldn’t persist new trip events from the Redis TimeSeries. This temporarily paused mileage calculations, leading to incorrect user summaries in the app.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But here’s what didn’t happen:&lt;/strong&gt;&lt;br /&gt;
We didn’t lose a single record, and no user lost a mile.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3&gt;How Our System Works&lt;/h3&gt;

&lt;p&gt;Our pipeline captures anonymized telemetry events from users’ mobile devices, processes them through AWS infrastructure, and stores aggregated trip summaries in a relational database.&lt;/p&gt;

&lt;!--✅ Light-themed data flow diagram (renders correctly in Blogspot)--&gt;
&lt;p style=&quot;text-align: center;&quot;&gt;
  &lt;img alt=&quot;Data Flow Diagram&quot; src=&quot;https://quickchart.io/graphviz?graph=digraph%20G%20%7Brankdir%3DLR%3Bnode%20%5Bshape%3Dbox%2Cstyle%3Dfilled%5D%3B%22Mobile%20App%22%20%5Bfillcolor%3D%22%23b3e5fc%22%5D%3B%22Ingestion%20API%22%20%5Bfillcolor%3D%22%23c8e6c9%22%5D%3B%22Redis%20TimeSeries%22%20%5Bfillcolor%3D%22%23fff9c4%22%5D%3B%22EventBridge%20Scheduler%22%20%5Bfillcolor%3D%22%23ffe0b2%22%5D%3BRDS%20%5Bfillcolor%3D%22%23d1c4e9%22%5D%3B%22Mobile%20App%22-%3E%22Ingestion%20API%22-%3E%22Redis%20TimeSeries%22-%3E%22EventBridge%20Scheduler%22-%3ERDS%3B%7D&quot; style=&quot;border-radius: 8px; box-shadow: rgba(0, 0, 0, 0.15) 0px 0px 6px; max-width: 800px; width: 95%;&quot; /&gt;
  &lt;br /&gt;&lt;small&gt;&lt;em&gt;Data Flow: Mobile App → Ingestion API → Redis TimeSeries → EventBridge Scheduler → RDS&lt;/em&gt;&lt;/small&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Event flow:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Mobile App&lt;/strong&gt; captures GPS telemetry and trip start/stop events.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Ingestion API&lt;/strong&gt; authenticates and sanitizes the data before writing to &lt;strong&gt;Redis TimeSeries&lt;/strong&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Redis TimeSeries&lt;/strong&gt; stores short-term data points with fine-grained timestamps for quick replay (a background job backs up the stream to S3 for longer term storage).&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;EventBridge Scheduler&lt;/strong&gt; triggers aggregation and processing jobs every few minutes.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;RDS&lt;/strong&gt; stores validated, aggregated trip records for long-term analytics and reporting.&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3&gt;What Happened During the Outage&lt;/h3&gt;

&lt;p&gt;When AWS degraded services in our primary region (&lt;em&gt;us-east-1&lt;/em&gt;),&amp;nbsp; the EventBridge Scheduler stopped firing events from 6:50 - 10:07 UTC halting our pipeline smack in the middle. While we were still capturing and ingesting the user geo data, the Redis Timeseries was not being processed as this job did not get scheduled for little over 3 hours.&lt;/p&gt;

&lt;p&gt;Remarkably, the Redis Timeseries held up. Our Redis cloud is with Redis Enterprise, but the instances are hosted in the AWS cloud. Even though Redis Enterprise &lt;a href=&quot;https://status.redis.io/&quot;&gt;noted&lt;/a&gt; that some customers would be impacted, we did not see a significant degradation.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgf_V03sgxGUJdC1qLH2WqbrtL3WB7rZQbPOH3MQwwglzjkxLJTdSla1aX5tPqayTMPF6txQX-4Ze-wl9R-j5qq3WajQkinIqPf39MaFmUhaoQB1jEJTCAGbORRA7AKcwL4TKJpM_LQwOuZ2lQNUZhbzQ22B-uoHYsqBghrGFfoo2ZnzRk9-7FXfA/s1694/Screenshot%202025-10-23%20at%202.28.38%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1138&quot; data-original-width=&quot;1694&quot; height=&quot;430&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgf_V03sgxGUJdC1qLH2WqbrtL3WB7rZQbPOH3MQwwglzjkxLJTdSla1aX5tPqayTMPF6txQX-4Ze-wl9R-j5qq3WajQkinIqPf39MaFmUhaoQB1jEJTCAGbORRA7AKcwL4TKJpM_LQwOuZ2lQNUZhbzQ22B-uoHYsqBghrGFfoo2ZnzRk9-7FXfA/w640-h430/Screenshot%202025-10-23%20at%202.28.38%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Understanding the AWS architecture helps explain why this was the case.&lt;p&gt;&lt;/p&gt;&lt;p&gt;AWS separates each service into a control plane and a data plane. The control plane is responsible for resource monitoring, allocation, scaling whereas once resources have been allocated, the data plane takes over - the data plane is more reliable/resilient while the control plane is known to fail.&amp;nbsp;&lt;/p&gt;&lt;p&gt;Here from the horses&#39;s mouth:&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEis6V60cJJ8ofuAEKsNDXPqP3VqVw-2pPwlt1H7ophXHuq8bg3G91NYubM6ItYxrPW9uBDUi1BBTay1zsLMc-eUEcUAJChs2HJliFxjZeYSXClaW_1Mm62n2o9wpqk2HoD3fkArJA8Zp2cWjGOiEl7kJp-rS5vbvkF8gHPOvRCF-8mYRISoMC6y9g/s2026/Screenshot%202025-10-23%20at%2012.58.03%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;326&quot; data-original-width=&quot;2026&quot; height=&quot;102&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEis6V60cJJ8ofuAEKsNDXPqP3VqVw-2pPwlt1H7ophXHuq8bg3G91NYubM6ItYxrPW9uBDUi1BBTay1zsLMc-eUEcUAJChs2HJliFxjZeYSXClaW_1Mm62n2o9wpqk2HoD3fkArJA8Zp2cWjGOiEl7kJp-rS5vbvkF8gHPOvRCF-8mYRISoMC6y9g/w640-h102/Screenshot%202025-10-23%20at%2012.58.03%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;Since we provision our Redis cluster for expected usage, manually adding more nodes/memory as our volumes increase, we were not relying on the AWS control plane for scaling - our instances continued to hum (we saw the same with RDS as well -- again, it is customary to provision the RDS for your needs and perform upgrades manually as traffic increases)&lt;/p&gt;&lt;blockquote style=&quot;border-left: 4px solid rgb(59, 130, 246); color: #4b5563; font-style: italic; margin: 1.5rem 0px; padding-left: 1rem;&quot;&gt;
  This was not the case for our web/job servers, that were configured with Auto scaling rules. We had set a lower limit on the number of machines for each cluster, and we were running hard on this reduced capacity until recovery.
&lt;/blockquote&gt;&lt;p&gt;When services recovered, we started processing events from the Timeseries, creating trips for users. But since we generate incremental trips for the last few minutes, we were still missing trips for the last 3 hours and 7 minutes.&lt;/p&gt;&lt;p&gt;We could easily tell how many trips we missed as we track this closely using a Cloud Watch metric. Each bar shows a completion of the job that is responsible for incrementally processing the timeseries.&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9j6cxXZ-aRsAiapCn9gkxboiN2fBXKZSXLV4MMsGqaUT2yq8qdAUfrIJ1_6WCWFWCoJIz5_RIWvuc8AipKf7yf_FE_tWJgu-xgWBrpAht2rH0WlkeoVSatiK9C2W9dnHpxaSPcBwmpMIkuq7efms2XHXshUyJCqBUp4_62napSCLW5SDGioCi6A/s2912/Screenshot%202025-10-23%20at%201.08.11%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;208&quot; data-original-width=&quot;2912&quot; height=&quot;46&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9j6cxXZ-aRsAiapCn9gkxboiN2fBXKZSXLV4MMsGqaUT2yq8qdAUfrIJ1_6WCWFWCoJIz5_RIWvuc8AipKf7yf_FE_tWJgu-xgWBrpAht2rH0WlkeoVSatiK9C2W9dnHpxaSPcBwmpMIkuq7efms2XHXshUyJCqBUp4_62napSCLW5SDGioCi6A/w640-h46/Screenshot%202025-10-23%20at%201.08.11%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&amp;nbsp;&amp;nbsp;&lt;/p&gt;&lt;p&gt;When services recovered, EventBridge Scheduler fired all the events in the backlog.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwOUDK6s2Uj_US3k-jbbu1hrwrFqJrjCagnvwxKJ0ufqIZStA5aL6Ty7O3ZaqwcqkVwDEDtbyu9ExHwIoXObi-NHyA0ZFB6n42kfhbJkD59wpkXcxJ1jI6_CNKS5-cAYYzzQgXvp6OyqgM27dn9jl7hiM1bYnJotngVidgcXSBajXs1Jl3d6HWbw/s2902/Screenshot%202025-10-23%20at%202.30.12%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;220&quot; data-original-width=&quot;2902&quot; height=&quot;48&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwOUDK6s2Uj_US3k-jbbu1hrwrFqJrjCagnvwxKJ0ufqIZStA5aL6Ty7O3ZaqwcqkVwDEDtbyu9ExHwIoXObi-NHyA0ZFB6n42kfhbJkD59wpkXcxJ1jI6_CNKS5-cAYYzzQgXvp6OyqgM27dn9jl7hiM1bYnJotngVidgcXSBajXs1Jl3d6HWbw/w640-h48/Screenshot%202025-10-23%20at%202.30.12%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p&gt;This caused a different problem as our trips processor was designed to handle the time series data in real time. We did not anticipate serving more than a single event during a ten minute window. So we got 21 late-fired triggers but effectively could process just one, for the last ten minutes. More on this later!&lt;/p&gt;
&lt;ul&gt;
  
&lt;/ul&gt;

&lt;p&gt;The critical task was to update the user data for the missing three hours. I had written a script to patch trips that I had used earlier for a less severe outage (5 minutes). With some minor modification to account for the partial data towards the tail, I was able to correct mileage for all our users who happened to be driving during the outage (luckily they were not on self-driving cars powered by AWS. Ok - bad joke)&lt;/p&gt;&lt;p&gt;There was still something I couldn&#39;t explain - CW told me ~ 2,000 jobs completed after jobs started flowing again. I expected 21 jobs, but I was puzzled by the much larger volume that ran at the tail. What amounted to that, and would they cause a different type of mis-calculation? Indeed, some interesting things did take place with those 21 EventBridge triggers, let me explain.&lt;/p&gt;&lt;p&gt;When a trigger fires on the tenth minute, we launch a job per user who have likely been driving recently. These jobs run concurrently, and we need to track the completion of the last job to know that all users have been processed and the window can be marked &quot;complete&quot;.&lt;/p&gt;&lt;p&gt;This is done with a Redis Set that keeps track of users who are still being processed. So when the trigger fires, it first determines all recent drivers, adds them to the set, before spawning a worker per user.&amp;nbsp; Then each worker removes an element from the set, and if it is the last item, notifies the completion of the run.&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEik7Th_o-LVJ47r6w_kV8VgfVPA1fylIPFxW0SoxeEx2zsEc3RwCIwrKsTx0uDgM2j8375iz2i8HWbSRpXR3nnRKmsufk6t2btE3l7M5tlfEhBMxHGTEYH91vhL8D4m1T6Bt92Uqf17Y60FzptRGxeHiMRpskLCfA1mbBS_vBCtpBJlpBYrSB-b2Q/s1490/Screenshot%202025-10-26%20at%202.49.15%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;614&quot; data-original-width=&quot;1490&quot; height=&quot;264&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEik7Th_o-LVJ47r6w_kV8VgfVPA1fylIPFxW0SoxeEx2zsEc3RwCIwrKsTx0uDgM2j8375iz2i8HWbSRpXR3nnRKmsufk6t2btE3l7M5tlfEhBMxHGTEYH91vhL8D4m1T6Bt92Uqf17Y60FzptRGxeHiMRpskLCfA1mbBS_vBCtpBJlpBYrSB-b2Q/w640-h264/Screenshot%202025-10-26%20at%202.49.15%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4LRjYiiLfFQE25mPfcbsjml1MZ7nEis6SfP29NgvRqcpTMgxfLxL4QkoBXQfbE1Wf9wo4xGbtI2dfKq4nSdJ13SozQtavS4NC2G8jF_hjluQFtQJs37NizN7NTYpIhqBW_9gSLRcm9CqQ1Te4NMcW711NUYNZjlMnsEWCto_WMs98vzAJAtDXfw/s1346/Screenshot%202025-10-26%20at%202.49.30%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;512&quot; data-original-width=&quot;1346&quot; height=&quot;244&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4LRjYiiLfFQE25mPfcbsjml1MZ7nEis6SfP29NgvRqcpTMgxfLxL4QkoBXQfbE1Wf9wo4xGbtI2dfKq4nSdJ13SozQtavS4NC2G8jF_hjluQFtQJs37NizN7NTYpIhqBW_9gSLRcm9CqQ1Te4NMcW711NUYNZjlMnsEWCto_WMs98vzAJAtDXfw/w640-h244/Screenshot%202025-10-26%20at%202.49.30%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;When 21 triggers fired in rapid succession, they all forked a job per user, resulting in many workers racing to compute the same job, and may workers hitting an empty set. And of course this meant that these jobs &quot;up counted&quot; miles for the drivers in that time window.&lt;/p&gt;&lt;p&gt;So the last data cleanup was to figure out where we added more miles to users during the end of the outage. I first thought this might be really hard as we had already updated these records, which then kept getting updated further as the users kept driving. But fortunately, we update both start and end times for each trip in the record, so it was possible to compute the miles driven in this specific range for each user from the raw timeseries data.&lt;/p&gt;&lt;blockquote style=&quot;border-left: 4px solid rgb(59, 130, 246); color: #4b5563; font-style: italic; margin: 1.5rem 0px; padding-left: 1rem;&quot;&gt;
  To verify that we have been up-counting, I queried for records in descending order of speed. And I saw speeds of 600 mph, which confirmed the hypothesis quite fast [no pun intended]
&lt;/blockquote&gt;&lt;p&gt;I could re-use a method from the earlier script for patching the data, and write a bit of code for the update. So finally, after a very long day, our users&#39; data was fully corrected.&lt;/p&gt;&lt;h3&gt;Improvements made:&lt;/h3&gt;&lt;p&gt;&lt;/p&gt;

&lt;p style=&quot;text-align: left;&quot;&gt;We are making improvements on how we handle the tail of an outage going forward. The idea here is to not let more than a single processor to run in a given time window. This can be done with a simple Redis SET command with option &quot;NX&quot;, which sets a flag (lock) if it is not already set, thus guaranteeing that only a single process can acquire the lock. We set the TTL to be below the time window (7 minutes in this case) so that the lock naturally expires before the next trigger.&lt;/p&gt;&lt;h3&gt;Our Approach: Fairness and Fidelity&lt;/h3&gt;

&lt;p&gt;Our principle is simple: &lt;strong&gt;if you drove it, we count it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We don’t approximate, and we don’t drop data due to transient infrastructure issues. Each pipeline component is designed for &lt;strong&gt;eventual consistency&lt;/strong&gt; and &lt;strong&gt;idempotent replay&lt;/strong&gt;, so every record can be reconstructed safely and accurately.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3&gt;What We’re Building Next&lt;/h3&gt;

&lt;p&gt;Resilience isn’t just uptime — it’s graceful recovery. We’re implementing several next steps to strengthen this architecture:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Buffer device data:&lt;/strong&gt; As users go through areas with low mobile reception, we want to buffer the location data and deliver it when reception improves.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Adjust inaccurate device signal : &lt;/strong&gt;Use techniques like Kanman filtering to adjust the location for high fidelity, when the device accuracy is low&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;De-couple RDS for real time updates:&lt;/strong&gt; We will store running trips in Redis, with archiving to take place later. This makes us resilient on an event when the RDS is un-responsive, as we only need it at the latter step of archiving&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Monitoring for anomalies:&lt;/strong&gt;&amp;nbsp;Add speed as a tracked metric and alert over 200 mph.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Chaos testing &amp;amp; fault injection:&lt;/strong&gt; Monthly simulated outages to validate that our recovery flow remains reliable.&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3&gt;What It Means for Our Users&lt;/h3&gt;

&lt;p&gt;When something breaks in the cloud, we don’t panic — we verify, replay, and reconcile.&lt;br /&gt;
Because behind every data point is a real person trusting us to get it right.&lt;/p&gt;

&lt;p&gt;Outages happen, but trust shouldn’t. And that’s what we’re building for.&lt;/p&gt;

&lt;hr /&gt;

&lt;p style=&quot;color: #777777; font-size: 0.9em;&quot;&gt;&lt;em&gt;Posted by [Thushara Wijeratna], [Head of Eng] at [WorkSolo]&lt;/em&gt;&lt;/p&gt;
</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/7740553086760774633/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/7740553086760774633' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/7740553086760774633'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/7740553086760774633'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2025/10/when-aws-went-down-our-users-didnt-lose.html' title='When AWS Went Down, Our Users Didn’t Lose Their Miles'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgf_V03sgxGUJdC1qLH2WqbrtL3WB7rZQbPOH3MQwwglzjkxLJTdSla1aX5tPqayTMPF6txQX-4Ze-wl9R-j5qq3WajQkinIqPf39MaFmUhaoQB1jEJTCAGbORRA7AKcwL4TKJpM_LQwOuZ2lQNUZhbzQ22B-uoHYsqBghrGFfoo2ZnzRk9-7FXfA/s72-w640-h430-c/Screenshot%202025-10-23%20at%202.28.38%E2%80%AFPM.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-5894031168028601760</id><published>2025-10-16T15:39:00.000-07:00</published><updated>2025-10-16T17:21:10.934-07:00</updated><title type='text'>Rails 8: - can&#39;t cast RSpec::Mocks::Double</title><content type='html'>
&lt;p&gt;&amp;nbsp;One of the first unit test failures I encountered on a Rails 8.0 upgrade was:&lt;/p&gt;&lt;p&gt;&lt;span&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;/span&gt;&lt;code&gt;can&#39;t cast RSpec::Mocks::Double&lt;/code&gt;&lt;/p&gt;&lt;p&gt;The error happens on saving an ActiveRecord object to the database.&lt;/p&gt;&lt;div style=&quot;background-color: #1e1e1e; color: #d4d4d4; font-family: Menlo, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 12px; line-height: 18px; white-space: pre;&quot;&gt;&lt;div style=&quot;line-height: 18px;&quot;&gt;&lt;span style=&quot;color: #6a9955;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 18px;&quot;&gt;&lt;span style=&quot;color: #6a9955;&quot;&gt;#  connected                   :boolean          default(FALSE)&lt;/span&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 18px;&quot;&gt;&lt;div style=&quot;line-height: 18px;&quot;&gt;&lt;span style=&quot;color: #6a9955;&quot;&gt;#  last_connected_at           :datetime&lt;/span&gt;&lt;/div&gt;&lt;div style=&quot;line-height: 18px;&quot;&gt;&lt;span style=&quot;color: #6a9955;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style=&quot;background-color: #1e1e1e; color: #d4d4d4; font-family: Menlo, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 12px; line-height: 18px; white-space: pre;&quot;&gt;&lt;span style=&quot;color: #569cd6;&quot;&gt;class&lt;/span&gt; LymoAccount &amp;lt; ApplicationRecord&lt;/div&gt;&lt;div style=&quot;background-color: #1e1e1e; color: #d4d4d4; font-family: Menlo, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 12px; line-height: 18px; white-space: pre;&quot;&gt;  after_commit &lt;span style=&quot;color: #569cd6;&quot;&gt;:update_last_connected_at&lt;/span&gt;, &lt;span style=&quot;color: #569cd6;&quot;&gt;if:&lt;/span&gt; &lt;span style=&quot;color: #569cd6;&quot;&gt;:saved_change_to_connected?&lt;/span&gt;&lt;/div&gt;&lt;div style=&quot;background-color: #1e1e1e; color: #d4d4d4; font-family: Menlo, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 12px; line-height: 18px; white-space: pre;&quot;&gt;&lt;span style=&quot;color: #569cd6;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style=&quot;background-color: #1e1e1e; color: #d4d4d4; font-family: Menlo, Monaco, &amp;quot;Courier New&amp;quot;, monospace; font-size: 12px; line-height: 18px; white-space: pre;&quot;&gt;&lt;div&gt;  &lt;span style=&quot;color: #569cd6;&quot;&gt;def&lt;/span&gt; update_last_connected_at&lt;/div&gt;&lt;div&gt;    update!(&lt;span style=&quot;color: #569cd6;&quot;&gt;last_connected_at:&lt;/span&gt; Time.current) &lt;span style=&quot;color: #569cd6;&quot;&gt;if&lt;/span&gt; connected?&lt;/div&gt;&lt;div&gt;  &lt;span style=&quot;color: #569cd6;&quot;&gt;end&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;&lt;p&gt;Turns out, this is related to stricter validation in Active Record, that refuses to save a mock object.&lt;/p&gt;&lt;p&gt;Generally speaking, you should be using FactoryBot methods to create real ActiveRecord objects from your unit tests. And we were. So it puzzled me why we would get this error, as it did not seem like we were storing anything that was a mock.&lt;/p&gt;&lt;p&gt;ChatGPT got pretty confused as well -- It got confused as the exception was thrown from an &lt;code&gt;after_commit&lt;/code&gt; hook and its assumption was that there were attributes already set in the model that are being re-serialized and this was causing the issue.&lt;/p&gt;&lt;p&gt;We went through a listing of all the attributes of the record, examining their type (class) and none of them was a mock.&amp;nbsp;&lt;/p&gt;&lt;p&gt;This was the point when I gave up on Gen AI and took another look at the model.&lt;/p&gt;&lt;p&gt;I quickly eliminated that this has anything to do with the &lt;code&gt;connected&lt;/code&gt; attribute that we are checking, by updating the test to save without the check. It didn&#39;t help, so I knew that the update itself was throwing.&lt;/p&gt;&lt;p&gt;Then I wondered if updating &lt;b&gt;any&lt;/b&gt; column would throw the same error.&lt;/p&gt;&lt;p&gt;I updated a text column, and it worked - progress. Now I knew this might have something to do with either:&lt;/p&gt;&lt;p&gt;1. the specific column - maybe it has some validation that is behaving weird&lt;/p&gt;&lt;p&gt;2. the column type - maybe datetime columns behave differently with updates&lt;/p&gt;&lt;p&gt;Luckily there were a few other datetime columns, and when I updated any one of them with &lt;code&gt;Time.current&lt;/code&gt;, they threw the same error. We were getting somewhere.&lt;/p&gt;&lt;p&gt;On a hunch, that this was to do with &lt;code&gt;Time.current&lt;/code&gt; , I took a look at the unit test, to see if Time was being mocked - normally, we use Timecop as the best practice, but it was always possible that an engineer decided to be &quot;creative&quot;.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;This is what I found on the unit test:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;
  let(:data_sync_at) { Time.current.to_i }
  before do
    allow(Time).to receive_message_chain(:current, :to_i).and_return(data_sync_at)
  end
&lt;/code&gt;&lt;/pre&gt;  
  So what this does is stub the intermediate &lt;code&gt;Time.current&lt;/code&gt; call, and it is not that obvious! So when we used &lt;code&gt;Time.current&lt;/code&gt; to update the record, it tried to use a Mock object and failed.&lt;p&gt;A simple &lt;code&gt;Timecop.travel(data_sync_at)&lt;/code&gt; fixed the issue.&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiAF__JNPzESOZvUGtXQGPRKBtajrOpGavyzSrO1WRlIpLv6RJJ6AO9KCpcFJxulaBuL0-rGHUwdwS_X1ApshxG2a0fLqiKxKwg79HysS1JVmSq6LR0TMlBYIC-js9jRjyK4lEpWjltGKWk4l2vwoxxtsh3R_Ivb4Cdna7NwhyphenhyphenZfTOiSSwGlx1xJQ/s2986/Screenshot%202025-10-16%20at%203.31.45%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;500&quot; data-original-width=&quot;2986&quot; height=&quot;109&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiAF__JNPzESOZvUGtXQGPRKBtajrOpGavyzSrO1WRlIpLv6RJJ6AO9KCpcFJxulaBuL0-rGHUwdwS_X1ApshxG2a0fLqiKxKwg79HysS1JVmSq6LR0TMlBYIC-js9jRjyK4lEpWjltGKWk4l2vwoxxtsh3R_Ivb4Cdna7NwhyphenhyphenZfTOiSSwGlx1xJQ/w640-h109/Screenshot%202025-10-16%20at%203.31.45%E2%80%AFPM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p&gt;Here is the Claude transcript, which happened later, when I was better able to guide the conversation along a more productive route!&lt;/p&gt;

&lt;div style=&quot;max-width: 900px; margin: 20px auto; font-family: Arial, sans-serif; line-height: 1.6; color: #333;&quot;&gt;

&lt;h1 style=&quot;color: #2c3e50; margin-bottom: 30px;&quot;&gt;Debugging Rails RSpec Error: RSpec::Mocks::Double Casting Issue&lt;/h1&gt;

&lt;div style=&quot;margin-bottom: 40px; padding-bottom: 20px; border-bottom: 1px solid #e0e0e0;&quot;&gt;
&lt;div style=&quot;background-color: #4a90e2; color: white; padding: 14px 16px; border-radius: 8px; margin: 12px 0;&quot;&gt;
&lt;strong&gt;Q: On a Rails model, why would this code throw an error on a unit test saying: can&#39;t cast RSpec::Mocks::Double?&lt;/strong&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;def update_last_connected_at
    update!(last_connected_at: Time.current) if connected?
end&lt;/pre&gt;
&lt;/div&gt;

&lt;div style=&quot;background-color: #f0f0f0; color: #333; padding: 14px 16px; border-radius: 8px; margin: 12px 0; border-left: 4px solid #4a90e2;&quot;&gt;
&lt;strong&gt;Answer:&lt;/strong&gt; The error occurs because &lt;code&gt;connected?&lt;/code&gt; is returning an RSpec mock double instead of a boolean value, and Rails is trying to cast it to a boolean in the &lt;code&gt;if&lt;/code&gt; conditional.

&lt;p&gt;&lt;strong&gt;Common causes:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Unmocked association&lt;/strong&gt; — &lt;code&gt;connected?&lt;/code&gt; might depend on an association that&#39;s mocked but not configured properly&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Missing stub&lt;/strong&gt; — The &lt;code&gt;connected?&lt;/code&gt; method itself might be mocked but not given a return value&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Solutions:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;1. Stub the method explicitly with a boolean:&lt;/p&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;allow(model).to receive(:connected?).and_return(true)&lt;/pre&gt;

&lt;p&gt;2. Build a proper test object instead of relying on mocks:&lt;/p&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;let(:model) { build(:your_model, connected: true) }&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div style=&quot;margin-bottom: 40px; padding-bottom: 20px; border-bottom: 1px solid #e0e0e0;&quot;&gt;
&lt;div style=&quot;background-color: #4a90e2; color: white; padding: 14px 16px; border-radius: 8px; margin: 12px 0;&quot;&gt;
&lt;strong&gt;Q: Happens without connected?&lt;/strong&gt;
&lt;/div&gt;

&lt;div style=&quot;background-color: #f0f0f0; color: #333; padding: 14px 16px; border-radius: 8px; margin: 12px 0; border-left: 4px solid #4a90e2;&quot;&gt;
&lt;strong&gt;Answer:&lt;/strong&gt; If you&#39;re getting the error without even calling &lt;code&gt;connected?&lt;/code&gt;, the problem is likely that &lt;code&gt;update!&lt;/code&gt; itself is being called on a mock.

&lt;p&gt;&lt;strong&gt;Most common cause:&lt;/strong&gt; You&#39;re probably testing this method on a mocked or partially mocked model instance. When you stub the model without fully setting it up, calling &lt;code&gt;update!&lt;/code&gt; on it returns a double instead of actually updating the record.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solutions:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;1. Use &lt;code&gt;create&lt;/code&gt; or &lt;code&gt;build&lt;/code&gt; instead of &lt;code&gt;double&lt;/code&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;let(:model) { create(:your_model) }
model.update_last_connected_at&lt;/pre&gt;

&lt;p&gt;2. If you must use a double, stub &lt;code&gt;update!&lt;/code&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;let(:model) { double(:your_model) }
allow(model).to receive(:update!).and_return(true)&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div style=&quot;margin-bottom: 40px; padding-bottom: 20px; border-bottom: 1px solid #e0e0e0;&quot;&gt;
&lt;div style=&quot;background-color: #4a90e2; color: white; padding: 14px 16px; border-radius: 8px; margin: 12px 0;&quot;&gt;
&lt;strong&gt;Q: Model is not stubbed, it is a FactoryBot created one&lt;/strong&gt;
&lt;/div&gt;

&lt;div style=&quot;background-color: #f0f0f0; color: #333; padding: 14px 16px; border-radius: 8px; margin: 12px 0; border-left: 4px solid #4a90e2;&quot;&gt;
&lt;strong&gt;Answer:&lt;/strong&gt; If it&#39;s a real FactoryBot instance, then the issue is different. The error suggests that &lt;code&gt;update!&lt;/code&gt; is being called but something in the conditional logic is going wrong.

&lt;p&gt;&lt;strong&gt;Most likely causes:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;connected?&lt;/code&gt; method doesn&#39;t exist or returns a mock&lt;/li&gt;
&lt;li&gt;The model instance itself is getting wrapped in a double&lt;/li&gt;
&lt;li&gt;Database transaction/rollback issues&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Debug steps:&lt;/strong&gt; Add some debugging to see what&#39;s actually happening:&lt;/p&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;def update_last_connected_at
    puts &quot;connected? returns: #{connected?.inspect}&quot;
    puts &quot;connected?.class: #{connected?.class}&quot;
    update!(last_connected_at: Time.current) if connected?
end&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div style=&quot;margin-bottom: 40px; padding-bottom: 20px; border-bottom: 1px solid #e0e0e0;&quot;&gt;
&lt;div style=&quot;background-color: #4a90e2; color: white; padding: 14px 16px; border-radius: 8px; margin: 12px 0;&quot;&gt;
&lt;strong&gt;Q: Connected is a column in the table (model). Here&#39;s my test and model:&lt;/strong&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;let(:data_sync_at) { Time.current.to_i }
allow(Time).to receive_message_chain(:current, :to_i).and_return(data_sync_at)
// call method that ends up saving the record

# Model:
after_commit :update_last_connected_at

def update_last_connected_at     
  update!(last_connected_at: Time.current)   
end&lt;/pre&gt;
&lt;/div&gt;

&lt;div style=&quot;background-color: #f0f0f0; color: #333; padding: 14px 16px; border-radius: 8px; margin: 12px 0; border-left: 4px solid #4a90e2;&quot;&gt;
&lt;strong&gt;Answer:&lt;/strong&gt; Ah, there&#39;s the issue! You have an infinite loop of callbacks:

&lt;ol&gt;
&lt;li&gt;&lt;code&gt;update!&lt;/code&gt; is called in &lt;code&gt;update_last_connected_at&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;update!&lt;/code&gt; triggers &lt;code&gt;after_commit&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;after_commit&lt;/code&gt; calls &lt;code&gt;update_last_connected_at&lt;/code&gt; again&lt;/li&gt;
&lt;li&gt;Which calls &lt;code&gt;update!&lt;/code&gt; again... and so on&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This creates a recursive callback that RSpec is having trouble with, and the mock you set up on &lt;code&gt;Time.current&lt;/code&gt; might be interfering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The real problem:&lt;/strong&gt; You&#39;re missing the &lt;code&gt;connected?&lt;/code&gt; check that should prevent this recursion:&lt;/p&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;def update_last_connected_at
    update!(last_connected_at: Time.current) if connected?
end&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;Or use &lt;code&gt;update_column&lt;/code&gt; instead&lt;/strong&gt; (skips callbacks):&lt;/p&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;def update_last_connected_at
    update_column(:last_connected_at, Time.current) if connected?
end&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div style=&quot;margin-bottom: 40px; padding-bottom: 20px;&quot;&gt;
&lt;div style=&quot;background-color: #4a90e2; color: white; padding: 14px 16px; border-radius: 8px; margin: 12px 0;&quot;&gt;
&lt;strong&gt;Q: My after_commit is like this so should not recurse:&lt;/strong&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;after_commit :update_last_connected_at, if: :saved_change_to_connected?&lt;/pre&gt;
&lt;/div&gt;

&lt;div style=&quot;background-color: #f0f0f0; color: #333; padding: 14px 16px; border-radius: 8px; margin: 12px 0; border-left: 4px solid #4a90e2;&quot;&gt;
&lt;strong&gt;Answer:&lt;/strong&gt; Good catch! With that guard clause, it shouldn&#39;t recurse. But there&#39;s still an issue:

&lt;p&gt;&lt;strong&gt;The problem:&lt;/strong&gt; When &lt;code&gt;update_last_connected_at&lt;/code&gt; calls &lt;code&gt;update!&lt;/code&gt;, it triggers another &lt;code&gt;after_commit&lt;/code&gt;. Even though the guard &lt;code&gt;if: :saved_change_to_connected?&lt;/code&gt; prevents it from running again, RSpec is still having issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The real culprit is likely your Time mock:&lt;/strong&gt;&lt;/p&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;allow(Time).to receive_message_chain(:current, :to_i).and_return(data_sync_at)&lt;/pre&gt;

&lt;p&gt;This is mocking &lt;code&gt;Time.current&lt;/code&gt; in a way that might be interfering with the &lt;code&gt;after_commit&lt;/code&gt; callback execution in your test. When you call &lt;code&gt;update!(last_connected_at: Time.current)&lt;/code&gt; inside the callback, the mocked &lt;code&gt;Time.current&lt;/code&gt; might be returning a double.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try this instead:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 1:&lt;/strong&gt; Mock just the .to_i part, not the whole chain:&lt;/p&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;allow_any_instance_of(Time).to receive(:to_i).and_return(data_sync_at)&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;Option 2:&lt;/strong&gt; Use travel_to:&lt;/p&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;travel_to(Time.at(data_sync_at)) do
  # your test code
end&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;Option 3:&lt;/strong&gt; Simplify by not mocking Time at all:&lt;/p&gt;
&lt;pre style=&quot;background-color: #2c3e50; color: #ecf0f1; padding: 16px; border-radius: 6px; overflow-x: auto; font-family: &#39;Monaco&#39;, &#39;Courier New&#39;, monospace; font-size: 13px; margin-top: 10px;&quot;&gt;let(:data_sync_at) { Time.current.to_i }
# Just use the actual Time.current in your model&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/5894031168028601760/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/5894031168028601760' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/5894031168028601760'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/5894031168028601760'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2025/10/rails-8-cant-cast-rspecmocksdouble.html' title='Rails 8: - can&#39;t cast RSpec::Mocks::Double'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiAF__JNPzESOZvUGtXQGPRKBtajrOpGavyzSrO1WRlIpLv6RJJ6AO9KCpcFJxulaBuL0-rGHUwdwS_X1ApshxG2a0fLqiKxKwg79HysS1JVmSq6LR0TMlBYIC-js9jRjyK4lEpWjltGKWk4l2vwoxxtsh3R_Ivb4Cdna7NwhyphenhyphenZfTOiSSwGlx1xJQ/s72-w640-h109-c/Screenshot%202025-10-16%20at%203.31.45%E2%80%AFPM.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-9128483795048195528</id><published>2025-05-17T07:04:00.000-07:00</published><updated>2025-05-23T14:23:55.709-07:00</updated><title type='text'>Real time location of drivers : a tale of repurposing a Jupyter Notebook</title><content type='html'>&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: justify;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqEOXw16I9veN-zTMIlukYLnDE-sP5xI0n1vfudPn_1tzhzpqhcmaH3tFubdENTd7oC4_s4Jr-6LWLg9uX_RkoaU7Tgpr6501nuau6ltdOChHdhyphenhyphenWuxETACL-vVsRT-buhxK-2qGrP7I81b8iInmiyJ0-kUdo4IBZiyLSl-dxjgTQokSAYyAyUsA/s3191/drivehustle.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1765&quot; data-original-width=&quot;3191&quot; height=&quot;320&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqEOXw16I9veN-zTMIlukYLnDE-sP5xI0n1vfudPn_1tzhzpqhcmaH3tFubdENTd7oC4_s4Jr-6LWLg9uX_RkoaU7Tgpr6501nuau6ltdOChHdhyphenhyphenWuxETACL-vVsRT-buhxK-2qGrP7I81b8iInmiyJ0-kUdo4IBZiyLSl-dxjgTQokSAYyAyUsA/w579-h320/drivehustle.png&quot; width=&quot;579&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;At Solo, we automatically track driver location to calculate mileage—making it easy for our users to deduct travel expenses when filing taxes with the IRS. But we go far beyond basic mileage tracking. Our app breaks each day into individual “trips,” so drivers can see their full driving route in detail. Wondering where you lost the most time in traffic? Trying to remember that tricky apartment complex with no parking? Or the day you circled a golf course for 30 minutes? We capture all of it—and turn those moments into insights, delivered through an interactive map that helps you drive smarter.&lt;/p&gt;&lt;p data-end=&quot;361&quot; data-start=&quot;179&quot; style=&quot;text-align: justify;&quot;&gt;To visualize driving routes, we use &lt;strong data-end=&quot;232&quot; data-start=&quot;215&quot;&gt;OpenStreetMap&lt;/strong&gt; rendered through a &lt;strong data-end=&quot;272&quot; data-start=&quot;252&quot;&gt;Jupyter Notebook&lt;/strong&gt;, all powered by a lightweight &lt;strong data-end=&quot;319&quot; data-start=&quot;303&quot;&gt;Flask server&lt;/strong&gt;. The Flask server handles two core tasks:&lt;/p&gt;
&lt;ul data-end=&quot;562&quot; data-start=&quot;363&quot;&gt;
&lt;li data-end=&quot;464&quot; data-start=&quot;363&quot;&gt;
&lt;p data-end=&quot;464&quot; data-start=&quot;365&quot; style=&quot;text-align: justify;&quot;&gt;Given a list of &lt;code data-end=&quot;404&quot; data-start=&quot;381&quot;&gt;[latitude, longitude]&lt;/code&gt; coordinates, it plots the route on interactive map tiles.&lt;/p&gt;
&lt;/li&gt;
&lt;li data-end=&quot;562&quot; data-start=&quot;465&quot;&gt;
&lt;p data-end=&quot;562&quot; data-start=&quot;467&quot; style=&quot;text-align: justify;&quot;&gt;It animates the route by syncing movement with timestamps associated with each coordinate pair.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-end=&quot;671&quot; data-start=&quot;564&quot; style=&quot;text-align: justify;&quot;&gt;We chose &lt;strong data-end=&quot;590&quot; data-start=&quot;573&quot;&gt;OpenStreetMap&lt;/strong&gt; over Google Maps for a few key reasons that make it especially startup-friendly:&lt;/p&gt;
&lt;ul data-end=&quot;1014&quot; data-start=&quot;673&quot;&gt;
&lt;li data-end=&quot;786&quot; data-start=&quot;673&quot;&gt;
&lt;p data-end=&quot;786&quot; data-start=&quot;675&quot; style=&quot;text-align: justify;&quot;&gt;&lt;strong data-end=&quot;693&quot; data-start=&quot;675&quot;&gt;Cost-effective&lt;/strong&gt;: OpenStreetMap is significantly more affordable than Google Maps, with no steep API pricing.&lt;/p&gt;
&lt;/li&gt;
&lt;li data-end=&quot;908&quot; data-start=&quot;787&quot;&gt;
&lt;p data-end=&quot;908&quot; data-start=&quot;789&quot; style=&quot;text-align: justify;&quot;&gt;&lt;strong data-end=&quot;812&quot; data-start=&quot;789&quot;&gt;Highly customizable&lt;/strong&gt;: From tile colors and custom markers to layer controls, the map styling is incredibly flexible.&lt;/p&gt;
&lt;/li&gt;
&lt;li data-end=&quot;1014&quot; data-start=&quot;909&quot;&gt;
&lt;p data-end=&quot;1014&quot; data-start=&quot;911&quot; style=&quot;text-align: justify;&quot;&gt;&lt;strong data-end=&quot;933&quot; data-start=&quot;911&quot;&gt;Frequently updated&lt;/strong&gt;: The map data is refreshed several times a day, ensuring accuracy and relevance.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;p data-end=&quot;621&quot; data-start=&quot;270&quot; style=&quot;text-align: justify;&quot;&gt;On the backend, our &lt;strong data-end=&quot;306&quot; data-start=&quot;290&quot;&gt;Flask server&lt;/strong&gt; handles dynamic map rendering. The &lt;code data-end=&quot;356&quot; data-start=&quot;342&quot;&gt;render_map()&lt;/code&gt; function below takes in location data, timestamps, and speeds, then visualizes the route using &lt;a class=&quot;cursor-pointer&quot; data-end=&quot;508&quot; data-start=&quot;452&quot; rel=&quot;noopener&quot; target=&quot;_new&quot;&gt;Folium&lt;/a&gt; and &lt;a class=&quot;cursor-pointer&quot; data-end=&quot;569&quot; data-start=&quot;513&quot; rel=&quot;noopener&quot; target=&quot;_new&quot;&gt;branca&lt;/a&gt;—a powerful combo for interactive mapping in Python.&lt;/p&gt;
&lt;p data-end=&quot;643&quot; data-start=&quot;623&quot; style=&quot;text-align: justify;&quot;&gt;Here&#39;s how it works:&lt;/p&gt;
&lt;ul data-end=&quot;879&quot; data-start=&quot;645&quot;&gt;
&lt;li data-end=&quot;770&quot; data-start=&quot;645&quot;&gt;
&lt;p data-end=&quot;770&quot; data-start=&quot;647&quot; style=&quot;text-align: justify;&quot;&gt;If a &lt;code data-end=&quot;673&quot; data-start=&quot;652&quot;&gt;transition_duration&lt;/code&gt; is set, the function animates the trip using &lt;strong data-end=&quot;741&quot; data-start=&quot;719&quot;&gt;TimestampedGeoJson&lt;/strong&gt;, syncing movement with time.&lt;/p&gt;
&lt;/li&gt;
&lt;li data-end=&quot;879&quot; data-start=&quot;771&quot;&gt;
&lt;p data-end=&quot;879&quot; data-start=&quot;773&quot; style=&quot;text-align: justify;&quot;&gt;If no animation is requested, it renders a &lt;strong data-end=&quot;837&quot; data-start=&quot;816&quot;&gt;color-coded route&lt;/strong&gt; based on speed, using &lt;code data-end=&quot;878&quot; data-start=&quot;860&quot;&gt;folium.ColorLine&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;pre&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;  def render_map(locations, epochs, speeds, transition_duration):&lt;/div&gt;&lt;code&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;    if transition_duration:&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;        print(&quot;animating the path!&quot;)&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;        return TimestampedGeoJson(&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            {&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                &quot;type&quot;: &quot;FeatureCollection&quot;,&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                &quot;features&quot;: [&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                    {&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                        &quot;type&quot;: &quot;Feature&quot;,&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                        &quot;geometry&quot;: {&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                            &quot;type&quot;: &quot;LineString&quot;,&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                            &quot;coordinates&quot;: [&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                                [point[1], point[0]] for point in locations&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                            ],&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                        },&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                        &quot;properties&quot;: {&quot;times&quot;: epochs},&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                    }&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;                ],&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            },&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            period=&quot;PT1M&quot;,&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            transition_time=int(transition_duration),&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;        )&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;    else:&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;        colormap = branca.colormap.StepColormap(&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            [&quot;black&quot;, &quot;#DB3A2D&quot;, &quot;#003057&quot;, &quot;#00BE6E&quot;],&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            vmin=0,&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            vmax=120,&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            index=[0, 1, 5, 30, 1000],&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            caption=&quot;Speed (mph)&quot;,&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;        )&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;        return folium.ColorLine(&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            positions=locations,&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            colormap=colormap,&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            weight=4,&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;            colors=speeds,&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;        )&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;This is how an animation looks like:&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnz8JoIIWwoj66XVIv6GTDZVgA1r8p4bDk_5E863GGL_A4vJpIegHbHOFvEpDbEcgWc9hdtoBQzKu6XWGeuQFBuSncT9IAYxS8i-42tvIdCM0GNl5fS0tiKJy5HWl7e9PDsGlFOwu2N_efFKS4y57myTlh2xT0cpwfJ33vbbgWdQk8TUshcMeV3Q/s1045/loot.gif&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;608&quot; data-original-width=&quot;1045&quot; height=&quot;186&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnz8JoIIWwoj66XVIv6GTDZVgA1r8p4bDk_5E863GGL_A4vJpIegHbHOFvEpDbEcgWc9hdtoBQzKu6XWGeuQFBuSncT9IAYxS8i-42tvIdCM0GNl5fS0tiKJy5HWl7e9PDsGlFOwu2N_efFKS4y57myTlh2xT0cpwfJ33vbbgWdQk8TUshcMeV3Q/s320/loot.gif&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;I didn’t begin building production features in Jupyter notebooks. In fact, this all started three years ago with a much simpler goal: to understand our traffic patterns. As we expanded the product into new cities—and into different segments within larger metro areas—we needed answers. Where is traffic volume increasing? Where are drivers earning more in tips? These kinds of questions required a flexible geospatial analytics setup. Jupyter notebooks turned out to be the perfect environment to explore this growing volume of location-based data.&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;This is a rough look at our early days as we launched in Seattle:&lt;/p&gt;&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: justify;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjr5z0OMuO2xgGnft9FmkpZYZn0Gd_W_QD9eNCHgJ58ZJjNGWEQLrJDAddX-Pjwq5jznIZLVQLcxH_rqRzyH2oqcZw27x4YpYS6r-PK4jVXfw35MgwsZy82HFtaPnkbagtmyUftRDvf6HaO0ghJxU7mROcsR5WoOIwhuA60Zt6n5k3XNX3YZB-rgw/s2690/Screenshot%202025-05-15%20at%201.36.58%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1670&quot; data-original-width=&quot;2690&quot; height=&quot;357&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjr5z0OMuO2xgGnft9FmkpZYZn0Gd_W_QD9eNCHgJ58ZJjNGWEQLrJDAddX-Pjwq5jznIZLVQLcxH_rqRzyH2oqcZw27x4YpYS6r-PK4jVXfw35MgwsZy82HFtaPnkbagtmyUftRDvf6HaO0ghJxU7mROcsR5WoOIwhuA60Zt6n5k3XNX3YZB-rgw/w572-h357/Screenshot%202025-05-15%20at%201.36.58%E2%80%AFPM.png&quot; width=&quot;572&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;span&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;/span&gt;&lt;p&gt;&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;That early exploration eventually evolved into a lightweight geospatial analytics pipeline—one that could handle real driver data at scale. Using Jupyter notebooks gave us the flexibility to prototype quickly, visualize patterns, and iterate. But as the insights proved useful, we started formalizing parts of that workflow. What began as an experiment matured into a production-grade service: powered by a Flask backend, drawing from location check-ins, and rendering driver routes with OpenStreetMap tiles—all orchestrated from within the same notebook-driven environment we started with.&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;This is exactly what makes working at a startup so much fun. At a smaller scale, we can take something like a Jupyter notebook—a tool meant for exploration—and ship a real feature to users through the mobile app. I know some of you engineers at Amazon or Meta might be shaking your heads, but that’s the beauty of it: tools that would never even be considered in a big-company tech stack become fair game at a startup. And sometimes, that unconventional choice turns out to be the right one.&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;This is the route of a driver that is plotted using folium on a React-native web-view:&lt;/p&gt;&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: justify;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-oVjme7YNaj-NTwZqffE6QjjlTbg0vBhxwzURreBGC3iTwguXcoA97Ji75UoLy7kSfu94V2RL0uV2OBvtGm5qT0d0garphOPBVdSU7JWENotQJIs-07jt25DtMu7oj9xtwDupI2oMrxLZY8wxgaMmzYFmMZaps7rkth1SJMbyEVXW7McH0M9vhQ/s2372/Screenshot%202025-05-15%20at%201.04.25%E2%80%AFPM.png&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1000&quot; data-original-width=&quot;2372&quot; height=&quot;239&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-oVjme7YNaj-NTwZqffE6QjjlTbg0vBhxwzURreBGC3iTwguXcoA97Ji75UoLy7kSfu94V2RL0uV2OBvtGm5qT0d0garphOPBVdSU7JWENotQJIs-07jt25DtMu7oj9xtwDupI2oMrxLZY8wxgaMmzYFmMZaps7rkth1SJMbyEVXW7McH0M9vhQ/w565-h239/Screenshot%202025-05-15%20at%201.04.25%E2%80%AFPM.png&quot; width=&quot;565&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;span&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;/span&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;div style=&quot;text-align: justify;&quot;&gt;And what happens when we &lt;em data-end=&quot;180&quot; data-start=&quot;176&quot;&gt;do&lt;/em&gt; have millions of eyeballs on these maps? That’s not a crisis—it’s an opportunity. There are several clear paths to optimize for lower latency and scalability (hello, Leaflet and tile caching). But the key difference is this: we’ll be solving a real, validated need—not one we only &lt;em data-end=&quot;471&quot; data-start=&quot;462&quot;&gt;thought&lt;/em&gt; users might have. That’s the advantage of moving fast at a startup. We don’t prematurely optimize—we ship, we listen, and we scale when it actually matters.&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;p data-end=&quot;486&quot; data-start=&quot;139&quot; style=&quot;text-align: justify;&quot;&gt;Bringing it back to maps—our rendering is handled by the Folium library, running within a Python Flask server. What’s nice about Folium is that it provides the same visual output in a Jupyter notebook as it does in production. This lets us prototype and test the map layout directly in the notebook before moving the code over to a Flask endpoint.&lt;/p&gt;
&lt;p data-end=&quot;725&quot; data-start=&quot;488&quot; style=&quot;text-align: justify;&quot;&gt;Here’s how it works: a web server sends the Flask server a list of GPS points to plot. The Flask server then renders the route using Folium and returns the HTML map back to the web server, which in turn passes it along to the mobile app.&lt;/p&gt;
&lt;p style=&quot;text-align: justify;&quot;&gt;For individual routes, this approach works surprisingly well. It’s not the fastest setup—since the full map is rendered server-side and sent to the client—but for shorter routes, the latency is acceptable and the experience is smooth enough.&lt;/p&gt;&lt;p data-end=&quot;336&quot; data-start=&quot;135&quot; style=&quot;text-align: justify;&quot;&gt;Eventually, we wanted to display a real-time map of all our active drivers on a flat panel in our Seattle office. The simplest (and fastest) way to do that? Leverage the same system we’d already built.&lt;/p&gt;
&lt;p style=&quot;text-align: justify;&quot;&gt;So I added a new endpoint to the Flask server—one that accepts a list of GPS points and renders a small icon at each location. Different driver events are visualized using different icons. For example: a driver&#39;s current location appears as a yellow circle with a red outline; a new sign-up shows up as a gold star; and when a driver swipes a Solo cash card, a dollar sign icon pops into view.&lt;/p&gt;&lt;p data-end=&quot;374&quot; data-start=&quot;207&quot; style=&quot;text-align: justify;&quot;&gt;Well, that was the easy part. The real challenge was figuring out how to track all these events in real time so we could continuously update the map every few minutes.&lt;/p&gt;&lt;p data-end=&quot;448&quot; data-start=&quot;376&quot; style=&quot;text-align: justify;&quot;&gt;To manage this, I used several Redis sorted sets, grouped by event type:&lt;/p&gt;&lt;ul data-end=&quot;500&quot; data-start=&quot;450&quot;&gt;
&lt;li data-end=&quot;467&quot; data-start=&quot;450&quot;&gt;
&lt;p data-end=&quot;467&quot; data-start=&quot;452&quot; style=&quot;text-align: justify;&quot;&gt;&lt;code data-end=&quot;467&quot; data-start=&quot;452&quot;&gt;EVENT_DRIVING&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li data-end=&quot;484&quot; data-start=&quot;468&quot;&gt;
&lt;p data-end=&quot;484&quot; data-start=&quot;470&quot; style=&quot;text-align: justify;&quot;&gt;&lt;code data-end=&quot;484&quot; data-start=&quot;470&quot;&gt;EVENT_SIGNUP&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li data-end=&quot;500&quot; data-start=&quot;485&quot;&gt;
&lt;p data-end=&quot;500&quot; data-start=&quot;487&quot; style=&quot;text-align: justify;&quot;&gt;&lt;code data-end=&quot;500&quot; data-start=&quot;487&quot;&gt;EVENT_SWIPE&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;p data-end=&quot;818&quot; data-start=&quot;502&quot; style=&quot;text-align: justify;&quot;&gt;Each set holds user IDs as members, with their current &lt;code data-end=&quot;580&quot; data-start=&quot;557&quot;&gt;&amp;lt;latitude, longitude&amp;gt;&lt;/code&gt; stored using Redis&#39; &lt;code data-end=&quot;609&quot; data-start=&quot;601&quot;&gt;GEOADD&lt;/code&gt; command. These sets guarantee that each user has only one location entry, so as we receive location updates, we simply overwrite the previous value—giving us the user&#39;s most recent location at any given time.&lt;/p&gt;&lt;p data-end=&quot;1034&quot; data-start=&quot;820&quot; style=&quot;text-align: justify;&quot;&gt;But there&#39;s a catch: if a driver stops moving or goes offline, their entry becomes stale. Redis does support TTLs (time-to-live), but it doesn&#39;t allow expiring individual members of a set. So I had to get creative.&lt;/p&gt;&lt;p&gt;




&lt;/p&gt;&lt;p data-end=&quot;1364&quot; data-start=&quot;1036&quot; style=&quot;text-align: justify;&quot;&gt;To work around this, I store a separate key for each active user using a naming pattern like &lt;code data-end=&quot;1152&quot; data-start=&quot;1129&quot;&gt;LIVE_EVENTS_&amp;lt;user_id&amp;gt;&lt;/code&gt;, and assign a 5-minute TTL to each. Then, every 10 minutes, I scan through the geo sets and prune out any user IDs that no longer have a corresponding &lt;code data-end=&quot;1319&quot; data-start=&quot;1304&quot;&gt;LIVE_EVENTS_*&lt;/code&gt; key—effectively cleaning up stale locations.&lt;/p&gt;&lt;p data-end=&quot;1364&quot; data-start=&quot;1036&quot; style=&quot;text-align: justify;&quot;&gt;And that map you see at the top of this post? It was built exactly this way—stitched together from Redis geo sets, rendered by a Flask server, and piped straight from a Jupyter notebook prototype into production.&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&amp;nbsp;&lt;/p&gt;&lt;p style=&quot;text-align: justify;&quot;&gt;&lt;br /&gt;&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/9128483795048195528/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/9128483795048195528' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/9128483795048195528'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/9128483795048195528'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2025/05/real-time-location-of-drivers-tale-of.html' title='Real time location of drivers : a tale of repurposing a Jupyter Notebook'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqEOXw16I9veN-zTMIlukYLnDE-sP5xI0n1vfudPn_1tzhzpqhcmaH3tFubdENTd7oC4_s4Jr-6LWLg9uX_RkoaU7Tgpr6501nuau6ltdOChHdhyphenhyphenWuxETACL-vVsRT-buhxK-2qGrP7I81b8iInmiyJ0-kUdo4IBZiyLSl-dxjgTQokSAYyAyUsA/s72-w579-h320-c/drivehustle.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-4340518051915073196</id><published>2024-03-13T14:55:00.000-07:00</published><updated>2024-03-13T14:55:03.716-07:00</updated><title type='text'>Postgres indices on large tables : gotchas</title><content type='html'>&lt;p&gt;When you run a migration to create an index on Postgres, to allow queries to run, we create the index using the `CONCURRENTLY` flag.&lt;/p&gt;&lt;p&gt;But if the migration fails for any reason, the index will be created but unusable. Say, you were adding a UNIQUE index, and the migration fails since it encounters a duplicate. So the migration fails with something like:&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjN6gTIU0Wk4ZfJC5PuGSo9WJeEkndhvEcAGPFe5Mf4KCSflsI8FZKvtjn86NECdIW7vcFkAXntW4NlTm36RgB2TdNqFGCUe2yXx0PcHsSIf2FxCMR5MiAcWbdtQ34fc0uFE8iGVSMp80TcPZXJBrL8KlKyOEaKAP82wgceN5hfAx4RHcIrT8d4FQ/s1678/Screen%20Shot%202024-03-13%20at%202.49.33%20PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;232&quot; data-original-width=&quot;1678&quot; height=&quot;88&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjN6gTIU0Wk4ZfJC5PuGSo9WJeEkndhvEcAGPFe5Mf4KCSflsI8FZKvtjn86NECdIW7vcFkAXntW4NlTm36RgB2TdNqFGCUe2yXx0PcHsSIf2FxCMR5MiAcWbdtQ34fc0uFE8iGVSMp80TcPZXJBrL8KlKyOEaKAP82wgceN5hfAx4RHcIrT8d4FQ/w640-h88/Screen%20Shot%202024-03-13%20at%202.49.33%20PM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;Now imagine that your migration needed to make an existing index UNIQUE. It is usually done by first removing the index, and adding it back with the UNIQUE constraint.&lt;/p&gt;&lt;p&gt;Since the migration is aborted at the point of creating the new index with the constraint, now there is no index. The app will be running without an index.&lt;/p&gt;&lt;p&gt;If you were to check the db, you might get confused as the index *seems* to be there - it is just not operable.&lt;/p&gt;&lt;p&gt;To verify, run an `EXPLAIN` and you will see it does not use the index.&lt;/p&gt;&lt;p&gt;Now the queries that regularly use the index will be running quite slowly, as the index is not quite ready. This could add a lot of pressure to the db and make the app pretty much unusable / inoperable.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/4340518051915073196/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/4340518051915073196' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/4340518051915073196'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/4340518051915073196'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2024/03/postgres-indices-on-large-tables-gotchas.html' title='Postgres indices on large tables : gotchas'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjN6gTIU0Wk4ZfJC5PuGSo9WJeEkndhvEcAGPFe5Mf4KCSflsI8FZKvtjn86NECdIW7vcFkAXntW4NlTm36RgB2TdNqFGCUe2yXx0PcHsSIf2FxCMR5MiAcWbdtQ34fc0uFE8iGVSMp80TcPZXJBrL8KlKyOEaKAP82wgceN5hfAx4RHcIrT8d4FQ/s72-w640-h88-c/Screen%20Shot%202024-03-13%20at%202.49.33%20PM.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-3295864781946105094</id><published>2024-01-10T14:39:00.000-08:00</published><updated>2024-01-10T14:39:06.117-08:00</updated><title type='text'>Ruby lazy collections and with_index</title><content type='html'>&lt;p&gt;&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/a/AVvXsEjNSFBlP7iXwQ997zFEdTLMCaxF_KuuNPKfm3_-RbGzCpZGCRr7hjNMhpc-glAp5WQjUKH-Z9QhUWOLhoGy7-0xPUiJy50nFGAY2TTyjHa_VTEN9bV_KmKkghxPJLXmg05nwARVtJaayViE-cO9UtAPHKzkfGQ2Y2LgKFGr7zLwczYAEn73YL6SRw&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img alt=&quot;&quot; data-original-height=&quot;588&quot; data-original-width=&quot;940&quot; height=&quot;200&quot; src=&quot;https://blogger.googleusercontent.com/img/a/AVvXsEjNSFBlP7iXwQ997zFEdTLMCaxF_KuuNPKfm3_-RbGzCpZGCRr7hjNMhpc-glAp5WQjUKH-Z9QhUWOLhoGy7-0xPUiJy50nFGAY2TTyjHa_VTEN9bV_KmKkghxPJLXmg05nwARVtJaayViE-cO9UtAPHKzkfGQ2Y2LgKFGr7zLwczYAEn73YL6SRw&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The lazy collections don&#39;t quite give us a way to use `with_index` , but, we can use `each_with_index`.&lt;p&gt;&lt;/p&gt;&lt;p&gt;It took me a moment to figure out the problem was with the collection being lazy.&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/3295864781946105094/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/3295864781946105094' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/3295864781946105094'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/3295864781946105094'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2024/01/ruby-lazy-collections-and-withindex.html' title='Ruby lazy collections and with_index'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/a/AVvXsEjNSFBlP7iXwQ997zFEdTLMCaxF_KuuNPKfm3_-RbGzCpZGCRr7hjNMhpc-glAp5WQjUKH-Z9QhUWOLhoGy7-0xPUiJy50nFGAY2TTyjHa_VTEN9bV_KmKkghxPJLXmg05nwARVtJaayViE-cO9UtAPHKzkfGQ2Y2LgKFGr7zLwczYAEn73YL6SRw=s72-c" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-5830703354515209078</id><published>2022-08-25T11:33:00.001-07:00</published><updated>2022-08-25T11:40:46.226-07:00</updated><title type='text'>Using Redis for web request counting</title><content type='html'>&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjliVxPSESbGhLTMl1lK7amAZAnDBr3wg2LohY5F4Ncih5G3EnSYAozaU3u_J0jnY4N0T9jh4nPM-4w4UE3ReRAgDKPap3ZYLMQRZ3CH2FPK3Mt82XkluWr_N7d_5eE5suJ7vOY5LnMyJqUYOtZjIbqx6uTr16ir-H3EL79adtoaBfzuXygRLQ/s960/Screen%20Shot%202022-08-25%20at%2011.38.58%20AM.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;486&quot; data-original-width=&quot;960&quot; height=&quot;162&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjliVxPSESbGhLTMl1lK7amAZAnDBr3wg2LohY5F4Ncih5G3EnSYAozaU3u_J0jnY4N0T9jh4nPM-4w4UE3ReRAgDKPap3ZYLMQRZ3CH2FPK3Mt82XkluWr_N7d_5eE5suJ7vOY5LnMyJqUYOtZjIbqx6uTr16ir-H3EL79adtoaBfzuXygRLQ/s320/Screen%20Shot%202022-08-25%20at%2011.38.58%20AM.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;&lt;br /&gt;&lt;/h2&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Problem Statement&lt;/h2&gt;&lt;p&gt;With the growth of our user base and increasing traffic to the web servers, we wanted to come up with some realtime counters to measure traffic.&lt;/p&gt;&lt;p&gt;We wanted the system to give us an idea of:&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;ul style=&quot;text-align: left;&quot;&gt;&lt;li&gt;The total web requests the site is receiving each minute&lt;/li&gt;&lt;li&gt;These totals aggregated hourly and displayed for the day (24 entries)&lt;/li&gt;&lt;li&gt;The counts broken up by the specific request path&lt;/li&gt;&lt;li&gt;The number of unique visitors broken up by the minute, hour and day&lt;/li&gt;&lt;/ul&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Limitations of available solutions&lt;/h2&gt;&lt;div&gt;We use Heroku for deploying the app. The Heroku dashboard provides us a rough idea of the request counts, but it is difficult to incorporate that into our own operational dashboards. Also Heroku doesn&#39;t break down the counts by the request path, which was important to us in understanding the source of traffic and how we could scale resources based on the source. Unique visitor counts are similarly an important metric not readily available.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Counters help scale the app&lt;/h2&gt;&lt;div&gt;Breaking down the traffic by the source was important in coming up with a way to efficiently scale our service. Currently most of our application sits in a single service. So scaling the app means we add machines to this single fleet, even though only requests from one or two sources really benefit from this added capacity.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We have 3 main sources of traffic:&lt;/div&gt;&lt;div&gt;&lt;ol style=&quot;text-align: left;&quot;&gt;&lt;li&gt;Our users hit the service for various features provided from their mobile app&lt;/li&gt;&lt;li&gt;Our users send us their geo location from the mobile app&lt;/li&gt;&lt;li&gt;We receive near real-time data of our users gig work from a third party&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;Our primary application performance goal is providing a sub second experience to users as they use our mobile app; thus mainly, we want to optimize resourcing on the backend with a focus on 1.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;However, we get so much more traffic from 2. and 3. which consume most of the server bandwidth. Keeping all three services as a single service degrades the experience for the user.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Mind you, 2. and 3. do not have a real time processing requirement. While a web server is needed to accept these requests, the actual processing is handled by an asynchronous worker outside of the web server.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But still, since there are so many of these web requests, for the few milliseconds each of them sit on the web server, it takes away bandwidth from the user requests.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Why Redis?&lt;/h2&gt;&lt;div&gt;Redis provides the ideal data structures for counting requests with minimal overhead.&amp;nbsp; (And must I say that it is fast, as in really, really fast) A few keys can be used to keep a running total for each minute per request type, then a &lt;a href=&quot;https://redis.io/docs/data-types/sorted-sets/&quot; target=&quot;_blank&quot;&gt;sorted set&lt;/a&gt;&amp;nbsp;can be used to aggregate the last N minutes of requests. (For our purposes, we decided to keep 60 counts, thus providing a picture of activity for the last hour, but you can choose to keep counts for longer than that.)&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The same idea can be extended to measure days worth of traffic broken by the hour.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Choice of the Sorted Set&lt;/h2&gt;&lt;div&gt;Why did we decide on the sorted set for aggregating the counters? Well, the sorted set allows us to have the counters sorted by time. This way, we can quickly get the list of counts for the last hour ordered from minute 1 down. To be fair, it is a bit overkill to use a set for this, as we are never going to be mutating an element (since the timestamp is always increasing), but it does suit our purposes just fine!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Before going any further, let us briefly recap the salient features of the sorted set. It allows us to insert elements along with a score, and the elements are sorted in real time by the score. It &lt;a href=&quot;https://redis.io/docs/data-types/sorted-sets/&quot; target=&quot;_blank&quot;&gt;scales&lt;/a&gt; really well for even millions (or more) of elements as each insert operation takes O(log(n)) time -- much like a binary tree. While we do not need that level of scale, one can think of keeping extremely granular counts for a long period of time, which could come in handy for debugging bizzare performance problems after the fact!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We can use the timestamp as the score. Redis will then always sort the set by the timestamp. This has the advantage that if you wanted to change the counter later (imagine a design where you quickly provide a rough estimate of the count, but later do a second pass to provide exact counts), you can simply add a new count to the set with the same timestamp and the position of the element will not change.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The counters will need to be reset at the start of the minute. I first made the mistake of using the expiry time of the key set to 1 minute, but realized that this introduces a race at the point of aggregating the count on to the sorted set. Which is that we may be unlucky that before the aggregation, redis could have expired the key, resulting in a substantial undercount in the set. (This was a difficult bug to track down, and of course the most obvious, I had a face-palm moment as you can imagine.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There is a slight difficulty we need to work around here w.r.t the sorted set. If we keep the count as the element in the set, a count that happens to be the same as one already stored will replace the previous count (with the score modified). Since we are using the timestamp as the score, this will essentially remove the entry we had for the earlier timestamp. This is how sets work after all - it is a data structure suited for keeping unique members. But we can easily overcome this by prepending the timestamp of the count to the count and storing that as the element of the set. To read the count, we merely have to split the element by the delimiter (we used the colon here for the delimited, which is somewhat of a standard in redis for splitting keys), and use the second part.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;A look at the running counters&lt;/h2&gt;&lt;div&gt;A ZRANGE command can retrieve the counts like so:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEijjMv2GgHx0j6c_ySyW5LQNkLC4iXgcPkGnzdIYRiy6EdXK-pgb0h0BMF6WLZqStNno09SrRkpR24iQdndI0pUc4Md6wOSZD-WQa4m1nPTeNU4dhkiJBQFWGRUJhrOLVtbljg69qVHs70V72Dx4yoJnT8hBvhARODiO4rSL4ygqngJyPAXNP8/s1342/Screen%20Shot%202022-08-21%20at%206.59.19%20PM.png&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;900&quot; data-original-width=&quot;1342&quot; height=&quot;429&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEijjMv2GgHx0j6c_ySyW5LQNkLC4iXgcPkGnzdIYRiy6EdXK-pgb0h0BMF6WLZqStNno09SrRkpR24iQdndI0pUc4Md6wOSZD-WQa4m1nPTeNU4dhkiJBQFWGRUJhrOLVtbljg69qVHs70V72Dx4yoJnT8hBvhARODiO4rSL4ygqngJyPAXNP8/w640-h429/Screen%20Shot%202022-08-21%20at%206.59.19%20PM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Counting the unique visitors is only slightly more involved. First we need to keep a regular Redis set and update it for each request. In our case, the user id is encoded in the request header, we decode it and add it to the Redis set. Now if the same user visits the site again, since we are using a set, we will not add it again, and we still have just one element in the set. This way, we can take the length of the set at any point and know how many unique visitors we have from the time we started writing to the set.&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;div&gt;The only thing left to do is, create the set at the start of the time interval we need to measure the count, and reset it at the end of the time interval. So we can set this up to reset every minute for a minute by minute count of unique visitors. Then we can use the infrastructure we built above to aggregate the count over to the sorted set, so we have a running count of unique visitors for the past N minutes.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;(You may have a different technique for figuring out the ID of the user making the request. Once the ID is extracted, you can use a Redis set to keep track of the visit.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here is how we see the unique visitor count changing dynamically:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjj_SLxZ36R-gmmRu4ulSIIPvGF2ITghSjo4p6aCpWZZ02Aq8OFuXFoAvv6NK95qhV9kN2qincb0RGQ145zIS_j3qK8Ke76gjbDDR6KsNVcVJ6TgZZ72XzsgpeZHp4xpmI7FAFGnOOXfJt0VsSeal9L4NEzKT1KIsDVGO3usj64pA2Ikk18SWc/s1290/Screen%20Shot%202022-08-21%20at%207.07.19%20PM.png&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;262&quot; data-original-width=&quot;1290&quot; height=&quot;81&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjj_SLxZ36R-gmmRu4ulSIIPvGF2ITghSjo4p6aCpWZZ02Aq8OFuXFoAvv6NK95qhV9kN2qincb0RGQ145zIS_j3qK8Ke76gjbDDR6KsNVcVJ6TgZZ72XzsgpeZHp4xpmI7FAFGnOOXfJt0VsSeal9L4NEzKT1KIsDVGO3usj64pA2Ikk18SWc/w400-h81/Screen%20Shot%202022-08-21%20at%207.07.19%20PM.png&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We can just as easily use another excellent Redis command to see all the user ids in this set. Here is a snippet in our case :&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6H4aUUMTEzpmu8LuP8AeaO4DuP9AOGT65vKORhk7oeTFrQfTSHJ6cI18AW1xkvNw8un5lnRdfY8WQqCMa1a2c2CCkZGrA4EVH4fEaWsV-RZaCsdpHtNbxu5FmUePLqvlk53Nl6Spg35eMfgyi8xXBRfvHrnNavgDtafouWq3bYJBgZ2olRTI/s1304/Screen%20Shot%202022-08-21%20at%207.11.38%20PM.png&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;288&quot; data-original-width=&quot;1304&quot; height=&quot;71&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6H4aUUMTEzpmu8LuP8AeaO4DuP9AOGT65vKORhk7oeTFrQfTSHJ6cI18AW1xkvNw8un5lnRdfY8WQqCMa1a2c2CCkZGrA4EVH4fEaWsV-RZaCsdpHtNbxu5FmUePLqvlk53Nl6Spg35eMfgyi8xXBRfvHrnNavgDtafouWq3bYJBgZ2olRTI/s320/Screen%20Shot%202022-08-21%20at%207.11.38%20PM.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;&lt;br /&gt;&lt;/h2&gt;&lt;h2 style=&quot;text-align: left;&quot;&gt;Implementation&lt;/h2&gt;&lt;div style=&quot;text-align: left;&quot;&gt;&lt;span style=&quot;font-weight: normal;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;We implemented the counters using Ruby, with the &lt;/span&gt;&lt;a href=&quot;https://github.com/redis/redis-rb&quot; style=&quot;white-space: pre-wrap;&quot; target=&quot;_blank&quot;&gt;redis gem&lt;/a&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt; as our client. This involves several steps:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;ol style=&quot;text-align: left;&quot;&gt;&lt;li&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;Initializing the counters&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;Resetting the counters at minute, hour, day intervals&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;Incrementing the appropriate counters for each request&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;Aggregating the count onto the set&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;The first two steps can be combined. We used a scheduler that sits within the app via the &lt;a href=&quot;https://github.com/jjb/ruby-clock&quot; target=&quot;_blank&quot;&gt;ruby clock gem&lt;/a&gt;. Heroku allows provisioning a single process that runs a scheduler based on the schedule we set via the ruby clock. This is pretty similar to how one would use &lt;a href=&quot;https://www.geeksforgeeks.org/crontab-in-linux-with-examples/&quot; target=&quot;_blank&quot;&gt;cron&lt;/a&gt; on a Unix machine to schedule a task.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;Heroku does provide a &lt;a href=&quot;https://devcenter.heroku.com/articles/scheduler&quot; target=&quot;_blank&quot;&gt;scheduler&lt;/a&gt; as well. We did not use it as it does not have the same reliability guarantees as the ruby clock gem. I have seen cases where the Heroku scheduler does not fire and fires very late, all &lt;a href=&quot;https://devcenter.heroku.com/articles/scheduler#known-issues-and-alternatives&quot; target=&quot;_blank&quot;&gt;documented&lt;/a&gt; behaviors.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;Since we use Rails for our app, we utilized its framework built on top of controllers to track request counts.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;A controller encapsulates serving requests for a specific web route (think of this as having one controller for &lt;i&gt;yoursite.com/user/login&lt;/i&gt; and another controller for &lt;i&gt;yoursite.com/reports/account&lt;/i&gt;). Now each controller is a subclass of a class we implement called &lt;/span&gt;&lt;span style=&quot;background-color: #1e1e1e; color: #d4d4d4; white-space: pre;&quot;&gt;ApplicationController&lt;/span&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;  which itself is a subclass of the Rails class  &lt;/span&gt;&lt;span style=&quot;background-color: #1e1e1e; color: #d4d4d4; white-space: pre;&quot;&gt;ActionController::Base&lt;/span&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;Rails allows us to hook all requests at the ApplicationController level with a simple &lt;i&gt;before_action&lt;/i&gt; hook. We implemented the request counting using this hook, and it looks something like this:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span face=&quot;arial, sans-serif&quot;&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;class ApplicationController &amp;lt; ActionController::Base

  before_action :update_usage_counters

  def update_usage_counters
    PerfCounters.increment(user_id: current_user&amp;amp;.id, request_path: request&amp;amp;.fullpath)
  end
end
&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;div style=&quot;font-family: arial, sans-serif;&quot;&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;Now each request goes through update_usage_counters, which delegates the work to the PerfCounters class we wrote. &lt;i&gt;request&lt;/i&gt; is an object provided by the Rails routing framework, and &lt;i&gt;request.fullpath&lt;/i&gt; contains the URL. The method &lt;i&gt;current_user&lt;/i&gt; (not shown) extracts the logged in user&#39;s ID from the request headers.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;I will reproduce pieces of a simplified version of &lt;i&gt;PerfCounters&lt;/i&gt; that will illustrate the logic:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;The incrementing logic looks like this:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;class PerfCounters
  def self.increment(user_id:, request_path:)
    $redis.pipelined do |pipeline|
      if user_id.present?
        pipeline.incr(&#39;USER_REQUESTS_MINUTE&#39;)
        pipeline.sadd(&#39;UNIQUE_VISITORS_MINUTE&#39;, user_id)
        if request_path&amp;amp;.include?(&quot;/geo/send&quot;)
          pipeline.incr(&#39;GEO_REQUESTS_MINUTE&#39;)
        end
      else
        pipeline.incr(&#39;OTHER_REQUESTS_MINUTE&#39;)
      end
    end
  end
end
&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;/div&gt;&lt;div style=&quot;font-family: arial, sans-serif;&quot;&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;Notice that a request made on behalf of a logged in user will have &lt;i&gt;user_id&lt;/i&gt; parameter set. The &lt;i&gt;request_path&lt;/i&gt; is the path of the URL, and we use it here to separate the counts made to track the location of the user.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;Another neat redis feature we use here is &lt;a href=&quot;https://redis.io/docs/manual/pipelining/&quot; target=&quot;_blank&quot;&gt;pipelining&lt;/a&gt;. The idea is that if we need to make a number of independent requests to redis, we can open a socket to the redis server and send all that data and close the socket at the end. Redis server will return an array of replies in order after it processes all requests. This is a powerful feature that is more efficient than creating a socket for each separate request. It is not without cost - as the server has to buffer each request, technically blocking the request thread from processing other requests. The rule of thumb is to make sure that each request is pretty fast - O(1) would be ideal, and to not pipeline too many requests in a single call. As with everything, you must test this against all the other traffic you serve and compromise if you must!&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;Also notice that we are demonstrating the use of three counters &lt;/span&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;USER_REQUESTS_MINUTE,  &lt;/span&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;GEO_REQUESTS_MINUTE and &lt;/span&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;OTHER_REQUESTS_MINUTE, alongside a set called &lt;/span&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;UNIQUE_VISITORS_MINUTE. This last one actually keeps the user ids of all visitors. The &lt;a href=&quot;https://redis.io/commands/sadd/&quot; target=&quot;_blank&quot;&gt;sadd&lt;/a&gt; command adds the visitor id to the set upon the first time we see them.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;The ruby clock gem takes its inputs via a file named &lt;i&gt;Clockfile&lt;/i&gt;. This is in fact a file that uses ruby syntax, i:e it is evaluated by the ruby interpreter. All we do is define the aggregator to run every minute, like so:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span face=&quot;arial, sans-serif&quot; style=&quot;white-space: pre-wrap;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;schedule.cron &#39;* * * * *&#39; do
  PerfCounters.aggregate_minute
end&lt;/span&gt;&lt;/span&gt;&lt;span face=&quot;arial, sans-serif&quot; style=&quot;white-space: pre-wrap;&quot;&gt;
&lt;/span&gt;&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;This is what the minute aggregation looks like:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: courier;&quot;&gt;def self.aggregate_minute
    tm_obj = Time.current - 1.minute # aggregate last minute&#39;s stats
    tm = tm_obj.to_i

    # get all current minute counters, add them to the hour list before zeroing them out
    user_rpm, other_rpm, geo_rpm, unique_visitors_last_minute =
      $redis.pipelined do |pipeline|
        pipeline.get(&#39;USER_REQUESTS_MINUTE&#39;)
        pipeline.get(&#39;OTHER_REQUESTS_MINUTE&#39;)
        pipeline.get(&#39;GEO_REQUESTS_MINUTE&#39;)
        pipeline.scard(&#39;UNIQUE_VISITORS_MINUTE&#39;)
      end

    $redis.pipelined do |pipeline|
      # ZADD key score value : keep timestamp as score so we get the counters sorted by time
      # append the timestamp to the counter to make sure entries don&#39;t overwrite.
      pipeline.zadd(&#39;USER_REQUESTS_LAST_HOUR&#39;, tm, &quot;#{user_rpm}:#{tm}&quot;)
      pipeline.zadd(&#39;OTHER_REQUESTS_LAST_HOUR&#39;, tm, &quot;#{other_rpm}:#{tm}&quot;)
      pipeline.zadd(&#39;GEO_REQUESTS_LAST_HOUR&#39;, tm, &quot;#{geo_rpm}:#{tm}&quot;)
      pipeline.zadd(&#39;UNIQUE_VISITORS_LAST_HOUR&#39;, tm, &quot;#{unique_visitors_last_minute}:#{tm}&quot;)

      pipeline.del(&#39;USER_REQUESTS_MINUTE&#39;)
      pipeline.del(&#39;OTHER_REQUESTS_MINUTE&#39;)
      pipeline.del(&#39;GEO_REQUESTS_MINUTE&#39;)
      pipeline.del(&#39;UNIQUE_VISITORS_MINUTE&#39;)
    end
  end&lt;/span&gt;&lt;/span&gt;&lt;span face=&quot;arial, sans-serif&quot; style=&quot;white-space: pre-wrap;&quot;&gt;
&lt;/span&gt;&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;As you can see, there are two types of counters. One tracks the count for each minute, the other aggregates it for the hour. So for example, take &lt;/span&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;i&gt;USER_REQUESTS_MINUTE&lt;/i&gt;. This is incremented for each request made on behalf of a logged in user. Then upon the dawn of the minute, this counter is added to the sorted set &lt;/span&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;i&gt;USER_REQUESTS_LAST_HOUR&lt;/i&gt; and then immediately deleted.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;You can chop the aggregations every Mth hour since otherwise these sets will keep growing eventually taking all of Redis memory! I won&#39;t show that code, but it should be fairly straightforward to implement.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;After having implemented this solution and writing this article, I have come to see that there are other ways to implement counting using Redis. Redis provides such a versatile set of data structures and algorithms that there is always a simpler or more elegant technique somewhere!&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;For example, when we use a container like a sorted set or a list, we must set its bounds, clearing it at certain time intervals and thus restricting its memory usage. But if you use the &lt;a href=&quot;https://redis.io/docs/stack/&quot; target=&quot;_blank&quot;&gt;Redis stack&lt;/a&gt;, there is an excellent data structure - the Redis &lt;a href=&quot;https://redis.io/docs/stack/timeseries/&quot; target=&quot;_blank&quot;&gt;Timeseries&lt;/a&gt; that does much of this bookkeeping for you. Basically, you can configure the time series to expire old entries (something that most other Redis data structures do not do for you - you can expire the complete key or nothing at all). Besides that it has commands very similar to the set or the sorted set.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;Another advantage of a time series vs a sorted set would be the trivially simple management of a &quot;rolling&quot; window of counts. This is typical in performance monitoring that you want the &quot;last 72 hours&quot; or the last &quot;30 days&quot; of performance data, which is more useful than data &quot;for all of today&quot; or &quot;for the current hour&quot;.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;I leave this as an exercise to the reader. Maybe I can talk about this in greater detail in a future article as well!&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;span style=&quot;font-family: inherit;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;span&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;&lt;/span&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;p&gt;&lt;span face=&quot;arial, sans-serif&quot; style=&quot;font-size: small; white-space: pre-wrap;&quot;&gt;This post is in collaboration with Redis.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span face=&quot;arial, sans-serif&quot; style=&quot;font-size: small; font-weight: 700; white-space: pre-wrap;&quot;&gt;Learn more:&lt;/span&gt;&lt;/p&gt;&lt;div style=&quot;background-color: white; color: #222222; font-family: Arial, Helvetica, sans-serif; font-size: small;&quot;&gt;&lt;span id=&quot;m_728717110910178134m_5025674836303415239gmail-m_7284779258426388204gmail-m_-3702633914473850678gmail-m_8952040696209478705gmail-m_-1813321305228359964gmail-m_1262458826033144780gmail-m_-7682981764392618003gmail-docs-internal-guid-ffb92f76-7fff-b50f-4503-7982e6d77dea&quot;&gt;&lt;span face=&quot;arial, sans-serif&quot;&gt;&lt;p dir=&quot;ltr&quot; style=&quot;line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;ul style=&quot;margin-bottom: 0px; margin-top: 0px;&quot;&gt;&lt;li dir=&quot;ltr&quot; style=&quot;background-color: transparent; color: black; font-variant-east-asian: normal; font-variant-numeric: normal; list-style-type: disc; margin-left: 36pt; vertical-align: baseline; white-space: pre-wrap;&quot;&gt;&lt;p dir=&quot;ltr&quot; role=&quot;presentation&quot; style=&quot;line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;&quot;&gt;&lt;a data-saferedirecturl=&quot;https://www.google.com/url?q=https://redis.info/3NBGJRT&amp;amp;source=gmail&amp;amp;ust=1660847741884000&amp;amp;usg=AOvVaw3OvlJHE4walg8cPTDs6_EZ&quot; href=&quot;https://redis.info/3NBGJRT&quot; style=&quot;color: #1155cc; text-decoration-line: none;&quot; target=&quot;_blank&quot;&gt;&lt;span style=&quot;background-color: transparent; font-variant-east-asian: normal; font-variant-numeric: normal; text-decoration-line: underline; vertical-align: baseline;&quot;&gt;Try Redis Cloud for free&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li dir=&quot;ltr&quot; style=&quot;background-color: transparent; color: black; font-variant-east-asian: normal; font-variant-numeric: normal; list-style-type: disc; margin-left: 36pt; vertical-align: baseline; white-space: pre-wrap;&quot;&gt;&lt;p dir=&quot;ltr&quot; role=&quot;presentation&quot; style=&quot;line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;&quot;&gt;&lt;a data-saferedirecturl=&quot;https://www.google.com/url?q=https://redis.info/3Ga9YII&amp;amp;source=gmail&amp;amp;ust=1660847741884000&amp;amp;usg=AOvVaw3WFQ5HF6OUmX0bUxd401Fk&quot; href=&quot;https://redis.info/3Ga9YII&quot; style=&quot;color: #1155cc; text-decoration-line: none;&quot; target=&quot;_blank&quot;&gt;&lt;span style=&quot;background-color: transparent; font-variant-east-asian: normal; font-variant-numeric: normal; text-decoration-line: underline; vertical-align: baseline;&quot;&gt;Watch this video on the benefits of Redis Cloud over other Redis providers&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;ul style=&quot;margin-bottom: 0px; margin-top: 0px;&quot;&gt;&lt;li dir=&quot;ltr&quot; style=&quot;background-color: transparent; color: black; font-style: italic; font-variant-east-asian: normal; font-variant-numeric: normal; list-style-type: disc; margin-left: 36pt; vertical-align: baseline; white-space: pre-wrap;&quot;&gt;&lt;p dir=&quot;ltr&quot; role=&quot;presentation&quot; style=&quot;line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;&quot;&gt;&lt;span style=&quot;background-color: transparent; font-variant-east-asian: normal; font-variant-numeric: normal; vertical-align: baseline;&quot;&gt;(Embed this video if possible)&lt;/span&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li dir=&quot;ltr&quot; style=&quot;background-color: transparent; color: black; font-variant-east-asian: normal; font-variant-numeric: normal; list-style-type: disc; margin-left: 36pt; vertical-align: baseline; white-space: pre-wrap;&quot;&gt;&lt;p dir=&quot;ltr&quot; role=&quot;presentation&quot; style=&quot;line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;&quot;&gt;&lt;a data-saferedirecturl=&quot;https://www.google.com/url?q=https://redis.info/3LC4GqB&amp;amp;source=gmail&amp;amp;ust=1660847741884000&amp;amp;usg=AOvVaw2kSKn9ughOQAfpSuz-kRW7&quot; href=&quot;https://redis.info/3LC4GqB&quot; style=&quot;color: #1155cc; text-decoration-line: none;&quot; target=&quot;_blank&quot;&gt;&lt;span style=&quot;background-color: transparent; font-variant-east-asian: normal; font-variant-numeric: normal; text-decoration-line: underline; vertical-align: baseline;&quot;&gt;Redis Developer Hub - tools, guides, and tutorials about Redis&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li dir=&quot;ltr&quot; style=&quot;background-color: transparent; color: black; font-variant-east-asian: normal; font-variant-numeric: normal; list-style-type: disc; margin-left: 36pt; vertical-align: baseline; white-space: pre-wrap;&quot;&gt;&lt;p dir=&quot;ltr&quot; role=&quot;presentation&quot; style=&quot;line-height: 1.38; margin-bottom: 0pt; margin-top: 0pt;&quot;&gt;&lt;a data-saferedirecturl=&quot;https://www.google.com/url?q=https://redis.info/3wMR7PR&amp;amp;source=gmail&amp;amp;ust=1660847741884000&amp;amp;usg=AOvVaw1yy4M9iSMCYouDeG8n7XDv&quot; href=&quot;https://redis.info/3wMR7PR&quot; style=&quot;color: #1155cc; text-decoration-line: none;&quot; target=&quot;_blank&quot;&gt;&lt;span style=&quot;background-color: transparent; font-variant-east-asian: normal; font-variant-numeric: normal; text-decoration-line: underline; vertical-align: baseline;&quot;&gt;RedisInsight Desktop GUI&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/5830703354515209078/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/5830703354515209078' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/5830703354515209078'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/5830703354515209078'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2022/08/using-redis-for-web-request-counting.html' title='Using Redis for web request counting'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjliVxPSESbGhLTMl1lK7amAZAnDBr3wg2LohY5F4Ncih5G3EnSYAozaU3u_J0jnY4N0T9jh4nPM-4w4UE3ReRAgDKPap3ZYLMQRZ3CH2FPK3Mt82XkluWr_N7d_5eE5suJ7vOY5LnMyJqUYOtZjIbqx6uTr16ir-H3EL79adtoaBfzuXygRLQ/s72-c/Screen%20Shot%202022-08-25%20at%2011.38.58%20AM.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-4428572095461200047</id><published>2020-12-18T11:27:00.001-08:00</published><updated>2020-12-18T11:27:26.555-08:00</updated><title type='text'>Set GIT Hash on Heroku Deploys</title><content type='html'>&lt;p&gt;&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEihUrjGpVTNwUw1pPsA_ituKYe877LSFWs9uAzaHZ4DKQPdOa8BWIPMS0ZsDsnC08h3iSMIoHh_xkbmkKK0jOrPWRaLkCk8LJErWeZ713hbHidm7KcfCxcGGsPZ2R4HMzOtL-HXtg/s200/git.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;200&quot; data-original-width=&quot;200&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEihUrjGpVTNwUw1pPsA_ituKYe877LSFWs9uAzaHZ4DKQPdOa8BWIPMS0ZsDsnC08h3iSMIoHh_xkbmkKK0jOrPWRaLkCk8LJErWeZ713hbHidm7KcfCxcGGsPZ2R4HMzOtL-HXtg/s0/git.png&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;There are &lt;a href=&quot;https://stackoverflow.com/questions/14583282/heroku-display-hash-of-current-commit-in-browser&quot; target=&quot;_blank&quot;&gt;several ways to set the GIT hash for heroku&lt;/a&gt;. I prefer setting this just before pushing the main branch to the heroku remote.&lt;p&gt;&lt;/p&gt;&lt;p&gt;Also, I prefer to automate this with a `pre-push` hook.&lt;/p&gt;&lt;p&gt;One issue I ran into was that it was not straightforward to know if there is anything to push. Meaning, if the remote was up-to-date or not. This is important, as we don&#39;t want heroku deploying a version when there are no changes to be deployed (which it will if we set the APP_VERSION).&lt;/p&gt;&lt;p&gt;The simplest way to know if the remote was already up-to-date seemed to be to do a `--dry-run` of `git push`.&lt;/p&gt;&lt;p&gt;But of course that runs the hook again, so this is an endless loop situation.&lt;/p&gt;&lt;p&gt;It does not seem that git allows us to see the git argument on the hook. If it did, we could explicitly not run the hook on a dry run.&lt;/p&gt;&lt;p&gt;But there is a `--no-verify` option that bypasses the hooks, and can be used when we do our dry run.&lt;/p&gt;&lt;p&gt;Here is the complete script:&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjtP2LLBZSxndHrlVZgPns8Itsjn_WPu9I3kyaXdlHmcX7k6f4K-yqUIeifDMZD-2yvNZONS7KAv0quseK-27B2XaziPLqdfxJtthlWdmGCRkySwuZOxZTQ19W32sIe630V0hZ1w/s1422/Screen+Shot+2020-12-18+at+11.22.18+AM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1422&quot; data-original-width=&quot;1370&quot; height=&quot;320&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjtP2LLBZSxndHrlVZgPns8Itsjn_WPu9I3kyaXdlHmcX7k6f4K-yqUIeifDMZD-2yvNZONS7KAv0quseK-27B2XaziPLqdfxJtthlWdmGCRkySwuZOxZTQ19W32sIe630V0hZ1w/s320/Screen+Shot+2020-12-18+at+11.22.18+AM.png&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/4428572095461200047/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/4428572095461200047' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/4428572095461200047'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/4428572095461200047'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2020/12/set-git-hash-on-heroku-deploys.html' title='Set GIT Hash on Heroku Deploys'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEihUrjGpVTNwUw1pPsA_ituKYe877LSFWs9uAzaHZ4DKQPdOa8BWIPMS0ZsDsnC08h3iSMIoHh_xkbmkKK0jOrPWRaLkCk8LJErWeZ713hbHidm7KcfCxcGGsPZ2R4HMzOtL-HXtg/s72-c/git.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-6351292119167452852</id><published>2019-11-30T17:47:00.001-08:00</published><updated>2019-12-01T10:33:05.950-08:00</updated><title type='text'>Rails database migrations on a cluster with AWS CodeDeploy</title><content type='html'>&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhb-_M9fqGYAv-_EX6NDqE0ISYioT3Lba2SYcpGClDHf_2k5Wmnz1l7cUMVFSMifqcFUb0pslOlg08BiED0QZvYQ74RzUKY2u2zNyP7fanYYDvlYk-4xP9g0XwWMyRtwaUi9XNi9w/s1600/ror.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;200&quot; data-original-width=&quot;200&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhb-_M9fqGYAv-_EX6NDqE0ISYioT3Lba2SYcpGClDHf_2k5Wmnz1l7cUMVFSMifqcFUb0pslOlg08BiED0QZvYQ74RzUKY2u2zNyP7fanYYDvlYk-4xP9g0XwWMyRtwaUi9XNi9w/s1600/ror.png&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
A typical Ruby / Rails solution will comprise of a number of web servers behind a load balancer. Each of the web servers will read/write from a central database. In the course of new features being added to the application, the database schema goes through changes, what we refer to as &quot;migrations&quot;.&lt;br /&gt;
&lt;br /&gt;
When new code is deployed, the migrations that are needed for new code needs to be run first. If AWS Code Deploy is used for deployment of new code, we can set up the&amp;nbsp;&lt;i&gt;AfterInstall&lt;/i&gt;&amp;nbsp;hooks to run the migrations before re-starting the web server.&lt;br /&gt;
&lt;br /&gt;
So the usual flow in a deployment goes something like this:&lt;br /&gt;
&lt;br /&gt;
&lt;ol&gt;
&lt;li&gt;Stop the web server&lt;/li&gt;
&lt;li&gt;Migrate the database&lt;/li&gt;
&lt;li&gt;Start the web server&lt;/li&gt;
&lt;/ol&gt;
&lt;br /&gt;
However, our application is hosted on a number of web servers. We don&#39;t want to bring down all servers at once. A typical blue/green deployment will have us deploy to just one third of the server fleet at once.&lt;br /&gt;
&lt;br /&gt;
So if we have 27 web servers, we will be running the above steps on 9 of them at the same time. The main problem with this is that when the Rails migrate runs on multiple servers at once, it is likely to fail on a number of them. This is because Rails takes an advisory lock on the database and throws an exception on concurrent migrations. You can read more about the advisory locking &lt;a href=&quot;https://nebulab.it/blog/the-strange-case-of-activerecord-concurrentmigrationerror/&quot; target=&quot;_blank&quot;&gt;here &lt;/a&gt;as well as a way to work around the problem.&lt;br /&gt;
&lt;br /&gt;
But the solution is not without its drawbacks. If you prevent the migrations running on all but one machine, it is possible that the new code will be deployed sooner on those machines before the migration has finished. This is specially true for long running migrations. Then there is potential for new code to be running against an old database schema. New features that depend on the new schema will likely fail.&lt;br /&gt;
&lt;br /&gt;
A better solution would be:&lt;br /&gt;
&lt;ol&gt;
&lt;li&gt;Run the migration on a single instance - this could be one of the web servers, or a dynamically provisioned EC2 instance that can access the database.&lt;/li&gt;
&lt;li&gt;For all web servers&lt;/li&gt;
&lt;/ol&gt;
&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 2.1 Stop the server&lt;br /&gt;
&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 2.2 Deploy new code to it&lt;br /&gt;
&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 2.3 Start the server&lt;br /&gt;
&lt;ul&gt;
&lt;/ul&gt;
&lt;div&gt;
The advantage of this solution is&lt;/div&gt;
&lt;div&gt;
&lt;ol&gt;
&lt;li&gt;We side-step the concurrent migration issue. We run the migrations on a single instance and then do the rest of the deployment without incurring any database issues.&lt;/li&gt;
&lt;li&gt;We bring up the new code only after the database is migrated, so the new features work reliably from the start.&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
&lt;div&gt;
So the new database schema changes need to be backward compatible. But this is a general constraint we have anyway since on a blue/green deployment some part of the code is old and will be hitting the new database.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
While this solution is pretty straightforward, it requires some effort to implement this in the AWS CodeDeploy environment.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
What I ended up doing was to use a new Deployment Group (called &lt;i&gt;staging&lt;/i&gt;) to bring up a single EC2 instance, change the start up code to only run the migration on that deployment group. Then I hooked this deployment group right after the deployment to a test instance, but before the code is deployed to the production servers.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
In the startup code, we can check the current deployment group via &lt;i&gt;ENV[&#39;DEPLOYMENT_GROUP_NAME&#39;].&lt;/i&gt; In our scripts, we set the &lt;i&gt;RAILS_ENV&lt;/i&gt; equal to the Deployment Group. This allows code to take different paths based on where it runs (in a local dev environment, a staging server or like in this case on a &lt;i&gt;migrator&lt;/i&gt; server).&lt;br /&gt;
&lt;br /&gt;
This is what our migrate script now looks like:&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiI0gkC9mgjj7MvyTw113T3dzqdEW1mEyZIVZkLx6fZnO34uc0n1BImv_5q8f7GaQ9ZH0ptAQHfmv-bVDpDnJfeR6-4DvN0Be07Z-_mEmRh1pd2iGj4O2RyPhTHzE5AfHYgt_RCDQ/s1600/Screen+Shot+2019-11-30+at+5.30.44+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;186&quot; data-original-width=&quot;756&quot; height=&quot;78&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiI0gkC9mgjj7MvyTw113T3dzqdEW1mEyZIVZkLx6fZnO34uc0n1BImv_5q8f7GaQ9ZH0ptAQHfmv-bVDpDnJfeR6-4DvN0Be07Z-_mEmRh1pd2iGj4O2RyPhTHzE5AfHYgt_RCDQ/s320/Screen+Shot+2019-11-30+at+5.30.44+PM.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;
It is important to set the inequality, as we want the migrations to run on our test servers - we just don&#39;t want them running on the production web servers.&lt;br /&gt;
&lt;br /&gt;
We add this to our database.yml, notice the environment is &lt;i&gt;staging&lt;/i&gt;, to match the deployment group. Notice the database is the production instance.&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjXGSyGQ4OuYhxOM9Ecabomvna9yD4B__zQktXSmWjD2q_XOm7lz_kVx5JjCYo62WlUlP6H_TQo3gkED-j3CdDiQD8cZssDz50rE7e7qELdUJR_rCKH-ji4JiJdxACn9oNYUnKIw/s1600/Screen+Shot+2019-11-30+at+5.32.57+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;360&quot; data-original-width=&quot;1520&quot; height=&quot;75&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjXGSyGQ4OuYhxOM9Ecabomvna9yD4B__zQktXSmWjD2q_XOm7lz_kVx5JjCYo62WlUlP6H_TQo3gkED-j3CdDiQD8cZssDz50rE7e7qELdUJR_rCKH-ji4JiJdxACn9oNYUnKIw/s320/Screen+Shot+2019-11-30+at+5.32.57+PM.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;
In our case, we read credentials from AWS Secrets Manager. You don&#39;t have to.&lt;br /&gt;
&lt;br /&gt;
This is how our &lt;i&gt;staging&lt;/i&gt; step in CodePipleline looks like:&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbOEQysrBpmCPMkrQq3vIwF15MvfBumuT0CfJuhVn8RHouTPXYWnpKWuntqkx-9WCfHd-02NTPR563FOyZjI1A1yHSmbgGCcsgt3DR2aQzej0Rs3C91Dcs2PnLUaqyvxzXTi_88Q/s1600/Screen+Shot+2019-11-30+at+5.40.12+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1154&quot; data-original-width=&quot;1196&quot; height=&quot;308&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbOEQysrBpmCPMkrQq3vIwF15MvfBumuT0CfJuhVn8RHouTPXYWnpKWuntqkx-9WCfHd-02NTPR563FOyZjI1A1yHSmbgGCcsgt3DR2aQzej0Rs3C91Dcs2PnLUaqyvxzXTi_88Q/s320/Screen+Shot+2019-11-30+at+5.40.12+PM.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;
On the last CodeDeploy step, hit the edit button and set the Application Name and the Deployment Group correctly.&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQ0SGvDImaVkI0gpIPrphiFf28jRBgB72JYzVlsTTJo6Oy54wKhRs5cVl7oLSL6i6uxRp0zbyvetnixYhNBZnet6msCbh47QbHD8oCU9koO1ornGKlRa-tQllILdJhuQWj_sgQQA/s1600/Screen+Shot+2019-11-30+at+5.42.44+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;223&quot; data-original-width=&quot;1600&quot; height=&quot;44&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQ0SGvDImaVkI0gpIPrphiFf28jRBgB72JYzVlsTTJo6Oy54wKhRs5cVl7oLSL6i6uxRp0zbyvetnixYhNBZnet6msCbh47QbHD8oCU9koO1ornGKlRa-tQllILdJhuQWj_sgQQA/s320/Screen+Shot+2019-11-30+at+5.42.44+PM.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;
Now before the code is deployed to the production servers, the database migration has been completed on the &lt;i&gt;staging&lt;/i&gt; instance. If the migration fails, CodeDeploy won&#39;t advance to the deploy stage. When the production servers start with new code, they will all use the new database schema.&lt;br /&gt;
&lt;br /&gt;
After the migration has finished and before code is deployed, the old code will start using the new database schema. As long as the new schema is backward compatible, this will not cause a problem.&lt;br /&gt;
&lt;br /&gt;
You may have to run the release pipline a few times till AWS co-operates with the new changes. But it should eventually start working.&lt;/div&gt;
&lt;div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/6351292119167452852/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/6351292119167452852' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/6351292119167452852'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/6351292119167452852'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2019/11/rails-database-migrations-with-aws.html' title='Rails database migrations on a cluster with AWS CodeDeploy'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhb-_M9fqGYAv-_EX6NDqE0ISYioT3Lba2SYcpGClDHf_2k5Wmnz1l7cUMVFSMifqcFUb0pslOlg08BiED0QZvYQ74RzUKY2u2zNyP7fanYYDvlYk-4xP9g0XwWMyRtwaUi9XNi9w/s72-c/ror.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-3369669319434414676</id><published>2019-07-08T23:33:00.003-07:00</published><updated>2019-07-08T23:45:48.510-07:00</updated><title type='text'>Kinesis/Firehose/Athena - Creating queryable data from logs</title><content type='html'>&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHxwpuO2WL04IJh0csWxvM4oV-oa-N2DvPyJV6GehuxlF_WA_bUHR7FTeI_9ax1JWAsMEEefsnYMKFvUJe5144KsJY7OLwa0b7NeUyjtplu8fz0h-lsFhuXfEbOaCduk3s7SjOpA/s1600/firehose.jpg&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;500&quot; data-original-width=&quot;1000&quot; height=&quot;160&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHxwpuO2WL04IJh0csWxvM4oV-oa-N2DvPyJV6GehuxlF_WA_bUHR7FTeI_9ax1JWAsMEEefsnYMKFvUJe5144KsJY7OLwa0b7NeUyjtplu8fz0h-lsFhuXfEbOaCduk3s7SjOpA/s320/firehose.jpg&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
Amazon has a data transformation pipeline that allows log data to be queried with a SQL like syntax. This can be useful to gain insight that is buried in log data generally thought of as temporary. When was the last time you went over 6 months of logs? Right, just what I thought.&lt;br /&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Wading through logs is painful and with the growth of data all the more so. No wonder that when confronted with the task of gleaning information from past data, engineers build specialized table structures with relational queries in mind or provision specialized map/reduce jobs to crunch over the data for detailed answers to specific questions.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
But this time consuming exercise can be done away with by using the Amazon Kinesis pipeline. The flow looks something like this -&amp;gt; The application writes a JSON formatted record that captures a particular item of interest to a Kinesis data stream.&amp;nbsp; A Firehose instance is attached to the output of the data stream. This Firehose instance converts the data to a &quot;JSON like&quot; format and writes them into a S3 bucket at a specified folder. Another Amazon service Glue provides a crawler that can then process new files that get uploaded to the S3 bucket. The Glue crawler infers the schema from the JSON it finds in the S3 files and creates a Glue table. To query the data, Amazon provides yet another service - Athena, which sports a SQL syntax and a user friendly query console. Phew, yeah, it is quite the mother of all pipelines.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This is all pretty straightforward to set up starting from Kinesis console itself. You should start with the Data streams tab in Kinesis, create a data stream, then create a Kinesis Firehose with source equal to the data stream you just created. Specify that firehose data will be written with the API like so:&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOkoiKGILB3F2Waar5DDBsyh_vKt3mK-o3-TbORfEObsz8MEjvsB8SXLyGIT9jOQZ0HER46cTBer0uuYYhNLqbmOwiZT8_mNQ2OqJRSa9NV-PhB4D6rhBfAHVjdihih8uEMWbD7A/s1600/Screen+Shot+2019-07-08+at+10.43.24+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;198&quot; data-original-width=&quot;1600&quot; height=&quot;48&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOkoiKGILB3F2Waar5DDBsyh_vKt3mK-o3-TbORfEObsz8MEjvsB8SXLyGIT9jOQZ0HER46cTBer0uuYYhNLqbmOwiZT8_mNQ2OqJRSa9NV-PhB4D6rhBfAHVjdihih8uEMWbD7A/s400/Screen+Shot+2019-07-08+at+10.43.24+PM.png&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Since we are writing JSON to Kinesis, there is no need to convert record format, and we will use the data as is without transformation to Firehose, well more on this later, but we can leave the default settings for&amp;nbsp;&lt;span style=&quot;background-color: white; color: #444444; font-family: &amp;quot;helvetica neue&amp;quot; , &amp;quot;roboto&amp;quot; , &amp;quot;arial&amp;quot; , sans-serif; font-size: 14px; font-weight: 700; text-align: right;&quot;&gt;Source record transformation&lt;/span&gt;&amp;nbsp;and&amp;nbsp;&lt;span style=&quot;background-color: white; color: #444444; font-family: &amp;quot;helvetica neue&amp;quot; , &amp;quot;roboto&amp;quot; , &amp;quot;arial&amp;quot; , sans-serif; font-size: 14px; font-weight: 700; text-align: right;&quot;&gt;Record format conversion&lt;/span&gt;.&amp;nbsp;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Finally you need to specify where you want this data to live, in our case S3:&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIh4xqvUjaeR-7iCmw88elNSBS_U7Qit0s_IdjrZM5z3LNh1uTZ3xQM3QYGHWYw6PSb_B6lHePM_YqT1SUtHpm_NuXit987d2iSNeeliG5gYhrOqqFN6yAQ2P3BS0sCSI0GaKURQ/s1600/Screen+Shot+2019-07-08+at+10.50.06+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;325&quot; data-original-width=&quot;1600&quot; height=&quot;81&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIh4xqvUjaeR-7iCmw88elNSBS_U7Qit0s_IdjrZM5z3LNh1uTZ3xQM3QYGHWYw6PSb_B6lHePM_YqT1SUtHpm_NuXit987d2iSNeeliG5gYhrOqqFN6yAQ2P3BS0sCSI0GaKURQ/s400/Screen+Shot+2019-07-08+at+10.50.06+PM.png&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Now head over to Glue and add a crawler. Specify the S3 folder that you used above for the &quot;include path&quot;. For simplicity, I would start with a single schema for all S3 records, under &quot;Grouping Behaviors&quot;.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdzDY4T9vtCEOUwMABdJ-CJQ3K6JyiYUVCIiXtdhyhEmQhYalte8f0Snpx7zU_2C8jdCe-zFbysZRYbFAfI3i4s3Pi9KbFhi7iKwJPj8BLi7HlJhAMN_BOX5mrUIflZrQqiDodMw/s1600/Screen+Shot+2019-07-08+at+10.54.05+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;312&quot; data-original-width=&quot;1600&quot; height=&quot;77&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdzDY4T9vtCEOUwMABdJ-CJQ3K6JyiYUVCIiXtdhyhEmQhYalte8f0Snpx7zU_2C8jdCe-zFbysZRYbFAfI3i4s3Pi9KbFhi7iKwJPj8BLi7HlJhAMN_BOX5mrUIflZrQqiDodMw/s400/Screen+Shot+2019-07-08+at+10.54.05+PM.png&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Now head over to your favorite editor and let&#39;s write some code - finally!&lt;/div&gt;
&lt;div&gt;
It&#39;s up to you how you want to structure the code to do this. In the application I&#39;m building, it is literally the logs that I want sent over to Kinesis. Anticipating this, I wrote a function that the app calls for writing logs, and this function was the ideal place to add in the write to Kinesis. It looks something like this:&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2Dd76sHLpwmIABiN1Ndw-FgIfn_yNG7j7_fdcbfXuaHFePBNDb4RvCijRIxfrvLByYO58oUZHxx2N8NWjqyiybDLAnqcs8E-1ODH7UbDze37WgPOY1XTmEQpredy-abAkoiT76g/s1600/Screen+Shot+2019-07-08+at+11.01.57+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;562&quot; data-original-width=&quot;1600&quot; height=&quot;224&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2Dd76sHLpwmIABiN1Ndw-FgIfn_yNG7j7_fdcbfXuaHFePBNDb4RvCijRIxfrvLByYO58oUZHxx2N8NWjqyiybDLAnqcs8E-1ODH7UbDze37WgPOY1XTmEQpredy-abAkoiT76g/s640/Screen+Shot+2019-07-08+at+11.01.57+PM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
That will be all to it, except there is an annoying&lt;a href=&quot;https://forums.aws.amazon.com/thread.jspa?threadID=244858&quot; target=&quot;_blank&quot;&gt; bug in the pipeline that we need to work around&lt;/a&gt;. The issue is that Firehose writes &quot;JSON like&quot; data to S3 that is all a single line. The Glue crawler expects each record to exist in a single line. So when all the records are squished into a single line in S3, the crawler processes the first and throws away the rest. Imagine my surprise when only 1 out of 17 of my log records appeared in the Athena queries.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The workaround is to write a Lambda function with a Kinesis trigger. What this does is that every time a Kinesis record is written, the Lambda gets triggered. Well, that is not strictly true - Kinesis will batch a bunch of records and invoke the lambda once per batch. The batch size (or time for trigger) can be specified from the console.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Or if you are using &lt;a href=&quot;https://serverless.com/framework/docs/providers/aws/guide/quick-start/&quot; target=&quot;_blank&quot;&gt;serverless&lt;/a&gt;, this can be specified in the serverless.yml like so:&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGvrzLOHv6ETU_5W-_XY1vKiMbvYRCZK7lQTlxYyR_dwnZMH57mY6gusu1gpFoWEh8PRKmPHhgm0k6iDhxO3IduQhLQ1q3sd8hUCw4WERMZc1QOQwjbhywPbWdPmqLzSZHOBBMCQ/s1600/Screen+Shot+2019-07-08+at+11.21.08+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;370&quot; data-original-width=&quot;1004&quot; height=&quot;234&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGvrzLOHv6ETU_5W-_XY1vKiMbvYRCZK7lQTlxYyR_dwnZMH57mY6gusu1gpFoWEh8PRKmPHhgm0k6iDhxO3IduQhLQ1q3sd8hUCw4WERMZc1QOQwjbhywPbWdPmqLzSZHOBBMCQ/s640/Screen+Shot+2019-07-08+at+11.21.08+PM.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Without further ado, here&#39;s the Lambda that adds the newline:&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRHooF4L7sDlZCFuSmCuiUrJirwPE6nQFyzLgcayqSTJ02kNLqsOAavChwNtn_JQ_4Wpn0mRb62LGQOS9dRyJSNNabhsQz4JkAuOmYKc1TN-Lpn23LdchzG_P7vMJBiCUCo0NxQg/s1600/Screen+Shot+2019-07-08+at+11.24.38+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1282&quot; data-original-width=&quot;1008&quot; height=&quot;640&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRHooF4L7sDlZCFuSmCuiUrJirwPE6nQFyzLgcayqSTJ02kNLqsOAavChwNtn_JQ_4Wpn0mRb62LGQOS9dRyJSNNabhsQz4JkAuOmYKc1TN-Lpn23LdchzG_P7vMJBiCUCo0NxQg/s640/Screen+Shot+2019-07-08+at+11.24.38+PM.png&quot; width=&quot;502&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This is written in node.js, and I used the serverless framework with the node.js template to write it. I&#39;m exporting a single function named newlines. This is triggered when there is a batch of records in the Kinesis data stream. We map over the records, transforming each record by adding a new line. This is done in the add_new_line function.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
To let the node engine know what we did, we use the &lt;i&gt;callback&lt;/i&gt;. It is standard node.js to pass an error object for errors and null when there are no errors (we succeeded).&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;i&gt;firehose.putBatchRecord&lt;/i&gt; is for efficiency - we could just as well have used &lt;i&gt;firehose.PutRecord&lt;/i&gt; and the results would be the same besides throughput.&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/3369669319434414676/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/3369669319434414676' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/3369669319434414676'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/3369669319434414676'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2019/07/kinesisfirehoseathena-creating.html' title='Kinesis/Firehose/Athena - Creating queryable data from logs'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHxwpuO2WL04IJh0csWxvM4oV-oa-N2DvPyJV6GehuxlF_WA_bUHR7FTeI_9ax1JWAsMEEefsnYMKFvUJe5144KsJY7OLwa0b7NeUyjtplu8fz0h-lsFhuXfEbOaCduk3s7SjOpA/s72-c/firehose.jpg" height="72" width="72"/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-6197355768604941237</id><published>2018-06-04T15:26:00.000-07:00</published><updated>2018-06-04T15:26:36.200-07:00</updated><title type='text'>JSON to objects in a few languages</title><content type='html'>&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-Q7U81enBGlb78IWF_ov6B05gIulYFKf7skEiQlYgV5-ACQVA93ZMo4KW46ohYc1I6DIztIap6KUaEZcbURWYgKVse1IS8O3O6ucnhtf1oEd_WyNUWhMUFLCc0oJ1xNLnNkIPIg/s1600/json.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;120&quot; data-original-width=&quot;120&quot; height=&quot;200&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-Q7U81enBGlb78IWF_ov6B05gIulYFKf7skEiQlYgV5-ACQVA93ZMo4KW46ohYc1I6DIztIap6KUaEZcbURWYgKVse1IS8O3O6ucnhtf1oEd_WyNUWhMUFLCc0oJ1xNLnNkIPIg/s200/json.png&quot; width=&quot;200&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
When working with data services, we often have a need to convert JSON strings to objects that model our data. Here is a list of code snippets in different languages that convert this &lt;a href=&quot;https://developers.facebook.com/docs/graph-api/using-graph-api/&quot;&gt;Facebook Graph JSON data&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
The list is presented in ascending order of verbosity. Predictably, Javascript&amp;nbsp;most succinctly expresses its wishes, whereas Java uses a copious amount of code. Scala avoids some of that verbosity by using case classes that remove boilerplate code for constructors. Jackson (Java) requires getters and setters to identify which attributes of the object are to be serialized, causing code bloat.&lt;br /&gt;
&lt;br /&gt;
JSON:&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/thushw/e0415703669d53b7ba8527e96d636067.js&quot;&gt;&lt;/script&gt; Javascript:&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/thushw/34a6ebdaa71ee9b493442f6bb97a0684.js&quot;&gt;&lt;/script&gt; Ruby:&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/thushw/40e425e31c83a04718b0e90e936a048f.js&quot;&gt;&lt;/script&gt; Python:&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/thushw/c29003865097dfa8f9b14d56e2edd0ef.js&quot;&gt;&lt;/script&gt; Scala:&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/thushw/407303dd34bca921132a70eb5bb45ca3.js&quot;&gt;&lt;/script&gt; Java:&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/thushw/ff665100f2577e13e4a43a71e2e17b8b.js&quot;&gt;&lt;/script&gt;


</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/6197355768604941237/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/6197355768604941237' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/6197355768604941237'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/6197355768604941237'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2018/06/json-to-objects-in-few-languages.html' title='JSON to objects in a few languages'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-Q7U81enBGlb78IWF_ov6B05gIulYFKf7skEiQlYgV5-ACQVA93ZMo4KW46ohYc1I6DIztIap6KUaEZcbURWYgKVse1IS8O3O6ucnhtf1oEd_WyNUWhMUFLCc0oJ1xNLnNkIPIg/s72-c/json.png" height="72" width="72"/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-6768296040898438313</id><published>2018-04-20T13:10:00.000-07:00</published><updated>2018-04-20T17:24:17.791-07:00</updated><title type='text'>Goldbach Conjecture</title><content type='html'>In the 18th century, two mathematicians came up with a conjecture - known by its original creator - named Goldbach conjecture. It says that any even number greater than 2 can be expressed as a sum of two primes. There is no theoretical proof for this yet, but it is said to &lt;a href=&quot;https://artofproblemsolving.com/wiki/index.php?title=Goldbach_Conjecture&quot; target=&quot;_blank&quot;&gt;hold up to 400 trillion&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
A program to test Golbach&amp;nbsp;conjecture for a given integer:&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;https://gist.github.com/thushw/fdd2248c5f1c03d579d51e2c923f8241.js&quot;&gt;&lt;/script&gt; This program demonstrates two algorithms that are well known.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;ol&gt;
&lt;li&gt;The &lt;a href=&quot;https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes&quot;&gt;sieve of Eratosthenes&lt;/a&gt; to calculate all primes upto a given number&amp;nbsp;&lt;/li&gt;
&lt;li&gt;A linear algorithm to find if two numbers in a list sum to a given number.&lt;/li&gt;
&lt;/ol&gt;
&lt;br /&gt;
&lt;br /&gt;
To prove the Goldbach conjecture for a given &lt;i&gt;n&lt;/i&gt;, we use the sieve to find all prime numbers up to &lt;i&gt;n&lt;/i&gt;, then use the linear algorithm to find two primes from this list that sums up to &lt;i&gt;n&lt;/i&gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/6768296040898438313/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/6768296040898438313' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/6768296040898438313'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/6768296040898438313'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2018/04/goldbach-conjecture.html' title='Goldbach Conjecture'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-5086517635782699136</id><published>2018-04-06T15:58:00.001-07:00</published><updated>2018-04-06T16:02:54.924-07:00</updated><title type='text'>Timing with Jupyter notebook</title><content type='html'>Pieces of code can be timed within the Jupyter notebook using the &lt;a href=&quot;http://pynash.org/2013/03/06/timing-and-profiling/&quot;&gt;%timeit magic&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
Here is an example where a grid walk algorithm is implemented three times with progressively better run time, timed with %timeit and graphed using bokeh:&lt;br /&gt;
&lt;br /&gt;
Code:&lt;br /&gt;
&lt;br /&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;table&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt; 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47&lt;/pre&gt;
&lt;/td&gt;&lt;td&gt;&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;def&lt;/span&gt; &lt;span style=&quot;color: #0066bb; font-weight: bold;&quot;&gt;num_paths&lt;/span&gt;(n):
    M &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; [[&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;] &lt;span style=&quot;color: #333333;&quot;&gt;*&lt;/span&gt; n &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; i &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;range&lt;/span&gt;(n)]
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; i &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;range&lt;/span&gt;(n):
        M[n&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;][i] &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;

    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; r &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;range&lt;/span&gt;(n&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;2&lt;/span&gt;, &lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;, &lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;):
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; c &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;range&lt;/span&gt;(n&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;r&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;, n):
            M[r][c] &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; M[r][c&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;] &lt;span style=&quot;color: #333333;&quot;&gt;+&lt;/span&gt; M[r&lt;span style=&quot;color: #333333;&quot;&gt;+&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;][c]
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;return&lt;/span&gt; M[&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;][n&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;]

&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;def&lt;/span&gt; &lt;span style=&quot;color: #0066bb; font-weight: bold;&quot;&gt;num_paths_from&lt;/span&gt;(r, c, n, M):
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;if&lt;/span&gt; M[r][c] &lt;span style=&quot;color: #333333;&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;:
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;return&lt;/span&gt; M[r][c]
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;if&lt;/span&gt; r &lt;span style=&quot;color: #333333;&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt; &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;and&lt;/span&gt; c &lt;span style=&quot;color: #333333;&quot;&gt;==&lt;/span&gt; n&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;:
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;return&lt;/span&gt; &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;
    paths &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; ([(x,y) &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; (x,y) &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; 
              [(r&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;, c), (r, c&lt;span style=&quot;color: #333333;&quot;&gt;+&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;)] &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;if&lt;/span&gt; y &lt;span style=&quot;color: #333333;&quot;&gt;&amp;gt;=&lt;/span&gt; n&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;x&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt; 
                                   &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;and&lt;/span&gt; y&lt;span style=&quot;color: #333333;&quot;&gt;&amp;lt;&lt;/span&gt;n])
    npaths &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; x,y &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; paths:
        npaths &lt;span style=&quot;color: #333333;&quot;&gt;+=&lt;/span&gt; num_paths_from(x,y,n,M)
    M[r][c] &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; npaths
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;return&lt;/span&gt; npaths

&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;def&lt;/span&gt; &lt;span style=&quot;color: #0066bb; font-weight: bold;&quot;&gt;num_pathz_from&lt;/span&gt;(r, c, n):
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;if&lt;/span&gt; r &lt;span style=&quot;color: #333333;&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt; &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;and&lt;/span&gt; c &lt;span style=&quot;color: #333333;&quot;&gt;==&lt;/span&gt; n&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;:
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;return&lt;/span&gt; &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;
    paths &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; ([(x,y) &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; (x,y) &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; 
              [(r&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;, c), (r, c&lt;span style=&quot;color: #333333;&quot;&gt;+&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;)] &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;if&lt;/span&gt; y &lt;span style=&quot;color: #333333;&quot;&gt;&amp;gt;=&lt;/span&gt; n&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;x&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt; 
                                   &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;and&lt;/span&gt; y&lt;span style=&quot;color: #333333;&quot;&gt;&amp;lt;&lt;/span&gt;n])
    npaths &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; x,y &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; paths:
        npaths &lt;span style=&quot;color: #333333;&quot;&gt;+=&lt;/span&gt; num_pathz_from(x,y,n)
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;return&lt;/span&gt; npaths

&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;def&lt;/span&gt; &lt;span style=&quot;color: #0066bb; font-weight: bold;&quot;&gt;num_paths_slow&lt;/span&gt;(n):
    M &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; [[&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;] &lt;span style=&quot;color: #333333;&quot;&gt;*&lt;/span&gt; n &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; i &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;range&lt;/span&gt;(n)]
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;return&lt;/span&gt; num_paths_from(n&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;, &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;, n, M)

&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;def&lt;/span&gt; &lt;span style=&quot;color: #0066bb; font-weight: bold;&quot;&gt;num_paths_super_slow&lt;/span&gt;(n):
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;return&lt;/span&gt; num_pathz_from(n&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;, &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;, n)


&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; sz &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;range&lt;/span&gt;(&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;5&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;15&lt;/span&gt;):
    &lt;span style=&quot;color: #333333;&quot;&gt;%&lt;/span&gt;timeit num_paths(sz)
    &lt;span style=&quot;color: #333333;&quot;&gt;%&lt;/span&gt;timeit num_paths_slow(sz)
    &lt;span style=&quot;color: #333333;&quot;&gt;%&lt;/span&gt;timeit num_paths_super_slow(sz)
&lt;/pre&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;br /&gt;
Timing:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;table&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt; 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30&lt;/pre&gt;
&lt;/td&gt;&lt;td&gt;&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;100000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;7.74&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;26.2&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;62.1&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;100000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;9.27&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;32.9&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;200&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;100000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;11.3&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;43&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;615&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;100000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;13.9&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;56.9&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;100&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;2.05&lt;/span&gt; ms per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;100000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;16.6&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;70.9&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;100&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;6.67&lt;/span&gt; ms per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;100000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;19.4&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;97.4&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;23.7&lt;/span&gt; ms per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;22.1&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;105&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;80.2&lt;/span&gt; ms per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;25.6&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;135&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt; loop, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;287&lt;/span&gt; ms per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;29.8&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;149&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt; loop, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;1.05&lt;/span&gt; s per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;32.7&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10000&lt;/span&gt; loops, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;171&lt;/span&gt; µs per loop
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt; loop, best of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;3.78&lt;/span&gt; s per loop
&lt;/pre&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;br /&gt;
Chart:&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhV-7rM6T8-VqwQLzOwYT4CQScbeaIH0K_N6DuuR38LFeM-5qrjdhju7lZBd1U9_agvE2c0sVv4OJHLsgUXiKzD95fj3aHqRFYLL5_NMSY5iUWXGRtLnDCcmquG8bkpviJ_6W20rw/s1600/Screen+Shot+2018-04-06+at+3.36.28+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;614&quot; data-original-width=&quot;592&quot; height=&quot;320&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhV-7rM6T8-VqwQLzOwYT4CQScbeaIH0K_N6DuuR38LFeM-5qrjdhju7lZBd1U9_agvE2c0sVv4OJHLsgUXiKzD95fj3aHqRFYLL5_NMSY5iUWXGRtLnDCcmquG8bkpviJ_6W20rw/s320/Screen+Shot+2018-04-06+at+3.36.28+PM.png&quot; width=&quot;308&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;
Code for the plot: &lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;table&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt; 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20&lt;/pre&gt;
&lt;/td&gt;&lt;td&gt;&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;from&lt;/span&gt; &lt;span style=&quot;color: #0e84b5; font-weight: bold;&quot;&gt;bokeh.palettes&lt;/span&gt; &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;import&lt;/span&gt; Spectral11
&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;from&lt;/span&gt; &lt;span style=&quot;color: #0e84b5; font-weight: bold;&quot;&gt;bokeh.plotting&lt;/span&gt; &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;import&lt;/span&gt; figure, show, output_file

p &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; figure(plot_width&lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;300&lt;/span&gt;, plot_height&lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;300&lt;/span&gt;)
slowest &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; [&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;62&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;200&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;615&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;2050&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;6670&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;23700&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;80200&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;287000&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1050000&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3780000&lt;/span&gt;]
slower &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; [&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;26&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;32&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;43&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;56&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;70&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;97&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;105&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;135&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;149&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;171&lt;/span&gt;]
fast &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; [&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;7&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;9&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;11&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;13&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;16&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;19&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;22&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;25&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;29&lt;/span&gt;,&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;32&lt;/span&gt;]
st &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;5&lt;/span&gt;
end &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;8&lt;/span&gt;
mypalette&lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt;Spectral11[&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;:&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;]
p&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;multi_line(xs&lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt;[&lt;span style=&quot;color: #007020;&quot;&gt;list&lt;/span&gt;(&lt;span style=&quot;color: #007020;&quot;&gt;range&lt;/span&gt;(st,end)), &lt;span style=&quot;color: #007020;&quot;&gt;list&lt;/span&gt;(&lt;span style=&quot;color: #007020;&quot;&gt;range&lt;/span&gt;(st,end)), &lt;span style=&quot;color: #007020;&quot;&gt;list&lt;/span&gt;(&lt;span style=&quot;color: #007020;&quot;&gt;range&lt;/span&gt;(st,end))], 
             ys&lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt;[slowest[:end&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;st], 
                 slower[:end&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;st],
                 fast[:end&lt;span style=&quot;color: #333333;&quot;&gt;-&lt;/span&gt;st]
                ],
             line_color&lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt;mypalette,
             line_width&lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;5&lt;/span&gt;
             )

show(p)
&lt;/pre&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;br /&gt;
This shows how the algorithm with exponential time complexity deteriorates for higher values of n:&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPk4mZ6Tiu2mUGSXN9zSgvV1HZdfsiSM50DMQhMnJA1RbsGrkTEo3RuH37Z5Vnvk7uF8UDn0fHgA9-cd_0NA09q9mO1RnVDoNE2ffHWQXfBnwsuarkVTPGzHih_isA-tQtFx69pA/s1600/Screen+Shot+2018-04-06+at+3.43.22+PM.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;628&quot; data-original-width=&quot;612&quot; height=&quot;320&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPk4mZ6Tiu2mUGSXN9zSgvV1HZdfsiSM50DMQhMnJA1RbsGrkTEo3RuH37Z5Vnvk7uF8UDn0fHgA9-cd_0NA09q9mO1RnVDoNE2ffHWQXfBnwsuarkVTPGzHih_isA-tQtFx69pA/s320/Screen+Shot+2018-04-06+at+3.43.22+PM.png&quot; width=&quot;311&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
Now that I&#39;ve shown you a bunch of performance numbers and visualization, if you are curious about the algorithm, it is a contrived example of finding the number of paths from one corner of a grid to another, here the squares to the north of the diagonal from top right to bottom left are out of bounds - that is, the path is restricted to the right of the diagonal. In this image, we show the problem for n = 5.&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg37Pfz5oZxXD17wMu37Sek8SXypUlmEtAxrw01FSbQ8du2P9ENFXpJ0erFnpoLZUzlk-t7brs_XpQ_rtkm9sQ9R-r5z2ScMEKJ1rEoqgcsNEx9oEGc-SLOvFq8zogxxQ94OunaOw/s1600/img07.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1600&quot; data-original-width=&quot;1600&quot; height=&quot;320&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg37Pfz5oZxXD17wMu37Sek8SXypUlmEtAxrw01FSbQ8du2P9ENFXpJ0erFnpoLZUzlk-t7brs_XpQ_rtkm9sQ9R-r5z2ScMEKJ1rEoqgcsNEx9oEGc-SLOvFq8zogxxQ94OunaOw/s320/img07.png&quot; width=&quot;319&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;
&lt;br /&gt;
The exponential algorithm recursively finds the number of paths from each point to the end point (the top right corner). But since you can reach a single point by a number of paths (and this number increases exponentially with n), the same computation of finding the number of paths from this point to the grid corner is repeated, causing the slowdown.&lt;br /&gt;
&lt;br /&gt;
The next improvement is to remember the number of paths once calculated. Say if we are on [4,2], we will calculate the path to the grid end from here and mark it in M[4][2]. Next time we are at [4,2], we no longer need to calculate again, as the result can be looked up from M[4][2].&lt;br /&gt;
&lt;br /&gt;
The last algorithm uses dynamic programming to do even less work. It works based on the simple observation that a cell (i,j) can only be reached from just 2 cells. Those are the cell to its immediate left, (i,j-1) and the cell right below it, (i+1,j). Then there is just a single path from these two to (i,j). So if we know the number of paths to those two cells, we can add them up to find the number of paths to (i,j). Then we can keep calculating the paths to each cell, walking from bottom row up, going right on the columns and eventually, we will fill the cell at the top right (0, n -1).&lt;br /&gt;
&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/5086517635782699136/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/5086517635782699136' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/5086517635782699136'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/5086517635782699136'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2018/04/timing-with-jupyter-notebook.html' title='Timing with Jupyter notebook'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhV-7rM6T8-VqwQLzOwYT4CQScbeaIH0K_N6DuuR38LFeM-5qrjdhju7lZBd1U9_agvE2c0sVv4OJHLsgUXiKzD95fj3aHqRFYLL5_NMSY5iUWXGRtLnDCcmquG8bkpviJ_6W20rw/s72-c/Screen+Shot+2018-04-06+at+3.36.28+PM.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-8376247330446204838</id><published>2018-04-04T17:08:00.000-07:00</published><updated>2018-04-05T08:57:15.909-07:00</updated><title type='text'>Pandas snippets</title><content type='html'>&lt;div class=&quot;tr_bq&quot;&gt;
Here are some useful snippets that can come in handy when cleaning data with pandas. This was useful for me in completing the coursework for &lt;a href=&quot;https://www.coursera.org/learn/python-data-analysis/home/welcome&quot; target=&quot;_blank&quot;&gt;python data science course&lt;/a&gt;.&lt;/div&gt;
&lt;br /&gt;
&lt;u&gt;Extract a subset of columns from the dataframe based on a regular expression:&lt;/u&gt;&lt;br /&gt;
Code:&lt;br/&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;div style=&quot;background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;&quot;&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;pre style=&quot;margin: 0; line-height: 125%&quot;&gt; 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24&lt;/pre&gt;&lt;/td&gt;&lt;td&gt;&lt;pre style=&quot;margin: 0; line-height: 125%&quot;&gt;persona1 &lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt; pd&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;Series({
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Last Post On&amp;#39;&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;02/04/2017&amp;#39;&lt;/span&gt;,
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Friends-2015&amp;#39;&lt;/span&gt;: &lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;10&lt;/span&gt;,
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Friends-2016&amp;#39;&lt;/span&gt;: &lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;20&lt;/span&gt;,
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Friends-2017&amp;#39;&lt;/span&gt;: &lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;300&lt;/span&gt;
})

persona2 &lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt; pd&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;Series({
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Last Post On&amp;#39;&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;02/04/2018&amp;#39;&lt;/span&gt;,
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Friends-2015&amp;#39;&lt;/span&gt;: &lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;100&lt;/span&gt;,
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Friends-2016&amp;#39;&lt;/span&gt;: &lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;240&lt;/span&gt;,
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Friends-2017&amp;#39;&lt;/span&gt;: &lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;560&lt;/span&gt;
})

persona3 &lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt; pd&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;Series({
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Last Post On&amp;#39;&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;02/04/2014&amp;#39;&lt;/span&gt;,
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Friends-2015&amp;#39;&lt;/span&gt;: &lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;120&lt;/span&gt;,
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Friends-2016&amp;#39;&lt;/span&gt;: &lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;120&lt;/span&gt;,
                        &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Friends-2017&amp;#39;&lt;/span&gt;: &lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;120&lt;/span&gt;
})

df &lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt; pd&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;DataFrame([persona1, persona2, persona3], 
                  index&lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt;[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Chris&amp;#39;&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Bella&amp;#39;&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Laura&amp;#39;&lt;/span&gt;])
df&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;filter(regex&lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt;(&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;quot;Friends-\d{4}&amp;quot;&lt;/span&gt;))
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/div&gt;

&lt;br /&gt;
Output:&lt;br /&gt;
&lt;table border=&quot;1&quot; class=&quot;dataframe&quot;&gt;
  &lt;thead&gt;
&lt;tr style=&quot;text-align: right;&quot;&gt;
      &lt;th&gt;&lt;/th&gt;
      &lt;th&gt;Friends-2015&lt;/th&gt;
      &lt;th&gt;Friends-2016&lt;/th&gt;
      &lt;th&gt;Friends-2017&lt;/th&gt;
    &lt;/tr&gt;
&lt;/thead&gt;
  &lt;tbody&gt;
&lt;tr&gt;
      &lt;th&gt;Chris&lt;/th&gt;
      &lt;td&gt;10&lt;/td&gt;
      &lt;td&gt;20&lt;/td&gt;
      &lt;td&gt;300&lt;/td&gt;
    &lt;/tr&gt;
&lt;tr&gt;
      &lt;th&gt;Bella&lt;/th&gt;
      &lt;td&gt;100&lt;/td&gt;
      &lt;td&gt;240&lt;/td&gt;
      &lt;td&gt;560&lt;/td&gt;
    &lt;/tr&gt;
&lt;tr&gt;
      &lt;th&gt;Laura&lt;/th&gt;
      &lt;td&gt;120&lt;/td&gt;
      &lt;td&gt;120&lt;/td&gt;
      &lt;td&gt;120&lt;/td&gt;
    &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;br /&gt;&lt;/blockquote&gt;
&lt;u&gt;Set a column based on the value of both the current row and adjacent rows:&lt;/u&gt;&lt;br /&gt;
&lt;br /&gt;
For this example, we define regulars to the gym as those who have gone to the gym last year at least 3 months in a row:&lt;br /&gt;
Code:&lt;br/&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;div style=&quot;background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;&quot;&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;pre style=&quot;margin: 0; line-height: 125%&quot;&gt; 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24&lt;/pre&gt;&lt;/td&gt;&lt;td&gt;&lt;pre style=&quot;margin: 0; line-height: 125%&quot;&gt;&lt;span style=&quot;color: #008800; font-weight: bold&quot;&gt;import&lt;/span&gt; &lt;span style=&quot;color: #0e84b5; font-weight: bold&quot;&gt;datetime&lt;/span&gt;
df &lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt; pd&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;DataFrame({&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Month&amp;#39;&lt;/span&gt;: 
                   [datetime&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;date(&lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;2008&lt;/span&gt;, i, &lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;1&lt;/span&gt;)&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;strftime(&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;%B&amp;#39;&lt;/span&gt;)
                             &lt;span style=&quot;color: #008800; font-weight: bold&quot;&gt;for&lt;/span&gt; i &lt;span style=&quot;color: #000000; font-weight: bold&quot;&gt;in&lt;/span&gt; &lt;span style=&quot;color: #007020&quot;&gt;range&lt;/span&gt;(&lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;1&lt;/span&gt;,&lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;13&lt;/span&gt;)] &lt;span style=&quot;color: #333333&quot;&gt;*&lt;/span&gt; &lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;3&lt;/span&gt;, 
                   &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;visited&amp;#39;&lt;/span&gt;: [&lt;span style=&quot;color: #008800; font-weight: bold&quot;&gt;False&lt;/span&gt;]&lt;span style=&quot;color: #333333&quot;&gt;*&lt;/span&gt;&lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;36&lt;/span&gt;},
                   index&lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt;[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Alice&amp;#39;&lt;/span&gt;]&lt;span style=&quot;color: #333333&quot;&gt;*&lt;/span&gt;&lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;12&lt;/span&gt; &lt;span style=&quot;color: #333333&quot;&gt;+&lt;/span&gt; 
                         [&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Bob&amp;#39;&lt;/span&gt;]&lt;span style=&quot;color: #333333&quot;&gt;*&lt;/span&gt;&lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;12&lt;/span&gt; &lt;span style=&quot;color: #333333&quot;&gt;+&lt;/span&gt; 
                         [&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Bridgett&amp;#39;&lt;/span&gt;]&lt;span style=&quot;color: #333333&quot;&gt;*&lt;/span&gt;&lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;12&lt;/span&gt;)

df &lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt; df&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;reset_index()

&lt;span style=&quot;color: #008800; font-weight: bold&quot;&gt;def&lt;/span&gt; &lt;span style=&quot;color: #0066BB; font-weight: bold&quot;&gt;make_regular&lt;/span&gt;(r, name):
    r[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;visited&amp;#39;&lt;/span&gt;] &lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt; (r[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;visited&amp;#39;&lt;/span&gt;] &lt;span style=&quot;color: #000000; font-weight: bold&quot;&gt;or&lt;/span&gt; (r[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;index&amp;#39;&lt;/span&gt;] &lt;span style=&quot;color: #333333&quot;&gt;==&lt;/span&gt; name) &lt;span style=&quot;color: #000000; font-weight: bold&quot;&gt;and&lt;/span&gt; 
                  ((r[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Month&amp;#39;&lt;/span&gt;] &lt;span style=&quot;color: #333333&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;February&amp;#39;&lt;/span&gt;) &lt;span style=&quot;color: #000000; font-weight: bold&quot;&gt;or&lt;/span&gt;
                   (r[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Month&amp;#39;&lt;/span&gt;] &lt;span style=&quot;color: #333333&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;March&amp;#39;&lt;/span&gt;) &lt;span style=&quot;color: #000000; font-weight: bold&quot;&gt;or&lt;/span&gt;
                   (r[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Month&amp;#39;&lt;/span&gt;] &lt;span style=&quot;color: #333333&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;April&amp;#39;&lt;/span&gt;)))
    &lt;span style=&quot;color: #008800; font-weight: bold&quot;&gt;return&lt;/span&gt; r

df &lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt; df&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;apply(make_regular, axis&lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;1&lt;/span&gt;, args&lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt;(&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Alice&amp;#39;&lt;/span&gt;,))
df &lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt; df&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;apply(make_regular, axis&lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;1&lt;/span&gt;, args&lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt;(&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Bob&amp;#39;&lt;/span&gt;,))
regular &lt;span style=&quot;color: #333333&quot;&gt;=&lt;/span&gt; ((df[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;visited&amp;#39;&lt;/span&gt;] &lt;span style=&quot;color: #333333&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;color: #008800; font-weight: bold&quot;&gt;True&lt;/span&gt;) &lt;span style=&quot;color: #333333&quot;&gt;&amp;amp;&lt;/span&gt; 
          (df[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;visited&amp;#39;&lt;/span&gt;]&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;shift(&lt;span style=&quot;color: #333333&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;1&lt;/span&gt;) &lt;span style=&quot;color: #333333&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;color: #008800; font-weight: bold&quot;&gt;True&lt;/span&gt;) &lt;span style=&quot;color: #333333&quot;&gt;&amp;amp;&lt;/span&gt; 
          (df[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;visited&amp;#39;&lt;/span&gt;]&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;shift(&lt;span style=&quot;color: #333333&quot;&gt;-&lt;/span&gt;&lt;span style=&quot;color: #0000DD; font-weight: bold&quot;&gt;2&lt;/span&gt;) &lt;span style=&quot;color: #333333&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;color: #008800; font-weight: bold&quot;&gt;True&lt;/span&gt;))
df[regular][&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;index&amp;#39;&lt;/span&gt;]&lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;values &lt;span style=&quot;color: #333333&quot;&gt;.&lt;/span&gt;tolist()
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/div&gt;

&lt;br /&gt;
Output:&lt;br/&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;div style=&quot;background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;&quot;&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;pre style=&quot;margin: 0; line-height: 125%&quot;&gt;1&lt;/pre&gt;&lt;/td&gt;&lt;td&gt;&lt;pre style=&quot;margin: 0; line-height: 125%&quot;&gt;[&lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Alice&amp;#39;&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0&quot;&gt;&amp;#39;Bob&amp;#39;&lt;/span&gt;]
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/div&gt;

&lt;br /&gt;
&lt;blockquote&gt;
&lt;br /&gt;&lt;/blockquote&gt;
</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/8376247330446204838/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/8376247330446204838' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/8376247330446204838'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/8376247330446204838'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2018/04/pandas-snippets.html' title='Pandas snippets'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-9105221548962062601</id><published>2018-03-23T19:32:00.001-07:00</published><updated>2018-03-23T19:33:29.677-07:00</updated><title type='text'>Pushing your code to pypi</title><content type='html'>&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixnR_XGFLHOl5SOkVh7X49w9gOtWarRV-a6EKgH9euivLAOAtf3e3hDe6naGOKcTYEMnsV8TI0-2ahySjVaoA-uubPRKeiVNrOtb1zSghlM4xxTNbDyxvdsD6CTUuYOkSq3f2dtQ/s1600/pylogo.jpeg&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;116&quot; data-original-width=&quot;116&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixnR_XGFLHOl5SOkVh7X49w9gOtWarRV-a6EKgH9euivLAOAtf3e3hDe6naGOKcTYEMnsV8TI0-2ahySjVaoA-uubPRKeiVNrOtb1zSghlM4xxTNbDyxvdsD6CTUuYOkSq3f2dtQ/s1600/pylogo.jpeg&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;a href=&quot;http://peterdowns.com/posts/first-time-with-pypi.html&quot; target=&quot;_blank&quot;&gt;Here is a good document&lt;/a&gt; that describes how to push your code to the Pypi repository.&lt;br /&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
A URL has changed slightly. In your ~/.pypirc set the URL as follows:&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;style type=&quot;text/css&quot;&gt;
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff}
span.s1 {font-variant-ligatures: no-common-ligatures}
&lt;/style&gt;


&lt;br /&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;[pypitest]&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;repository=https://test.pypi.org/legacy/&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small; font-variant-ligatures: normal;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small; font-variant-ligatures: normal;&quot;&gt;The register step is no longer required. All you need to do is upload the files.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;style type=&quot;text/css&quot;&gt;
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff}
span.s1 {font-variant-ligatures: no-common-ligatures}
&lt;/style&gt;


&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;python setup.py sdist upload -r pypitest&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small;&quot;&gt;Each time you initiate an upload, you&#39;d need to change the version number and the URL.&lt;/span&gt;&lt;br /&gt;
&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small;&quot;&gt;While this uploaded the package to test.pypi.org, the upload steps had changed for pypi.org:&lt;/span&gt;&lt;br /&gt;
&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small;&quot;&gt;




&lt;style type=&quot;text/css&quot;&gt;
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff; min-height: 13.0px}
span.s1 {font-variant-ligatures: no-common-ligatures}
span.s2 {font-variant-ligatures: no-common-ligatures; color: #33bbc8}
&lt;/style&gt;


&lt;/span&gt;&lt;br /&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small;&quot;&gt;&lt;span class=&quot;s1&quot;&gt;thushara@ figleaf &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;(master)&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;$ python setup.py sdist upload -r pypi&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small;&quot;&gt;
&lt;/span&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small;&quot;&gt;&lt;span class=&quot;s1&quot;&gt;/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: &#39;install_requires&#39;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small;&quot;&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;warnings.warn(msg)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;running sdist&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;running check&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;warning: sdist: manifest template &#39;MANIFEST.in&#39; does not exist (using default file list)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;warning: sdist: standard file not found: should have one of README, README.txt&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;writing manifest file &#39;MANIFEST&#39;&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;creating figleaf-0.2&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;creating figleaf-0.2/figleaf&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;making hard links in figleaf-0.2...&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;hard linking setup.cfg -&amp;gt; figleaf-0.2&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;hard linking setup.py -&amp;gt; figleaf-0.2&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;hard linking figleaf/__init__.py -&amp;gt; figleaf-0.2/figleaf&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;hard linking figleaf/graph.py -&amp;gt; figleaf-0.2/figleaf&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Creating tar archive&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;removing &#39;figleaf-0.2&#39; (and everything under it)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;running upload&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Submitting dist/figleaf-0.2.tar.gz to https://pypi.python.org/pypi&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Upload failed (410): Gone (This API has been deprecated and removed from legacy PyPI in favor of using the APIs available in the new PyPI.org implementation of PyPI (located at https://pypi.org/). For more information about migrating your use of this API to PyPI.org, please see https://packaging.python.org/guides/migrating-to-pypi-org/#uploading. For more information about the sunsetting of this API, please see https://mail.python.org/pipermail/distutils-sig/2017-June/030766.html)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;error: Upload failed (410): Gone (This API has been deprecated and removed from legacy PyPI in favor of using the APIs available in the new PyPI.org implementation of PyPI (located at https://pypi.org/). For more information about migrating your use of this API to PyPI.org, please see https://packaging.python.org/guides/migrating-to-pypi-org/#uploading. For more information about the sunsetting of this API, please see https://mail.python.org/pipermail/distutils-sig/2017-June/030766.html)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small;&quot;&gt;To upload to pypi I used &lt;a href=&quot;https://pypi.python.org/pypi/twine&quot; target=&quot;_blank&quot;&gt;twine&lt;/a&gt;. Installing that on MacOS High Sierra required the removal of SIP.&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;
&lt;span style=&quot;font-family: &amp;quot;times&amp;quot;; font-size: small;&quot;&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;
In ~/.pypirc, I removed the repository line under [pypi]&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;blockquote class=&quot;tr_bq&quot;&gt;
python setup.py sdist&lt;/blockquote&gt;
&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
Remove old tars under dist, and&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;blockquote class=&quot;tr_bq&quot;&gt;
twine upload dist/*&lt;/blockquote&gt;
&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Now I could see the &lt;a href=&quot;https://pypi.python.org/pypi/wildhops/0.2&quot; target=&quot;_blank&quot;&gt;project under pypi&lt;/a&gt;&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/9105221548962062601/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/9105221548962062601' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/9105221548962062601'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/9105221548962062601'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2018/03/pushing-your-code-to-pypi.html' title='Pushing your code to pypi'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixnR_XGFLHOl5SOkVh7X49w9gOtWarRV-a6EKgH9euivLAOAtf3e3hDe6naGOKcTYEMnsV8TI0-2ahySjVaoA-uubPRKeiVNrOtb1zSghlM4xxTNbDyxvdsD6CTUuYOkSq3f2dtQ/s72-c/pylogo.jpeg" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-4374706422187144834</id><published>2018-03-23T18:47:00.002-07:00</published><updated>2018-03-23T18:48:19.568-07:00</updated><title type='text'>Installing Twine on MacOS High Sierra</title><content type='html'>&lt;style type=&quot;text/css&quot;&gt;
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #c33720; background-color: #ffffff}
span.s1 {font-variant-ligatures: no-common-ligatures}
span.s2 {font-variant-ligatures: no-common-ligatures; color: #33bbc8}
span.s3 {font-variant-ligatures: no-common-ligatures; color: #c33720}
&lt;/style&gt;


&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixnR_XGFLHOl5SOkVh7X49w9gOtWarRV-a6EKgH9euivLAOAtf3e3hDe6naGOKcTYEMnsV8TI0-2ahySjVaoA-uubPRKeiVNrOtb1zSghlM4xxTNbDyxvdsD6CTUuYOkSq3f2dtQ/s1600/pylogo.jpeg&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;116&quot; data-original-width=&quot;116&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixnR_XGFLHOl5SOkVh7X49w9gOtWarRV-a6EKgH9euivLAOAtf3e3hDe6naGOKcTYEMnsV8TI0-2ahySjVaoA-uubPRKeiVNrOtb1zSghlM4xxTNbDyxvdsD6CTUuYOkSq3f2dtQ/s1600/pylogo.jpeg&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;thushara@ wildhops &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;(master)&lt;/span&gt;&lt;span class=&quot;s3&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;$ sudo -H pip install twine&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Password:&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Collecting twine&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;Downloading twine-1.11.0-py2.py3-none-any.whl&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Collecting pkginfo&amp;gt;=1.4.2 (from twine)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;Downloading pkginfo-1.4.2-py2.py3-none-any.whl&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Requirement already satisfied: setuptools&amp;gt;=0.7.0 in /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python (from twine)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Collecting tqdm&amp;gt;=4.14 (from twine)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;Downloading tqdm-4.19.8-py2.py3-none-any.whl (52kB)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;100% |████████████████████████████████| 61kB 2.1MB/s&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Collecting requests-toolbelt&amp;gt;=0.8.0 (from twine)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;Downloading requests_toolbelt-0.8.0-py2.py3-none-any.whl (54kB)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;100% |████████████████████████████████| 61kB 1.6MB/s&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Requirement already satisfied: requests!=2.15,!=2.16,&amp;gt;=2.5.0 in /Library/Python/2.7/site-packages (from twine)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p1&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Installing collected packages: pkginfo, tqdm, requests-toolbelt, twine&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Exception:&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;Traceback (most recent call last):&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;File &quot;/Library/Python/2.7/site-packages/pip/basecommand.py&quot;, line 215, in main&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;status = self.run(options, args)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;File &quot;/Library/Python/2.7/site-packages/pip/commands/install.py&quot;, line 342, in run&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;prefix=options.prefix_path,&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;File &quot;/Library/Python/2.7/site-packages/pip/req/req_set.py&quot;, line 784, in install&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;**kwargs&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;File &quot;/Library/Python/2.7/site-packages/pip/req/req_install.py&quot;, line 851, in install&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;self.move_wheel_files(self.source_dir, root=root, prefix=prefix)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;File &quot;/Library/Python/2.7/site-packages/pip/req/req_install.py&quot;, line 1064, in move_wheel_files&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;isolated=self.isolated,&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;File &quot;/Library/Python/2.7/site-packages/pip/wheel.py&quot;, line 377, in move_wheel_files&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;clobber(source, dest, False, fixer=fixer, filter=filter)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;File &quot;/Library/Python/2.7/site-packages/pip/wheel.py&quot;, line 316, in clobber&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;ensure_dir(destdir)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;File &quot;/Library/Python/2.7/site-packages/pip/utils/__init__.py&quot;, line 83, in ensure_dir&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;os.makedirs(path)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;File &quot;/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/os.py&quot;, line 150, in makedirs&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;makedirs(head, mode)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &lt;/span&gt;File &quot;/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/os.py&quot;, line 157, in makedirs&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp; &amp;nbsp; &lt;/span&gt;mkdir(name, mode)&lt;/span&gt;&lt;/div&gt;
&lt;div class=&quot;p2&quot;&gt;
&lt;span class=&quot;s1&quot;&gt;OSError: [Errno 1] Operation not permitted: &#39;/System/Library/Frameworks/Python.framework/Versions/2.7/man&#39;&lt;/span&gt;&lt;/div&gt;
&lt;br /&gt;
The only way to get write access under /System is to boot into Recovery Mode and run this command on the Terminal:&lt;br /&gt;
&lt;br /&gt;
&lt;span style=&quot;background-color: #eff0f1; color: #242729; font-family: &amp;quot;consolas&amp;quot; , &amp;quot;menlo&amp;quot; , &amp;quot;monaco&amp;quot; , &amp;quot;lucida console&amp;quot; , &amp;quot;liberation mono&amp;quot; , &amp;quot;dejavu sans mono&amp;quot; , &amp;quot;bitstream vera sans mono&amp;quot; , &amp;quot;courier new&amp;quot; , monospace , sans-serif; font-size: 13px; white-space: pre-wrap;&quot;&gt;csrutil disable&lt;/span&gt;&lt;br /&gt;
&lt;span style=&quot;background-color: #eff0f1; color: #242729; font-family: &amp;quot;consolas&amp;quot; , &amp;quot;menlo&amp;quot; , &amp;quot;monaco&amp;quot; , &amp;quot;lucida console&amp;quot; , &amp;quot;liberation mono&amp;quot; , &amp;quot;dejavu sans mono&amp;quot; , &amp;quot;bitstream vera sans mono&amp;quot; , &amp;quot;courier new&amp;quot; , monospace , sans-serif; font-size: 13px; white-space: pre-wrap;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;span style=&quot;background-color: #eff0f1; color: #242729; font-family: &amp;quot;consolas&amp;quot; , &amp;quot;menlo&amp;quot; , &amp;quot;monaco&amp;quot; , &amp;quot;lucida console&amp;quot; , &amp;quot;liberation mono&amp;quot; , &amp;quot;dejavu sans mono&amp;quot; , &amp;quot;bitstream vera sans mono&amp;quot; , &amp;quot;courier new&amp;quot; , monospace , sans-serif; font-size: 13px; white-space: pre-wrap;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
Reboot, install again&lt;br /&gt;
&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/4374706422187144834/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/4374706422187144834' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/4374706422187144834'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/4374706422187144834'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2018/03/installing-twine-on-macos-high-sierra.html' title='Installing Twine on MacOS High Sierra'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixnR_XGFLHOl5SOkVh7X49w9gOtWarRV-a6EKgH9euivLAOAtf3e3hDe6naGOKcTYEMnsV8TI0-2ahySjVaoA-uubPRKeiVNrOtb1zSghlM4xxTNbDyxvdsD6CTUuYOkSq3f2dtQ/s72-c/pylogo.jpeg" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-7505790132490204793</id><published>2018-03-22T16:03:00.000-07:00</published><updated>2018-03-22T16:03:13.517-07:00</updated><title type='text'>A Graph in Python - and pythonic surprises</title><content type='html'>&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQgjHt-IWjPZvTuqZVDPtNjhJcXWQGGZ3P1MRvZoUfakAOxeF-JaAn3JuP1TUOYZcbfZJJOhNoDc3-BpB04mICVjPJ-pMmKbnHyjzx2-U2EWH-GywCU40F2XkGdhbCCzDEnvUb8w/s1600/graph.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;300&quot; data-original-width=&quot;300&quot; height=&quot;200&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQgjHt-IWjPZvTuqZVDPtNjhJcXWQGGZ3P1MRvZoUfakAOxeF-JaAn3JuP1TUOYZcbfZJJOhNoDc3-BpB04mICVjPJ-pMmKbnHyjzx2-U2EWH-GywCU40F2XkGdhbCCzDEnvUb8w/s200/graph.png&quot; width=&quot;200&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
I started implementing a Graph in python for a project and I encountered an unexpected behavior. See if you can spot the problem.&lt;br /&gt;
&lt;br /&gt;
Code for the graph is &lt;a href=&quot;https://gist.github.com/thushw/e495a494512e280d583c4d2333080444&quot; target=&quot;_blank&quot;&gt;here&lt;/a&gt;:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
However this is buggy. Each time an edge is added to one node, it gets added to all the nodes. Adding an edge from &#39;bellevue&#39; to &#39;lynwood&#39; added the edge to both vertices &#39;bellevue&#39; and &#39;lynwood&#39;.&lt;br /&gt;
&lt;br /&gt;
Code/Output:&lt;br /&gt;
&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;g.add_node&lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;GraphNode&lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;seattle&#39;&lt;/span&gt;, &lt;span style=&quot;color: #333333;&quot;&gt;[&lt;/span&gt;Edge&lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;seattle&#39;&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;bellevue&#39;&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;dist&#39;&lt;/span&gt;, 10&lt;span style=&quot;color: #333333;&quot;&gt;)&lt;/span&gt;, Edge&lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;seattle&#39;&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;lynwood&#39;&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;dist&#39;&lt;/span&gt;, 20&lt;span style=&quot;color: #333333;&quot;&gt;)]))&lt;/span&gt;

g.add_edge&lt;span style=&quot;color: #333333;&quot;&gt;((&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;bellevue&#39;&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;lynwood&#39;&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;dist&#39;&lt;/span&gt;, 5&lt;span style=&quot;color: #333333;&quot;&gt;))&lt;/span&gt;

print &lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;g&lt;span style=&quot;color: #333333;&quot;&gt;)&lt;/span&gt;

bellevue -&amp;gt; bellevue:lynwood:dist:5
lynwood -&amp;gt; bellevue:lynwood:dist:5
seattle -&amp;gt; seattle:bellevue:dist:10 seattle:lynwood:dist:20
&lt;/pre&gt;
&lt;/div&gt;
&lt;br /&gt;
After a lengthy debugging stint, the issue was identified to be the way &lt;a href=&quot;http://docs.python-guide.org/en/latest/writing/gotchas/&quot; target=&quot;_blank&quot;&gt;Python evaluates default argument values to functions&lt;/a&gt;.
</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/7505790132490204793/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/7505790132490204793' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/7505790132490204793'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/7505790132490204793'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2018/03/a-graph-in-python-and-pythonic-surprises.html' title='A Graph in Python - and pythonic surprises'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQgjHt-IWjPZvTuqZVDPtNjhJcXWQGGZ3P1MRvZoUfakAOxeF-JaAn3JuP1TUOYZcbfZJJOhNoDc3-BpB04mICVjPJ-pMmKbnHyjzx2-U2EWH-GywCU40F2XkGdhbCCzDEnvUb8w/s72-c/graph.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-142951622434781866</id><published>2017-12-07T20:36:00.001-08:00</published><updated>2018-03-22T15:34:54.257-07:00</updated><title type='text'>Python : Common pitfalls</title><content type='html'>&lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;&lt;b&gt;Join a list if and only if all values in the list are strings:&lt;/b&gt;
Code:

print &lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;running %s&quot;&lt;/span&gt; % &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; &#39;&lt;/span&gt;.join&lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;cmd&lt;span style=&quot;color: #333333;&quot;&gt;))&lt;/span&gt;

Error:

Traceback &lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;most recent call last&lt;span style=&quot;color: #333333;&quot;&gt;)&lt;/span&gt;:

  File &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;test.py&quot;&lt;/span&gt;, line 29, in &amp;lt;module&amp;gt;

    print &lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;running %s&quot;&lt;/span&gt; % &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; &#39;&lt;/span&gt;.join&lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;cmd&lt;span style=&quot;color: #333333;&quot;&gt;))&lt;/span&gt;

TypeError: sequence item 4: expected string, int found


Cause:

There are non-strings in the list cmd.


Ex:

&lt;span style=&quot;color: #996633;&quot;&gt;cmd&lt;/span&gt; &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color: #333333;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;runthis.py&quot;&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;--host&quot;&lt;/span&gt;, host, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;--port&quot;&lt;/span&gt;, port&lt;span style=&quot;color: #333333;&quot;&gt;]&lt;/span&gt;



To make join  happy:

&lt;span style=&quot;color: #996633;&quot;&gt;cmd&lt;/span&gt; &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color: #333333;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;runthis.py&quot;&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;--host&quot;&lt;/span&gt;, host, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;--port&quot;&lt;/span&gt;, str&lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;port&lt;span style=&quot;color: #333333;&quot;&gt;)]&lt;/span&gt;



&lt;b&gt;Avoid default values in mutable arguments:&lt;/b&gt;

Code:

class A:

    def __init__&lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;self, &lt;span style=&quot;color: #996633;&quot;&gt;lst&lt;/span&gt;&lt;span style=&quot;color: #333333;&quot;&gt;=[])&lt;/span&gt;:

        self.lst &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; lst



&lt;span style=&quot;color: #996633;&quot;&gt;a&lt;/span&gt; &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; A&lt;span style=&quot;color: #333333;&quot;&gt;()&lt;/span&gt;

&lt;span style=&quot;color: #996633;&quot;&gt;b&lt;/span&gt; &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; A&lt;span style=&quot;color: #333333;&quot;&gt;()&lt;/span&gt;

a.lst.append&lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;crocs&#39;&lt;/span&gt;&lt;span style=&quot;color: #333333;&quot;&gt;)&lt;/span&gt;


print &lt;span style=&quot;color: #333333;&quot;&gt;(&lt;/span&gt;b.lst&lt;span style=&quot;color: #333333;&quot;&gt;)&lt;/span&gt;


Output:

&lt;span style=&quot;color: #333333;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;crocs&#39;&lt;/span&gt;&lt;span style=&quot;color: #333333;&quot;&gt;]&lt;/span&gt;


Cause:

Python evaluates default arguments to a function at the time the function&lt;/pre&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;is defined, not each time it is called. So all instances will mutate a single &lt;/pre&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;list.&lt;/pre&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/142951622434781866/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/142951622434781866' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/142951622434781866'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/142951622434781866'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2017/12/python-join-with-care.html' title='Python : Common pitfalls'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11265228.post-4935113659990877418</id><published>2017-11-05T12:19:00.000-08:00</published><updated>2017-11-08T16:53:02.881-08:00</updated><title type='text'>A Beautiful Soupy Exercise in Scraping Interesting Integers</title><content type='html'>Integers can be very interesting at least if you are a mathematician, and even for a lay person like me, interesting integers can be used to spice up some data that is of interest. My interest here being the bike counter installed on the Fremont Bridge sidewalk that counts the number of bicycles crossing the bridge in both directions.&lt;br /&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Inspired by this idea of &lt;a href=&quot;http://rooreynolds.com/2008/04/24/blogjects-and-tweetjects/&quot;&gt;blogjects&lt;/a&gt; and &lt;a href=&quot;http://stanford-clark.com/andy_house.html&quot;&gt;twittering houses&lt;/a&gt;, I wanted to send out an early morning tweet of the number of cyclists who braved the streets across Fremont the day before. The data is uploaded by SDOT early morning, and a cron task would request for this and tweet it.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Simple and a tad boring. Now what if I could map the count to something interesting? Researching on interesting integers, I came up on the theorem that there are no &lt;a href=&quot;https://en.wikipedia.org/wiki/Interesting_number_paradox&quot;&gt;uninteresting integers&lt;/a&gt;&amp;nbsp;because after all if there were a bunch of these, and one of them must be the smallest of the lot, and the fact itself makes this number interesting.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Energized in no small measure by this revelation, I sallied forth to find a list of interesting integers in the thousands range, as every day there were ~ 3000 cyclists being logged. It didn&#39;t take long for me to reach a comprehensive page of &lt;a href=&quot;http://www2.stetson.edu/~efriedma/numbers.html&quot;&gt;integers&lt;/a&gt;. The pattern is an integer within a &amp;lt;font&amp;gt; tag, followed by a phrase that describes it.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwuEKrDHEq5X3wVixDUMVlfnxunzQmDbcZZvhi7hGZkIWPgPrSRODjxChSSvUGa2LZkjLvwWlw83FsAUbBH1N6MBgQg-GUWTFAw6hANv_izLKWplVNP8dIe4MtoA_AkCvxTmsxKg/s1600/numbers.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;238&quot; data-original-width=&quot;504&quot; height=&quot;151&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwuEKrDHEq5X3wVixDUMVlfnxunzQmDbcZZvhi7hGZkIWPgPrSRODjxChSSvUGa2LZkjLvwWlw83FsAUbBH1N6MBgQg-GUWTFAw6hANv_izLKWplVNP8dIe4MtoA_AkCvxTmsxKg/s320/numbers.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;font &lt;span class=&quot;html-attribute-name&quot;&gt;size&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;+3&lt;/span&gt; &lt;span class=&quot;html-attribute-name&quot;&gt;color&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;gray&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/font&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt; is the &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/AdditiveIdentity.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/AdditiveIdentity.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;additive identity&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;br&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;font &lt;span class=&quot;html-attribute-name&quot;&gt;size&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;+3&lt;/span&gt; &lt;span class=&quot;html-attribute-name&quot;&gt;color&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;gray&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/font&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt; is the &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/MultiplicativeIdentity.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/MultiplicativeIdentity.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;multiplicative identity&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;br&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;font &lt;span class=&quot;html-attribute-name&quot;&gt;size&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;+3&lt;/span&gt; &lt;span class=&quot;html-attribute-name&quot;&gt;color&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;darkblue&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/font&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt; is the only even &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/PrimeNumber.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/PrimeNumber.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;prime&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;br&amp;gt;&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
At first glance, it seemed a simple matter of using &lt;a href=&quot;https://www.crummy.com/software/BeautifulSoup/&quot;&gt;Beautiful Soup&lt;/a&gt; to get each font tag, extract its text, then look for the font&#39;s sibling to extract the phrase.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
However, the font tag has multiple siblings that make up the complete phrase. In the soup these are represented as NavigableString objects. It&#39;s a matter of moving across the document until we hit a &amp;lt;br&amp;gt; tag, collecting all the text as we go along.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Now since all of this needs to be part of the tweet, I quickly realized that not much can be said in 140 characters. So I didn&#39;t bother keeping the URLs. I used a jupyter notebook to quickly prototype the outline, and I can&#39;t stress enough how useful this is, specially when you are dealing with an unfamiliar API (which Beautiful soup was to me).&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Here is how I used the notebook to understand the basic structure of the page:&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhF6hOfJvEo_Nph4cTtLU4KDv1EWAM0LW5xLZYzIPHp2OlFeFuBwyCpHncBivY6bl7zCKwWIk2-0zvGPYBu27gA5z3hXxmLKyCE9tQTb69it_51VnUu8aGX5nGYS12QJX0-FmAYFg/s1600/fiddlesoup.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;1000&quot; data-original-width=&quot;1600&quot; height=&quot;396&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhF6hOfJvEo_Nph4cTtLU4KDv1EWAM0LW5xLZYzIPHp2OlFeFuBwyCpHncBivY6bl7zCKwWIk2-0zvGPYBu27gA5z3hXxmLKyCE9tQTb69it_51VnUu8aGX5nGYS12QJX0-FmAYFg/s640/fiddlesoup.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
So getting to the integers was quite trivial as BeautifuSoup provides a way to search for a specific tag (font) with a specific value for a given attribute (size=+3). Since the phrase for the integer is in a number of contiguous elements, we need to construct it by visiting siblings of the font tag until we hit a &amp;lt;br&amp;gt; tag.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;unexpected_tags &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; {}
&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;def&lt;/span&gt; &lt;span style=&quot;color: #0066bb; font-weight: bold;&quot;&gt;get_text_to_eol&lt;/span&gt;(font_section):
    text_parts &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; []
    section &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; font_section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;next_sibling
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;while&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name &lt;span style=&quot;color: #333333;&quot;&gt;!=&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;br&#39;&lt;/span&gt;:
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;if&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name &lt;span style=&quot;color: #333333;&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;a&#39;&lt;/span&gt;:
            text_parts&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;append(section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;string)
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;elif&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;None&lt;/span&gt;:
            text_parts&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;append(&lt;span style=&quot;color: #007020;&quot;&gt;str&lt;/span&gt;(section))
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;else&lt;/span&gt;:
            &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;print&lt;/span&gt; (&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;found &lt;/span&gt;&lt;span style=&quot;background-color: #eeeeee;&quot;&gt;%s&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt; tag&quot;&lt;/span&gt; &lt;span style=&quot;color: #333333;&quot;&gt;%&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name)
            unexpected_tags[section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name] &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; unexpected_tags&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;get(section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name, &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;)&lt;span style=&quot;color: #333333;&quot;&gt;+&lt;/span&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;
        section &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;next_sibling    
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;return&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; &#39;&lt;/span&gt;&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;join(text_parts)  
&lt;/pre&gt;
&lt;/div&gt;
&lt;br /&gt;
Now I ran through the results in jupyter, and the first few are shown below:&lt;br /&gt;
&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; number, text &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;map&lt;/span&gt;(&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;lambda&lt;/span&gt; section: (section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;get_text(), get_text_to_eol(section)), integer_sections):
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;print&lt;/span&gt; (number, text)
&lt;/pre&gt;
&lt;/div&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the  additive identity &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the  multiplicative identity &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;2&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the only even  prime &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the number of spatial dimensions we live &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt;&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;4&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the smallest number of colors sufficient to color &lt;span style=&quot;color: #007020;&quot;&gt;all&lt;/span&gt; planar maps&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;5&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the number of  Platonic solids &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;6&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the smallest  perfect number &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;7&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the smallest number of sides of a  regular  polygon that &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;not&lt;/span&gt;  constructible  by straightedge &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;and&lt;/span&gt; compass&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;8&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the largest  cube  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; the  Fibonacci sequence &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;9&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the maximum number of  cubes  that are needed to &lt;span style=&quot;color: #007020;&quot;&gt;sum&lt;/span&gt; to &lt;span style=&quot;color: #007020;&quot;&gt;any&lt;/span&gt; positive  integer &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the base of our number system&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;11&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the largest known  multiplicative persistence &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;12&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the smallest  abundant number &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;13&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the number of  Archimedean solids &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;14&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the smallest even number n &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;with&lt;/span&gt; no solutions to  &lt;span style=&quot;background-color: #ffaaaa; color: red;&quot;&gt;φ&lt;/span&gt; (m) &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; n&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;15&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the smallest  composite number  n &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;with&lt;/span&gt; the &lt;span style=&quot;color: #007020;&quot;&gt;property&lt;/span&gt; that there &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; only one  group  of order n&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
found sup tag
found sup tag
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;16&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the only number of the form x  &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; y  &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;with&lt;/span&gt; x &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;and&lt;/span&gt; y being different  integers &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;17&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the number of  wallpaper groups &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;18&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the only positive number that &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; twice the &lt;span style=&quot;color: #007020;&quot;&gt;sum&lt;/span&gt; of its digits&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
found sup tag
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;19&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the maximum number of &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;4&lt;/span&gt;  powers needed to &lt;span style=&quot;color: #007020;&quot;&gt;sum&lt;/span&gt; to &lt;span style=&quot;color: #007020;&quot;&gt;any&lt;/span&gt; number&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;20&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the number of  rooted trees  &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;with&lt;/span&gt; &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;6&lt;/span&gt; vertices&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;21&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the smallest number of distinct  squares  needed to tile a  square &lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;
&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;22&lt;/span&gt;  &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; the number of  partitions  of &lt;span style=&quot;color: #6600ee; font-weight: bold;&quot;&gt;8.&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;
&lt;br /&gt;
Since I collected the unknown tags, I could see what they were:


&lt;br /&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEfAyvIB6Zuh0EDKuoCfbGKBb6HichmzGeO-LUoue7GnihYBWV5QgpafNUOpkRMzUUdnGz3lIbUvzhTybH65NF2VqBdhUtPHfzGLQBX7OqdlSLeG6xmuLbKKY7WbyI-SS4mvmGuw/s1600/tags.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;107&quot; data-original-width=&quot;1600&quot; height=&quot;42&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEfAyvIB6Zuh0EDKuoCfbGKBb6HichmzGeO-LUoue7GnihYBWV5QgpafNUOpkRMzUUdnGz3lIbUvzhTybH65NF2VqBdhUtPHfzGLQBX7OqdlSLeG6xmuLbKKY7WbyI-SS4mvmGuw/s640/tags.png&quot; width=&quot;640&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;
Here is an example of a superscript being used:&lt;br /&gt;
&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;font &lt;span class=&quot;html-attribute-name&quot;&gt;size&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;+3&lt;/span&gt; &lt;span class=&quot;html-attribute-name&quot;&gt;color&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;FF6699&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/font&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt; is the only number of the form x&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;sup&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/sup&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt; = y&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;sup&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/sup&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt; with x and y being different &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/Integer.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/Integer.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;integers&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;br&amp;gt;&lt;/span&gt;&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgMty4FEIv7sCIwKVxDNATfHzC__OSB7EPwsf3s7kWQvs1XVMQzr6JyMernGrqRvpQQcvMJmPjmamyLiucF_YUrYu2T7Ahxj2KX4ohwxnXxUnRmr0t29axyw4d3X-vkVStY2DC7g/s1600/tags.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;76&quot; data-original-width=&quot;1172&quot; height=&quot;25&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgMty4FEIv7sCIwKVxDNATfHzC__OSB7EPwsf3s7kWQvs1XVMQzr6JyMernGrqRvpQQcvMJmPjmamyLiucF_YUrYu2T7Ahxj2KX4ohwxnXxUnRmr0t29axyw4d3X-vkVStY2DC7g/s400/tags.png&quot; width=&quot;400&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Now isn&#39;t that interesting, out of all the numbers that this is the only one?&lt;br /&gt;
&lt;br /&gt;
Here is how the subscript is being used:&lt;br /&gt;
&lt;br /&gt;
&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;font &lt;span class=&quot;html-attribute-name&quot;&gt;size&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;+3&lt;/span&gt; &lt;span class=&quot;html-attribute-name&quot;&gt;color&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;brown&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;126&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/font&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt; = &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;sub&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/sub&amp;gt;&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/Combination.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/Combination.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;sub&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/sub&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;br&amp;gt;&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;br /&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjW2diBcP7eIXAH48ukdRO6SNythyDfgLP8OZxQT7Wsg1w7e9UM5UxiWm1_gSTFfCUe6kp_dDvM1SDEEi-GqxU_FODft_mnp5sEk2TYrBc0qEA6ziT-FAQAaxsSh6Q4MXQJ3trP7Q/s1600/sub.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;78&quot; data-original-width=&quot;222&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjW2diBcP7eIXAH48ukdRO6SNythyDfgLP8OZxQT7Wsg1w7e9UM5UxiWm1_gSTFfCUe6kp_dDvM1SDEEi-GqxU_FODft_mnp5sEk2TYrBc0qEA6ziT-FAQAaxsSh6Q4MXQJ3trP7Q/s1600/sub.png&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
With this insight, I augmented the superscripts with the ^ symbol, and left the subscripts as is.&lt;br /&gt;
&lt;br /&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;def&lt;/span&gt; &lt;span style=&quot;color: #0066bb; font-weight: bold;&quot;&gt;get_text_to_eol&lt;/span&gt;(font_section):
    text_parts &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; []
    section &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; font_section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;next_sibling
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;while&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name &lt;span style=&quot;color: #333333;&quot;&gt;!=&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;br&#39;&lt;/span&gt;:
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;if&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name &lt;span style=&quot;color: #333333;&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;a&#39;&lt;/span&gt;:
            text_parts&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;append(section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;string)
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;elif&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;None&lt;/span&gt;:
            text_parts&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;append(&lt;span style=&quot;color: #007020;&quot;&gt;str&lt;/span&gt;(section))
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;else&lt;/span&gt;:
            &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;if&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name &lt;span style=&quot;color: #333333;&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;sup&#39;&lt;/span&gt;:
                text_parts&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;append(&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;^&#39;&lt;/span&gt;)
            text_parts&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;append(section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;string)
        section &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;next_sibling    
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;return&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; &#39;&lt;/span&gt;&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;join(text_parts)       
&lt;/pre&gt;
&lt;/div&gt;
&lt;br /&gt;
And I again digressed on a merry tangent where people were using &lt;a href=&quot;https://www.buzzfeed.com/jwherrman/9-tweets-that-break-twitter?utm_term=.bqN6oEQpL#.pr1Gk1lbA&quot;&gt;non-ascii&lt;/a&gt; characters to tweet &lt;a href=&quot;https://lingojam.com/TwitterFonts&quot;&gt;subscripts&lt;/a&gt; and superscripts.&lt;br /&gt;
&lt;br /&gt;
However, somewhat surprisingly, the sibling list did not always end in a &amp;lt;br&amp;gt;. I hit a None for a section for integer 248 with the html:&lt;br /&gt;
&lt;br /&gt;
&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;font &lt;span class=&quot;html-attribute-name&quot;&gt;size&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;+3&lt;/span&gt; &lt;span class=&quot;html-attribute-name&quot;&gt;color&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;006600&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;248&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/font&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt; is the smallest number n&amp;gt;1 for which the &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/ArithmeticMean.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/ArithmeticMean.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;arithmetic&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;, &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/GeometricMean.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/GeometricMean.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;geometric&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;, and &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/HarmonicMean.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/HarmonicMean.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;harmonic means&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a/&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt; of &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/TotientFunction.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/TotientFunction.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;amp;phi;&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;(n) and &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/DivisorFunction.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/DivisorFunction.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;amp;sigma;&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;(n) are all &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/Integer.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/Integer.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;integers&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;br&amp;gt;&lt;/span&gt;&lt;br /&gt;
&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&lt;br /&gt;&lt;/span&gt;
Can you spot the problem, it is subtle?&lt;br /&gt;
&lt;br /&gt;
Notice that&amp;nbsp;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;harmonic means&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a/&amp;gt;&lt;/span&gt;&amp;nbsp;is not the correct encoding. Beautiful soup replaces this dangling tag with a beautiful pair&amp;nbsp;&lt;span style=&quot;background-color: white; font-size: 14px; white-space: pre-wrap;&quot;&gt;&amp;lt;a&amp;gt;&amp;lt;/a&amp;gt;&lt;/span&gt;:&lt;br /&gt;
&lt;br /&gt;
&lt;pre style=&quot;background-color: white; border-radius: 0px; border: 0px; box-sizing: border-box; font-size: 14px; line-height: inherit; overflow: auto; padding: 0px; vertical-align: baseline; white-space: pre-wrap; word-break: break-all; word-wrap: break-word;&quot;&gt;&amp;lt;a href=&quot;http://mathworld.wolfram.com/HarmonicMean.html&quot;&amp;gt;harmonic means&amp;lt;a&amp;gt;&amp;lt;/a&amp;gt; of &amp;lt;a href=&quot;http://mathworld.wolfram.com/TotientFunction.html&quot;&amp;gt;φ&amp;lt;/a&amp;gt;(n) and &amp;lt;a href=&quot;http://mathworld.wolfram.com/DivisorFunction.html&quot;&amp;gt;σ&amp;lt;/a&amp;gt;(n) are all &amp;lt;a href=&quot;http://mathworld.wolfram.com/Integer.html&quot;&amp;gt;integers&amp;lt;/a&amp;gt;.&amp;lt;br/&amp;gt;&amp;lt;/a&amp;gt;&lt;/pre&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
This is all very nice, except that we were relying on a &amp;lt;br&amp;gt; tag to be an eventual sibling, and Beautiful soup is on a soupy wake trying to find the matching &amp;lt;/a&amp;gt; to the tag it started with, finally &lt;b&gt;&lt;span style=&quot;color: purple;&quot;&gt;finding it at&lt;/span&gt;:&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;font &lt;span class=&quot;html-attribute-name&quot;&gt;size&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;+3&lt;/span&gt; &lt;span class=&quot;html-attribute-name&quot;&gt;color&lt;/span&gt;=&lt;span class=&quot;html-attribute-value&quot;&gt;FF6699&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;1351&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/font&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt; has the property that &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/e.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/e.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;sup&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;1351&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&lt;span style=&quot;color: purple;&quot;&gt;&lt;b&gt;&amp;lt;/a&amp;gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/sup&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt; is within .0009 of an &lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;a &lt;span class=&quot;html-attribute-name&quot;&gt;href&lt;/span&gt;=&quot;&lt;a class=&quot;html-attribute-value html-external-link&quot; href=&quot;http://mathworld.wolfram.com/Integer.html&quot; target=&quot;_blank&quot;&gt;http://mathworld.wolfram.com/Integer.html&lt;/a&gt;&quot;&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;integer&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;&lt;span style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;html-tag&quot; style=&quot;font-family: monospace; white-space: pre-wrap;&quot;&gt;&amp;lt;br&amp;gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
We are getting all the integers from 248 through 1351 in one unbroken block.&lt;br /&gt;
&lt;br /&gt;
Is there then an easier way to solve this problem? It&#39;s tempting to think regular expressions when it comes to html parsing issues of this sort. What if we use a regular expression to split apart the sections containing the integer and phrase? After all, using a top down parser snagged on a mismatched tag, but maybe a regular expression can give us a better behaved set of html tags which we can then parse individually with Beautiful Soup.&lt;br /&gt;
&lt;br /&gt;
We can get the html lines with a reg exp split.&lt;br /&gt;
&lt;br /&gt;
Since we decided to forego the URLs, we could construct a BeautifulSoup instance for each line, and call the get_text() method to strip all tags.&lt;br /&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;import&lt;/span&gt; &lt;span style=&quot;color: #0e84b5; font-weight: bold;&quot;&gt;re&lt;/span&gt;
lines &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; re&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;split(&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;&amp;lt;br&amp;gt;[&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0; color: #666666; font-weight: bold;&quot;&gt;\r\n&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;\s]*&quot;&lt;/span&gt;, html)
list_lines &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;list&lt;/span&gt;(&lt;span style=&quot;color: #007020;&quot;&gt;filter&lt;/span&gt;(&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;lambda&lt;/span&gt; x: x &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;not&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;None&lt;/span&gt;, [re&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;search(&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;r&quot;&amp;lt;font size=\+3 .*&quot;&lt;/span&gt;, line) &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; line &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; lines]))
text_lines &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; [BeautifulSoup(l&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;group(&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;), &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;html.parser&#39;&lt;/span&gt;)&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;get_text() &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; l &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; list_lines]
&lt;/pre&gt;
&lt;/div&gt;
&lt;br /&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;[&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;0 is the additive identity.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;1 is the multiplicative identity.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;2 is the only even prime.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;3 is the number of spatial dimensions we live in.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;4 is the smallest number of colors sufficient to color all planar maps.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;5 is the number of Platonic solids.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;6 is the smallest perfect number.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;7 is the smallest number of sides of a regular polygon that is not constructible by straightedge and compass.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;8 is the largest cube in the Fibonacci sequence.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;9 is the maximum number of cubes that are needed to sum to any positive integer.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;10 is the base of our number system.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;11 is the largest known multiplicative persistence.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;12 is the smallest abundant number.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;13 is the number of Archimedean solids.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;14 is the smallest even number n with no solutions to φ(m) = n.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;15 is the smallest composite number n with the property that there is only one group of order n.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;16 is the only number of the form xy = yx with x and y being different integers.&#39;&lt;/span&gt;,
 &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;17 is the number of wallpaper groups.&#39;&lt;/span&gt;,
&lt;/pre&gt;
&lt;/div&gt;
&lt;br /&gt;
But now we are not identifying the superscripts, subscripts, as you can see from the output for integer 16.&lt;br /&gt;
&lt;br /&gt;
What we should do is to then use the regular expression to get the lines, apply the parser for each line, then use the function we wrote earlier to get the text within each line. Now since the parser can&#39;t go over a &amp;lt;br&amp;gt;, it just might result in a better extraction of phrases.&lt;br /&gt;
&lt;br /&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;import&lt;/span&gt; &lt;span style=&quot;color: #0e84b5; font-weight: bold;&quot;&gt;re&lt;/span&gt;
lines &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; re&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;split(&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&quot;&amp;lt;br&amp;gt;[&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0; color: #666666; font-weight: bold;&quot;&gt;\r\n&lt;/span&gt;&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;\s]*&quot;&lt;/span&gt;, html)
list_lines &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;list&lt;/span&gt;(&lt;span style=&quot;color: #007020;&quot;&gt;filter&lt;/span&gt;(&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;lambda&lt;/span&gt; x: x &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;not&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;None&lt;/span&gt;, [re&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;search(&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;r&quot;&amp;lt;font size=\+3 .*&quot;&lt;/span&gt;, line) &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; line &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; lines]))
soups &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; [BeautifulSoup(l&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;group(&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;), &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;html.parser&#39;&lt;/span&gt;)&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;font &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; l &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; list_lines]
np_list &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; [(&lt;span style=&quot;color: #007020;&quot;&gt;int&lt;/span&gt;(s&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;get_text()), get_text_to_eol(s)) &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;for&lt;/span&gt; s &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;in&lt;/span&gt; soups]
&lt;/pre&gt;
&lt;/div&gt;
&lt;br /&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;(&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;7&lt;/span&gt;,
  &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the smallest number of sides of a  regular  polygon that is not  constructible  by straightedge and compass.&#39;&lt;/span&gt;),
 (&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;8&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the largest  cube  in the  Fibonacci sequence .&#39;&lt;/span&gt;),
 (&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;9&lt;/span&gt;,
  &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the maximum number of  cubes  that are needed to sum to any positive  integer .&#39;&lt;/span&gt;),
 (&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the base of our number system.&#39;&lt;/span&gt;),
 (&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;11&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the largest known  multiplicative persistence .&#39;&lt;/span&gt;),
 (&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;12&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the smallest  abundant number .&#39;&lt;/span&gt;),
 (&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;13&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the number of  Archimedean solids .&#39;&lt;/span&gt;),
 (&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;14&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the smallest even number n with no solutions to  φ (m) = n.&#39;&lt;/span&gt;),
 (&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;15&lt;/span&gt;,
  &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the smallest  composite number  n with the property that there is only one  group  of order n.&#39;&lt;/span&gt;),
 (&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;16&lt;/span&gt;,
  &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the only number of the form x ^ y  = y ^ x  with x and y being different  integers .&#39;&lt;/span&gt;),
 (&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;17&lt;/span&gt;, &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the number of  wallpaper groups .&#39;&lt;/span&gt;),
&lt;/pre&gt;
&lt;/div&gt;
&lt;br /&gt;
Now we convert this list of pairs to a dictionary, so we can quickly look up the integer:&lt;br /&gt;
&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;&lt;span style=&quot;color: #007020;&quot;&gt;hash&lt;/span&gt; &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;dict&lt;/span&gt;(np_list)
&lt;/pre&gt;
&lt;/div&gt;
&lt;br /&gt;
&lt;!-- HTML generated using hilite.me --&gt;&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;{&lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;0&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the  additive identity .&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;1&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the  multiplicative identity .&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;2&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the only even  prime .&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;3&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the number of spatial dimensions we live in.&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;4&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the smallest number of colors sufficient to color all planar maps.&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;5&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the number of  Platonic solids .&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;6&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the smallest  perfect number .&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;7&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the smallest number of sides of a  regular  polygon that is not  constructible  by straightedge and compass.&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;8&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the largest  cube  in the  Fibonacci sequence .&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;9&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the maximum number of  cubes  that are needed to sum to any positive  integer .&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;10&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the base of our number system.&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;11&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the largest known  multiplicative persistence .&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;12&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the smallest  abundant number .&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;13&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the number of  Archimedean solids .&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;14&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the smallest even number n with no solutions to  φ (m) = n.&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;15&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the smallest  composite number  n with the property that there is only one  group  of order n.&#39;&lt;/span&gt;,
 &lt;span style=&quot;color: #0000dd; font-weight: bold;&quot;&gt;16&lt;/span&gt;: &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; is the only number of the form x ^ y  = y ^ x  with x and y being different  integers .&#39;&lt;/span&gt;,
&lt;/pre&gt;
&lt;/div&gt;
&lt;br /&gt;
The get_text_to_eol() was modified to handle hitting the end of the sibling list without hitting a &amp;lt;br&amp;gt;. Also, we keep all strings unicode up until they need to be output, at which point a conversion to utf-8 is done.&lt;br /&gt;
&lt;br /&gt;
&lt;div style=&quot;background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;&quot;&gt;
&lt;pre style=&quot;line-height: 125%; margin: 0;&quot;&gt;&lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;def&lt;/span&gt; &lt;span style=&quot;color: #0066bb; font-weight: bold;&quot;&gt;get_text_to_eol&lt;/span&gt;(font_section):
    text_parts &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; []
    section &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; font_section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;next_sibling
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;while&lt;/span&gt; section &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;not&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;None&lt;/span&gt;:
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;if&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name &lt;span style=&quot;color: #333333;&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;a&#39;&lt;/span&gt;:
            text_parts&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;append(section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;get_text())
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;elif&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name &lt;span style=&quot;color: black; font-weight: bold;&quot;&gt;is&lt;/span&gt; &lt;span style=&quot;color: #007020;&quot;&gt;None&lt;/span&gt;:
            text_parts&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;append(&lt;span style=&quot;color: #007020;&quot;&gt;unicode&lt;/span&gt;(section))
        &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;else&lt;/span&gt;:
            &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;if&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;name &lt;span style=&quot;color: #333333;&quot;&gt;==&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;sup&#39;&lt;/span&gt;:
                text_parts&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;append(&lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39;^&#39;&lt;/span&gt;)
            text_parts&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;append(section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;get_text())
        section &lt;span style=&quot;color: #333333;&quot;&gt;=&lt;/span&gt; section&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;next_sibling    
    &lt;span style=&quot;color: #008800; font-weight: bold;&quot;&gt;return&lt;/span&gt; &lt;span style=&quot;background-color: #fff0f0;&quot;&gt;&#39; &#39;&lt;/span&gt;&lt;span style=&quot;color: #333333;&quot;&gt;.&lt;/span&gt;join(text_parts)     
&lt;/pre&gt;
&lt;/div&gt;
&lt;br /&gt;
I think this will do for our purposes. In the next post I will show how this was used along with the bike counter uploads to tweet early morning updates to twitter.&lt;br /&gt;
&lt;br /&gt;
The full source code, along with the twitter updates can be found &lt;a href=&quot;https://github.com/thushw/fremont&quot;&gt;here&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://thushw.blogspot.com/feeds/4935113659990877418/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/11265228/4935113659990877418' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/4935113659990877418'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11265228/posts/default/4935113659990877418'/><link rel='alternate' type='text/html' href='http://thushw.blogspot.com/2017/11/a-beautiful-soupy-exercise-in-scraping.html' title='A Beautiful Soupy Exercise in Scraping Interesting Integers'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwuEKrDHEq5X3wVixDUMVlfnxunzQmDbcZZvhi7hGZkIWPgPrSRODjxChSSvUGa2LZkjLvwWlw83FsAUbBH1N6MBgQg-GUWTFAw6hANv_izLKWplVNP8dIe4MtoA_AkCvxTmsxKg/s72-c/numbers.png" height="72" width="72"/><thr:total>1</thr:total></entry></feed>