<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>jenaiz.com &#8211; I think.</title>
	<atom:link href="https://jenaiz.com/feed/" rel="self" type="application/rss+xml" />
	<link>https://jenaiz.com/</link>
	<description>by Jesús Navarrete</description>
	<lastBuildDate>Mon, 30 Jan 2023 10:38:12 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.8.5</generator>
	<item>
		<title>Django i18n without Unicode issues</title>
		<link>https://jenaiz.com/2023/01/django-i18n-without-unicode-issues/</link>
		
		<dc:creator><![CDATA[jenaiz]]></dc:creator>
		<pubDate>Mon, 30 Jan 2023 10:38:10 +0000</pubDate>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[i18n]]></category>
		<guid isPermaLink="false">https://jenaiz.com/?p=680</guid>

					<description><![CDATA[<p>A guide to solve Unicode issues when using i18n with Django. Use your special characters when translating to Spanish, German or Chinese.</p>
<p>The post <a href="https://jenaiz.com/2023/01/django-i18n-without-unicode-issues/">Django i18n without Unicode issues</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="has-small-font-size">There are hundreds of good tutorials written about how to add i18n to a Django application. Some of them are straightforward to follow and get the translations working quickly.</p>



<p>I have read tutorials for adding <a href="https://phrase.com/blog/posts/quick-guide-django-i18n/">German</a>, <a href="https://joaoventura.net/blog/2016/django-translation-2/">Portuguese</a>, Chinese, and Spanish. In all of them, the special characters are used directly in the translation files without any issues, in my case that didn&#8217;t work out at first.</p>



<p>I couldn&#8217;t use the special Spanish characters directly in the translation files. I was having always an error similar to &#8220;<strong>&#8216;ascii&#8217; codec can&#8217;t decode byte 0xc3 in position 18: ordinal not in range(128)</strong>&#8220;. </p>



<p>Are you having issues with accented or special characters in your translation files? Cannot you show characters like ß, ü, or &#8220;你好&#8221;? Let me help you with the problem. </p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">My Project Setup</h2>



<p>I have a Django project with multiple applications, and for each application, I had to add its translation files. Because of the numerous templates, views, and APIs, I wanted to translate one application per time. Test the translations slowly for every feature and jump to the next one.</p>



<p>When I read the tutorials, I didn&#8217;t generate the .po files or the folder structure automatically. I tried, but the generation was giving me some issues, and because of my approach, I decided to create the folder structure and the files by hand.</p>



<p>If you look at the tutorials carefully, they usually create the files using <strong>makemessages</strong>. This utility creates the folder structure in all the applications and looks for the keys that need to be added depending on your application.</p>



<p>The use is pretty simple, execute the below command with the language you would like to add, in this case, German.</p>



<pre class="wp-block-code"><code class="">django-admin makemessages -l de -i venv</code></pre>



<p>And it will create the folder structure for all the applications, together with all the keys that need to be translated. The result will look like this:</p>


<div class="wp-block-image is-style-default">
<figure class="aligncenter size-medium"><a href="https://jenaiz.com/wp-content/uploads/2023/01/Screenshot-2023-01-21-at-12.14.25.png"><img fetchpriority="high" decoding="async" width="300" height="282" src="https://jenaiz.com/wp-content/uploads/2023/01/Screenshot-2023-01-21-at-12.14.25-300x282.png" alt="Folders for i18n in Django" class="wp-image-691" title="Folder structure for German" srcset="https://jenaiz.com/wp-content/uploads/2023/01/Screenshot-2023-01-21-at-12.14.25-300x282.png 300w, https://jenaiz.com/wp-content/uploads/2023/01/Screenshot-2023-01-21-at-12.14.25.png 562w" sizes="(max-width: 300px) 100vw, 300px" /></a></figure></div>


<p>English and Spanish are present, and their translation files are compiled in this case. After the execution of the above command, the structure for the German language was created. The new file and folders are the ones within the highlighted yellow square.</p>



<h2 class="wp-block-heading">My step-by-step technique</h2>



<p>As said before, my project has multiple applications, concretely eight applications. It is a side project I use for myself, but it is still not publically available for other users. I set up the internationalization and started with the home page. I added the files only for the main application, translated the templates to Spanish, and then jumped to the views.</p>



<p>In the templates, I use <strong>trans</strong>, so it is pretty simple to have it working. Just the i18n library at the top of the template and start creating your translations:</p>



<pre class="wp-block-code"><code lang="django" class="language-django">{% extends 'base.html' %}

{% load i18n %}

...

&lt;h1 class="display-5 fw-bold"&gt;{% trans "bukios_home_welcome_title" %}&lt;/h1&gt;
...</code></pre>



<p>Don&#8217;t forget to add the <strong>load i18n</strong> in every template where you use the translation functions. It is not imported from other templates, although you <strong>extend</strong> from a base template or <strong>include</strong> them.</p>



<p>After I added a few keys, I tested the feature in isolation, without looking at texts from other applications or features. One feature translation and application at a time. The command that you will need to have the translations working in your templates and not seeing the keys is:</p>



<pre class="wp-block-code"><code lang="bash" class="language-bash">django-admin compilemessages -i venv</code></pre>



<p>This allows you to have both languages working, but how? I moved the English texts to the translation file, checked it worked, and added the Spanish translation. Two steps, one seeing that everything continued working in English and a second step checking that all the things appeared in Spanish.</p>



<h3 class="wp-block-heading">First Issue: the templates</h3>



<p>In the templates, I found my first problem. The Spanish special characters like the accented vocals were not working using them directly. I looked into the different tutorials and although everybody was using them directly in the .po files, it didn&#8217;t work for me. Each time that I added an accented vocal or other Spanish special characters, the compilation worked but when I tried to look into the web to see the changes I got an error similar to this one:</p>



<figure class="wp-block-image size-large"><a href="https://jenaiz.com/wp-content/uploads/2023/01/image-1.png"><img decoding="async" width="1024" height="286" src="https://jenaiz.com/wp-content/uploads/2023/01/image-1-1024x286.png" alt="Unicode error in Django" class="wp-image-696" srcset="https://jenaiz.com/wp-content/uploads/2023/01/image-1-1024x286.png 1024w, https://jenaiz.com/wp-content/uploads/2023/01/image-1-300x84.png 300w, https://jenaiz.com/wp-content/uploads/2023/01/image-1-768x214.png 768w, https://jenaiz.com/wp-content/uploads/2023/01/image-1-1536x429.png 1536w, https://jenaiz.com/wp-content/uploads/2023/01/image-1.png 1856w" sizes="(max-width: 1024px) 100vw, 1024px" /></a></figure>



<p>The hint the above error:</p>


<div class="wp-block-image">
<figure class="aligncenter size-medium"><a href="https://jenaiz.com/wp-content/uploads/2023/01/image-2.png"><img decoding="async" width="300" height="64" src="https://jenaiz.com/wp-content/uploads/2023/01/image-2-300x64.png" alt="Unicode issue with Spanish characters" class="wp-image-697" srcset="https://jenaiz.com/wp-content/uploads/2023/01/image-2-300x64.png 300w, https://jenaiz.com/wp-content/uploads/2023/01/image-2-768x164.png 768w, https://jenaiz.com/wp-content/uploads/2023/01/image-2.png 834w" sizes="(max-width: 300px) 100vw, 300px" /></a></figure></div>


<p>a Unicode error with some text, probably something like &#8220;&#8230;etición in&#8230;&#8221;</p>



<p>After looking a bit I found that &#8220;ó&#8221; was not working but &#8220;&amp;oacute;&#8221; was. So, I decided to encode all the special characters. It was not a big deal for Spanish, but it may be impossible for Chinese.</p>



<h3 class="wp-block-heading">Second issue: the  messages</h3>



<p>After I finished with the templates, I continued with views.py. They don&#8217;t use the same function, there you use <strong>gettext(&#8230;) </strong>or similar.</p>



<p>But it took me some time to figure out that <strong>gettext</strong> was not able to translate &amp;aacute; to á, it was not able to decode the HTML codes in the translation files.</p>



<h2 class="wp-block-heading">The solution, or how to set the charset</h2>



<p>The problem is clear, but let me briefly summarize it. In Spanish, German, or other languages the special characters are not shown as expected in the views. The HTML code encoding is not a nice or right solution. The gettext function does not decode the HTML codes at all, they are not solving partially your problem. Why partially? because for example I tried something like &#8216;你好&#8217; (that means &#8216;hello&#8217;) and it didn&#8217;t show up either. </p>



<p>The problem is not happening at the compilation time, it is happening at the use time. In the browser, you will see something similar to the below screenshot.</p>



<figure class="wp-block-image size-large"><a href="https://jenaiz.com/wp-content/uploads/2023/01/image.png"><img loading="lazy" decoding="async" width="1024" height="831" src="https://jenaiz.com/wp-content/uploads/2023/01/image-1024x831.png" alt="Unicode error in Django, complete stacktrace" class="wp-image-684" srcset="https://jenaiz.com/wp-content/uploads/2023/01/image-1024x831.png 1024w, https://jenaiz.com/wp-content/uploads/2023/01/image-300x244.png 300w, https://jenaiz.com/wp-content/uploads/2023/01/image-768x624.png 768w, https://jenaiz.com/wp-content/uploads/2023/01/image-1536x1247.png 1536w, https://jenaiz.com/wp-content/uploads/2023/01/image.png 1872w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></a></figure>



<p>The deep analysis was giving me some insides, the Unicode was wrong at some point. I tried at the application level, at the HTML level, and at the translation files level. I failed at all the levels with all my changes. My setup was correct, the same as in every tutorial, with no differences. Almost, no differences.</p>



<p>How did I find out the error? After reading, testing, and failing so many times around the Unicode, the encoding and decoding of the <strong>gettext</strong> function, I decided to debug the library and find out how it worked, where it was failing, and what was the code doing at that level.</p>



<h3 class="wp-block-heading">The clue</h3>



<p>I put a breakpoint at line 458 in the gettext.py file, stop the application there and execute some tests when the key had some accented vocals</p>



<pre class="wp-block-code"><code class="">/usr/local/Cellar/python@3.9/3.9.16/Frameworks/Python.framework/Versions/3.9/lib/python3.9/gettext.py</code></pre>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><a href="https://jenaiz.com/wp-content/uploads/2023/01/image-3.png"><img loading="lazy" decoding="async" src="https://jenaiz.com/wp-content/uploads/2023/01/image-3-1024x269.png" alt="Debugging gettext library" class="wp-image-699" width="512" height="135" srcset="https://jenaiz.com/wp-content/uploads/2023/01/image-3-1024x269.png 1024w, https://jenaiz.com/wp-content/uploads/2023/01/image-3-300x79.png 300w, https://jenaiz.com/wp-content/uploads/2023/01/image-3-768x202.png 768w, https://jenaiz.com/wp-content/uploads/2023/01/image-3.png 1202w" sizes="auto, (max-width: 512px) 100vw, 512px" /></a></figure></div>


<p>The key was the <strong>charset</strong> value, debugging I found out that it was &#8216;ascii&#8217;, not UTF-8 as I was expecting and as I was setting up in a lot of places. The value that I was adding after reading some StackOverflow similar cases, was not properly coming here. I tried with UTF-8.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><a href="https://jenaiz.com/wp-content/uploads/2023/01/charset-utf8.jpg"><img loading="lazy" decoding="async" src="https://jenaiz.com/wp-content/uploads/2023/01/charset-utf8-1024x278.jpg" alt="" class="wp-image-714" width="512" height="139" srcset="https://jenaiz.com/wp-content/uploads/2023/01/charset-utf8-1024x278.jpg 1024w, https://jenaiz.com/wp-content/uploads/2023/01/charset-utf8-300x82.jpg 300w, https://jenaiz.com/wp-content/uploads/2023/01/charset-utf8-768x209.jpg 768w, https://jenaiz.com/wp-content/uploads/2023/01/charset-utf8.jpg 1218w" sizes="auto, (max-width: 512px) 100vw, 512px" /></a></figure></div>


<p>Eureka! It worked!</p>



<p>At this point, I had some success, but still no solution.</p>



<h3 class="wp-block-heading">The solution</h3>



<p>I added UTF-8 in multiple ways and in multiple files, but nothing looked working. I analyzed the tutorials one by one, reviewing again the setup, the views, the imports, using <strong>ugettext</strong> (deprecated and removed in Django 3), u&#8217;your text&#8217;, &#8230;</p>



<p>Nothing worked.</p>



<p>I took <a href="http://Django internationalization tutorial https://docs.djangoproject.com/en/4.1/topics/i18n/translation/">the official documentation</a>, and read it from the beginning, carefully, but I wasn&#8217;t able to find how to fix it.</p>



<p>At this point, I had created the English files, and the Spanish files for the complete application all the translations worked fine for the templates, but not the translations coming from the views using <strong>gettext</strong>. Still, I wasn&#8217;t translating the javascript files, but those use gettext too. They could wait until I fixed the issue.</p>



<p>And then, I found a <a href="https://medium.com/analytics-vidhya/django-translations-working-example-70457372bd72">post on medium</a>, Giorgi was doing the same as others, he was using English, Russian, and Georgian. But there was one thing different, he showed a .po file with something like a header, something similar to the below code.</p>



<pre class="wp-block-code"><code class="">msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2020-01-28 14:45+0400\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME &lt;EMAIL@ADDRESS&gt;\n"
"Language-Team: LANGUAGE &lt;LL@li.org&gt;\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n!=1);\n"</code></pre>



<p>Look at the Content-Type line, there is a charset set to UTF-8 there.</p>



<p>I tried and the magic happens. The accented characters, the opening of the exclamations, our famous ñ,&#8230; all of them started to show up. </p>



<p>The key is the line where the charset is set, I tested removing the others and the translation issue still worked.</p>



<h2 class="wp-block-heading">Conclusion</h2>



<p>Look for a tutorial that fits well for you and start setting up i18n in Django. Also get the <a href="https://docs.djangoproject.com/en/4.1/topics/i18n/translation/">documentation</a> close to you and read it. My recommendation is that you should use the special utility libraries from Django at the beginning, until you learn all the details, which will avoid you problems. As you have read, I have described you how to set up the charset to UTF-8, for your translation files and how to dive deep to understand what is going on in the libraries. </p>



<p>In my case, not using the <strong>makemessages</strong> utility at the beginning, letting the library to set those values automatically was the issue. More if you think about it, that in all the tutorials I was reading, they clean up the files and remove these key values completely. In some tutorials, you are able to see some dots (&#8216;&#8230;&#8217;), but it is difficult to figure out what they mean.</p>



<p>Remember to add the charset in the way that I have described. It was not easy to find out how to add it to the translation files.</p>



<p>There are more values that I have kept for now, I will fine-tune them later as soon as I see the impact that some of them will have in the different libraries. For the time being, they are like documentation of what it is possible to set up in the translation files.</p>
<p>The post <a href="https://jenaiz.com/2023/01/django-i18n-without-unicode-issues/">Django i18n without Unicode issues</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Understanding how the Python garbage collector works</title>
		<link>https://jenaiz.com/2022/02/understanding-how-the-python-garbage-collector-works/</link>
		
		<dc:creator><![CDATA[jenaiz]]></dc:creator>
		<pubDate>Fri, 25 Feb 2022 15:32:16 +0000</pubDate>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[garbage collector]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[python]]></category>
		<guid isPermaLink="false">https://jenaiz.com/?p=504</guid>

					<description><![CDATA[<p>Understanding more about Reference Counting and Generational Garbage Collector, the garbage collector algorithms used by CPython.</p>
<p>The post <a href="https://jenaiz.com/2022/02/understanding-how-the-python-garbage-collector-works/">Understanding how the Python garbage collector works</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>As most of you probably know, Python is a dynamic programming language with different implementations. The CPython implementation manages the memory using an implementation of Reference Counting and Generational Garbage Collector. It is important to mention that other implementations of Python like PyPy, IronPython,&#8230; can use different strategies. </p>



<p>Did you know those strategies were used for memory allocation? Do you know whether it&#8217;s possible to disable the garbage collector or not?</p>



<hr class="wp-block-separator is-style-dots"/>



<p>From version 2, Python started to use two different strategies for memory allocation reference counting and generation garbage collection. Prior to that, the only strategy used was reference counting.</p>



<h2 class="wp-block-heading" id="reference-counting">Reference Counting</h2>



<p>In this technique, it keeps the counting of the references to an object. When a new reference is created, the counter gets incremented by one, when we remove a reference the counter decrements by one.</p>



<p>Of course, every object created in Python needs to keep the counter updated continuously. In case the reference counter is 0, the object is eligible for being garbage collected.</p>



<p>Let&#8217;s create three references to the object &#8220;my object&#8221; and check the reference count of the object.</p>



<pre class="wp-block-preformatted has-cyan-bluish-gray-background-color has-background"><strong>&gt;&gt;&gt; import sys
&gt;&gt;&gt; a = "my object"
&gt;&gt;&gt; b = a
&gt;&gt;&gt; c = a
&gt;&gt;&gt; id(a)</strong>
4377801904
<strong>&gt;&gt;&gt; id(b)
</strong>4377801904
<strong>&gt;&gt;&gt; id(c)
</strong>4377801904
<strong>&gt;&gt;&gt; sys.getrefcount(a)
</strong>4</pre>



<p><strong>id(&#8230;)</strong> shows up the unique integer that represents the object of the reference. And <strong>sys.getrefcount(a)</strong> returns the reference count of the object (&#8220;my object&#8221;). The count returned is generally one higher than we expected, because it includes the temporary reference as an argument to getrefcount().</p>



<p>Below there is a representation of the above code.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><a href="https://jenaiz.com/wp-content/uploads/2021/12/image-1.png"><img loading="lazy" decoding="async" src="https://jenaiz.com/wp-content/uploads/2021/12/image-1.png" alt="" class="wp-image-516" width="455" height="246" srcset="https://jenaiz.com/wp-content/uploads/2021/12/image-1.png 910w, https://jenaiz.com/wp-content/uploads/2021/12/image-1-300x162.png 300w, https://jenaiz.com/wp-content/uploads/2021/12/image-1-768x415.png 768w" sizes="auto, (max-width: 455px) 100vw, 455px" /></a><figcaption>Three references are linked to the object</figcaption></figure></div>


<p>If we remove one reference, the counter gets decremented by one.</p>



<pre class="wp-block-preformatted has-cyan-bluish-gray-background-color has-background"><strong>&gt;&gt;&gt; del(c)
&gt;&gt;&gt; sys.getrefcount(a)</strong>
3</pre>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><a href="https://jenaiz.com/wp-content/uploads/2021/12/image-2.png"><img loading="lazy" decoding="async" src="https://jenaiz.com/wp-content/uploads/2021/12/image-2.png" alt="" class="wp-image-518" width="441" height="247" srcset="https://jenaiz.com/wp-content/uploads/2021/12/image-2.png 882w, https://jenaiz.com/wp-content/uploads/2021/12/image-2-300x168.png 300w, https://jenaiz.com/wp-content/uploads/2021/12/image-2-768x430.png 768w" sizes="auto, (max-width: 441px) 100vw, 441px" /></a><figcaption>One reference removed</figcaption></figure></div>


<p>If we remove all the references, the counter will be 0 and the object will be eligible for the garbage collector to erase it.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><a href="https://jenaiz.com/wp-content/uploads/2021/12/image-3.png"><img loading="lazy" decoding="async" src="https://jenaiz.com/wp-content/uploads/2021/12/image-3-1024x413.png" alt="" class="wp-image-519" width="512" height="207" srcset="https://jenaiz.com/wp-content/uploads/2021/12/image-3-1024x413.png 1024w, https://jenaiz.com/wp-content/uploads/2021/12/image-3-300x121.png 300w, https://jenaiz.com/wp-content/uploads/2021/12/image-3-768x309.png 768w, https://jenaiz.com/wp-content/uploads/2021/12/image-3.png 1266w" sizes="auto, (max-width: 512px) 100vw, 512px" /></a><figcaption>No more references exist</figcaption></figure></div>


<p>Something curious is that the common values had more references count than I expected. This is because others reference them at the start-up of the interpreter. For example, I created a reference to <strong>1</strong> and found out a few hundred references count to the object. My recommendation is to create a special number or string, that will help you to understand the use of the <strong>getrefcount(&#8230;)</strong> method.</p>



<pre class="wp-block-preformatted has-cyan-bluish-gray-background-color has-background"><strong>&gt;&gt;&gt; h = 1
&gt;&gt;&gt; sys.getrefcount(h)
</strong>601
<strong>&gt;&gt;&gt; h = 3.14151692
&gt;&gt;&gt; sys.getrefcount(h)
</strong>2</pre>



<p>Also, if you create two objects with the same value, they don&#8217;t get the same ID, because they are not the same object. You can check their unique IDs and the reference count of the objects.</p>



<pre class="wp-block-preformatted has-cyan-bluish-gray-background-color has-background"><strong>&gt;&gt;&gt; a = 1234
&gt;&gt;&gt; b = 1234
&gt;&gt;&gt; id(a)
</strong>4484904240
<strong>&gt;&gt;&gt; id(b)
</strong>4484904080
<strong>&gt;&gt;&gt; sys.getrefcount(b)
</strong>2
<strong>&gt;&gt;&gt; sys.getrefcount(a)
</strong>2</pre>



<p>A benefit of using reference count is the eligibility to erase an object from memory as soon as it has no references.</p>



<p>It also has some drawbacks. It can be really inefficient, particularly in a naive multi-threaded implementation. And it is not able to handle objects with circular references. For those cases, Python applies a second algorithm called generational garbage collection.</p>



<h2 class="wp-block-heading" id="generational-garbage-collection">Generational Garbage Collection</h2>



<p>This algorithm divides the objects into different generations based on time allocation. And it can apply different policies to each generation.</p>



<p>Python creates three generations at the start-up of the application. New objects go to the first generation, if they survive the recollection, the algorithm moves them to the second generation. The same will happen in this generation, the objects are collected or moved to the third generation. In that generation, the objects will stay until the program ends.</p>



<p>Each generation has a threshold, when the list of objects exceeds the threshold, Python runs the garbage collection process.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><a href="https://jenaiz.com/wp-content/uploads/2021/12/image-4.png"><img loading="lazy" decoding="async" width="1024" height="333" src="https://jenaiz.com/wp-content/uploads/2021/12/image-4-1024x333.png" alt="" class="wp-image-520" srcset="https://jenaiz.com/wp-content/uploads/2021/12/image-4-1024x333.png 1024w, https://jenaiz.com/wp-content/uploads/2021/12/image-4-300x98.png 300w, https://jenaiz.com/wp-content/uploads/2021/12/image-4-768x250.png 768w, https://jenaiz.com/wp-content/uploads/2021/12/image-4-1536x500.png 1536w, https://jenaiz.com/wp-content/uploads/2021/12/image-4-2048x667.png 2048w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></a><figcaption>Generational Garbage Collection in three steps</figcaption></figure></div>


<p>One of the drawbacks of this technique is that usually fails to remove long-living garbage, although they do a good job with the newest objects.</p>



<h3 class="wp-block-heading" id="is-it-possible-to-disable-the-garbage-collector-in-python">Is it possible to disable the garbage collector in Python?</h3>



<p>It is possible to disable the second algorithm, the generational garbage collector, but it is not possible to disable the reference count algorithm. </p>



<p>Below there are a few methods from the gc module that can help you.</p>



<pre class="wp-block-preformatted has-cyan-bluish-gray-background-color has-background"><strong>&gt;&gt;&gt; import gc
&gt;&gt;&gt; gc.isenabled()
</strong>True
<strong>&gt;&gt;&gt; gc.disable()
&gt;&gt;&gt; gc.isenabled()
</strong>False</pre>



<p>Disabling Python generational garbage collector will not show you less memory use in your application, because Python generally doesn’t release memory back to the underlying operating system.</p>



<p>In case you want to dive deep into disabling the garbage collector, I recommend you to take a look at the post of the <a href="https://instagram-engineering.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172" target="_blank" rel="noreferrer noopener">Instagram Engineering team</a>, they were doing some experiments with the garbage collector and discovered some side effects with the <strong>disable()</strong> method for some third-party libraries. </p>



<h2 class="wp-block-heading" id="conclusion">Conclusion</h2>



<p>Python uses two strategies for memory management, reference counting and generational garbage collector for cyclical dependencies. The second one is an optional garbage collector that is possible to disable. It is possible to take a look at the reference count of the objects, change the thresholds of the generations, and a few things more. I recommend you to take a look at the <a href="https://docs.python.org/3/library/gc.html">gc module</a>, the <a href="https://docs.python.org/3/library/sys.html">sys module</a>, or the garbage collector <a href="https://devguide.python.org/garbage_collector/">design documentation</a>.</p>



<center>
<iframe style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//rcm-eu.amazon-adsystem.com/e/cm?lt1=_blank&amp;bc1=000000&amp;IS2=1&amp;bg1=FFFFFF&amp;fc1=000000&amp;lc1=0000FF&amp;t=jenaizcom-21&amp;language=es_ES&amp;o=30&amp;p=8&amp;l=as4&amp;m=amazon&amp;f=ifr&amp;ref=as_ss_li_til&amp;asins=0471941484&amp;linkId=94f476bff235815ae5b4ea9939ba8cd0"></iframe>
<iframe style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//rcm-eu.amazon-adsystem.com/e/cm?lt1=_blank&amp;bc1=000000&amp;IS2=1&amp;bg1=FFFFFF&amp;fc1=000000&amp;lc1=0000FF&amp;t=jenaizcom-21&amp;language=es_ES&amp;o=30&amp;p=8&amp;l=as4&amp;m=amazon&amp;f=ifr&amp;ref=as_ss_li_til&amp;asins=1420082795&amp;linkId=fe69ef0c352335c72775d994bbc04b07"></iframe>
<iframe style="width:120px;height:240px;" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" src="//rcm-eu.amazon-adsystem.com/e/cm?lt1=_blank&amp;bc1=000000&amp;IS2=1&amp;bg1=FFFFFF&amp;fc1=000000&amp;lc1=0000FF&amp;t=jenaizcom-21&amp;language=es_ES&amp;o=30&amp;p=8&amp;l=as4&amp;m=amazon&amp;f=ifr&amp;ref=as_ss_li_til&amp;asins=1449355730&amp;linkId=f025501f77cf847afb8d6edafbde7d19"></iframe> </center>



<script async="" src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js?client=ca-pub-8157441291394108" crossorigin="anonymous"></script>
<ins class="adsbygoogle" style="display:block; text-align:center;" data-ad-layout="in-article" data-ad-format="fluid" data-ad-client="ca-pub-8157441291394108" data-ad-slot="4301203557"></ins>
<script>
     (adsbygoogle = window.adsbygoogle || []).push({});
</script>
<p>The post <a href="https://jenaiz.com/2022/02/understanding-how-the-python-garbage-collector-works/">Understanding how the Python garbage collector works</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How to fix a network connection error in a Droplet￼</title>
		<link>https://jenaiz.com/2022/01/how-to-fix-a-network-connection-error-in-a-droplet/</link>
		
		<dc:creator><![CDATA[jenaiz]]></dc:creator>
		<pubDate>Mon, 31 Jan 2022 09:51:24 +0000</pubDate>
				<category><![CDATA[linux]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[digital ocean]]></category>
		<category><![CDATA[droplet]]></category>
		<category><![CDATA[server]]></category>
		<guid isPermaLink="false">https://jenaiz.com/?p=550</guid>

					<description><![CDATA[<p>After losing the network connection in a Droplet, here are some notes about how to restore the system.</p>
<p>The post <a href="https://jenaiz.com/2022/01/how-to-fix-a-network-connection-error-in-a-droplet/">How to fix a network connection error in a Droplet￼</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>Last Friday, my <a href="https://docs.digitalocean.com/products/droplets/" target="_blank" rel="noreferrer noopener">Droplet</a> lost the connection to the internet after a reboot. I tried to connect multiple times with SSH and it didn’t work. I checked the monitoring in DigitalOcean but I wasn’t able to see what was going on. Then, after a few tries and a reset of the root password, I was able to log into the droplet using the Recovery Console.&nbsp;</p>



<p>There was no internet connection, the <strong>eth0</strong> network interface was removed from the network interfaces and I discovered that there were some packages missing related to the network.</p>



<p>I am writing this post from a Post-Morten perspective. The details here are not something that I reproduced again to check that all the steps are perfectly fine, this is more a “how I solved, and I hope it will help you” than a step-by-step recipe. But believe it or not, after hours of trying things, the solution was easier than expected.</p>



<hr class="wp-block-separator"/>



<h2 class="wp-block-heading" id="what-was-happening-what-was-the-error-that-you-would-see"><strong>What was happening? What was the error that you would see?</strong></h2>



<p>The droplet lost the connection to the network, it was not responding to any request, the public IP was configured in DigitalOcean, but no domain responded, nor the IP address. Identifying the scenario wasn’t a big deal, you don’t need to be an expert sysadmin to notice it.</p>



<p>The only way to connect with my droplet was by using the “Recovery Console” nothing else worked. So I reset my root password and I logged into the droplet. The console only works with a US keyboard, it was really really slow, and writing or copying &amp; pasting something or editing a file was a pain. Besides that, when I tried to change the keyboard, I figured out that some programs were missing.</p>



<h2 class="wp-block-heading" id="how-did-it-happen"><strong>How did it&nbsp;happen?</strong></h2>



<p>My best guess is that I accidentally delete some packages or they got deleted. Reading about what caused the problem for others most of them described some apt-get package purge that accidentally deleted the network tools. But others described that the issue appeared after the reboot because some packages were deleted without knowing the root cause of the deletion.</p>



<p>I thought about the last things that I did in the droplet and I remembered an upgrade of Python from version 3.8 to 3.9 and some clean-up of other versions in the droplet. I remembered because I got some troubles with Apache, mod_wsgi, pip, and Django application not being able to read the correct version of Python. Although the story would be for another post, I can say that I fixed the installation at the end, but I remember to make an <strong>apt-get purge</strong>. I guess that was the root of my Friday nightmare.</p>



<p>Anyways, the issue only appeared after the reboot of the machine, I didn&#8217;t notice anything before that.</p>



<h2 class="wp-block-heading" id="why-was-it-so-painful-and-why-did-i-decide-to-write-it-down"><strong>Why was it so painful? And why did I decide to write it&nbsp;down?</strong></h2>



<p>I thought I had lost all the data in the Droplet. No easy solution to make a copy of the data at least without fixing the network interface. And then I realized that most of the tools that could help me were missing, which made it more difficult. All the posts that I was able to find told me how to solve it only partially because the final step always required a tool that was somehow missing in my system.</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>Missing tools: netplan, networking, ifconfig, cloud-init,&nbsp;…</p></blockquote>



<p>The only tool available for me was <a href="https://www.cyberciti.biz/faq/linux-ip-command-examples-usage-syntax/" target="_blank" rel="noreferrer noopener">IP</a>.</p>



<hr class="wp-block-separator is-style-dots"/>



<h2 class="wp-block-heading" id="nothing-worked"><strong>Nothing worked…</strong></h2>



<p>The first thing was to discover what was happening. I checked that the network interface <strong>eth0</strong> was not showing up, and I found <a href="https://www.digitalocean.com/community/questions/the-network-configuration-is-not-working-on-a-droplet-after-reboot" target="_blank" rel="noreferrer noopener">a post</a> in the forums where they described an issue similar to mine.</p>



<pre class="wp-block-code"><code lang="bash" class="language-bash">sudo ip link set eth0 up</code></pre>



<p><strong>“Cannot find device “eth0”</strong></p>



<pre class="wp-block-code"><code lang="bash" class="language-bash">dmesg | grep -i eth</code></pre>



<p>It gave me something similar to these messages errors:</p>



<p><strong>virtio<em>net virtio0 ens3: renamed from eth0<br>virtio</em>net virtio0 eth0: renamed from ens3</strong></p>



<p>The same that another person described <a href="https://www.digitalocean.com/community/questions/no-internet-connection-after-droplet-reboot" target="_blank" rel="noreferrer noopener">here</a>.</p>



<p>My system is an Ubuntu Server 20.04 LTS, so the details of the question could help me to set up the eth0 interface, but how the interface was waking up didn’t fix the issue. Later, I compared with a setup really working and there were some details different.&nbsp;</p>



<p>In all the cases, the solutions described only worked temporarily because the solution disappeared after the next reboot.&nbsp;</p>



<p>All the solutions that I have applied need to be translated to use the “ip” tool, it looks like “ifconfig” was removed a long time ago as a default package.</p>



<p>One thing that I tried and failed:</p>



<pre class="wp-block-code"><code lang="bash" class="language-bash">ip a add &lt;PUBLIC_IP_ADDRESS>/&lt;NETMASK> dev eth0
ip link set dev eth0 up</code></pre>



<p>I also modified the <strong>/etc/network/interfaces</strong> file, adding the below information:</p>



<pre class="wp-block-code"><code lang="" class="">iface eth0 inet static
     address &lt;PUBLIC_IP_ADDRESS&gt;
     netmask &lt;NETMASK&gt;
     gateway &lt;GATEWAY&gt;</code></pre>



<p>but I wasn’t able to execute:</p>



<pre class="wp-block-code"><code lang="bash" class="language-bash">sudo systemctl restart networking.service</code></pre>



<p>so I wasn’t able to apply the configuration. There it’s when I have discovered that “networking” was not installed.</p>



<p>Nothing worked for me.&nbsp;</p>



<p>Something strange was that the configuration files <strong>/etc/udev/rules.d/70-persistent-net.rules</strong> and <strong>/etc/netplan/50-cloud-init.yaml</strong> were there and well configured.</p>



<hr class="wp-block-separator is-style-dots"/>



<h2 class="wp-block-heading" id="how-did-i-finally-solve-it"><strong>How did I finally solve&nbsp;it?</strong></h2>



<p>At some point in the long process, I found something interesting in the DigitalOcean <a href="https://docs.digitalocean.com/products/droplets/resources/recovery-iso/">documentation</a>: You can restart your Droplet from a Recovery ISO and then have access to your Droplet hard disk. That was key to find the solution.</p>



<p>I followed the process to start up the Droplet from the Recovery ISO. Then I connected with the Droplet via SSH and then I started to work remotely.&nbsp;</p>



<p>I added a <a href="https://stackoverflow.com/a/54460886/211149" target="_blank" rel="noreferrer noopener">nameserver</a> editing the below file:</p>



<pre class="wp-block-code"><code lang="bash" class="language-bash">sudo vim /etc/resolv.conf</code></pre>



<p>&nbsp;I used one of Google well known DNS servers:</p>



<pre class="wp-block-code"><code lang="vim" class="language-vim">nameserver 8.8.8.8</code></pre>



<p>And then, I mounted my droplet hard drive, as described on <a href="https://stackoverflow.com/a/54460886/211149" rel="noreferrer noopener" target="_blank">StackOverflow</a>:</p>



<pre class="wp-block-code"><code lang="bash" class="language-bash">sudo mount --bind /dev /&lt;chrootlocation&gt;/dev
sudo mount --bind /proc /&lt;chrootlocation&gt;/proc
sudo mount --bind /sys /&lt;chrootlocation&gt;/sys
sudo cp /etc/resolv.conf /&lt;chrootlocation&gt;/etc/resolv.conf
sudo chroot /&lt;chrootlocation&gt;</code></pre>



<p>In my case &lt;chrootlocation&gt; was “mnt”.</p>



<p>After that, I updated and upgraded apt-get:</p>



<pre class="wp-block-code"><code lang="bash" class="language-bash">apt-get update&nbsp;
apt-get upgrade&nbsp;</code></pre>



<p>I am not sure if I would recommend the latest one.</p>



<p>After that, I decided to install all the tools that I found missing and could help me to fix the issue:</p>



<pre class="wp-block-code"><code lang="bash" class="language-bash">apt-get install netplan cloud-init ufw landscape-common</code></pre>



<p>When I felt good enough, I decided to stop the Droplet, I removed the Recovery ISO and set it up to start from my hard drive again.</p>



<p>When the droplet started up, the network was restored and all was working normally. I was able to connect via SSH and my domains were working as before.</p>



<p>My thought here is that the configuration files were properly configured for all the tools, but the tools were missing. When the tools were restored, the system started to work again.</p>



<h2 class="wp-block-heading" id="lessons-learned">Lessons learned</h2>



<p>Back up your server often. This is a cheap &#8220;production&#8221; server, where I have my blog and some Python applications running, and my strong recommendation is not to have this kind of setup. </p>



<p>One server with all the stuff is not a good idea. I do it because it is an easy way to play with things, but from time to time these funny stories happen.</p>
<p>The post <a href="https://jenaiz.com/2022/01/how-to-fix-a-network-connection-error-in-a-droplet/">How to fix a network connection error in a Droplet￼</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>MongoDB IO performance</title>
		<link>https://jenaiz.com/2015/12/mongodb-io-performance/</link>
		
		<dc:creator><![CDATA[jenaiz]]></dc:creator>
		<pubDate>Tue, 01 Dec 2015 14:21:51 +0000</pubDate>
				<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[mongodb]]></category>
		<category><![CDATA[nosql]]></category>
		<category><![CDATA[performance]]></category>
		<guid isPermaLink="false">http://jenaiz.com/?p=214</guid>

					<description><![CDATA[<p>It was not the first time that we saw the problem, and it was not the first time that we cannot figure out what was</p>
<p>The post <a href="https://jenaiz.com/2015/12/mongodb-io-performance/">MongoDB IO performance</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>It was not the first time that we saw the problem, and it was not the first time that we cannot figure out what was creating the error. But today, I want to share the experience.</p>
<p>A little bit of the history, our production system is a <a href="https://docs.mongodb.org/manual/replication/">replica set</a> of three nodes. When the issue is happening our system responses slow, in such situation what you feel is that the web is <em>down</em>, because you don&#8217;t receive answer in quite some time for some requests.</p>
<p>You know that the system is fixable easily because after you restart the node involved you get the system up and running normally again. The problem is that you have to delete all the data in the replica affected, and start a synchronisation from scratch. The first time was happening in the master node, the last day was happening in the slave node (a different machine).</p>
<p>Something good was that the last day when it was happening we was testing <a href="http://www.newrelic.com">Newrelic</a> and we got new information, not so much but something more.&nbsp;Some pictures that I have extracted from our Newrelic monitoring:</p>
<p>The CPU in use is growing a little in the point of the peaks:</p>
<p><a href="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.56.png"><img loading="lazy" decoding="async" class="wp-image-217 size-full aligncenter" src="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.56.png" alt="Screen Shot 2015-11-27 at 15.50.56" width="757" height="185" srcset="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.56.png 757w, https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.56-300x73.png 300w" sizes="auto, (max-width: 757px) 100vw, 757px" /></a></p>
<p>&nbsp;</p>
<p><a href="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.51.06.png"><img loading="lazy" decoding="async" class="wp-image-218 size-full aligncenter" src="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.51.06.png" alt="Screen Shot 2015-11-27 at 15.51.06" width="536" height="191" srcset="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.51.06.png 536w, https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.51.06-300x107.png 300w" sizes="auto, (max-width: 536px) 100vw, 536px" /></a></p>
<p>But the real problem is the disk IO (not the network IO, as you can see in the pictures):</p>
<p><a href="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.46.png"><img loading="lazy" decoding="async" class="wp-image-216 size-full aligncenter" src="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.46.png" alt="Screen Shot 2015-11-27 at 15.50.46" width="768" height="203" srcset="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.46.png 768w, https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.46-300x79.png 300w" sizes="auto, (max-width: 768px) 100vw, 768px" /></a></p>
<p>Here, there is a better graph about the problem with the IO and you can see that the writes was increasing in that time like the hell, using the 100% of the IO in the hard disk.</p>
<p><a href="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.15.png"><img loading="lazy" decoding="async" class="aligncenter wp-image-215 size-large" src="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.15-1024x634.png" alt="Screen Shot 2015-11-27 at 15.50.15" width="440" height="272" srcset="https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.15-1024x634.png 1024w, https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.15-300x186.png 300w, https://jenaiz.com/wp-content/uploads/2015/11/Screen-Shot-2015-11-27-at-15.50.15.png 1324w" sizes="auto, (max-width: 440px) 100vw, 440px" /></a></p>
<p>We don&#8217;t know the real reason, at the moment of write this post. Although, there is&nbsp;an <a href="https://jira.mongodb.org/browse/SERVER-2771">old issue</a>&nbsp;(fixed in 2.5.0), when mongod&nbsp;flushs the working memory to disk, it was having this particular problem.</p>
<p>I would like to write down all the things that we know in this point, maybe it will help us to find a solution in some point and clarify our theory of the <strong>&#8220;flushing data&#8221;</strong>:</p>
<ul>
<li>There was&nbsp;not so much users in that time in the website. The traffic and/or the consequences of that is not creating the issue, as far as we can see.</li>
<li>The issue was appearing when the system is not doing so much things.&nbsp;It is not the most relaxed time of our&nbsp;MongoDB infrastructure, but there is not so many things going on in that particular time.</li>
<li>The number of connections was growing from ~150 to 2000.</li>
<li>The response times in the queries was growing to seconds ~20s or more</li>
<li>If you create a graph with the response times of the collections, there is a log about &#8220;flushing data&#8221; (we use <a href="https://docs.mongodb.org/manual/core/mmapv1/">MMAPv1</a>)</li>
</ul>
<blockquote><p><em>015-11-22T10:07:45.645+0100 I STORAGE [DataFileSync] flushing mmaps took 426819ms for 615 files</em><br />
<em>2015-11-22T11:13:31.755+0100 I STORAGE [DataFileSync] flushing mmaps took 429132ms for 615 files</em></p></blockquote>
<ul>
<li>The node is still working for&nbsp;the master, but really slow for queries. It makes the web to be blocked because the driver doesn&#8217;t see the node down and it continues sending request to it.</li>
<li>There was one of the web nodes that was using all the heap memory in&nbsp;that particular period of time, and going back to normal after we fix the problem with node.</li>
<li>Something bad about the monitoring is that currently the instrumentation is not working for our version of MongoDB, that is because I cannot connect the web nodes with the database&nbsp;and add more information about the issue.</li>
</ul>
<p>We have tested the performance of the hard disk from the linux side and from the mongo side and we didn&#8217;t achieve any clue about what it&#8217;s wrong. Our results were successful in both case.</p>
<p>For the time been, we have switched the storage type from <a href="https://docs.mongodb.org/manual/core/mmapv1/">MMAPv1</a> to <a href="https://docs.mongodb.org/manual/core/wiredtiger/">WiredTiger</a>, to see if there is a different behaviour, in theory it should not flush data from memory to disk in that way.</p>
<p>If you&nbsp;have any suggestion, we have posted the same information in the&nbsp;<a href="https://groups.google.com/forum/?utm_medium=email&amp;utm_source=footer#!msg/mongodb-user/8I0Vbx-a8eU/-bTcjYtoGwAJ">MongoDB user group</a>. Or write a comment here&nbsp;:).</p>
<p>The post <a href="https://jenaiz.com/2015/12/mongodb-io-performance/">MongoDB IO performance</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>A programming language for AI</title>
		<link>https://jenaiz.com/2015/11/a-progamming-language-for-ai/</link>
		
		<dc:creator><![CDATA[jenaiz]]></dc:creator>
		<pubDate>Sun, 22 Nov 2015 09:27:01 +0000</pubDate>
				<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[deep learning]]></category>
		<category><![CDATA[ideas]]></category>
		<category><![CDATA[knowledge base]]></category>
		<guid isPermaLink="false">http://jenaiz.com/?p=204</guid>

					<description><![CDATA[<p>I am curious which programming language is more useful for Artificial Intelligence. &#8220;Choose the language that you are more proficient in&#8221;, it is not an option</p>
<p>The post <a href="https://jenaiz.com/2015/11/a-progamming-language-for-ai/">A programming language for AI</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>I am curious which programming language is more useful for Artificial Intelligence. <em>&#8220;Choose the language that you are more proficient in&#8221;,</em> it is not an option to me. Choose the right tool, for the right problem is better in my case.</p>
<p>I was looking in Quora and founding some results. <a href="https://www.quora.com/What-is-best-programming-language-for-Artificial-Intelligence-projects">&#8220;What is best programming language for Artificial Intelligence projects?&#8221;</a> is one of the most interesting, I was reading the answers from there. And the conclusion among the results is: Python (because it is fast to develop things and there are interesting libraries), C/C++ (because the speed and performance) or Java.</p>
<p>Taking a look to google, I found a tutorial written by Günter Neumann, from the German Research Center for Artificial Intelligence, entitled <a href="http://www.dfki.de/~neumann/publications/new-ps/ai-pgr.pdf">Programming Languages in Artificial Intelligence</a>. In the tutorial, you can read why functional programming languages and symbolic languages are more useful for AI and then you find an introduction to Lisp and a small part for Prolog.</p>
<p>It is a simple introduction to Lisp, but I couldn&#8217;t avoid to remember <a href="https://github.com/jenaiz/micro-lisp">µlisp</a> (an small Lisp interpreter that I did, based in another book <a href="http://buildyourownlisp.com/">Build Your Own Lisp</a>). I built a simple version of Lisp using C. In that point there were no standard libraries, you have to build them by yourself and I was wondering, if in that point you can start to create a language that it helps you to represent the world.</p>
<p>As always that was a crazy idea. Create a programming language that experts tell it is useful for artificial intelligence and build the standard libraries to represent part of the world that the system have to work with. As far as you go with the idea, you know that you cannot represent the complete real world with that approach, but&#8230; could you do a mix of implement part of the world with the programming language and part of the world based in the experience somehow? That was my thought, maybe there would be a way to do it.</p>
<p>By the way, my mind took me to start reading the new book of <a href="http://goodfeli.github.io/dlbook/">Deep Learning</a> by Ian Goodfellow, Aaron Courville and Yoshua Bengio. In the introduction, there is a reference to Cyc (<a href="http://dl.acm.org/citation.cfm?id=575523">Lenat and Guha, 1989</a>) and <em>knowledge base</em>.</p>
<blockquote><p>A computer can reason about statements in these formal languages automatically using logical inference rules. This is known as the <em>knowledge base</em> approach to artificial intelligence. None of these projects has lead to a major success. One of the most famous such projects is Cyc (Lenat and Guha, 1989)  <em>[<a href="http://goodfeli.github.io/dlbook/version-2015-11-18/contents/intro.html">extracted from the draft</a>]</em></p></blockquote>
<p>I am still thinking that it could work, because my approach is not to write every single rule of the world with the programming language, if not, to have some base using the language prepared for that specific problem like a DSL, but going further and <em>without any limitation from the language itself</em>. Either way, it is just an idea, I will continue reading the book from <a href="http://www.iro.umontreal.ca/~bengioy/yoshua_en/index.html">Yoshua Bengio</a>, about deep learning it looks really promising and I will take a look to the <a href="http://www.jfsowa.com/pubs/CycRev93.pdf">review</a> of the Lenat &amp; Guha book, maybe I can figure out more.</p>
<p>The post <a href="https://jenaiz.com/2015/11/a-progamming-language-for-ai/">A programming language for AI</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Deep learning in a large scale distributed system</title>
		<link>https://jenaiz.com/2015/06/deep-learning-in-a-large-scale-distributed-systems/</link>
		
		<dc:creator><![CDATA[jenaiz]]></dc:creator>
		<pubDate>Sun, 14 Jun 2015 10:39:42 +0000</pubDate>
				<category><![CDATA[machine learning]]></category>
		<category><![CDATA[papers]]></category>
		<category><![CDATA[deep learning]]></category>
		<category><![CDATA[DistBelief]]></category>
		<category><![CDATA[distributed systems]]></category>
		<category><![CDATA[google]]></category>
		<guid isPermaLink="false">http://jenaiz.com/?p=137</guid>

					<description><![CDATA[<p>Deep learning is interesting in many ways. But when you consider to do it in thousands of cores that can process millions of parameters, then the problem</p>
<p>The post <a href="https://jenaiz.com/2015/06/deep-learning-in-a-large-scale-distributed-systems/">Deep learning in a large scale distributed system</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>Deep learning is interesting in many ways. But when you consider to do it in thousands of cores that can process millions of parameters, then the problem is more interesting and complex at the same time.</p>
<p><figure style="width: 980px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" src="http://ep00.epimg.net/tecnologia/imagenes/2015/02/27/actualidad/1425053335_288538_1425142647_noticia_grande.jpg" alt="" width="980" height="653" /><figcaption class="wp-caption-text">Google Datacenter (via Google)</figcaption></figure></p>
<p>Google was doing an interesting experiment, training a deep network with millions of parameters in thousands of CPUs. The goal was to train very large datasets without to limit the form of the model.</p>
<p>The <a title="Large Scale Distributed Deep Networks" href="http://research.google.com/archive/large_deep_networks_nips2012.html" target="_blank" rel="noopener noreferrer">paper</a> describes the use of DistBelief, a framework created for distributed parallel computing applied to deep learning training. A collection of the features that the framework manage by itself are:</p>
<blockquote><p>The framework automatically parallelises computation in each machine using all available core, and manages communication, synchronisation and data transfer between machines during both training and inference.</p></blockquote>
<p>I couldn&#8217;t find too much information about it, only what it is written in the paper.</p>
<p>They have applied two algorithms: <a href="http://en.wikipedia.org/wiki/Stochastic_gradient_descent" target="_blank" rel="noopener noreferrer">SGD</a> (Stochastic Gradient Descent) and <a href="http://en.wikipedia.org/wiki/Limited-memory_BFGS" target="_blank" rel="noopener noreferrer">L-BFGS</a>. These algorithms usually works well, but they doesn&#8217;t scale with very large data sets. That is because they introduce some modifications to them. The paper gives you more details about the optimisations in both algorithms that you can find interesting.</p>
<p>I was found really interesting the idea of distributed parallel computing working for very large datasets  in such algorithms.</p>
<p>You can read <a href="http://research.google.com/archive/large_deep_networks_nips2012.html" target="_blank" rel="noopener noreferrer">&#8220;Large Scale Distributed Deep Networks&#8221;</a>, or if you are interested in the <a href="http://static.googleusercontent.com/media/research.google.com/en//archive/large_deep_networks_nips2012.pdf" target="_blank" rel="noopener noreferrer">pdf version</a>. Have fun!</p>
<p>The post <a href="https://jenaiz.com/2015/06/deep-learning-in-a-large-scale-distributed-systems/">Deep learning in a large scale distributed system</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Classifying documents using Apache Mahout</title>
		<link>https://jenaiz.com/2015/02/classifying-documents-using-apache-mahout/</link>
		
		<dc:creator><![CDATA[jenaiz]]></dc:creator>
		<pubDate>Sat, 21 Feb 2015 18:35:20 +0000</pubDate>
				<category><![CDATA[machine learning]]></category>
		<category><![CDATA[apache lucene]]></category>
		<category><![CDATA[apache mahout]]></category>
		<category><![CDATA[text classification]]></category>
		<guid isPermaLink="false">http://jenaiz.com/?p=121</guid>

					<description><![CDATA[<p>I wondered how to do some text classification with Java and Apache Mahout. Isabel Drost-Fromm gave a talk at the LuceneSolrRevolution Conference (Dublin &#8211; 2013) where she</p>
<p>The post <a href="https://jenaiz.com/2015/02/classifying-documents-using-apache-mahout/">Classifying documents using Apache Mahout</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>I wondered how to do some text classification with Java and <a href="http://mahout.apache.org/" target="_blank" rel="noopener noreferrer">Apache Mahout</a>. <a href="http://drost-fromm.de/" target="_blank" rel="noopener noreferrer">Isabel Drost-Fromm</a> gave a talk at the LuceneSolrRevolution Conference (Dublin &#8211; 2013) where she was speaking about the topic, of how Apache Mahout and Lucene could help you.</p>
<p>It is a good introduction to the topic. I enjoyed too much what was presented in the talk.</p>
<p>Lucene, Mahout, and Hadoop (only a little bit) sound really great for a talk about how to do text classifications.</p>
<p>The general idea behind the complete process to classify documents will follow the below steps:</p>
<blockquote><p>HTML &gt;&gt; <strong>Apache Tika</strong></p>
<p>Fulltext &gt;&gt; <strong>Lucene Analyzer</strong></p>
<p>Tokenstream &gt;&gt; <strong>FeatureVectorEnconder</strong></p>
<p>Vector &gt;&gt; <strong>Online Learner</strong></p></blockquote>
<p>Of course, Isabel was giving the advice of reusing the libraries that you have in your hands, taking an internal look at the algorithms used there, and improving them if you need them. As a first approach, it is really good for me to see how things work.</p>
<p>Mahout is a perfect library for machine learning, it was using map reduce to perfectly integrate with Hadoop (v1.0), although from April of 2014 they have decided to move forward:</p>
<blockquote><p>The Mahout community decided to move its codebase onto modern data processing systems that offer a richer programming model and more efficient execution than Hadoop MapReduce. (You can read that in there web site).</p></blockquote>
<p>At the end of the video, there is a recommendation to everyone to participate in the project: bug fixing, documentation, and reporting bugs. There are a lot of things to do in open-source projects always. If you are using the libraries there, I recommend you subscribe to the mailing lists if you are interested in the project.</p>
<p><iframe loading="lazy" title="Text Classification Powered by Apache Mahout and Lucene, Isabel Drost-Fromm, ASF/Nokia Gate 5" width="640" height="360" src="https://www.youtube.com/embed/tA9YMlafUyw?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe></p>
<p>I really recommend you to see the video if you have an interest in the field, she was giving a good talk about a good topic. You can take a look at the <a href="http://www.slideshare.net/lucenerevolution/lucene-mahout-drostfromm-copy">slides</a> too.</p>
<p>The post <a href="https://jenaiz.com/2015/02/classifying-documents-using-apache-mahout/">Classifying documents using Apache Mahout</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Two interesting books to start with Machine Learning</title>
		<link>https://jenaiz.com/2015/02/two-interesting-books-to-start-with-machine-learning/</link>
		
		<dc:creator><![CDATA[jenaiz]]></dc:creator>
		<pubDate>Wed, 18 Feb 2015 10:00:03 +0000</pubDate>
				<category><![CDATA[books]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[pattern recognition]]></category>
		<guid isPermaLink="false">http://jenaiz.com/?p=89</guid>

					<description><![CDATA[<p>There are a lot of books in the field of Machine Learning, just a fast search in&#160;Amazon&#160;gives you more than 25.ooo books. I wanted to</p>
<p>The post <a href="https://jenaiz.com/2015/02/two-interesting-books-to-start-with-machine-learning/">Two interesting books to start with Machine Learning</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>There are a lot of books in the field of Machine Learning, just a fast search in&nbsp;<a title="Machine learning books" href="http://www.amazon.com/s/ref=nb_sb_noss_1?url=search-alias%3Daps&amp;field-keywords=machine+learning" target="_blank" rel="noopener noreferrer">Amazon</a>&nbsp;gives you more than 25.ooo books. I wanted to filter all those books an choose the most useful. I was looking in google, quora and reading some post that I found around internet. There a lot of people giving a list of 10 &#8211; 20 books about machine learning, statistical learning, reinforcement learning&#8230; I just wanted to find the two interesting books to go into&nbsp;the field.</p>
<p>With these books, it is possible to&nbsp;learn general aspects about the topic and later go more in deep in the part that sounds more interesting.</p>
<p>&nbsp;</p>
<hr>
<p>&nbsp;</p>
<p><a href="https://jenaiz.com/wp-content/uploads/2015/02/Machine-Learning.jpg"><img loading="lazy" decoding="async" class="wp-image-200 size-full alignleft" src="https://jenaiz.com/wp-content/uploads/2015/02/Machine-Learning.jpg" alt="" width="160" height="160" srcset="https://jenaiz.com/wp-content/uploads/2015/02/Machine-Learning.jpg 160w, https://jenaiz.com/wp-content/uploads/2015/02/Machine-Learning-150x150.jpg 150w" sizes="auto, (max-width: 160px) 100vw, 160px" /></a></p>
<p><a href="http://www.amazon.com/gp/product/0070428077/ref=as_li_tl?ie=UTF8&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0070428077&amp;linkCode=as2&amp;tag=smethecod-20&amp;linkId=EJH5JJ6IWKSXZ4AE" rel="nofollow">Machine Learning</a><img loading="lazy" decoding="async" style="border: none !important; margin: 0px !important;" src="http://ir-na.amazon-adsystem.com/e/ir?t=smethecod-20&amp;l=as2&amp;o=1&amp;a=0070428077" alt="" width="1" height="1" border="0"></p>
<p>The <strong>&#8220;book&#8221;</strong> that everyone recommend as a good point to start, written by <a href="http://www.cs.cmu.edu/~tom/" target="_blank" rel="noopener noreferrer">Tom M. Mitchell</a>&nbsp;(professor in the Carnegie Mellon University).</p>
<p>This is an introduction book for the field. You don&#8217;t need to have previous knowledge in Machine Learning.</p>
<p>Some topics that you will find in the book: decision tree learning, artificial neural networks, bayesian learning, computational learning, genetic algorithms, reinforcement learning and more.</p>
<p>&nbsp;</p>
<hr>
<p>&nbsp;</p>
<p><a href="https://jenaiz.com/wp-content/uploads/2015/02/pattern-recognition.jpg"><img loading="lazy" decoding="async" class="wp-image-201 size-full alignright" src="https://jenaiz.com/wp-content/uploads/2015/02/pattern-recognition.jpg" alt="" width="160" height="160" srcset="https://jenaiz.com/wp-content/uploads/2015/02/pattern-recognition.jpg 160w, https://jenaiz.com/wp-content/uploads/2015/02/pattern-recognition-150x150.jpg 150w" sizes="auto, (max-width: 160px) 100vw, 160px" /></a><a href="http://www.amazon.com/gp/product/0387310738/ref=as_li_qf_sp_asin_il_tl?ie=UTF8&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0387310738&amp;linkCode=as2&amp;tag=smethecod-20&amp;linkId=CF3H4IESKKYZO3JX" target="_blank" rel="noopener noreferrer">Pattern Recognition and Machine Learning (Information Science and Statistics)</a></p>
<p>The author is <a href="http://research.microsoft.com/en-us/um/people/cmbishop/" target="_blank" rel="noopener noreferrer">Christopher M. Bishop</a>,&nbsp;a Distinguished Scientist at Microsoft Research Cambridge, where he leads the Machine Learning and Perception group</p>
<p>This book will give you a really good approach to the commonly used algorithms in Machine Learning.</p>
<p>&nbsp;</p>
<hr>
<p>&nbsp;</p>
<p>Both books are theoretical and will give you a good introduction. Of course there so many books in the area, some of then more practical, some about statistical learning&#8230; But I think it is good to have a simple point to start.</p>
<p>I have started with Tom M. Mitchell&#8217;s book. I will give you my impression when I have finished it.</p>
<p>The post <a href="https://jenaiz.com/2015/02/two-interesting-books-to-start-with-machine-learning/">Two interesting books to start with Machine Learning</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>The agile samurai</title>
		<link>https://jenaiz.com/2014/10/the-agile-samurai/</link>
		
		<dc:creator><![CDATA[jenaiz]]></dc:creator>
		<pubDate>Tue, 21 Oct 2014 08:00:13 +0000</pubDate>
				<category><![CDATA[books]]></category>
		<category><![CDATA[agile]]></category>
		<category><![CDATA[book]]></category>
		<category><![CDATA[project management]]></category>
		<guid isPermaLink="false">http://jenaiz.com/?p=7</guid>

					<description><![CDATA[<p>When you have done so many projects from scratch, using legacy code&#8230; different type of customers: banks, telecoms, retail, &#8230; in different type of companies</p>
<p>The post <a href="https://jenaiz.com/2014/10/the-agile-samurai/">The agile samurai</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>When you have done so many projects from scratch, using legacy code&#8230; different type of customers: banks, telecoms, retail, &#8230; in different type of companies (big, small, startups&#8230;), you always move to the next project thinking that you will do better the next time. But, how?</p>
<p>For me Scrum changed how things could be done better. Agile is what this <a href="https://pragprog.com/book/jtrap/the-agile-samurai" target="_blank">book</a> is about, but I, personally, feel that both are really connected. The names of the meetings or events are different but, how they are organize look similar.</p>
<p>The book is about how to execute your projects in a way that your customer feel more confident about the job that your are doing. It is not only about <strong>agile</strong>, it is about how to execute projects in a way that we can deal with changes and still have quality; having immediate feedback about the current status; how to be ready for production from the beginning.</p>
<p>Not all the customers are the same, not all the product owners are the same, not all the companies are the same, in conclusion: not all the XXX are the same.</p>
<p>I like the idea of Inception Desk. It is really good to have everyone in the team working in the big picture as an approach to start. As a mirror where everybody is looking how the project look for him and how the things are going to be. After that you can start and change the things later, if you need it.</p>
<p>In general the book is good for: to feel how it could be if you organize the project in a agile way; what are going to be the problems; how you could engage the customer/product owner; how the team should work; how testing and continous deployment should work; how transparent is going to be the status of the project; how you deal with the changes from the beginning&#8230; A lot of things together in a few pages :).</p>
<p>To know more, you should <a href="https://pragprog.com/book/jtrap/the-agile-samurai" target="_blank">read it</a>.</p>
<p>Thanks to <a href="https://twitter.com/cdiezgil" target="_blank">Carlos Díez</a> to lend me the book.</p>
<p>The post <a href="https://jenaiz.com/2014/10/the-agile-samurai/">The agile samurai</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Re-launch</title>
		<link>https://jenaiz.com/2014/10/re-launch/</link>
		
		<dc:creator><![CDATA[jenaiz]]></dc:creator>
		<pubDate>Mon, 06 Oct 2014 08:00:51 +0000</pubDate>
				<category><![CDATA[opinion]]></category>
		<category><![CDATA[hello]]></category>
		<guid isPermaLink="false">http://jenaiz.com/?p=4</guid>

					<description><![CDATA[<p>So many times, I find myself writing a list of articles to read, writing notes about them or about books&#8230; but, not always, I write</p>
<p>The post <a href="https://jenaiz.com/2014/10/re-launch/">Re-launch</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>So many times, I find myself writing a list of articles to read, writing notes about them or about books&#8230; but, not always, I write those notes in the same place. I took the decision that it is good if I put all of those notes together in the same place.</p>
<p>One time ago, I used to write using <a href="http://smellthecode.tumblr.com" target="_blank">tumblr</a>, but it have not been updated a long time.</p>
<p>I needed to put all those interesting notes, comments, ideas, investigations together. I think the evolution of those ideas, the experiences, the articles read&#8230; every of those could be interesting to share &amp; collect here.</p>
<p>The post <a href="https://jenaiz.com/2014/10/re-launch/">Re-launch</a> appeared first on <a href="https://jenaiz.com">jenaiz.com - I think.</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
