<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Helyx.org]]></title><description><![CDATA[Helyx.org helps you to master languages (Rust, Go & Python), container orchestration (Kubernetes), data and cloud providers (AWS & GCP) by providing tips & tricks and best practices]]></description><link>https://helyx.org/</link><image><url>https://helyx.org/favicon.png</url><title>Helyx.org</title><link>https://helyx.org/</link></image><generator>Ghost 5.36</generator><lastBuildDate>Fri, 17 Mar 2023 21:19:49 GMT</lastBuildDate><atom:link href="https://helyx.org/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Generate S3 presigned URLs with Boto3]]></title><description><![CDATA[Boto3, the AWS SDK for Python, allows to interact with Amazon S3 service and generate presigned URLs.]]></description><link>https://helyx.org/generate-s3-presigned-urls-with-boto3/</link><guid isPermaLink="false">5ee2a48a04a06e04d218ede7</guid><category><![CDATA[AWS]]></category><category><![CDATA[S3]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Fri, 01 Jan 2021 15:01:21 GMT</pubDate><media:content url="https://helyx.org/content/images/2021/01/bucket.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2021/01/bucket.jpg" alt="Generate S3 presigned URLs with Boto3"><p>AWS SDKs are available for different languages. However Python is a language of choice to write serverless code and process data. For this reason, we will use it to showcase how to create a presigned URL for an S3 object.</p><blockquote>Whatever the reason you want to restrict access to a specific object stored into Amazon S3, you will have to use presigned URLs to give access to that object if you want to make it accessible to a restricted audience without having to configure a set of permissions associated to an access key.</blockquote><p>You don&apos;t want to create a specific user and an associated access key each time you want to make a restricted resource accessible. It is not manageable. </p><p>Also, you don&apos;t want to make the resource public as you want to keep control to who is accessing your restricted resource.</p><p>For this reason, a goos option is the use of presigned URLs. You can automate creation of it into your code, and you can make them expire whenever you want.</p><p>The counterpart, is that you cannot revoke a permission to access an object through a presigned URL. You will have to remove the resource from its location.</p><p>Also, you cannot protect of URL sharing. It means that the use of presigned URLs must be compatible with your needs. For example, if you need to generate links to some content only for a limited period of time, it will be a great fit.</p><h3 id="create-a-presigned-url-with-boto3">Create a presigned URL with Boto3</h3><p><a href="https://boto3.amazonaws.com/v1/documentation/api/latest/index.html?ref=helyx-org">Boto3</a>, the AWS SDK for Python, will allow you to interact with Amazon S3 service and generate pre-signed URL. The example below generates presigned URL to an object (here, a json file) stored in a bucket at some prefix with a validity period of 1 day :</p><figure class="kg-card kg-code-card"><pre><code class="language-python">import boto3
from botocore.client import Config

# Get the service client with sigv4 configured
s3 = boto3.client(&apos;s3&apos;, config=Config(signature_version=&apos;s3v4&apos;))

# Generate the URL to get &apos;key-name&apos; from &apos;bucket-name&apos;
# URL expires in 604800 seconds (seven days)
url = s3.generate_presigned_url(
    ClientMethod=&apos;get_object&apos;,
    Params={
        &apos;Bucket&apos;: &apos;my-bucket&apos;,
        &apos;Key&apos;: &apos;some-path/file-to-access.json&apos;
    },
    ExpiresIn=86400
)

print(url)
</code></pre><figcaption>Generate a presigned URL with Boto3</figcaption></figure><p>Given you save that script into a python file named <code>generate_presigned_url.py</code>. You will be able to call it with the following command:</p><pre><code class="language-bash">AWS_PROFILE=my_profile generate_presigned_url.py</code></pre><p>Here, we are using an already configured AWS profile called <code>my_profile</code>.</p><p>The result will correspond to something like that: </p><pre><code class="language-bash">https://my-bucket.s3.amazonaws.com/some-path/file-to-access.json?X-Amz-Algorithm=AWS4-HMAC-SHA256&amp;X-Amz-Credential=AKIAYSWZVJ32FGBAXBNB%2F20210101%2Feu-west-1%2Fs3%2Faws4_request&amp;X-Amz-Date=20210101T134645Z&amp;X-Amz-Expires=86400&amp;X-Amz-SignedHeaders=host&amp;X-Amz-Signature=92e1ff94b2eb8541d9f4b1ea058c02d69717383fa39daec91b5bb31e2f90f4d4</code></pre><p>By reviewing the result, we can observe that the URL points to the configured bucket &amp; prefix, and that query string parameters were generated: </p><ul><li>The <strong>algorithm </strong>used: X-Amz-Algorithm=AWS4-HMAC-SHA256</li><li>The <strong>credential</strong> mixed with additional information: X-Amz-Credential=AKIAYSWZVJ32FGBAXBNB%2F20210101%2Feu-west-1%2Fs3%2Faws4_request</li><li>The <strong>generation date </strong>in ISO8601 format: X-Amz-Date=20210101T134645Z</li><li>The <strong>validity period</strong>: X-Amz-Expires=86400</li><li>The <strong>headers</strong> used for the signature: X-Amz-SignedHeaders=host</li><li>and then, the <strong>signature</strong>, that allow to check the URL has not been modified: X-Amz-Signature=92e1ff94b2eb8541d9f4b1ea058c02d69717383fa39daec91b5bb31e2f90f4d4</li></ul><p>To improve usability of the script, it might be a good idea to parse the command line arguments, and use them to configure the Boto3 call to generate the signed URL:</p><figure class="kg-card kg-code-card"><pre><code class="language-python">import boto3
from botocore.client import Config
import argparse

parser = argparse.ArgumentParser(&quot;generate_signed_url&quot;)
parser.add_argument(&quot;bucket&quot;, help=&quot;S3 Bucket&quot;, type=str)
parser.add_argument(&quot;key&quot;, help=&quot;S3 key&quot;, type=str)
parser.add_argument(&quot;expires_in&quot;, help=&quot;Expire in&quot;, type=int)
args = parser.parse_args()


# Get the service client with sigv4 configured
s3 = boto3.client(&apos;s3&apos;, config=Config(signature_version=&apos;s3v4&apos;))

# Generate the URL to get &apos;key-name&apos; from &apos;bucket-name&apos;
# URL expires in 604800 seconds (seven days)
url = s3.generate_presigned_url(
    ClientMethod=&apos;get_object&apos;,
    Params={
        &apos;Bucket&apos;: args.bucket,
        &apos;Key&apos;: args.key
    },
    ExpiresIn=args.expires_in
)

print(url)</code></pre><figcaption>Generate a presigned URL with Boto3 with command line arguments</figcaption></figure><p>The previous script can be used by configuring some additional arguments on command line:</p><pre><code class="language-bash">AWS_PROFILE=my_profile generate_signed_url.py my_bucket my_prefix/my_object.json 3600</code></pre><h3 id="s3v4-signatures">S3v4 signatures</h3><p>Previous examples have been configured to use S3v4 signature to generate presigned URLs. Calling <code>generate_presigned_url</code> function without configuring Boto3 session to use s3v4 signatures will results in a different signature format: </p><pre><code class="language-bash">https://s3.eu-west-1.amazonaws.com/my_bucket/my_prefix/my_object.json?AWSAccessKeyId=AKIAYSWZVJ32FGBAXBNB&amp;Signature=LYlMYi2LMr4dQK4ivSGVUiF5Yqo%3D&amp;Expires=1609513255</code></pre><p>This detail might not seem to be important. However, given you try to provide access to an &#xA0;file encrypted with AWS KMS managed key, you will fail to generate a valid presigned URL if use of AWS Signature Version 4 is not configured on the Boto3 session, and using another signature format will result in the following error: </p><pre><code class="language-xml">&lt;Error&gt;
&lt;Code&gt;InvalidArgument&lt;/Code&gt;
&lt;Message&gt;Requests specifying Server Side Encryption with AWS KMS managed keys require AWS Signature Version 4.&lt;/Message&gt;
&lt;ArgumentName&gt;Authorization&lt;/ArgumentName&gt;
&lt;ArgumentValue&gt;null&lt;/ArgumentValue&gt;
&lt;RequestId&gt;80149C77623B9D45&lt;/RequestId&gt;
&lt;HostId&gt;HZBvZonRGHTPRI51YQYQZIuRqsclhb1RddrM2F7jbvKMVTUBbfhEq9N9HJhj4sRngjTRlbrxYyi=&lt;/HostId&gt;
&lt;/Error&gt;</code></pre><h3 id="temporary-credentials">Temporary credentials</h3><p>As presigned URLs inherit from the IAM principal that makes the call, if the IAM principal used is one with temporary credentials, for example a STS session of 1 hour, then even if you set your expire to 1 day, the access to the resource through the presigned URL will be rejected as soon as the session from the IAM principal becomes invalid. In the given example, the presigned URL would become invalid after 1 hour. </p><h3 id="presigned-urls-limitations">Presigned URLs limitations</h3><p>Validity period will vary given you created your presigned URL with:</p><ul><li> <em>IAM instance profile</em> (Valid up to 6 hours)</li><li><em>AWS Security Token Service</em> (Valid up to 36 hours)</li><li>or with <em>IAM user</em> (Valid up to 7 days with AWS Signature Version 4).</li></ul><h3 id="presigned-urls-for-file-upload">Presigned URLs for file upload</h3><p>Presigned URLs can be used in many situations to access resources already stored in S3. However, you have to know, that you can also use presigned URLs to upload objects to S3.</p><p>It is useful when you want your user/customer to be able to upload a specific object to your S3 storage without providing AWS security credentials.</p><p>As presigned URLs inherit from the IAM principal that makes the call, you should carefully design associated permissions to avoid security issues. It is possible for example to limit use from specific network paths ( with aws:SourceIP, aws:SourceVPC, aws:SourceVPCe conditions in policy definitions).</p><h3 id="additional-resources">Additional resources</h3><p>You can refer to more detailed explanations in the AWS documentation to share objects at this page: <a href="https://docs.aws.amazon.com/AmazonS3/latest/dev/ShareObjectPreSignedURL.html?ref=helyx-org">https://docs.aws.amazon.com/AmazonS3/latest/dev/ShareObjectPreSignedURL.html</a>, and to upload objects here: <a href="https://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlUploadObject.html?ref=helyx-org">https://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlUploadObject.html</a>.</p><p></p>]]></content:encoded></item><item><title><![CDATA[Introduction to AWS CloudShell]]></title><description><![CDATA[AWS CloudShell is a new service aimed at facilitating interactions with AWS from the command line without having to install & configure a full set of tools]]></description><link>https://helyx.org/introduction-to-aws-cloud-shell/</link><guid isPermaLink="false">5fdf65eaaf778b04cce53ad5</guid><category><![CDATA[AWS]]></category><category><![CDATA[re:Invent]]></category><category><![CDATA[CloudShell]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Sat, 26 Dec 2020 21:22:32 GMT</pubDate><media:content url="https://helyx.org/content/images/2020/12/aws-cloud-shell.png" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2020/12/aws-cloud-shell.png" alt="Introduction to AWS CloudShell"><p>During the <strong>re:Invent 2020 Developer Keynote,</strong> presented by <strong>Dr. Werner Vogels</strong>, was introduced a new handy service named <strong>AWS CloudShell</strong>.</p><blockquote>AWS CloudShell is aimed at providing an AWS-enabled shell prompt in the browser that is simple and secure with as little friction as possible.</blockquote><p>AWS CloudShell is generally available in <strong>us-east-1</strong> (N. Virginia), <strong>us-east-2</strong> (Ohio), <strong>us-west-2</strong> (Oregon), <strong>ap-northeast-1</strong> (Tokyo), and <strong>eu-west-1</strong> (Ireland) at launch. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://helyx.org/content/images/2020/12/product-page-diagram-AWS-CloudShell@2x.png" class="kg-image" alt="Introduction to AWS CloudShell" loading="lazy" width="1580" height="730" srcset="https://helyx.org/content/images/size/w600/2020/12/product-page-diagram-AWS-CloudShell@2x.png 600w, https://helyx.org/content/images/size/w1000/2020/12/product-page-diagram-AWS-CloudShell@2x.png 1000w, https://helyx.org/content/images/2020/12/product-page-diagram-AWS-CloudShell@2x.png 1580w" sizes="(min-width: 720px) 720px"><figcaption>AWS CloudShell in a nutshell</figcaption></figure><p>By announcing this new service, AWS fills a gap that has been present for years, and where competition has been providing solutions for a long time, starting with <a href="https://cloud.google.com/shell?ref=helyx-org">GCP Cloud Shell</a>.</p><p>You can see on YouTube an introduction of the service during Werner Vogels Keynote:</p><figure class="kg-card kg-embed-card kg-card-hascaption"><iframe width="356" height="200" src="https://www.youtube.com/embed/jt-gV1YwmnI?start=1330&amp;feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe><figcaption>AWS CloudShell introduction by Werner Vogels</figcaption></figure><h3 id="accessing-aws-cloudshell">Accessing AWS CloudShell</h3><p>To access the AWS CloudShell, you just have to connect to the <a href="https://aws.amazon.com/console/?ref=helyx-org">AWS Console</a> and click to the icon available in top-right navigation menu.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://helyx.org/content/images/2020/12/aws-console-top-right-navigation-menu.png" class="kg-image" alt="Introduction to AWS CloudShell" loading="lazy" width="2000" height="1246" srcset="https://helyx.org/content/images/size/w600/2020/12/aws-console-top-right-navigation-menu.png 600w, https://helyx.org/content/images/size/w1000/2020/12/aws-console-top-right-navigation-menu.png 1000w, https://helyx.org/content/images/size/w1600/2020/12/aws-console-top-right-navigation-menu.png 1600w, https://helyx.org/content/images/size/w2400/2020/12/aws-console-top-right-navigation-menu.png 2400w" sizes="(min-width: 720px) 720px"><figcaption>AWS CloudShell button</figcaption></figure><p>By clicking on the icon, a new page will open to the AWS CloudShell home page and a new AWS CloudShell instance will start:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://helyx.org/content/images/2020/12/aws-cloud-shell-instance.png" class="kg-image" alt="Introduction to AWS CloudShell" loading="lazy" width="2000" height="1250" srcset="https://helyx.org/content/images/size/w600/2020/12/aws-cloud-shell-instance.png 600w, https://helyx.org/content/images/size/w1000/2020/12/aws-cloud-shell-instance.png 1000w, https://helyx.org/content/images/size/w1600/2020/12/aws-cloud-shell-instance.png 1600w, https://helyx.org/content/images/size/w2400/2020/12/aws-cloud-shell-instance.png 2400w" sizes="(min-width: 720px) 720px"><figcaption>AWS CloudShell</figcaption></figure><p>The command-line provided has the <a href="https://aws.amazon.com/cli/?ref=helyx-org">AWS Command Line Interface (CLI)</a> (v2) installed and configured so that you can run AWS commands without requiring any additional setup or configuration.</p><p>The environment is providing <strong>pre-installed</strong> <em>Python</em> &amp; <em>Node</em> <strong>runtimes</strong> and <strong>tools</strong> such as <code>jq</code>.</p><p>AWS Cloud Shell is based on <a href="https://aws.amazon.com/amazon-linux-2/?ref=helyx-org">Amazon Linux 2</a>.</p><h3 id="shells">Shells</h3><p><strong>3 shells are pre-installed</strong> : <em>Bash</em> which is the default shell, <em>Z Shell </em>also known as zsh, that provides customization with themes and plugins, and <em>PowerShell</em>. </p><p>If you are a Microsoft user, PowerShell availability, built on top of Microsoft&apos;s .NET Command Language Runtime, will make you happy, and will let you take advantage of its deep integration with .NET.</p><p><strong>Shell in use can be identified by the command prompt</strong>: <code>$</code> corresponds to Bash, <code>PS&gt;</code> corresponds to PowerShell and <code>%</code>corresponds to zsh.</p><p>The default user is <code>cloudshell-user</code> which is not the default user that you will find in Amazon Linux EC2 instances (ec2-user). Using some scripts designed for EC2 may result in some issue if they are not adapted to run on AWS CloudShell.</p><h3 id="additional-aws-command-line-interfaces-cli-">Additional AWS command line interfaces (CLI)</h3><p>In addition of the default AWS CLI, <strong>additional CLIs are provided pre-installed</strong>, which is handy, as it takes times whenever you want to use one of them, as you have to find related instructions to make the installation. Provided CLIs are:</p><ul><li>AWS Elastic Beanstalk CLI (<em>eb</em>),</li><li>Amazon Elastic Container Service (Amazon ECS) CLI (<em>ecs-cli</em>)</li><li>AWS SAM CLI (<em>sam</em>).</li></ul><p>It is always time consuming to setup a shell when you want to interact with your account resources. Moreover, as you don&apos;t do this kind of installation every other day, it means that you have to remember how to setup your tooling.</p><p>With AWS CloudShell, you always have at hand a working environment that does not require to spend time at installing tooling on a system that you don&apos;t own whether you are on a Linux, Windows or Mac machine.</p><p>Also, you don&apos;t have that much to worry about the cleanup of the machine after its usage as AWS CloudShell is available from the browser.</p><p>A simple history cleanup of the browser or accessing the service via private browsing should be enough <em>(given that the computer is not compromised)</em>.</p><h3 id="development-tools-and-shell-utilities">Development tools and shell utilities</h3><p><strong>Many tools and shell utilities are also pre-installed: </strong><code>git</code>, <code>iputils</code>, <code>jq</code>, <code>tmux</code>, <code>vim</code>, <code>wget</code> or<em> </em><code>CodeCommit utility for Git</code> (git-remote-codecommit) which provides a simple method for pushing and pulling code from CodeCommit repositories by extending &#xA0;Git.</p><p>By default, <strong>AWS CloudShell users have sudo privileges</strong>. Therefore, <strong>it is possible to use the sudo command to install additional software</strong>. As AWS CloudShell is based on Amazon Linux 2, you will have to use <code>yum</code> to install software.</p><p>However, additional software has to be installed on each session as setups are recycled between sessions.</p><p><strong>It is possible to customize the initialization of AWS CloudShell sessions</strong> by customizing the <em>.bashrc</em>. In case of access loss to the session due to any error, it is still possible to delete the home directory <em>(Action is available from Action Menu)</em>.</p><p>In case of advanced customization needs, it can be preferable to rely on code versioning for example with Git.</p><p>Here is a full list of programs available in the <code>/usr/bin</code> directory:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://helyx.org/content/images/2020/12/Screenshot-20201226-184052@2x.png" class="kg-image" alt="Introduction to AWS CloudShell" loading="lazy" width="2000" height="1250" srcset="https://helyx.org/content/images/size/w600/2020/12/Screenshot-20201226-184052@2x.png 600w, https://helyx.org/content/images/size/w1000/2020/12/Screenshot-20201226-184052@2x.png 1000w, https://helyx.org/content/images/size/w1600/2020/12/Screenshot-20201226-184052@2x.png 1600w, https://helyx.org/content/images/size/w2400/2020/12/Screenshot-20201226-184052@2x.png 2400w" sizes="(min-width: 720px) 720px"><figcaption>/usr/bin programs</figcaption></figure><p><code>amazon-linux-extras</code> command is available as part of the standard installation. It means that many additional software can be installed with ease.</p><p>For example, to install <code>java-openjdk11</code>, you just have to execute the following commands:</p><figure class="kg-card kg-code-card"><pre><code class="language-shell">sudo amazon-linux-extras enable java-openjdk11
sudo yum install java-11-openjdk</code></pre><figcaption>Install java-openjdk11</figcaption></figure><p>After installation, executing <code>java -version</code> will return the following result: </p><figure class="kg-card kg-code-card"><pre><code>openjdk version &quot;11.0.7&quot; 2020-04-14 LTS
OpenJDK Runtime Environment 18.9 (build 11.0.7+10-LTS)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.7+10-LTS, mixed mode, sharing)</code></pre><figcaption>`java -version` information&#xA0;</figcaption></figure><h3 id="deleting-home-directory">Deleting home directory</h3><p>Deleting data stored in the home directory is permanent. It cannot be reversed, but it can be useful either in case of issue, or to simply remove all data.</p><h3 id="limits-of-persistent-storage">Limits of persistent storage</h3><p>AWS CloudShell allows to store <strong>1 GB</strong> of data in each region at no cost. <strong>Only data stored in the Home directory <em>($HOME)</em> will be persisted between 2 sessions</strong>. Data stored in other locations is automatically wiped at the end of a session.</p><p>Data is retained for a maximum of <strong>120 days</strong> after the end of the last session for a given region.</p><p>AWS CloudShell has been implemented using cryptographic keys provided by AWS KMS. The service generates and manages cryptographic keys used for encrypting data.</p><h3 id="other-shell-limits">Other shell limits</h3><p>It is possible to run a maximum of <strong>10 shells</strong> at the same time for each region at no charge.</p><p>After <strong>20 to 30 minutes</strong> of inactivity the session will end. </p><p>Processes in background are not considered as activities. Only keyboard &amp; mouse interactions will be considered as activities and extend sessions. However, <strong>there is a hard limit of 12 hours of activity</strong>. After this period of time, the session will automatically end.</p><p>When the session times out, it is possible to reconnect simply by clicking on the reconnect button.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://helyx.org/content/images/2020/12/Screenshot-20201226-182143@2x.png" class="kg-image" alt="Introduction to AWS CloudShell" loading="lazy" width="1234" height="442" srcset="https://helyx.org/content/images/size/w600/2020/12/Screenshot-20201226-182143@2x.png 600w, https://helyx.org/content/images/size/w1000/2020/12/Screenshot-20201226-182143@2x.png 1000w, https://helyx.org/content/images/2020/12/Screenshot-20201226-182143@2x.png 1234w" sizes="(min-width: 720px) 720px"><figcaption>Reconnect popup</figcaption></figure><h3 id="instance-metadata">Instance metadata</h3><p>It is worth noting that instance metadata are not available from AWS CloudShell as opposed to EC2 instances. Trying to call the magic URL results in the following error message: <em>&quot;curl: (7) Couldn&apos;t connect to server&quot;.</em></p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://helyx.org/content/images/2020/12/Screenshot-20201226-183508@2x.png" class="kg-image" alt="Introduction to AWS CloudShell" loading="lazy" width="2000" height="382" srcset="https://helyx.org/content/images/size/w600/2020/12/Screenshot-20201226-183508@2x.png 600w, https://helyx.org/content/images/size/w1000/2020/12/Screenshot-20201226-183508@2x.png 1000w, https://helyx.org/content/images/size/w1600/2020/12/Screenshot-20201226-183508@2x.png 1600w, https://helyx.org/content/images/size/w2400/2020/12/Screenshot-20201226-183508@2x.png 2400w" sizes="(min-width: 720px) 720px"><figcaption>Instance metadata</figcaption></figure><h3 id="network-access-data-transfer">Network Access &amp; Data Transfer</h3><p>AWS CloudShell session users can access the public internet, however it is not possible to reach inbound ports from outside. <strong>No public IP address is available</strong>.</p><p>As download &amp; upload can be slow, the preferred way to handle large files will be to use S3 storage from the command line interface.</p><p><strong>Download</strong> &amp; <strong>Upload</strong> features are accessible from the <strong>Action menu</strong>:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://helyx.org/content/images/2020/12/Screenshot-20201226-170823@2x.png" class="kg-image" alt="Introduction to AWS CloudShell" loading="lazy" width="2000" height="1250" srcset="https://helyx.org/content/images/size/w600/2020/12/Screenshot-20201226-170823@2x.png 600w, https://helyx.org/content/images/size/w1000/2020/12/Screenshot-20201226-170823@2x.png 1000w, https://helyx.org/content/images/size/w1600/2020/12/Screenshot-20201226-170823@2x.png 1600w, https://helyx.org/content/images/size/w2400/2020/12/Screenshot-20201226-170823@2x.png 2400w" sizes="(min-width: 720px) 720px"><figcaption>Action Menu</figcaption></figure><h3 id="shell-layouts">Shell Layouts</h3><p>It is possible to split horizontally &amp; vertically the main window as well as to create tabs to organize efficiently the workspace.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://helyx.org/content/images/2020/12/Screenshot-20201226-171213@2x.png" class="kg-image" alt="Introduction to AWS CloudShell" loading="lazy" width="2000" height="1250" srcset="https://helyx.org/content/images/size/w600/2020/12/Screenshot-20201226-171213@2x.png 600w, https://helyx.org/content/images/size/w1000/2020/12/Screenshot-20201226-171213@2x.png 1000w, https://helyx.org/content/images/size/w1600/2020/12/Screenshot-20201226-171213@2x.png 1600w, https://helyx.org/content/images/size/w2400/2020/12/Screenshot-20201226-171213@2x.png 2400w" sizes="(min-width: 720px) 720px"><figcaption>Shell layout</figcaption></figure><p>In addition, as preference pane will give access to additional customization parameters such as font size or theme used:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://helyx.org/content/images/2020/12/Screenshot-20201226-172424@2x.png" class="kg-image" alt="Introduction to AWS CloudShell" loading="lazy" width="1246" height="1082" srcset="https://helyx.org/content/images/size/w600/2020/12/Screenshot-20201226-172424@2x.png 600w, https://helyx.org/content/images/size/w1000/2020/12/Screenshot-20201226-172424@2x.png 1000w, https://helyx.org/content/images/2020/12/Screenshot-20201226-172424@2x.png 1246w" sizes="(min-width: 720px) 720px"><figcaption>AWS CloudShell Preferences</figcaption></figure><p><code>Enable Safe Paste</code> option available in the preference pane is a security feature that allows you to require yourself to verify that <em>multi-line text</em> that you are about to paste does not contain malicious scripts.</p><h3 id="compute-environment-resources">Compute environment resources</h3><p>Each AWS CloudShell is assigned CPU &amp; memory resources. More specifically, 1 vCPU &amp; 2 GiB of RAM are provided for free.</p><p>It is worth nothing that <strong>AWS CloudShell service does not provide support for Docker</strong>.</p><p>Trying to install <code>docker</code> with <code>amazon-linux-extra</code> will fail. Executing <code>docker ps</code> command returns the following error: </p><figure class="kg-card kg-code-card"><pre><code class="language-shell">Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?</code></pre><figcaption>`docker ps` command error</figcaption></figure><p>It should still be possible to configure the client to connect to a remote docker daemon.</p><h3 id="security-compliance">Security &amp; compliance</h3><p>By default, <strong>AWS CloudShell installs automatically security patches for the system packages</strong>. It means that you don&apos;t have to worry about it.</p><p>Regarding at compliance, AWS CloudShell is not in scope of any specific compliance programs.</p><p>If you are interested at monitoring activity of the service, it is possible to do it through Cloud Trail integration that can report a number of events either related to the activity of the user in the console or to API interactions.</p><p>It is also possible to leverage EventBridge rules to react to AWS CloudShell events.</p><h3 id="permissions">Permissions</h3><p>When it comes to refine permissions given to a specific user, IAM policies allows to customize at the level of expectation. </p><p>By default, The <code>AWSCloudShellFullAccess</code> grants permission to use AWS CloudShell with full access to all features.</p><p>However, it is also possible to restrict as usual permissions by customizing permissions through custom defined policies.</p><p>Permission prefix for AWS CloudShell service will be: <code>cloudshell</code>.</p><p>3 permissions specific to the service are available:</p><ul><li><code>cloudshell:CreateSession</code> , which allows to start a shell session</li><li><code>cloudshell:GetFileDownloadUrls</code>, which allows to download files from the shell environment to a local machine</li><li><code>cloudshell:GetFileUploadUrls</code>, which allows to upload files from a local machine to the shell environment</li></ul><p>It is possible, for example, to restrict access to AWS CloudShell by blocking &#xA0;file uploads &amp; downloads in the shell environment by defining a policy as following:</p><figure class="kg-card kg-code-card"><pre><code class="language-json">{
    &quot;Version&quot;: &quot;2012-10-17&quot;,
    &quot;Statement&quot;: [{
        &quot;Sid&quot;: &quot;CloudShellUser&quot;,
        &quot;Effect&quot;: &quot;Allow&quot;,
        &quot;Action&quot;: [
            &quot;cloudshell:*&quot;
        ],
        &quot;Resource&quot;: &quot;*&quot;
    }, {
        &quot;Sid&quot;: &quot;DenyUploadDownload&quot;,
        &quot;Effect&quot;: &quot;Deny&quot;,
        &quot;Action&quot;: [
            &quot;cloudshell:GetFileDownloadUrls&quot;,
            &quot;cloudshell:GetFileUploadUrls&quot;
        ],
        &quot;Resource&quot;: &quot;*&quot;
    }]
}</code></pre><figcaption>Custom AWS CloudShell policy</figcaption></figure><p>The greatness of AWS CloudShell resides in inheritance of permissions from the user connected to AWS Console. <strong>AWS CloudShell assumes the identity of the connected user</strong>.</p><h3 id="pricing">Pricing</h3><p><strong>Users are not charged when using AWS CloudShell</strong>. It means that you don&apos;t have to worry about pricing. Also, there is no minimum fees or required upfront commitments. Only data transfer is billed at standard rates.</p><h3 id="aws-cloudshell-plugin-for-vscode">AWS CloudShell plugin for VSCode</h3><p>An unofficial plugin for VSCode has been built to integrate <strong>VSCode</strong> with AWS CloudShell. It will allow to open multiple AWS CloudShell terminals within VSCode on demand.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://helyx.org/content/images/2020/12/screenshot.png" class="kg-image" alt="Introduction to AWS CloudShell" loading="lazy" width="2000" height="1092" srcset="https://helyx.org/content/images/size/w600/2020/12/screenshot.png 600w, https://helyx.org/content/images/size/w1000/2020/12/screenshot.png 1000w, https://helyx.org/content/images/size/w1600/2020/12/screenshot.png 1600w, https://helyx.org/content/images/size/w2400/2020/12/screenshot.png 2400w" sizes="(min-width: 720px) 720px"><figcaption>AWS CloudShell plugin for VSCode</figcaption></figure><p>More information available on the GitHub page of the plugin: <a href="https://github.com/iann0036/vscode-aws-cloudshell?ref=helyx-org">https://github.com/iann0036/vscode-aws-cloudshell</a>.</p><p>To get it work, AWS CLI must be installed as well as the S<a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with-install-plugin.html?ref=helyx-org">ession Manager plugin</a> for VSCode.</p><p>It is also required to configure properly an <em>AWS Profile</em> and configure VSCode plugin with it.</p><h3 id="conclusion">Conclusion</h3><p>Sure, AWS CloudShell is not a technological revolution, but it fills a gap that remained open for a long time. The service still lacks some features compared to equivalent solutions available for example in GCP, but it is a first step in the right direction.</p><h3 id="useful-link">Useful link</h3><ul><li>Page of the service: <a href="http://aws.amazon.com/cloudshell?ref=helyx-org">https://aws.amazon.com/cloudshell</a></li><li>AWS Blog announcement article: <a href="https://aws.amazon.com/fr/blogs/aws/aws-cloudshell-command-line-access-to-aws-resources/?ref=helyx-org">https://aws.amazon.com/fr/blogs/aws/aws-cloudshell-command-line-access-to-aws-resources/</a></li></ul>]]></content:encoded></item><item><title><![CDATA[Pandas on AWS with AWS Data Wrangler]]></title><description><![CDATA[The GitHub page of the project describe the library as Pandas on AWS. Pandas an is open source data analysis and manipulation tool, built on top of the Python programming language. Pandas is designed to be fast, powerful, flexible and easy to use.]]></description><link>https://helyx.org/aws-data-wrangler/</link><guid isPermaLink="false">5ec830db04a06e04d218e7f2</guid><category><![CDATA[AWS]]></category><category><![CDATA[Pandas]]></category><category><![CDATA[Python]]></category><category><![CDATA[Data]]></category><category><![CDATA[S3]]></category><category><![CDATA[Athena]]></category><category><![CDATA[Glue]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Tue, 09 Jun 2020 21:55:36 GMT</pubDate><media:content url="https://helyx.org/content/images/2020/05/aws-data-wrangler-logo-3.png" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2020/05/aws-data-wrangler-logo-3.png" alt="Pandas on AWS with AWS Data Wrangler"><p>What is AWS Data Wrangler library ? The <a href="https://github.com/awslabs/aws-data-wrangler?ref=helyx-org">GitHub page</a> of the project describes the library as Pandas on AWS.</p><p>In case, you stayed in your cave for a long time, <a href="https://pandas.pydata.org/?ref=helyx-org">Pandas</a> is an open source data analysis and manipulation tool, built on top of the <a href="https://www.python.org/?ref=helyx-org">Python</a> programming language. Pandas is designed to be fast, powerful, flexible and easy to use.</p><p>Positioning itself a <em>&#x201C;&#x2009;Pandas on AWS &#x201D;</em> immediately raises the bar.</p><p>It is a project available from the GitHub organization <a href="https://github.com/awslabs?ref=helyx-org">AWSLab</a>. You can find the organization page a bunch of projects open sourced by AWS, some of them more or less used or mature. The <a href="https://github.com/awslabs/s2n?ref=helyx-org">s2n</a> project, an implementation of the TLS/SSL protocols, is a good example of mature projects available.</p><p><a href="https://github.com/awslabs/aws-data-wrangler?ref=helyx-org">AWS Data Wrangler</a> module represents to date, more than 771 commits, 20 contributors, and 52 releases. Versions are currently released at a sustained pace, and the Python module is currently available in version <code>1.4.0</code>.</p><h2 id="installation">Installation</h2><p>There are two ways to install the module. Either using <em>p<a href="https://pypi.org/project/pip/?ref=helyx-org">ip</a></em> or using <a href="https://docs.conda.io/en/latest/?ref=helyx-org"><em>Conda</em></a>. </p><h3 id="pip-install">Pip install</h3><p>To install the module with <em>pip</em>, you can use the following command: </p><pre><code class="language-shell">pip install awswrangler</code></pre><h3 id="conda-install">Conda install</h3><p>If you are a <em>Conda</em> user, instead, you can install the module with the following command:</p><pre><code class="language-shell">conda install -c conda-forge awswrangler</code></pre><h2 id="basic-usage">Basic usage</h2><p>Following the GitHub <em><a href="https://github.com/awslabs/aws-data-wrangler/blob/master/README.md?ref=helyx-org">readme</a></em> introduction, here is the way to create a basic DataFrame with Pandas:</p><pre><code class="language-python">import pandas as pd

df = pd.DataFrame({&quot;id&quot;: [1, 2], &quot;value&quot;: [&quot;foo&quot;, &quot;bar&quot;]})</code></pre><p>And, then import the AWS Data Wrangler module:</p><pre><code class="language-python">import awswrangler as wr</code></pre><h3 id="write-data-to-amazon-s3">Write data to Amazon S3</h3><p>Now, lets create, into an S3 bucket, a data file representing the data from the <em>DataFrame</em> serialized into a file:</p><pre><code class="language-python"># Storing data on Data Lake
wr.s3.to_parquet(
    df=df,
    path=&quot;s3://bucket/dataset/&quot;,
    dataset=True,
    database=&quot;my_db&quot;,
    table=&quot;my_table&quot;
)</code></pre><p>Easy ! An <code>s3</code> variable at the root of the <em>AWS Data Wrangler</em> module lets the user access functions allowing to interact with <code>s3</code>, in this case to flush the <em>DataFrame</em> to <em>S3</em>.</p><h3 id="read-data-from-amazon-s3">Read data from Amazon S3</h3><p>The reverse function is also available allowing to read the data from <em>S3</em>:</p><pre><code class="language-python"># Retrieving the data directly from Amazon S3
df = wr.s3.read_parquet(&quot;s3://bucket/dataset/&quot;, dataset=True)</code></pre><p>You may wonder what is possible to do with the <em>AWS Data Wrangler</em> package apart interacting with <em>S3</em>. Let&apos;s take a free tour over some of the libraries features to discover some of its capabilities.</p><h2 id="definition">Definition</h2><p>Here is an accurate definition of the library as displayed in the documentation: </p><blockquote>An <a href="https://github.com/awslabs/aws-data-wrangler?ref=helyx-org">open-source</a> Python package that extends the power of <a href="https://github.com/pandas-dev/pandas?ref=helyx-org">Pandas</a> library to AWS connecting <strong><strong>DataFrames</strong></strong> and AWS data related services (<strong><strong>Amazon Redshift</strong></strong>, <strong><strong>AWS Glue</strong></strong>, <strong><strong>Amazon Athena</strong></strong>, <strong><strong>Amazon EMR</strong></strong>, etc).<br><br>Built on top of other open-source projects like <a href="https://github.com/pandas-dev/pandas?ref=helyx-org">Pandas</a>, <a href="https://github.com/apache/arrow?ref=helyx-org">Apache Arrow</a>, <a href="https://github.com/boto/boto3?ref=helyx-org">Boto3</a>, <a href="https://github.com/dask/s3fs?ref=helyx-org">s3fs</a>, <a href="https://github.com/sqlalchemy/sqlalchemy?ref=helyx-org">SQLAlchemy</a>, <a href="https://github.com/psycopg/psycopg2?ref=helyx-org">Psycopg2</a> and <a href="https://github.com/PyMySQL/PyMySQL?ref=helyx-org">PyMySQL</a>, it offers abstracted functions to execute usual ETL tasks like load/unload data from <strong><strong>Data Lakes</strong></strong>, <strong><strong>Data Warehouses</strong></strong> and <strong><strong>Databases</strong></strong>.</blockquote><h2 id="supported-services">Supported services</h2><p>The aim of the library is to simplify interaction with the data across AWS supported services. Basically, AWS Data Wrangler library is supporting 5 services from AWS:</p><ul><li>Amazon S3</li><li>AWS Glue Catalog</li><li>Amazon Athena</li><li>Databases (Redshift, PostgreSQL &amp; Mysql)</li><li>EMR</li><li>CloudWatch Logs</li></ul><p>As the library extends the power of <a href="https://github.com/pandas-dev/pandas?ref=helyx-org">Pandas</a> library to AWS connecting <em>DataFrames</em> and AWS data related services, most of operations available, directly dealing with loading or flushing the data, will rely on Pandas <em>DataFrames</em>. </p><h3 id="simplifying-interactions">Simplifying interactions</h3><p>However, the package is not focused only on loading / unloading the data. The package is also meant to simplify things, more specifically, simplifying interactions with services.</p><p>The library provides, for example, functions to :</p><ul><li>Load / unload data for Redshift</li><li>Generate a Redshift copy manifest instead of having to generate it by yourself</li></ul><p> but also to :</p><ul><li>simplify create of EMR clusters or definition and submission of build steps.</li></ul><h4 id="interacting-with-aws-athena">Interacting with AWS Athena</h4><p>Interacting with <em>AWS Athena</em> can be cumbersome. To reduce the burden, you have access to functions making things easier, for example, to start, stop, or wait for query completion.</p><p>Goodness does not stop on <em>AWS Athena</em> simplified interactions. You will also find &#xA0;improvements in interacting with <em>AWS Glue Data Catalog</em>, making code writing straightforward.</p><h4 id="aws-data-wrangler-as-default-way-to-interact">AWS Data Wrangler as default way to interact ?</h4><p>Given all this improvements made available over the standard APIs, <strong>it should be a no brainer to use it as your default way to interact with the supported services</strong> in a data processing context with Python.</p><p>Lets now go deeper in more detailed examples and notions around the <em>AWS Data Wrangler</em> package. To do that, let&apos;s start with sessions.</p><h3 id="sessions">Sessions</h3><p><em>AWS Data Wrangler</em> interacts with AWS services using a default <a href="https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html?ref=helyx-org"><em>Boto3 Session</em></a>. That&apos;s why, you won&apos;t have to provide most of the time any session informations. However, if you need to customize the session the module is working with, it is possible to reconfigure default <em>boto3</em> session:</p><pre><code class="language-python"> boto3.setup_default_session(region_name=&quot;eu-west-1&quot;)</code></pre><p>or &#xA0;even instantiate a new <em>boto3</em> session, and passing it as a named parameter to function calls:</p><pre><code class="language-python">session = boto3.Session(region_name=&quot;us-east-2&quot;)
wr.s3.does_object_exist(&quot;s3://foo/bar&quot;, boto3_session=session)</code></pre><h3 id="amazon-s3">Amazon S3</h3><p>As mentioned previously, an <code>s3</code> variable is available at the root of the AWS Data Wrangler module. The <code>s3</code> variable will essentially allow you to interact with <code>Amazon S3</code> service to work on <em>CSV</em>, <em>JSON,</em> <em>Parquet</em> and <em>fixed-width formatted</em> files along with having access to some handy functions purely related to file manipulations.</p><p>Lets define first 2 <em>DataFrames</em>: </p><pre><code class="language-python">import awswrangler as wr
import pandas as pd
import boto3

df1 = pd.DataFrame({
    &quot;id&quot;: [1, 2],
    &quot;name&quot;: [&quot;foo&quot;, &quot;boo&quot;]
})

df2 = pd.DataFrame({
    &quot;id&quot;: [3],
    &quot;name&quot;: [&quot;bar&quot;]
})</code></pre><p>Having those 2 <em>DataFrames</em> created, it will be possible to write them simply to <em>S3</em> this way:</p><pre><code class="language-bash">bucket = &quot;my-bucket&quot;

path1 = f&quot;s3://{bucket}/csv/file1.csv&quot;
path2 = f&quot;s3://{bucket}/csv/file2.csv&quot;

wr.s3.to_csv(df1, path1, index=False)
wr.s3.to_csv(df2, path2, index=False)</code></pre><p>As a result, it is also possible to read the previously written files in similar fashion:</p><pre><code class="language-python">df1Bis = wr.s3.read_csv(path1)</code></pre><p><code>df1bis</code> and <code>df1</code> should present the exact same data.</p><p>Finally, it is also possible to re-read written data by reading multiple CSV files at once, listing explicitly which files have to be read:</p><pre><code class="language-python">wr.s3.read_csv([path1, path2])</code></pre><p>Things can be made even easier by providing only the prefix to read data from:</p><pre><code class="language-python">wr.s3.read_csv(f&quot;s3://{bucket}/csv/&quot;)</code></pre><p>As seen, in example before, it is very easy to interact with <em>S3</em>, without having to deal with code complexities or boilerplates. </p><h3 id="aws-glue-data-catalog">AWS Glue Data Catalog</h3><p>Having tried a demo of the library interacting with <code>Amazon S3</code>, the next step is to let the user interact directly with the AWS Glue Data Catalog ? </p><p>To interact with, the user just have to use the <code>catalog</code> variable on the module.</p><pre><code class="language-python">wr.catalog.databases()</code></pre><p>Previous command should return the database list this way:</p><!--kg-card-begin: markdown--><table>
<thead>
<tr>
<th style="text-align:left">Database</th>
<th style="text-align:left">Description</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left">0</td>
<td style="text-align:left">awswrangler_test</td>
<td>AWS Data Wrangler Test Arena - Glue Database</td>
</tr>
<tr>
<td style="text-align:left">1</td>
<td style="text-align:left">default</td>
<td>Default Hive database</td>
</tr>
<tr>
<td style="text-align:left">2</td>
<td style="text-align:left">sampledb</td>
<td>Sample database</td>
</tr>
</tbody>
</table>
<!--kg-card-end: markdown--><p>It may not be that simple with direct usage of <em>boto3</em> API. But it will be that simple also to list available tables in a specific Database:</p><pre><code class="language-python">wr.catalog.tables(database=&quot;awswrangler_test&quot;)</code></pre><p>The command should return the following result:</p><!--kg-card-begin: markdown--><table>
<thead>
<tr>
<th style="text-align:left"></th>
<th style="text-align:left"></th>
<th style="text-align:left"></th>
<th style="text-align:left"></th>
<th style="text-align:left"></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left">Database</td>
<td style="text-align:left">Table</td>
<td style="text-align:left">Description</td>
<td style="text-align:left">Columns</td>
<td style="text-align:left">Partitions</td>
<td></td>
</tr>
<tr>
<td style="text-align:left">0</td>
<td style="text-align:left">awswrangler_test</td>
<td style="text-align:left">lambda</td>
<td style="text-align:left"></td>
<td style="text-align:left">col1, col2</td>
<td></td>
</tr>
<tr>
<td style="text-align:left">1</td>
<td style="text-align:left">awswrangler_test</td>
<td style="text-align:left">noaa</td>
<td style="text-align:left"></td>
<td style="text-align:left">id, dt, element, value, m_flag, q_flag, s_flag...</td>
<td></td>
</tr>
</tbody>
</table>
<!--kg-card-end: markdown--><p>Now, to get table details, meaning column informations, there is just the need to call the <code>table()</code> function over the <code>catalog</code> variable.</p><pre><code class="language-python">wr.catalog.table(database=&quot;awswrangler_test&quot;, table=&quot;boston&quot;)</code></pre><p>The command should return the following field list:</p><!--kg-card-begin: markdown--><table>
<thead>
<tr>
<th style="text-align:left"></th>
<th style="text-align:left"></th>
<th style="text-align:left"></th>
<th style="text-align:left"></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left">Column Name</td>
<td style="text-align:left">Type</td>
<td style="text-align:left">Partition</td>
<td style="text-align:left">Comment</td>
<td></td>
</tr>
<tr>
<td style="text-align:left">0</td>
<td style="text-align:left">crim</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>per capita crime rate by town</td>
</tr>
<tr>
<td style="text-align:left">1</td>
<td style="text-align:left">zn</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>proportion of residential land zoned for lots ...</td>
</tr>
<tr>
<td style="text-align:left">2</td>
<td style="text-align:left">indus</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>proportion of non-retail business acres per town</td>
</tr>
<tr>
<td style="text-align:left">3</td>
<td style="text-align:left">chas</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>Charles River dummy variable (= 1 if tract bou...</td>
</tr>
<tr>
<td style="text-align:left">4</td>
<td style="text-align:left">nox</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>nitric oxides concentration (parts per 10 mill...</td>
</tr>
<tr>
<td style="text-align:left">5</td>
<td style="text-align:left">rm</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>average number of rooms per dwelling</td>
</tr>
<tr>
<td style="text-align:left">6</td>
<td style="text-align:left">age</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>proportion of owner-occupied units built prior...</td>
</tr>
<tr>
<td style="text-align:left">7</td>
<td style="text-align:left">dis</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>weighted distances to five Boston employment c...</td>
</tr>
<tr>
<td style="text-align:left">8</td>
<td style="text-align:left">rad</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>index of accessibility to radial highways</td>
</tr>
<tr>
<td style="text-align:left">9</td>
<td style="text-align:left">tax</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>full-value property-tax rate per $10,000</td>
</tr>
<tr>
<td style="text-align:left">10</td>
<td style="text-align:left">ptratio</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>pupil-teacher ratio by town</td>
</tr>
<tr>
<td style="text-align:left">11</td>
<td style="text-align:left">b</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>1000(Bk - 0.63)^2 where Bk is the proportion o...</td>
</tr>
<tr>
<td style="text-align:left">12</td>
<td style="text-align:left">lstat</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td>lower status of the population</td>
</tr>
<tr>
<td style="text-align:left">13</td>
<td style="text-align:left">target</td>
<td style="text-align:left">double</td>
<td style="text-align:left">False</td>
<td></td>
</tr>
</tbody>
</table>
<!--kg-card-end: markdown--><p>You may wonder however how to create a table, let&apos;s say in Parquet format. To proceed, you have to call the function <code>to_parquet()</code> on <code>s3</code> variable providing the required parameters:</p><!--kg-card-begin: markdown--><table>
<thead>
<tr>
<th style="text-align:left">Parameter</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left">df</td>
<td>pandas.DataFrame</td>
<td>Pandas DataFrame <a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html?ref=helyx-org">https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html</a></td>
</tr>
<tr>
<td style="text-align:left">path</td>
<td>str</td>
<td>S3 path (for file e.g. s3://bucket/prefix/filename.parquet) (for dataset e.g. s3://bucket/prefix)</td>
</tr>
<tr>
<td style="text-align:left">dataset</td>
<td>bool</td>
<td>If True store a parquet dataset instead of a single file. If True, enable all follow arguments: partition_cols, mode, database, table, description, parameters, columns_comments.</td>
</tr>
<tr>
<td style="text-align:left">database</td>
<td>str, optional</td>
<td>Glue/Athena catalog: Database name.</td>
</tr>
<tr>
<td style="text-align:left">table</td>
<td>str, optional</td>
<td>Glue/Athena catalog: Table name</td>
</tr>
<tr>
<td style="text-align:left">mode</td>
<td>str, optional</td>
<td>append (Default), overwrite, overwrite_partitions. Only takes effect if dataset=True</td>
</tr>
<tr>
<td style="text-align:left">description</td>
<td>str, optional</td>
<td></td>
</tr>
<tr>
<td style="text-align:left">parameters</td>
<td>Dict[str, str], optional</td>
<td></td>
</tr>
<tr>
<td style="text-align:left">columns_comments</td>
<td>Dict[str, str], optional</td>
<td></td>
</tr>
</tbody>
</table>
<!--kg-card-end: markdown--><p>All parameters can be found at the following URL: <a href="https://aws-data-wrangler.readthedocs.io/en/latest/stubs/awswrangler.s3.to_parquet.html?ref=helyx-org#awswrangler.s3.to_parquet">https://aws-data-wrangler.readthedocs.io/en/latest/stubs/awswrangler.s3.to_parquet.html#awswrangler.s3.to_parquet</a>.</p><p>Writing a pandas DataFrame to <em>S3</em> in <em>Parquet</em> format, and referencing it in <em>Glue Data Catalog</em>, can be done this way with the following code: </p><pre><code class="language-python">
desc = &quot;&quot;&quot;This is a copy of UCI ML housing dataset. https://archive.ics.uci.edu/ml/machine-learning-databases/housing/
This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University.
The Boston house-price data of Harrison, D. and Rubinfeld, D.L. &#x2018;Hedonic prices and the demand for clean air&#x2019;, J. Environ. Economics &amp; Management, vol.5, 81-102, 1978. Used in Belsley, Kuh &amp; Welsch, &#x2018;Regression diagnostics &#x2026;&#x2019;, Wiley, 1980. N.B. Various transformations are used in the table on pages 244-261 of the latter.
The Boston house-price data has been used in many machine learning papers that address regression problems.
&quot;&quot;&quot;

param = {
    &quot;source&quot;: &quot;scikit-learn&quot;,
    &quot;class&quot;: &quot;cities&quot;
}

comments = {
    &quot;crim&quot;: &quot;per capita crime rate by town&quot;,
    &quot;zn&quot;: &quot;proportion of residential land zoned for lots over 25,000 sq.ft.&quot;,
    &quot;indus&quot;: &quot;proportion of non-retail business acres per town&quot;,
    &quot;chas&quot;: &quot;Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)&quot;,
    &quot;nox&quot;: &quot;nitric oxides concentration (parts per 10 million)&quot;,
    &quot;rm&quot;: &quot;average number of rooms per dwelling&quot;,
    &quot;age&quot;: &quot;proportion of owner-occupied units built prior to 1940&quot;,
    &quot;dis&quot;: &quot;weighted distances to five Boston employment centres&quot;,
    &quot;rad&quot;: &quot;index of accessibility to radial highways&quot;,
    &quot;tax&quot;: &quot;full-value property-tax rate per $10,000&quot;,
    &quot;ptratio&quot;: &quot;pupil-teacher ratio by town&quot;,
    &quot;b&quot;: &quot;1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town&quot;,
    &quot;lstat&quot;: &quot;lower status of the population&quot;,
}

res = wr.s3.to_parquet(
    df=df,
    path=f&quot;s3://{bucket}/boston&quot;,
    dataset=True,
    database=&quot;awswrangler_test&quot;,
    table=&quot;boston&quot;,
    mode=&quot;overwrite&quot;,
    description=desc,
    parameters=param,
    columns_comments=comments
)</code></pre><p>This code example is sourced from the AWS Data Wrangler tutorials, and more specifically the following one: <a href="https://github.com/awslabs/aws-data-wrangler/blob/master/tutorials/005%20-%20Glue%20Catalog.ipynb?ref=helyx-org">https://github.com/awslabs/aws-data-wrangler/blob/master/tutorials/005%20-%20Glue%20Catalog.ipynb</a>.</p><p>The execution of previous code sample in <em>AWS Glue Data Catalog</em> results in the following table informations:</p><figure class="kg-card kg-image-card"><img src="https://helyx.org/content/images/2020/06/glue_catalog_table_boston.png" class="kg-image" alt="Pandas on AWS with AWS Data Wrangler" loading="lazy"></figure><h3 id="aws-athena">AWS Athena </h3><p>Now that we have learned to interact with <em>Amazon S3</em> and <em>AWS Glue Data Catalog</em>, and that we know how to flush <em>DataFrames</em> in S3 and reference it as a dataset &#xA0;in the <em>Data Catalog</em>, we can focus on how to interact with data stored with the service AWS Athena.</p><p><em>AWS Data Wrangler</em> allows to run queries on Athena and fetches results in two ways:</p><ul><li>Using CTAS <strong>(ctas_approach=True)</strong>, which is the default method.</li><li>Using regular queries <strong>(ctas_approach=True)</strong>, and parsing CSV results on S3.</li></ul><p><strong><strong>ctas_approach=True</strong></strong></p><p>As mentioned in <a href="https://github.com/awslabs/aws-data-wrangler/blob/master/tutorials/006%20-%20Amazon%20Athena.ipynb?ref=helyx-org">tutorials</a>, this first approach allows to wrap the query with a <em>CTAS</em>, and read the table data as parquet directly from S3. It is faster as it relies on <em>Parquet</em> and not <em>CSV</em>, but it also enables support for nested types. It is mostly a trick compared to the original approach provided officially by the API, but it is effective and fully legal.</p><p>The counterpart to use this approach is that you need additional permissions on Glue (Requires create/delete table permissions). The background mechanism is based on the creation of a temporary table that will be immediately deleted after consumption. </p><p>Query example: </p><pre><code class="language-python">wr.athena.read_sql_query(&quot;SELECT * FROM noaa&quot;, database=&quot;awswrangler_test&quot;)</code></pre><p><strong><strong>ctas_approach=</strong>False</strong></p><p>Using the regular approach parsing the resulting <em>CSV</em> on <em>S3</em> provided as query execution result does not requires additional permissions. The read of results will not be as fast as the approach relying on <em>CTAS</em>, but it will anyway be faster than reading results with standard <em>AWS APIs</em>.</p><p>Query example: </p><pre><code class="language-python">wr.athena.read_sql_query(&quot;SELECT * FROM noaa&quot;, database=&quot;awswrangler_test&quot;, ctas_approach=False)</code></pre><p>The only difference with previous example is the change of ctas_approach parameter value from <code>True</code> to <code>False</code>.</p><h4 id="use-of-categories">Use of categories</h4><p>Defining <em>DataFrame</em> columns as category allows to optimize the speed of execution, but also helps to save memory. There is only the need to define an additional parameter <code>categories</code> to the function to leverage the improvement.</p><pre><code class="language-python">wr.athena.read_sql_query(&quot;SELECT * FROM noaa&quot;, database=&quot;awswrangler_test&quot;, categories=[&quot;id&quot;, &quot;dt&quot;, &quot;element&quot;, &quot;value&quot;, &quot;m_flag&quot;, &quot;q_flag&quot;, &quot;s_flag&quot;, &quot;obs_time&quot;])</code></pre><p>The returned columns are of type <code>pandas.Categorical</code> .</p><h4 id="batching-read-of-results">Batching read of results</h4><p>This option is good for memory constrained environments. Activating this option can be done by passing parameter <code>chunksize</code>. The value provided corresponds to the size of the chunk of data to read. Reading datasets this way allows to limit and constrain memory used, but also implies to read the full results by iterating over chunks.</p><p>Query example:</p><pre><code class="language-python">dfs = wr.athena.read_sql_query(
    &quot;SELECT * FROM noaa&quot;,
    database=&quot;awswrangler_test&quot;,
    ctas_approach=False,
    chunksize=10_000_000
)

for df in dfs:  # Batching
    print(len(df.index))</code></pre><p>Knowing that big datasets can be challenging to load and read, it is a good workaround to avoid memory issues.</p><h2 id="packaging-dependencies">Packaging &amp; Dependencies</h2><h3 id="availability-as-an-aws-lambda-layer">Availability as an AWS Lambda layer</h3><p>Going behind the toy demo, you may wonder how to integrate it with your code. Is it integrable with ease using for example <em>AWS Lambda</em> functions ? Will you have to build a complex pipeline to integrate it the right way into your <em>AWS Lambda</em> package ?</p><p>The answer is definitively: No ! A <em>Lambda Layer</em>&apos;s zip-file is available along Python <em>wheels</em> &amp; <em>eggs</em>. The <em>Lambda Layers</em> are available at the moment in 3 flavors: Python 3.6, 3.7 &amp; 3.8.</p><h3 id="aws-glue-integration">AWS Glue integration</h3><p>As the <em>AWS Data Wrangler</em> package counts on compiled dependencies (C/C++), there is no support for <em>Glue PySpark</em> by now. Only &#xA0;integration with <a href="https://docs.aws.amazon.com/glue/latest/dg/add-job-python.html?ref=helyx-org" rel="nofollow">Glue Python Shell</a> is possible at the moment.</p><h2 id="going-one-step-deeper">Going one step deeper</h2><p>If you want to learn more about the library, fee free to read the <a href="https://aws-data-wrangler.readthedocs.io/?ref=helyx-org">documentation</a> as it is a good source of inspiration. You can also visit the GitHub repository of the project and crawl the <a href="https://github.com/awslabs/aws-data-wrangler/tree/master/tutorials?ref=helyx-org">tutorial directory</a>.</p>]]></content:encoded></item><item><title><![CDATA[Must known options of the AWS CLI]]></title><description><![CDATA[Explore the AWS CLI capacities]]></description><link>https://helyx.org/what-can-you-do-with-the-aws-cli/</link><guid isPermaLink="false">5ec6dec804a06e04d218e5f4</guid><category><![CDATA[AWS]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Thu, 21 May 2020 21:41:49 GMT</pubDate><media:content url="https://helyx.org/content/images/2020/11/1200x628.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2020/11/1200x628.jpg" alt="Must known options of the AWS CLI"><p>So you have installed the AWS CLI on your system. What can you do with it ? Let&apos;s do some exploration on some basic usages.</p><h2 id="know-how-to-get-help">Know how to get help</h2><p>At a moment or another, you will have the need to get some help. You have the option to crawl the internet, but you can also just use what is at your finger tips.</p><p>By typing `aws` command in your favorite shell, you will get the usual usage informations relative to the command:</p><pre><code class="language-bash">$ aws
usage: aws [options] &lt;command&gt; &lt;subcommand&gt; [&lt;subcommand&gt; ...] [parameters]
To see help text, you can run:

  aws help
  aws &lt;command&gt; help
  aws &lt;command&gt; &lt;subcommand&gt; help
aws: error: the following arguments are required: command</code></pre><p>By reading carefully the usage, you may remark, you can access help at &#xA0;CLI level, command level, and then command / subcommand level. </p><p>Without even paying attention, we just got an interesting information: AWS CLI relies not only on commands, but also subcommands. Basically, it helps to reference services at command level, and then actions related to the selected service at subcommand level.</p><p>Here is the command structure:</p><pre><code class="language-bash">$ aws &lt;command&gt; &lt;subcommand&gt; [options and parameters]</code></pre><p>Depending on the command /subcommand used, you will be able to use various types of input values, such as numbers, strings, lists, maps or even JSON structures.</p><p>By executing command `aws help`, you will get the following answer:</p><pre><code class="language-bash">AWS()



NAME
       aws -

DESCRIPTION
       The  AWS  Command  Line  Interface is a unified tool to manage your AWS
       services.

SYNOPSIS
          aws [options] &lt;command&gt; &lt;subcommand&gt; [parameters]

       Use aws command help for information on a  specific  command.  Use  aws
       help  topics  to view a list of available help topics. The synopsis for
       each command shows its parameters and their usage. Optional  parameters
       are shown in square brackets.

OPTIONS
       --debug (boolean)

       Turn on debug logging.

       --endpoint-url (string)

       Override command&apos;s default URL with the given URL.

       --no-verify-ssl (boolean)

       By  default, the AWS CLI uses SSL when communicating with AWS services.
       For each SSL connection, the AWS CLI will verify SSL certificates. This
       option overrides the default behavior of verifying SSL certificates.

       --no-paginate (boolean)

       Disable automatic pagination.

       --output (string)

       The formatting style for command output.

       o json

       o text

       o table

       --query (string)

       A JMESPath query to use in filtering the response data.

       --profile (string)

       Use a specific profile from your credential file.

       --region (string)

       The region to use. Overrides config/env settings.

       --version (string)

       Display the version of this tool.

       --color (string)

       Turn on/off color output.

       o on

       o off

       o auto

       --no-sign-request (boolean)

       Do  not  sign requests. Credentials will not be loaded if this argument
       is provided.

       --ca-bundle (string)

       The CA certificate bundle to use when verifying SSL certificates. Over-
       rides config/env settings.

       --cli-read-timeout (int)

       The  maximum socket read time in seconds. If the value is set to 0, the
       socket read will be blocking and not timeout.

       --cli-connect-timeout (int)

       The maximum socket connect time in seconds. If the value is set  to  0,
       the socket connect will be blocking and not timeout.

AVAILABLE SERVICES
      o ...</code></pre><p>Checking at the bottom, you can see you will have access to the full list of services supported by the version of the CLI. But more important, you have all the options of the CLI, and there you can already see some goodness related to the CLI as debug logging, switch of target endpoint, response content filtering, and even configuration of targeted region or used profile.</p><p>Let&apos;s go through some interesting options available and see what they have to offer !</p><h2 id="debug-logging">Debug logging</h2><pre><code class="language-bash">aws --debug ...</code></pre><p>Being able to troubleshoot commands may become critical when you experiment issues with the AWS CLI. The simple debug flag will activate highly verbose debug logs, providing you precious information you need to understand what is ongoing.</p><h2 id="endpoint-url">Endpoint URL</h2><pre><code class="language-shell">aws --endpoint-url &lt;string&gt; ...</code></pre><p>Whenever you start using AWS services to host an endpoint directly within a private VPC, you have to specify them to use them instead of using the default pubic one.</p><p>It may be, especially useful in entreprise when an integration exists for example between the company network and the VPC, meaning that if you want to avoid to go through the internet, you will have to configure and use the VPC Endpoint associated with the service you are targeting.</p><h2 id="output-format">Output format</h2><p>The output flag is very handy. It allow to provide answers with multiple formats. It is possible to deal with: json, yaml, text, and table.</p><p>On one hand side, the text format is useful to process responses with standard Unix tools as `grep`, `sed` or `awk`. On the other hand, the table format allows to read data in table format.</p><p>Output flag value can be pre-configured into the AWS CLI config file. Here is an example:</p><pre><code class="language-bash">[default]
output=text</code></pre><p>It is also possible to specify it with an environment variable:</p><pre><code class="language-bash">$ export AWS_DEFAULT_OUTPUT=&quot;table&quot;</code></pre><p>But definitively, you may want to override default configuration with the flag:</p><pre><code class="language-bash">$ aws swf list-domains --registration-status REGISTERED --output json</code></pre><h3 id="text-format">Text format</h3><p>Using the text format will enable alternative presentation that may fit better with the need to execute requests and get results &#xA0;that may be much readable:</p><pre><code class="language-bash">$ aws iam list-users --output text --query &apos;Users[*].[UserName,Arn,CreateDate,PasswordLastUsed,UserId]&apos;</code></pre><pre><code class="language-bash">Admin         arn:aws:iam::123456789012:user/Admin         2014-10-16T16:03:09+00:00   2016-06-03T18:37:29+00:00   AIDA1111111111EXAMPLE backup-user   arn:aws:iam::123456789012:user/backup-user   2019-09-17T19:30:40+00:00   None                        AIDA2222222222EXAMPLE cli-user      arn:aws:iam::123456789012:user/cli-backup</code></pre><h3 id="table-format">Table format</h3><p>Given you want to read something more tabular and more visual to add &#xA0;results of request into a documentation, you may use the output flag this way:</p><pre><code class="language-bash">aws ec2 describe-volumes --query &apos;Volumes[*].{ID:VolumeId,InstanceId:Attachments[0].InstanceId,AZ:AvailabilityZone,Size:Size}&apos; --output table</code></pre><p>and then get the following result:</p><pre><code class="language-bash">------------------------------------------------------
|                   DescribeVolumes                  | 
+------------+----------------+--------------+-------+
|     AZ     |      ID        | InstanceId   | Size  |
+------------+----------------+--------------+-------+
|  us-west-2a|  vol-e11a5288  |  i-a071c394  |  30   |
|  us-west-2a|  vol-2e410a47  |  i-4b41a37c  |  8    |
+------------+----------------+--------------+-------+</code></pre><h2 id="query-specific-data">Query specific data</h2><pre><code class="language-bash">aws --query &lt;string&gt; ...</code></pre><p>The query flag will allow to specify a JMESPath query to use in filtering response data. <a href="https://jmespath.org/?ref=helyx-org">JMESPath</a> is a standard defining a query language for JSON.</p><p>You can find full detailed informations here:</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://docs.aws.amazon.com/cli/latest/userguide/cli-usage-output.html?ref=helyx-org#cli-usage-output-filter"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Controlling command output from the AWS CLI - AWS Command Line Interface</div><div class="kg-bookmark-description">Control the format of the output from the AWS Command Line Interface (AWS CLI).</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://docs.aws.amazon.com/assets/images/favicon.ico" alt="Must known options of the AWS CLI"><span class="kg-bookmark-author">AWS Command Line Interface</span></div></div></a></figure><p>Let say you want to describe volumes available in EC2 service, you will have to execute following command: </p><pre><code class="language-bash">$ aws ec2 describe-volumes</code></pre><p>And you will get this kind of answer, given you configured output to json:</p><pre><code class="language-bash">{
    &quot;Volumes&quot;: [
        {
            &quot;AvailabilityZone&quot;: &quot;us-west-2a&quot;,
            &quot;Attachments&quot;: [
                {
                    &quot;AttachTime&quot;: &quot;2013-09-17T00:55:03.000Z&quot;,
                    &quot;InstanceId&quot;: &quot;i-a071c394&quot;,
                    &quot;VolumeId&quot;: &quot;vol-e11a5288&quot;,
                    &quot;State&quot;: &quot;attached&quot;,
                    &quot;DeleteOnTermination&quot;: true,
                    &quot;Device&quot;: &quot;/dev/sda1&quot;
                }
            ],
            &quot;VolumeType&quot;: &quot;standard&quot;,
            &quot;VolumeId&quot;: &quot;vol-e11a5288&quot;,
            &quot;State&quot;: &quot;in-use&quot;,
            &quot;SnapshotId&quot;: &quot;snap-f23ec1c8&quot;,
            &quot;CreateTime&quot;: &quot;2013-09-17T00:55:03.000Z&quot;,
            &quot;Size&quot;: 30
        },
        {
            &quot;AvailabilityZone&quot;: &quot;us-west-2a&quot;,
            &quot;Attachments&quot;: [
                {
                    &quot;AttachTime&quot;: &quot;2013-09-18T20:26:16.000Z&quot;,
                    &quot;InstanceId&quot;: &quot;i-4b41a37c&quot;,
                    &quot;VolumeId&quot;: &quot;vol-2e410a47&quot;,
                    &quot;State&quot;: &quot;attached&quot;,
                    &quot;DeleteOnTermination&quot;: true,
                    &quot;Device&quot;: &quot;/dev/sda1&quot;
                }
            ],
            &quot;VolumeType&quot;: &quot;standard&quot;,
            &quot;VolumeId&quot;: &quot;vol-2e410a47&quot;,
            &quot;State&quot;: &quot;in-use&quot;,
            &quot;SnapshotId&quot;: &quot;snap-708e8348&quot;,
            &quot;CreateTime&quot;: &quot;2013-09-18T20:26:15.000Z&quot;,
            &quot;Size&quot;: 8
        }
    ]
}</code></pre><p>It may be verbose, and not very handy. And this is where the query flag starts to be interesting since it will allow to reduce the result payload to only what you are interested in. For example, you want, the VolumeId, the AvailabilityZone and the Size, you will have to execute the following command:</p><pre><code class="language-bash">aws ec2 describe-volumes --query &apos;Volumes[*].{VolumeId,AvailabilityZone,Size}&apos;</code></pre><p>Result will be the following one:</p><pre><code class="language-json">[
    {
        &quot;AvailabilityZone&quot;: &quot;us-west-2a&quot;,
        &quot;VolumeId&quot;: &quot;vol-e11a5288&quot;,
        &quot;Size&quot;: 30
    },
    {
        &quot;AvailabilityZone&quot;: &quot;us-west-2a&quot;,
        &quot;VolumeId&quot;: &quot;vol-2e410a47&quot;,
        &quot;Size&quot;: 8
    }
]</code></pre><p>You can go even further by providing aliases.</p><pre><code class="language-bash">aws ec2 describe-volumes --query &apos;Volumes[*].{ID:VolumeId,AZ:AvailabilityZone,Size:Size}&apos;</code></pre><p>providing the following result:</p><pre><code class="language-json">[
    {
        &quot;AZ&quot;: &quot;us-west-2a&quot;,
        &quot;ID&quot;: &quot;vol-e11a5288&quot;,
        &quot;Size&quot;: 30
    },
    {
        &quot;AZ&quot;: &quot;us-west-2a&quot;,
        &quot;ID&quot;: &quot;vol-2e410a47&quot;,
        &quot;Size&quot;: 8
    }
]</code></pre><h2 id="filter-result-content">Filter result content</h2><p>Capabilities are almost limitless given you know how to handle <a href="https://jmespath.org/?ref=helyx-org">JMESPath</a> query language. It is even possible to filter responses with expressions:</p><pre><code class="language-bash">$ aws ec2 describe-volumes \
    --filters &quot;Name=availability-zone,Values=us-west-2a&quot; &quot;Name=status,Values=attached&quot; \
    --query &apos;Volumes[?Size &gt; `50`].{Id:VolumeId,Size:Size,Type:VolumeType}&apos;</code></pre><p>Here we want to get only Volumes having a size greater than 50Gb. The powerful tip is that you don&apos;t have to write code to handle this kind of filtering, you just have to leverage the power of the filter flag.</p><h3 id="choose-the-profile">Choose the profile</h3><p>There are multiple ways to configure profile. It is also possible to configure it as a flag of the command executed, it might be handy in some situations. You basically have to add profile this way:</p><pre><code class="language-shell">aws configure --profile &lt;profilename&gt;</code></pre><h3 id="region-configuration">Region configuration</h3><p>As for `profile` option, there are multiple ways to provide `region` value. Region will influence the target endpoint used by the CLI to dialog with the expected region.</p><h2 id="conclusion">Conclusion &#xA0;</h2><p>Options are multiples as flags of the command line. Most of time, they have alternatives for example as Environment variables. Knowing them will allow you to be more proficient at the tasks you need to deal with on a daily basis. Not using these powerful options may make your work harder, as you would have to fix the needed feature.</p>]]></content:encoded></item><item><title><![CDATA[How to install AWS CLI v1 on Mac]]></title><description><![CDATA[Mini how-to describing how to install AWS CLI on a Mac computer.]]></description><link>https://helyx.org/how-to-install-aws-cli-v1-on-mac/</link><guid isPermaLink="false">5ec5cc5c0fa9593072c637f5</guid><category><![CDATA[AWS]]></category><category><![CDATA[Mac]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Tue, 19 May 2020 22:45:37 GMT</pubDate><media:content url="https://helyx.org/content/images/2020/11/aws-cli.jpeg" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2020/11/aws-cli.jpeg" alt="How to install AWS CLI v1 on Mac"><p>First of all, you need to know there are 2 versions of AWS CLI. In this article we will focus on AWS CLI v1 install as it is the most common and most known version of the AWS CLI.</p><h2 id="prerequisites">Prerequisites</h2><p>AWS CLI v1 relies on Python, and is compatible either with Python 2 or Python 3.</p><p>You can check your Python version with the following command line:</p><pre><code class="language-bash">$ python --version</code></pre><!--kg-card-begin: markdown--><p>if your computer doesn&apos;t already have Python, you will first have to install it.</p>
<!--kg-card-end: markdown--><h2 id="install-from-zip">Install from Zip</h2><p>This is not the most straightforward way to install the AWS CLI, but you can install it from the Zip bundle that is downloadable from S3. </p><p>You can install AWS CLI v1 with the following command: </p><pre><code class="language-bash">curl &quot;https://s3.amazonaws.com/aws-cli/awscli-bundle.zip&quot; -o &quot;awscli-bundle.zip&quot;
unzip awscli-bundle.zip
sudo ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws</code></pre><h3 id="verify-installation">Verify installation</h3><p>If everything is ok, you should be able to execute the following command and see as a result the version number of the CLI: </p><pre><code class="language-bash">$ aws --version
aws-cli/1.17.4 Python/3.7.4 Darwin/18.7.0 botocore/1.13</code></pre><h2 id="install-with-pip">Install with pip</h2><p>If you prefer, you can go also with pip to install the CLI. To proceed, you will have to execute this command: </p><pre><code class="language-shell">pip3 install awscli --upgrade --user</code></pre><p>Then, you should be able also, to get a result by typing the command `aws --version`.</p><h2 id="more-informations">More informations</h2><p>If you want more informations, you can refer to AWS CLI install page from the official documentation, following this link:</p><ul><li><a href="https://docs.aws.amazon.com/cli/latest/userguide/install-macos.html?ref=helyx-org">https://docs.aws.amazon.com/cli/latest/userguide/install-macos.html</a></li></ul>]]></content:encoded></item><item><title><![CDATA[Rust language is 1 year old]]></title><description><![CDATA[The Rust language aims to offer:
- Uncompromising performance and control,
- Prevention of many categories of bugs such as concurrency issues,
- Ergonomics at the height of languages like Python and Ruby.]]></description><link>https://helyx.org/rust-language-is-1-year-old/</link><guid isPermaLink="false">62549ee1d35ce80569cadb3b</guid><category><![CDATA[Rust]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Tue, 17 May 2016 08:40:10 GMT</pubDate><media:content url="https://helyx.org/content/images/2022/04/safeandunsafe.svg" medium="image"/><content:encoded><![CDATA[<figure class="kg-card kg-image-card"><img src="https://helyx.org/content/images/2022/04/Rust_programming_language_black_logo.svg_-1.png" class="kg-image" alt="Rust language is 1 year old" loading="lazy" width="144" height="144"></figure><img src="https://helyx.org/content/images/2022/04/safeandunsafe.svg" alt="Rust language is 1 year old"><p>The <a href="https://www.rust-lang.org/?ref=helyx-org">Rust</a> language aims to offer:</p><ul><li>Uncompromising performance and control,</li><li>Prevention of many categories of bugs such as concurrency issues,</li><li>Ergonomics at the height of languages like Python and Ruby.</li></ul><p>A year separates version 1.8.0 and the released version of version 1.0.0. To be more specific, this represents nearly 12,000 commits, and no less than 700 contributors. Remarkably, the language has become <a href="https://stackoverflow.com/research/developer-survey-2016?ref=helyx-org">the most popular language for developers</a> on StackOverflow.</p><p>The <a href="https://blog.rust-lang.org/2016/05/16/rust-at-one-year.html?ref=helyx-org">Rust anniversary article</a> also offers concrete cases of adopting the language:</p><ul><li>The <a href="https://www.dropbox.com/?ref=helyx-org">DropBox</a> use case is particularly interesting because it highlights how the company used Rust to develop the software to control the hardware they developed in an effort to become self-sufficient. screws from <a href="https://aws.amazon.com/?ref=helyx-org">Amazon Web Services</a>. Needless to underline the criticality of the task for a company which decides to operate on its own equipment on such a scale. While DropBox&apos;s back-end infrastructure has historically been written in Go, key issues such as memory footprint and lack of control over server usage have prompted components to be rewritten in Rust. According to <a href="https://news.ycombinator.com/item?id=11283688&amp;ref=helyx-org">Jamie Turner</a>, the advantages of Rust are numerous: advanced abstraction capabilities, no nulls, no segfaults, no leaks, but close to C performance and adequate memory control.</li><li>In a second feedback, the article tells us about <a href="https://servo.org/?ref=helyx-org">Servo</a>, and peripheral developments that are slowly starting to land in the Firefox code base, among other things, the mp4 metadata parsing task on OSX and Linux since Firefox version 45 . Although the code still works in test mode, no less than 1 billion execution reports have been compared with the C++ version with 100% accuracy. This example, however, remains the visible part of the iceberg, since other pieces of code should be integrated in the long term.</li></ul><p>During this first year, the focus was given particularly to improving Rust, both on the ecosystem part, as well as on the supported platforms, the tools, the compiler, or even the language itself. The <a href="http://blog.rust-lang.org/2016/05/16/rust-at-one-year.html?ref=helyx-org">article</a> details each of these categories.</p><p>The first Rust language conference, <a href="http://rustconf.com/?ref=helyx-org">RustConf</a>, is scheduled for September 9-10, 2016 in Portland. If the Rust language is of interest to you, and you live in Europe, don&apos;t worry, <a href="http://www.rustfest.eu/blog/happy-birthday-announcing-rustfest?ref=helyx-org">RustFest</a> is also scheduled for Berlin on September 17, 2016.</p><p>Finally, if you want to follow Rust news, you can subscribe to the <a href="https://this-week-in-rust.org/?ref=helyx-org">This week in Rust</a> newsletter to keep up to date with what&apos;s new in the ecosystem.</p>]]></content:encoded></item><item><title><![CDATA[Zero downtime deployment avec Node.js et Express, une première étape ...]]></title><description><![CDATA[<p>Lorsqu&#x2019;on souhaite stopper ou red&#xE9;marrer un serveur, diff&#xE9;rentes solutions s&#x2019;offrent &#xE0; nous. Parmi elles, la possibilit&#xE9; d&#x2019;envoyer un signal de type <a href="http://en.wikipedia.org/wiki/Unix_signal?ref=helyx-org">SIGTERM</a> au processus.</p><p>Cette solution est couramment utilis&#xE9;e, malheureusement cela entra&#xEE;ne la coupure des</p>]]></description><link>https://helyx.org/zero-downtime-deployment-avec-node-js-et-express-une-premiere-etape/</link><guid isPermaLink="false">62549ee1d35ce80569cadb3a</guid><category><![CDATA[Nginx]]></category><category><![CDATA[Node.js]]></category><category><![CDATA[Zero Downtime Deployment]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Mon, 21 Jul 2014 09:00:20 GMT</pubDate><media:content url="https://helyx.org/content/images/2022/04/node-9.jpeg" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2022/04/node-9.jpeg" alt="Zero downtime deployment avec Node.js et Express, une premi&#xE8;re &#xE9;tape ..."><p>Lorsqu&#x2019;on souhaite stopper ou red&#xE9;marrer un serveur, diff&#xE9;rentes solutions s&#x2019;offrent &#xE0; nous. Parmi elles, la possibilit&#xE9; d&#x2019;envoyer un signal de type <a href="http://en.wikipedia.org/wiki/Unix_signal?ref=helyx-org">SIGTERM</a> au processus.</p><p>Cette solution est couramment utilis&#xE9;e, malheureusement cela entra&#xEE;ne la coupure des connexions en cours sans permettre au serveur d&#x2019;honorer les requ&#xEA;tes en cours de traitement.</p><p>Dans l&#x2019;objectif de fournir une meilleure qualit&#xE9; de service, il est important d&#x2019;honorer toute requ&#xEA;te entrante. Comment faire, donc, pour permettre le red&#xE9;marrage d&#x2019;un serveur en douceur, sans couper brutalement les connexions en cours ?</p><p>Le protocole HTTP permet au serveur de r&#xE9;pondre aux requ&#xEA;tes entrantes par un status code 503 qui signifie que le service n&#x2019;est pas disponible (Service Unavailable). L&#x2019;id&#xE9;e est donc de renvoyer ce status code d&#xE8;s lors que le process a re&#xE7;u un signal SIGTERM, tout en laissant le temps au serveur de terminer le traitement des requ&#xEA;tes HTTP en cours, puis de stopper le serveur une fois que les requ&#xEA;tes en cours sont trait&#xE9;es. Il est toujours possible de killer le process du serveur si cela met trop longtemps.</p><h3 id="node-js">Node.js</h3><p>Avec Node.js, il est possible d&#x2019;&#xE9;couter les signaux re&#xE7;us par le syst&#xE8;me et d&#x2019;y r&#xE9;agir. Il est donc possible de mettre en place une m&#xE9;canique qui instruit le serveur de r&#xE9;pondre aux nouvelles requ&#xEA;tes entrantes par un status code 503, puis de couper le serveur une fois les requ&#xEA;tes en cours trait&#xE9;es.</p><p>Cela donne le code suivant:</p><pre><code class="language-coffeescript">start = new Date()

# Express
app = express()

gracefullyClosing = false

app.configure -&gt;
    app.set &apos;port&apos;, process.env.PORT or 8000
app.use (req, res, next) -&gt;
    return next() unless gracefullyClosing
	res.setHeader &quot;Connection&quot;, &quot;close&quot;
	res.send 503, &quot;Server is in the process of restarting&quot;

app.use app.router

app.get &apos;/&apos;, (req, res) -&gt; 
	res.send 200, &apos;OK&apos;

httpServer = app.listen app.get(&apos;port&apos;)

process.on &apos;SIGTERM&apos;, -&gt;
    logger.info &quot;Received kill signal (SIGTERM), shutting down gracefully.&quot;
    gracefullyClosing = true
    
httpServer.close -&gt;
    logger.info &quot;Closed out remaining connections.&quot;
    process.exit()
    
setTimeout -&gt;
    console.error &quot;Could not close connections in time, forcefully shutting down&quot;
    process.exit(1), 30 * 1000</code></pre><p>L&#x2019;appel de la fonction <em>close</em> sur l&#x2019;instance de serveur HTTP renvoy&#xE9;e <em>Express</em>, permet au serveur de terminer le traitement des requ&#xEA;te en cours avant de s&#x2019;arr&#xEA;ter.</p><p>Si votre application Node.js est correctement redond&#xE9;e avec un reverse proxy (Nginx, HAProxy) devant les diff&#xE9;rentes instances, les requ&#xEA;tes entrantes seront redirig&#xE9;es vers d&#x2019;autres instances en &#xE9;tat de traiter les requ&#xEA;tes. Cela sera le cas d&#xE8;s lors que votre application r&#xE9;pondra aux requ&#xEA;tes entrantes avec des status code 502 ou 503 par exemple.</p><h3 id="nginx">Nginx</h3><p>Si votre application est d&#xE9;ploy&#xE9;e derri&#xE8;re un reverse proxy tel que Nginx, il suffit de configurer celui-ci avec plusieurs flux <em>upstream</em> vers diff&#xE9;rentes instances de votre application pour qu&#x2019;il soit capable de passer la main &#xE0; une autre instance lorsqu&#x2019;il re&#xE7;oit un code erreur 502 ou 503 en r&#xE9;ponse &#xE0; une requ&#xEA;te transmise &#xE0; une des instances.</p><pre><code class="language-nginx">upstream my_app_upstream {
	server 127.0.0.1:7000;
	server 127.0.0.1:8000;
	server 127.0.0.1:9000;
}</code></pre><p>Ensuite, il faut d&#xE9;clarer ce que Nginx doit faire lorsqu&#x2019;il re&#xE7;oit une r&#xE9;ponse 502 ou 503 et le tour est jou&#xE9;. Ici, nous indiquons &#xE0; Nginx via la directive <a href="http://nginx.org/en/docs/http/ngx_http_proxy_module.html?ref=helyx-org#proxy_next_upstream">proxy_next_upstream</a> de faire suivre la requ&#xEA;te au prochain flux upstream lorsqu&#x2019;il re&#xE7;oit en r&#xE9;ponse une erreur, un timeout ou bien un code HTTP 502 ou 503:</p><pre><code class="language-nginx">location /app {
	...
	proxy_next_upstream error timeout http_502 http_503;
	...
	proxy_pass http://my_app_upstream;
}</code></pre><h3 id="limitations">Limitations</h3><p>Ce fonctionnement d&#xE9;crit dans cet article r&#xE9;pond bien aux besoins d&#x2019;applications traitant des requ&#xEA;tes HTTP simples, n&#xE9;anmoins il ne r&#xE9;pond pas au probl&#xE8;me des configurations ayant activ&#xE9; l&#x2019;option de <em>keepalive</em> pour les connexions HTTP, ni aux applications utilisant les websockets. Il faudra d&#xE8;s lors trouver une solutions adapt&#xE9;e.</p><h3 id="conclusion">Conclusion</h3><p>La mise en oeuvre de la notion de <em>Gracefully Closing</em> lors d&#x2019;un red&#xE9;marrage pour raison de d&#xE9;ploiement d&#x2019;une nouvelle version, est une premi&#xE8;re &#xE9;tape importante pour arriver &#xE0; faire du <em>Zero Downtime Deployment</em>. Cela permet, non seulement d&#x2019;honorer les requ&#xEA;tes en cours de traitement, mais &#xE9;galement &#xE0; vos reverse proxy de prendre connaissance de l&#x2019;absence de service et redispatcher les requ&#xEA;tes sur d&#x2019;autres serveurs avant que le flux upstream soit coup&#xE9;.</p>]]></content:encoded></item><item><title><![CDATA[Clusteriser votre application Node.js]]></title><description><![CDATA[<p>Les application Node.js sont par nature mono-thread&#xE9;es, or les serveurs, de nos jours, sont <em>presque</em>* toujours multi-core. Pour exploiter l&#x2019;ensemble des capacit&#xE9;s de ces serveurs, il est n&#xE9;cessaire de pouvoir exploiter tous les cores.</p><p>Pour cela, il existe principalement 2 techniques:</p>]]></description><link>https://helyx.org/clusteriser-votre-application-node-js/</link><guid isPermaLink="false">62549ee1d35ce80569cadb39</guid><category><![CDATA[Cluster]]></category><category><![CDATA[Node.js]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Thu, 17 Jul 2014 09:00:24 GMT</pubDate><media:content url="https://helyx.org/content/images/2022/04/cluster.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2022/04/cluster.jpg" alt="Clusteriser votre application Node.js"><p>Les application Node.js sont par nature mono-thread&#xE9;es, or les serveurs, de nos jours, sont <em>presque</em>* toujours multi-core. Pour exploiter l&#x2019;ensemble des capacit&#xE9;s de ces serveurs, il est n&#xE9;cessaire de pouvoir exploiter tous les cores.</p><p>Pour cela, il existe principalement 2 techniques:</p><ul><li>Lancer plusieurs instances d&#x2019;une application Node.js sur diff&#xE9;rents avec un reverse proxy pour load balancer les requ&#xEA;tes entrantes</li><li>Lancer une application Node.js en mode cluster</li></ul><p>Dans l&#x2019;id&#xE9;al, il faut lancer autant d&#x2019;instances qu&#x2019;il y a de cores sur la machine. Cela permet de partager au mieux la puissance de la machine entre les diff&#xE9;rentes instances sans pour autant d&#xE9;grader les performances en partager les cores entre plusieurs instances.</p><p>Nous allons dans cet article nous int&#xE9;resser au second cas de figure, c&#x2019;est &#xE0; dire le lancement d&#x2019;application Node.js en mode cluster.</p><blockquote><em>* Le terme &#x201C;presque&#x201D; est utilis&#xE9; ici car de nombreux serveurs cloud d&#x2019;entr&#xE9;e de gamme restent mono-thread&#xE9;s (VPS et instances EC2 1er prix, &#x2026;).</em></blockquote><h3 id="le-module-cluster">Le module cluster</h3><p>Le module <a href="http://nodejs.org/api/cluster.html?ref=helyx-org">cluster</a>, bien que marqu&#xE9; comme ayant une API exp&#xE9;rimental dans la documentation de Node.js, est aujourd&#x2019;hui largement utilis&#xE9;.</p><p>Son principe est simple, lorsqu&#x2019;une application est lanc&#xE9;e en mode cluster, un premier process est d&#xE9;marr&#xE9; en mode <em>master</em>. Le process <em>master</em> n&#x2019;a pas pour r&#xF4;le de traiter les requ&#xEA;tes entrantes &#xE0; proprement parler, mais plut&#xF4;t &#xE0; les dispatcher aux aux process fork&#xE9;s qui eux sont d&#xE9;di&#xE9;s au traitement des requ&#xEA;tes. Il sont lanc&#xE9;s dans le mode <em>worker</em>.</p><p>La responsabilit&#xE9; de l&#x2019;instanciation de forks en mode worker est du ressort du process master, et les r&#xE8;gles sont de fork sont laiss&#xE9;es &#xE0; la responsabilit&#xE9; du d&#xE9;veloppeur. Tout au long de la vie du cluster, des &#xE9;v&#xE9;nements sont g&#xE9;n&#xE9;r&#xE9;s aussi bien par le master que par les workers. Il est important de s&#x2019;y abonner pour &#xEA;tre capable de r&#xE9;agir &#xE0; des changements dans le cluster (Crash d&#x2019;un worker, par exemple).</p><p>Un exemple simple de cluster est propos&#xE9; par la documentation:</p><pre><code class="language-coffeescript">cluster = require(&quot;cluster&quot;)
http = require(&quot;http&quot;)
numCPUs = require(&quot;os&quot;).cpus().length

if cluster.isMaster
  # Fork workers.
  i = 0
  while i &lt; numCPUs
    cluster.fork()
    i++
  cluster.on &quot;exit&quot;, (worker, code, signal) -&gt;
    console.log &quot;worker &quot; + worker.process.pid + &quot; died&quot;
else
  # Workers can share any TCP connection
  # In this case its a HTTP server
  http.createServer((req, res) -&gt;
    res.writeHead 200
    res.end &quot;hello world\n&quot;
  ).listen 8000</code></pre><p>Bien que les API de clusterisation proposent une API assez simple &#xE0; g&#xE9;rer, il faut n&#xE9;anmoins s&#x2019;occuper de plusieurs points de d&#xE9;tail, et le risque de mal faire est rapidement arriv&#xE9;. Il est donc conseill&#xE9; de s&#x2019;appuyer sur un module tiers pour traiter ce sujet.</p><p>Le module <a href="https://github.com/doxout/recluster?ref=helyx-org">recluster</a>, tr&#xE8;s int&#xE9;ressant de part sa simplicit&#xE9; et sa maturit&#xE9;, permet de s&#x2019;affranchir de toute cette complexit&#xE9; de mise en oeuvre.</p><h3 id="recluster">recluster</h3><p>Pour l&#x2019;installer, il suffit de taper la commande suivante:</p><pre><code class="language-bash">npm install recluster --save</code></pre><p>Pour instancer une application en mode cluster, il suffit d&#x2019;ajouter le script suivant (cluster.coffee, par exemple) &#xE0; votre application:</p><pre><code class="language-coffeescript">recluster = require(&quot;recluster&quot;)
path = require(&quot;path&quot;)

cluster = recluster(path.join(__dirname, &quot;server.js&quot;), { ### Options ### })
cluster.run()

process.on &quot;SIGUSR2&quot;, -&gt;
  console.log &quot;Got SIGUSR2, reloading cluster...&quot;
  cluster.reload()

console.log &quot;spawned cluster, kill -s SIGUSR2&quot;, process.pid, &quot;to reload&quot;</code></pre><p>Le module recluster attend en param&#xE8;tre le chemin du script. Il suffit ensuite de lancer le fichier cluster.js au lieu du fichier server.js, et le tour est jou&#xE9; ! (Les exemples &#xE9;tant en CoffeeScript, il est n&#xE9;cessaire d&#x2019;avoir au pr&#xE9;alable compil&#xE9; les fichiers CoffeeScript).</p><h4 id="zero-downtime-reloading">Zero downtime reloading</h4><p>Le modules recluster permet de mettre &#xE0; jour une application sans coupure de service, il faut pour cela appeler la fonction reload sur la variable cluster.</p><p>Dans l&#x2019;exemple ci-dessus, le signal SIGUSR2, permet d&#x2019;indiquer au programme que le cluster doit &#xEA;tre recharg&#xE9;. Celui-ci rechargera les diff&#xE9;rents workers en se basant sur les options pass&#xE9;es en param&#xE8;tre (timeout de rechargement, etc). Les requ&#xEA;tes en cours de traitement seront honor&#xE9;s sans coupure brutale, dans la limite d&#x2019;un timeout d&#xE9;fini par les options de configuration du module.</p><p>Cette fonctionnalit&#xE9; est particuli&#xE8;rement int&#xE9;ressantes lors que l&#x2019;application doit &#xEA;tre mise &#xE0; jour sans coupure de service. Ainsi, la base de code peut-&#xEA;tre mise &#xE0; jour, puis les workers red&#xE9;marr&#xE9;s un par un une fois les requ&#xEA;tes en cours trait&#xE9;es.</p><p>Il est possible de s&#x2019;appuyer sur d&#x2019;autres conditions pour recharger les workers d&#x2019;un cluster, par exemple, il est possible de s&#x2019;appuyer sur la modification du fichier package.json pour d&#xE9;clencher un rechargement avec le code suivant:</p><pre><code class="language-coffeescript">fs = require &apos;fs&apos;
fs.watchFile &quot;package.json&quot;, (curr, prev) -&gt;
    console.log &quot;Package.json changed, reloading cluster...&quot;
    cluster.reload()</code></pre><h3 id="conclusion">Conclusion</h3><p>Ne vous laissez pas impressionner par la notion de clusterisation. Elle est tr&#xE8;s simple &#xE0; mettre en oeuvre dans le monde Node.js gr&#xE2;ce &#xE0; une API de base faisant parti des <em>core modules</em>, et de nombreux modules s&#x2019;appuyant dessus.</p>]]></content:encoded></item><item><title><![CDATA[Gérer les erreurs avec Node.js]]></title><description><![CDATA[<p>Lorsqu&#x2019;une exception n&#x2019;est pas g&#xE9;r&#xE9;e dans un programme Node.js, cela se termine en g&#xE9;n&#xE9;ral par un crash du process de l&#x2019;application. Il n&#x2019;y a d&#x2019;ailleurs pas grand chose &#xE0; faire pour</p>]]></description><link>https://helyx.org/gerer-les-erreurs-avec-node-js/</link><guid isPermaLink="false">62549ee1d35ce80569cadb38</guid><category><![CDATA[Node.js]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Tue, 15 Jul 2014 09:00:59 GMT</pubDate><media:content url="https://helyx.org/content/images/2022/04/node-7.jpeg" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2022/04/node-7.jpeg" alt="G&#xE9;rer les erreurs avec Node.js"><p>Lorsqu&#x2019;une exception n&#x2019;est pas g&#xE9;r&#xE9;e dans un programme Node.js, cela se termine en g&#xE9;n&#xE9;ral par un crash du process de l&#x2019;application. Il n&#x2019;y a d&#x2019;ailleurs pas grand chose &#xE0; faire pour tenter de rattraper le coup si l&#x2019;erreur remonte jusqu&#x2019;&#xE0; la boucle d&#x2019;&#xE9;v&#xE9;nement. C&#x2019;est pourquoi, il est n&#xE9;cessaire de traiter les erreurs avec attention.</p><p>Si votre programme g&#xE9;n&#xE8;re une erreur qui remonte jusqu&#x2019;&#xE0; la boucle d&#x2019;&#xE9;v&#xE9;nement comme suit:</p><pre><code class="language-coffeescript">process.nextTick () -&gt;
	throw new Error(&quot;Some Bad Error&quot;)</code></pre><p>Vous aurez le droit au message d&#x2019;erreur qui suit:</p><pre><code class="language-bash">Express listening on port: 9000
Started in 0.073 seconds

/Users/akinsella/Workspace/Projects/gtfs-playground/build/app-test.js:30
    throw new Error(&quot;Some Bad Error&quot;);
          ^
Error: Some Bad Error
    at /Users/akinsella/Workspace/Projects/gtfs-playground/build/app-test.js:30:11
    at process._tickCallback (node.js:415:13)
    at Function.Module.runMain (module.js:499:11)
    at startup (node.js:119:16)
    at node.js:901:3

Process finished with exit code 8</code></pre><h3 id="l-v-nement-uncaughtexception-">L&#x2019;&#xE9;v&#xE9;nement &#x2018;uncaughtException&#x2019;</h3><p>Node.js vous donne une chance d&#x2019;intercepter les erreurs qui remontent jusqu&#x2019;&#xE0; la boucle d&#x2019;&#xE9;v&#xE9;nement grace au dispatch l&#x2019;&#xE9;v&#xE9;nement de type <em>uncaughtExcpetion</em>.</p><p>Contrairement &#xE0; ce qu&#x2019;on pourrait penser, l&#x2019;&#xE9;v&#xE9;nement n&#x2019;est pas dispatch&#xE9; par le process Node.js pour catcher l&#x2019;erreur et permettre de continuer au programme son ex&#xE9;cution. C&#x2019;est principalement pour g&#xE9;rer correctement la lib&#xE9;ration de resources qui auraient &#xE9;t&#xE9; ouvertes par le programmes, et &#xE9;ventuellement logger de fa&#xE7;on plus pr&#xE9;cise le contexte de l&#x2019;erreur (Etat de la m&#xE9;moire, etc&#x2026;).</p><p>Lorsqu&#x2019;une erreur remonte jusqu&#x2019;&#xE0; la boucle d&#x2019;&#xE9;v&#xE9;nement, il ne faut plus consid&#xE9;rer l&#x2019;&#xE9;tat du programme comme &#xE9;tant consistant. C&#x2019;est pour cette raison qu&#x2019;il ne faut pas tenter de catcher l&#x2019;exception dans l&#x2019;id&#xE9;e de permettre au programme de continuer &#xE0; fonctionner.</p><p>Si vous souhaitez logger un message d&#x2019;erreur dans le cas d&#x2019;une exception remont&#xE9;e jusqu&#x2019;&#xE0; la boucle d&#x2019;&#xE9;v&#xE9;nement, vous pouvevz ajouter le code suivant &#xE0; votre programme:</p><pre><code class="language-coffeescript">process.on &apos;uncaughtException&apos;, (err) -&gt;
    console.log JSON.stringify(process.memoryUsage())
    console.error &quot;An uncaughtException was found, the program will end. #{err}, stacktrace: #{err.stack}&quot;
    process.exit 1

process.nextTick () -&gt;
    throw new Error(&quot;Some Bad Error&quot;)</code></pre><p>Ce qui donne le r&#xE9;sultat suivant:</p><pre><code class="language-bash">/Users/akinsella/.nvm/v0.10.22/bin/node app-test.js
{&quot;rss&quot;:12312576,&quot;heapTotal&quot;:4083456,&quot;heapUsed&quot;:2153648}
An uncaughtException was found, the program will end. Error: Some Bad Error, stacktrace: Error: Some Bad Error
    at /Users/akinsella/Workspace/Projects/gtfs-playground/build/app-test.js:13:11
    at process._tickCallback (node.js:415:13)
    at Function.Module.runMain (module.js:499:11)
    at startup (node.js:119:16)
    at node.js:901:3

Process finished with exit code 1</code></pre><p>Contrairement &#xE0; la gestion par d&#xE9;faut, nous avons pu retourner un <em>exit code</em> sp&#xE9;cifique. Ici le code retour: 1<br>Le message de log est &#xE9;galement diff&#xE9;rent. Nous sommes donc en mesure de maitriser le log d&#x2019;erreur en cas de crash.<br>Par ailleurs, les informations de m&#xE9;moire rendues disponibles dans les logs participeront &#xE0; faciliter l&#x2019;analyse du crash.</p><h3 id="express">Express</h3><p>Si vous utilisez un framework type <a href="http://expressjs.com/?ref=helyx-org">Express</a>, vous serez d&#xE9;charg&#xE9; d&#x2019;une partie du travail car les erreurs qui interviennent pendant le traitement d&#x2019;une requ&#xEA;te HTTTP sont catch&#xE9;es par le framework qui g&#xE9;rera pour vous l&#x2019;erreur.</p><p>Par d&#xE9;faut Express se contente de logger un crash qui intervient dans le traitement d&#x2019;une requ&#xEA;te HTTP via un simple log retourn&#xE9; dans la r&#xE9;ponse HTTP.</p><p>Par exemple, en ex&#xE9;cutant le programme suivant:</p><pre><code class="language-coffeescript">express = require &apos;express&apos;

app = express()

app.configure -&gt;
    app.set &apos;port&apos;, process.env.PORT or 9000
    app.use app.router

app.get &quot;/&quot;, (req, res) -&gt;
    throw new Error(&quot;Some Bad Error&quot;)

httpServer = app.listen app.get(&apos;port&apos;)

process.on &apos;uncaughtException&apos;, (err) -&gt;
    console.error &quot;An uncaughtException was found, the program will end. #{err}, stacktrace: #{err.stack}&quot;
    process.exit 1

console.error &quot;Express listening on port: #{app.get(&apos;port&apos;)}&quot;</code></pre><p>Puis en se rendant sur l&#x2019;url <em><a href="http://localhost:9000/?ref=helyx-org">http://localhost:9000</a></em>, Express renverra dans la r&#xE9;ponse HTTP le log suivant:</p><pre><code>Error: Some Bad Error
    at /Users/akinsella/Workspace/Projects/gtfs-playground/build/app-test.js:16:11
    at callbacks (/Users/akinsella/Workspace/Projects/gtfs-playground/node_modules/express/lib/router/index.js:164:37)
    at param (/Users/akinsella/Workspace/Projects/gtfs-playground/node_modules/express/lib/router/index.js:138:11)
    at pass (/Users/akinsella/Workspace/Projects/gtfs-playground/node_modules/express/lib/router/index.js:145:5)
    at Router._dispatch (/Users/akinsella/Workspace/Projects/gtfs-playground/node_modules/express/lib/router/index.js:173:5)
    at Object.router (/Users/akinsella/Workspace/Projects/gtfs-playground/node_modules/express/lib/router/index.js:33:10)
    at next (/Users/akinsella/Workspace/Projects/gtfs-playground/node_modules/express/node_modules/connect/lib/proto.js:193:15)
    at Object.expressInit [as handle] (/Users/akinsella/Workspace/Projects/gtfs-playground/node_modules/express/lib/middleware.js:30:5)
    at next (/Users/akinsella/Workspace/Projects/gtfs-playground/node_modules/express/node_modules/connect/lib/proto.js:193:15)
    at Object.query [as handle] (/Users/akinsella/Workspace/Projects/gtfs-playground/node_modules/express/node_modules/connect/lib/middleware/query.js:45:5)</code></pre><p>Il est &#xE9;galement possible d&#x2019;activer un log plus d&#xE9;taill&#xE9;, avec une mise en forme HTML, particuli&#xE8;rement utile en mode d&#xE9;veloppement en ajoutant les lignes suivantes:</p><pre><code class="language-coffeescript">app.configure &apos;development&apos;, () -&gt;
    app.use express.errorHandler
        dumpExceptions: true,
        showStack: true</code></pre><p>Le r&#xE9;sultat sera le suivant:</p><figure class="kg-card kg-image-card"><img src="https://helyx.org/content/images/wordpress/express-error.jpg" class="kg-image" alt="G&#xE9;rer les erreurs avec Node.js" loading="lazy"></figure><p>Express vous permet &#xE9;galement de renseigner un middleware qui aura la possibilit&#xE9; d&#x2019;interagir les erreurs rencontr&#xE9;es dans le traitement des requ&#xEA;tes HTTP. Ce middleware peut &#xEA;tre utile pour logger l&#x2019;erreur rencontr&#xE9;e ou bien encore lib&#xE9;rer des resources associ&#xE9;es &#xE0; la requ&#xEA;te en cours de traitement.</p><p>Il permettra &#xE9;galement de renvoyer une r&#xE9;ponse adapt&#xE9;e &#xE0; l&#x2019;utilisateur en cas d&#x2019;erreur non g&#xE9;r&#xE9;e. Ce point particuli&#xE8;rement int&#xE9;ressant dans le cas de l&#x2019;impl&#xE9;mentation d&#x2019;API REST. Le serveur devient capable de renvoyer une erreur interpr&#xE9;table par le client m&#xEA;me en cas d&#x2019;erreur non g&#xE9;r&#xE9;e.</p><p>Le middleware prendra la format suivant:</p><pre><code class="language-coffeescript">app.use (err, req, res, next) -&gt;
        console.error &quot;Error: #{err}, Stacktrace: #{err.stack}&quot;
        res.send 500, &quot;Something broke! Error: #{err}, Stacktrace: #{err.stack}&quot;</code></pre><h3 id="les-promises">Les promises</h3><p>Les <em>promises</em> peuvent vous aider &#xE0; g&#xE9;rer les erreurs plus efficacement gr&#xE2;ce &#xE0; leur m&#xE9;canisme de gestion des erreurs.</p><p>Un traitement encapsul&#xE9; dans une promise ne permettra jamais &#xE0; une erreur de remonter jusqu&#x2019;&#xE0; l&#x2019;<em>event loop</em>, l&#x2019;erreur sera catch&#xE9;e par la <em>promise</em> qui sera remont&#xE9;e dans la fonction <em>fail</em> ou <em>catch</em> selon la librairie ou bien encore dans le callback d&#x2019;erreur de la fonction <em>then</em>.</p><p>Il est donc int&#xE9;ressant d&#x2019;encapsuler vos traitement avec des promises, non seulement pour am&#xE9;liorer la lisibilit&#xE9; du code, mais &#xE9;galement pour sa capacit&#xE9; &#xE0; r&#xE9;sister aux crashs.</p><h3 id="les-domaines">Les domaines</h3><p>La notion de <em>domain</em> ne sera pas trait&#xE9;e dans cet article, sachez n&#xE9;anmoins que cette notion a &#xE9;t&#xE9; ajout&#xE9;e &#xE0; Node.js en version 0.10.</p><p>En bref et pour faire simple, l&#x2019;id&#xE9;e est plus ou moins de <em>containeriser</em> des <em>event emitters</em> en les associant &#xE0; un <em>domain</em>. En cas d&#x2019;erreur dans le traitement d&#x2019;un &#xE9;v&#xE9;nement g&#xE9;r&#xE9; par un <em>domain</em>, l&#x2019;exception ne fera pas crasher le programme directement, c&#x2019;est le <em>domain</em> qui sera en charge de traiter l&#x2019;erreur, mais cela ne vous sauvera pas en g&#xE9;n&#xE9;ral d&#x2019;un red&#xE9;marrage du process &#x2026; comme en t&#xE9;moigne la documentation:</p><blockquote><em>Domain error handlers are not a substitute for closing down your process when an error occurs.</em><br><br>By the very nature of how throw works in JavaScript, there is almost never any way to safely &#x201C;pick up where you left off&#x201D;, without leaking references, or creating some other sort of undefined brittle state.<br><br>The safest way to respond to a thrown error is to shut down the process. Of course, in a normal web server, you might have many connections open, and it is not reasonable to abruptly shut those down because an error was triggered by someone else.<br><br>The better approach is send an error response to the request that triggered the error, while letting the others finish in their normal time, and stop listening for new requests in that worker.<br><br><em>In this way, domain usage goes hand-in-hand with the cluster module, since the master process can fork a new worker when a worker encounters an error. For node programs that scale to multiple machines, the terminating proxy or service registry can take note of the failure, and react accordingly.</em></blockquote><p>La documentation de Node.js relative aux <em>domains</em> est disponibles &#xE0; l&#x2019;url suivante:</p><ul><li><em><a href="http://nodejs.org/api/domain.html?ref=helyx-org">http://nodejs.org/api/domain.html</a></em></li></ul>]]></content:encoded></item><item><title><![CDATA[Détecter les versions dépassées de vos dépendances Node.js]]></title><description><![CDATA[<p>L&#x2019;&#xE9;cosyst&#xE8;me Node.js est non seulement tr&#xE8;s jeune, mais &#xE9;galement tr&#xE8;s dynamique. Les versions des librairies que vous utilisez ont tendance &#xE0; changer tr&#xE8;s vite. Pour vous &#xE9;conomiser la recherche permanente des versions de librairies les plus</p>]]></description><link>https://helyx.org/detecter-les-versions-depassees-de-vos-dependances-node-js/</link><guid isPermaLink="false">62549ee1d35ce80569cadb37</guid><category><![CDATA[npm]]></category><category><![CDATA[Node.js]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Wed, 09 Jul 2014 09:00:22 GMT</pubDate><media:content url="https://helyx.org/content/images/2022/04/nodejs-1.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2022/04/nodejs-1.jpg" alt="D&#xE9;tecter les versions d&#xE9;pass&#xE9;es de vos d&#xE9;pendances Node.js"><p>L&#x2019;&#xE9;cosyst&#xE8;me Node.js est non seulement tr&#xE8;s jeune, mais &#xE9;galement tr&#xE8;s dynamique. Les versions des librairies que vous utilisez ont tendance &#xE0; changer tr&#xE8;s vite. Pour vous &#xE9;conomiser la recherche permanente des versions de librairies les plus r&#xE9;centes pour mettre &#xE0; jour votre fichier <em>package.json</em>, <em>npm</em> met &#xE0; disposition l&#x2019;outil <em>npm-outdated</em> qui se charge d&#x2019;analyser vos d&#xE9;pendances et de vous indiquer celles qui ne sont plus &#xE0; jour.</p><h3 id="npm-outdated">npm-outdated</h3><p>L&#x2019;outil <em>npm-outdated</em> s&#x2019;utilise tr&#xE8;s simplement en l&#x2019;appelant de la fa&#xE7;on suivante:</p><pre><code class="language-bash">npm outdated --depth=0</code></pre><p>Et produira la sortie ci-dessous:</p><figure class="kg-card kg-image-card"><img src="https://helyx.org/content/images/2022/04/npm-outdated.jpeg" class="kg-image" alt="D&#xE9;tecter les versions d&#xE9;pass&#xE9;es de vos d&#xE9;pendances Node.js" loading="lazy" width="415" height="184"></figure><p>Les versions plus anciennes de l&#x2019;outil ne produiront pas de sortie coloris&#xE9;e, il est donc int&#xE9;ressant de monter de version. La version de <em>npm</em> utils&#xE9;e ici est la <em>1.4.9</em>.</p><p><em>npm-outdated</em> analysera aussi bien vos d&#xE9;pendances standards que les d&#xE9;pendances de d&#xE9;veloppement sans faire de distinction.</p><p>La sortie retourn&#xE9;e par l&#x2019;outil ne montre que les d&#xE9;pendances ayant une version d&#xE9;pass&#xE9;e. Vous ne verrez donc pas les d&#xE9;pendances ayant une version &#xE0; jour.</p><p>3 versions diff&#xE9;rentes sont renseign&#xE9;es: <em>Current</em>, <em>Wanted</em> et <em>Latest</em>. Ces versions repr&#xE9;sentent respectivement la version courante, puis la derni&#xE8;re version &#xE0; jour correspondant au pattern de version d&#xE9;clar&#xE9; pour votre d&#xE9;pendance dans le fichier <em>package.json</em>, et enfin la derni&#xE8;re version disponible de la librairie.</p><h4 id="option-depth">Option depth</h4><p>Le param&#xE8;tre <em>&#x2013;depth=0</em> permet de se limiter aux d&#xE9;pendances directes sans se soucier des d&#xE9;pendances tir&#xE9;es par les librairies elles-m&#xEA;me tir&#xE9;es par vos d&#xE9;pendances directes.</p><p>Si nous utilisons le param&#xE8;tre <em>&#x2013;depth=2</em>, les d&#xE9;pendances indirectes commenceront alors &#xE0; &#xEA;tre mat&#xE9;rialis&#xE9;es dans la sortie de l&#x2019;outil:</p><figure class="kg-card kg-image-card"><img src="https://helyx.org/content/images/wordpress/npm-outdated-depth-1.jpg" class="kg-image" alt="D&#xE9;tecter les versions d&#xE9;pass&#xE9;es de vos d&#xE9;pendances Node.js" loading="lazy"></figure><h4 id="option-json">Option json</h4><p>Le param&#xE8;tre <em>&#x2013;json</em> permet quant &#xE0; lui d&#x2019;obtenir une sortie JSON. Cette option est particuli&#xE8;rement pratique pour exploiter l&#x2019;information produite dans des rapports de build par exemple, ou bien pour &#xEA;tre exploiter par d&#x2019;autres outils.</p><p>En ex&#xE9;cutant la ligne de commande suivante:</p><pre><code class="language-bash">npm outdated --depth=0 --json</code></pre><p>Vous obtiendrez la sortie suivante:</p><pre><code class="language-json">{
  &quot;coffee-script&quot;: {
    &quot;current&quot;: &quot;1.6.3&quot;,
    &quot;wanted&quot;: &quot;1.6.3&quot;,
    &quot;latest&quot;: &quot;1.7.1&quot;,
    &quot;location&quot;: &quot;node_modules/coffee-script&quot;
  },
  &quot;passport-local&quot;: {
    &quot;current&quot;: &quot;0.1.6&quot;,
    &quot;wanted&quot;: &quot;0.1.6&quot;,
    &quot;latest&quot;: &quot;1.0.0&quot;,
    &quot;location&quot;: &quot;node_modules/passport-local&quot;
  },
  &quot;uglify-js&quot;: {
    &quot;current&quot;: &quot;2.4.13&quot;,
    &quot;wanted&quot;: &quot;2.4.14&quot;,
    &quot;latest&quot;: &quot;2.4.14&quot;,
    &quot;location&quot;: &quot;node_modules/uglify-js&quot;
  }, ...
}</code></pre><p>npm-update</p><p>Maintenant que vous connaissez les derni&#xE8;res versions disponibles, vous souhaitez peut-&#xEA;tre en mettre certaines &#xE0; jour. Pour cela, vous pouvez utiliser l&#x2019;outil <em>npm update</em>.</p><p>Pour mettre &#xE0; jour la librairie <em>request</em>, il faudrait ex&#xE9;cuter la commande suivante:</p><pre><code class="language-bash">npm update request</code></pre><h3 id="badges-pour-votre-repository-github">Badges pour votre repository GitHub</h3><p>Le projet <a href="https://david-dm.org/?ref=helyx-org">David</a> vous permet de g&#xE9;n&#xE9;rer des badges indiquant si les versions de vos librairies sont &#xE0; jour ou bien d&#xE9;pass&#xE9;es.</p><p>Ce projet est particuli&#xE8;rement int&#xE9;ressant car, non seulement, il g&#xE9;n&#xE8;re des rapports pour votre projet sans que vous ayez &#xE0; lever le petit doigt, mais il g&#xE9;n&#xE8;re &#xE9;galement des badges que vous pouvez exposer les pages de votre projet permettant d&#x2019;indiquer l&#x2019;&#xE9;tat des versions de vos d&#xE9;pendances.</p><p>Pour exemple, pour savoir si le projet <a href="https://github.com/akinsella/gtfs-playground?ref=helyx-org">gtfs-playground</a> a les versions de ses d&#xE9;pendances &#xE0; jour, vous pouvez vous rendre sur la page suivante: <a href="https://david-dm.org/akinsella/gtfs-playground?ref=helyx-org">https://david-dm.org/akinsella/gtfs-playground</a></p><p>L&#x2019;outil fonctionne exclusivement avec les repositories Github. Pour construire un rapport pour votre projet, il suffit de renseigner votre organization et du nom de votre repository dans l&#x2019;url suivante avant de l&#x2019;appeler:</p><p><a href="https://david-dm.org/?ref=helyx-org">https://david-dm.org/</a>&lt;organization&gt;/&lt;repository&gt;</p><p>De m&#xEA;me, pour obtenir le badge correspondant &#xE0; votre projet, il suffit de construire la balise <em>img</em> comme suit:</p><p><a href="https://david-dm.org/?ref=helyx-org">https://david-dm.org/</a>&lt;organization&gt;/&lt;repository&gt;.&lt;extension&gt;</p><p>Ce qui donne le r&#xE9;sultat suivant pour le format png:</p><figure class="kg-card kg-image-card"><img src="https://helyx.org/content/images/2022/05/insecure1.png" class="kg-image" alt="D&#xE9;tecter les versions d&#xE9;pass&#xE9;es de vos d&#xE9;pendances Node.js" loading="lazy" width="146" height="18"></figure><p>Et pour le format SVG:</p><figure class="kg-card kg-image-card"><img src="https://helyx.org/content/images/2022/05/insecure1-1.png" class="kg-image" alt="D&#xE9;tecter les versions d&#xE9;pass&#xE9;es de vos d&#xE9;pendances Node.js" loading="lazy" width="146" height="18"></figure><h3 id="conclusion">Conclusion</h3><p>L&#x2019;&#xE9;cosyst&#xE8;me Node.js &#xE9;volue rapidement. Les librairies proposent donc r&#xE9;guli&#xE8;rement de nouvelles fonctionnalit&#xE9;s ou bien encore des corrections de bug. Il ne faut donc pas h&#xE9;siter &#xE0; mettre &#xE0; jour ses librairies.</p><p>Attention cependant &#xE0; ne pas non plus se pr&#xE9;cipiter et installer une version de librairie qui ne serait plus compatible avec votre code ou bien encore d&#x2019;installer une version bugg&#xE9;e. Il faut donc penser &#xE0; faire tourner ses tests pour s&#x2019;assurer qu&#x2019;aucune regression n&#x2019;impacte votre base de code.</p>]]></content:encoded></item><item><title><![CDATA[Transformez votre code Node.js grâce au module de promises Bluebird]]></title><description><![CDATA[<p>Lorsqu&#x2019;on parle de <em>promises</em> dans l&#x2019;&#xE9;cosyst&#xE8;me Node.js, on pense imm&#xE9;diatement &#xE0; la librairie <em>Q</em>. Toutefois, il existe de nombreux modules de promises proposant chacun des choses diff&#xE9;rentes. En particulier, le module <em>bluebird</em> se d&#xE9;marque gr&#xE2;</p>]]></description><link>https://helyx.org/transformez-votre-code-node-js-grace-au-module-de-promises-bluebird/</link><guid isPermaLink="false">62549ee1d35ce80569cadb36</guid><category><![CDATA[Promise]]></category><category><![CDATA[Node.js]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Fri, 04 Jul 2014 09:00:44 GMT</pubDate><media:content url="https://helyx.org/content/images/2022/04/bluebird.jpeg" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2022/04/bluebird.jpeg" alt="Transformez votre code Node.js gr&#xE2;ce au module de promises Bluebird"><p>Lorsqu&#x2019;on parle de <em>promises</em> dans l&#x2019;&#xE9;cosyst&#xE8;me Node.js, on pense imm&#xE9;diatement &#xE0; la librairie <em>Q</em>. Toutefois, il existe de nombreux modules de promises proposant chacun des choses diff&#xE9;rentes. En particulier, le module <em>bluebird</em> se d&#xE9;marque gr&#xE2;ce &#xE0; des fonctionnalit&#xE9;s tout &#xE0; fait int&#xE9;ressantes telles que la <em>&#x201C;promisification&#x201D;</em>.</p><h3 id="promisification">Promisification</h3><p>Les <em>core modules</em> de Node.js fonctionnent &#xE0; base de callback. Ainsi pour lire un fichier de fa&#xE7;on asynchrone, il faut appeler la fonction <em>readFile</em> du module <em>fs</em> et traiter la r&#xE9;ponse depuis le callback pass&#xE9; en dernier param&#xE8;tre de la fonction lors de son appel:</p><pre><code class="language-coffeescript">fs.readFile &quot;file.json&quot;, (err, val) -&gt;
    if err
        console.error &quot;unable to read file&quot;
        try
            val = JSON.parse(val);
            console.log val.success
        catch e
            console.error &quot;invalid json in file&quot;</code></pre><p><em>Bluebird</em> permet de transformer le code pr&#xE9;c&#xE9;dent dans le code suivant:</p><pre><code class="language-coffeescript">fs.readFileAsync(&quot;file.json&quot;).then(JSON.parse).then (val) -&gt;
    console.log val.success
.catch SyntaxError, (e) -&gt;
    console.error &quot;invalid json in file&quot;
.catch (e) -&gt;
    console.error &quot;unable to read file&quot;</code></pre><p>Cette transformation est rendue possible gr&#xE2;ce &#xE0; la <em>promisification</em> du module <em>fs</em>, via l&#x2019;appel de la fonction <em>promisifyAll</em> qui permet de transformer toutes les fonctions expos&#xE9;es en fonctions renvoyant des promises:</p><pre><code class="language-coffeescript">fs = require &quot;fs&quot;
Promise.promisifyAll fs

fs.readFileAsync(&quot;file.js&quot;, &quot;utf8&quot;).then(...)</code></pre><p>Selon toute vraisemblance, les fonctions du modules sont proxifi&#xE9;es via un wrapping changeant la signature.&#x200C;&#x200C;On pourra noter que le cha&#xEE;nage de fonctions <em>catch</em> sur la promise permet de diff&#xE9;rencier le traitement des erreurs en fonction de leur type. Ici, l&#x2019;erreur de type <em>SyntaxError</em> est trait&#xE9;e diff&#xE9;remment des erreurs typ&#xE9;es autrement.</p><h4 id="promisify">promisify</h4><p>Il est &#xE9;galement possible de ne promisifier qu&#x2019;une seule fonction gr&#xE2;ce &#xE0; la fonction <em>promisify</em>:</p><pre><code class="language-coffeescript">redisGet = Promise.promisify(redisClient.get, redisClient)&#x200C;&#x200C;redisGet(&apos;foo&apos;).then () -&gt;&#x200C;&#x200C;    #...</code></pre><p>Il y a tout de m&#xEA;me un pi&#xE8;ge puisque la fonction attend 2 param&#xE8;tres. Le premier &#xE9;tant la r&#xE9;f&#xE9;rence de la fonction &#xE0; promisifier, et le second &#xE9;tant l&#x2019;objet auquel la fonction est rattach&#xE9;e.</p><h3 id="nodeify">nodeify</h3><p>La fonction <em>nodeify</em> est &#xE9;galement tr&#xE8;s int&#xE9;ressante car elle permet d&#x2019;enregistrer un callback sur une promise <em>bluebird</em> et d&#x2019;appeler celui-ci &#xE0; la r&#xE9;solution de cette derni&#xE8;re:</p><pre><code class="language-coffeescript">getDataFor(input, callback) -&gt;
    dataFromDataBase(input).nodeify(callback)</code></pre><p>Cette possibilit&#xE9; est particuli&#xE8;rement int&#xE9;ressante, car elle permet de construire des API qui deviennent utilisables aussi bien par du code qui fonctionne &#xE0; base de callback, qu&#x2019;avec du code &#xE0; base de promise.</p><p>Ainsi, si le callback est renseign&#xE9;, il sera appel&#xE9;. Sinon, il suffira d&#x2019;exploiter la promise retourn&#xE9;e par la fonction pour obtenir et traiter le r&#xE9;sultat de l&#x2019;appel.</p><p>Exemple exploitant le m&#xE9;canisme de promise:</p><pre><code class="language-coffeescript">getDataFor(&quot;me&quot;).then (dataForMe) -&gt;
    console.log dataForMe</code></pre><p>Le m&#xEA;me exemple exploitant le m&#xE9;canisme de callback:</p><pre><code class="language-coffeescript">getDataFor &quot;me&quot;, (err, dataForMe) -&gt;
    if err
        console.error err
    console.log dataForMe</code></pre><h4 id="spread">spread</h4><p>En temps normal, le code suivant donnera en r&#xE9;sultat la tableau : [1, 2, 3].</p><pre><code class="language-coffeescript">Promise.resolve([1,2,3]).nodeify (err, result) -&gt;
    # err == null
    # result: [1,2,3]</code></pre><p>Toutefois, l&#x2019;option <em>{spread: true}</em> pass&#xE9;e &#xE0; l&#x2019;appel de la fonction <em>nodeify</em>, permet de dispatcher les valeurs de r&#xE9;sultat sur l&#x2019;ensemble des arguments de la fonction de callback renseign&#xE9;e:</p><pre><code class="language-coffeescript">Promise.resolve([1,2,3]).nodeify (err, a, b, c) -&gt;
    # err == null
    # a == 1
    # b == 2
    # c == 3
, {spread: true}</code></pre><h3 id="conclusion">Conclusion</h3><p>La librairie <em>bluebird</em> est riche en fonctions pour le moins int&#xE9;ressantes, vous pouvez les retrouver sur la page de documentation du projet GitHub:</p><blockquote><em>Lien: <a href="https://github.com/petkaantonov/bluebird/blob/master/API.md?ref=helyx-org">https://github.com/petkaantonov/bluebird/blob/master/API.md</a></em></blockquote>]]></content:encoded></item><item><title><![CDATA[Locker les versions de vos dépendances Node.js]]></title><description><![CDATA[<p>Node.js dispose d&#x2019;un gestionnaire de d&#xE9;pendances tr&#xE8;s efficace et incontournable: <a href="https://github.com/npm/npm?ref=helyx-org">npm</a>.</p><p>Reposant sur les informations de d&#xE9;pendances d&#xE9;clar&#xE9;es dans le fichier <em>package.json</em>, il s&#x2019;occupera de r&#xE9;cup&#xE9;rer les d&#xE9;pendances</p>]]></description><link>https://helyx.org/locker-les-versions-de-vos-dependances-node-js/</link><guid isPermaLink="false">62549ee1d35ce80569cadb35</guid><category><![CDATA[Node.js]]></category><category><![CDATA[npm]]></category><category><![CDATA[semver]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Mon, 30 Jun 2014 08:00:48 GMT</pubDate><media:content url="https://helyx.org/content/images/2022/04/npm.png" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2022/04/npm.png" alt="Locker les versions de vos d&#xE9;pendances Node.js"><p>Node.js dispose d&#x2019;un gestionnaire de d&#xE9;pendances tr&#xE8;s efficace et incontournable: <a href="https://github.com/npm/npm?ref=helyx-org">npm</a>.</p><p>Reposant sur les informations de d&#xE9;pendances d&#xE9;clar&#xE9;es dans le fichier <em>package.json</em>, il s&#x2019;occupera de r&#xE9;cup&#xE9;rer les d&#xE9;pendances d&#xE9;clar&#xE9;es et de les install&#xE9;es le dossier <em>node_modules</em> de votre projet, via l&#x2019;ex&#xE9;cution de la commande:</p><pre><code class="language-bash">npm install</code></pre><h3 id="pourquoi">Pourquoi ?</h3><p>Contrairement &#xE0; la m&#xE9;canique propos&#xE9;e par Maven dans le monde Java, Node.js repose sur une m&#xE9;canique de d&#xE9;pendances hi&#xE9;rarchiques. C&#x2019;est &#xE0; dire que <em>npm</em> va r&#xE9;cup&#xE9;rer et installer pour chaque niveau &#x2013; application, et d&#xE9;pendances elles-m&#xEA;mes &#x2013; les librairies associ&#xE9;es.</p><p>Par exemple, si votre projet, ainsi que les librairies dont il d&#xE9;pend, utilisent la librairie <a href="https://github.com/substack/node-mkdirp?ref=helyx-org"><em>mkdirp</em></a>, alors <em>npm</em> va charger et installer la d&#xE9;pendance <a href="https://github.com/substack/node-mkdirp?ref=helyx-org"><em>mkdirp</em></a> &#xE0; la fois dans le dossier <em>node_modules</em> de votre projet, mais &#xE9;galement dans le dossier <em>node_modules</em> de la librairie de votre projet.</p><p>Pour chaque librairie, dont vous d&#xE9;pendez, vous devez d&#xE9;clarer un pattern de s&#xE9;lection de version qui indiquera &#xE0; <em>npm</em> quelle version de d&#xE9;pendance t&#xE9;l&#xE9;charger. Les patterns disponibles sont vari&#xE9;s, allant du wildcard, &#xE0; la version exacte.</p><h3 id="quels-risques">Quels risques ?</h3><p>Tr&#xE8;s rapidement, vous serez confront&#xE9; &#xE0; des probl&#xE9;matiques de versions de d&#xE9;pendances qui &#xE9;voluent.</p><p>Cela emp&#xEA;chera, au mieux, vos applications de tourner correctement. Au pire engendrera des bugs subtiles et tr&#xE8;s difficiles &#xE0; d&#xE9;tecter ou corriger, avec le risque de mettre en p&#xE9;ril votre business, ou bien la qualit&#xE9; per&#xE7;ue de vos logiciels.</p><h3 id="quelles-solutions-quels-outils">Quelles solutions ? Quels outils ?</h3><p>Une technique possible pour se pr&#xE9;munir de ce probl&#xE8;me, est de locker les versions de vos d&#xE9;pendances en indiquant des patterns de version plus restrictifs, voir compl&#xE8;tement fix&#xE9;s.</p><p>Vous ne serez pas sorti d&#x2019;affaire pour autant. Vous aurez beau fixer les versions de vos d&#xE9;pendances, celles-ci reposent &#xE9;galement sur d&#x2019;autres d&#xE9;pendances, pour lesquelles, leur auteurs respectifs n&#x2019;appliquent peut-&#xEA;tre pas les r&#xE8;gles de gestion de versions qui vous arrange.</p><p>Ainsi, il est possible qu&#x2019;une librairie donn&#xE9;e d&#xE9;clare une version de d&#xE9;pendance avec un wildcard. De fait, vous serez amen&#xE9;, &#xE0; terme, &#xE0; r&#xE9;cup&#xE9;rer une version qui sera, soit incompatible avec votre code, soit tout simplement bugg&#xE9;e.</p><p>Il faudrait, dans l&#x2019;id&#xE9;al, pouvoir locker toute la hi&#xE9;rarchie des versions de d&#xE9;pendances et pouvoir r&#xE9;installer ces d&#xE9;pendances de fa&#xE7;on r&#xE9;p&#xE9;t&#xE9;e dans les versions s&#xE9;lectionn&#xE9;es.</p><p>La bonne nouvelle, c&#x2019;est qu&#x2019;il existe des solutions pour r&#xE9;pondre &#xE0; ce besoin, dont les outils <a href="https://github.com/mozilla/npm-lockdown?ref=helyx-org"><em>lockdown</em></a> et <a href="https://github.com/uber/npm-shrinkwrap?ref=helyx-org"><em>npm-shrinkwrap</em></a>.</p><h3 id="lockdown">lockdown</h3><p>Le module <em>lockdown</em> propose de locker les versions des d&#xE9;pendances de votre projet dans le but de vous assurer que le code que vous d&#xE9;veloppez reposera sur les m&#xEA;me version de d&#xE9;pendances que ce soit dans votre IDE ou bien pendant vos phases de tests ou bien en production.</p><p>L&#x2019;usage de <em>Lockdown</em> vous permettra de continuer &#xE0; utiliser la commande <em>npm install</em>, tout en vous assurant d&#x2019;obtenir le m&#xEA;me code &#xE0; chaque fois que la commande sera ex&#xE9;cut&#xE9;e, ainsi qu&#x2019;en vous &#xE9;vitant d&#x2019;avoir &#xE0; copier le code de vos d&#xE9;pendances dans votre gestionnaire de code source ou d&#x2019;avoir &#xE0; maintenir un repository priv&#xE9; npm.</p><p>Comme expliqu&#xE9; pr&#xE9;c&#xE9;demment, m&#xEA;me si vous exprimez la version exacte de vos d&#xE9;pendances dans votre fichier projet <em>package.json</em>, vous &#xEA;tes toujours vuln&#xE9;rable &#xE0; l&#x2019;apparition soudaine d&#x2019;une incompatibilit&#xE9; avec l&#x2019;une de vos d&#xE9;pendances.</p><p>Par exemple, si votre projet d&#xE9;pend d&#x2019;un package avec une version sp&#xE9;cifique, qui, elle m&#xEA;me d&#xE9;pend d&#x2019;un autre package d&#xE9;clar&#xE9; avec un version range, vous risquez de voir la version de votre d&#xE9;pendance changer lors d&#x2019;une future ex&#xE9;cution de la commande <em>npm install</em>.</p><p>Cet exemple n&#x2019;est h&#xE9;las pas la seule cause de probl&#xE8;me. D&#x2019;autre actions peuvent accidentellement casser le code de votre application:</p><ul><li>En poussant une nouvelle version de librairie qui ne supporte plus la version de Node.js que vous utilisez</li><li>En introduisant un bug dans du code qui fonctionnait bien au pr&#xE9;alable</li><li>&#x2026;</li></ul><h4 id="utilisation">Utilisation</h4><p>1. Installez une d&#xE9;pendance dans votre projet. Par exemple, en ligne de commande:</p><pre><code class="language-bash">npm install &lt;module&gt;@&lt;version&gt; --save</code></pre><p>2. G&#xE9;n&#xE9;rez le fichier <em>lockdown.json</em> en ex&#xE9;cutant la commande <em>lockdown-relock</em>:</p><pre><code class="language-bash">node_modules/.bin/lockdown-relock</code></pre><p>3. Puis, ajoutez le fichier nouvellement cr&#xE9;&#xE9; &#xE0; votre gestionnaire de code source.</p><h4 id="installer-vos-d-pendances-gr-ce-au-fichier-lockdown-json">Installer vos d&#xE9;pendances gr&#xE2;ce au fichier lockdown.json</h4><p>Une fois le fichier <em>lockdown.json</em> g&#xE9;n&#xE9;r&#xE9;, il vous suffit d&#x2019;appeler, de fa&#xE7;on tout &#xE0; fait classique, la commande <em>npm install</em> qui installera l&#x2019;ensemble des d&#xE9;pendances dans les versions attendues.</p><h4 id="points-forts">Points forts</h4><p><em>Lockdown</em> se veut &#xEA;tre un outil vous garantissant d&#x2019;utiliser un code source identique, aussi bien en d&#xE9;veloppement qu&#x2019;en production. C&#x2019;est pour cela, qu&#x2019;en plus de stocker les versions des d&#xE9;pendances utilis&#xE9;es, il stocke &#xE9;galement des checksums du code utilis&#xE9;. Il permet donc de savoir qu&#x2019;un code source dans une version donn&#xE9;e a &#xE9;t&#xE9; modifi&#xE9;, et vous alerte du probl&#xE8;me.</p><p>Autre point int&#xE9;ressant: le projet est maintenu par Mozilla, ce qui a tendance &#xE0; rassurer quant au s&#xE9;rieux et la p&#xE9;rennit&#xE9; de l&#x2019;outil.</p><h3 id="npm-shrinkwrap">npm-shrinkwrap</h3><p>Tout comme l&#x2019;outil <em>lockdown</em>, la commande <em>npm-shrinkwrap</em> propose de figer les versions de d&#xE9;pendances de votre application.</p><p>Pas de souci d&#x2019;installation n&#xE9;anmoins, puisque la commande est directement disponible dans la distribution de <em>npm</em>. <em>Npm</em> venant avec l&#x2019;installation de <em>Node.js</em>, pas besoin de bouger le petit doigt pour avoir l&#x2019;outil &#xE0; disposition.</p><p>Le fichier de stockage des informations de version s&#x2019;appelle quant &#xE0; lui <em>npm-shrinkwrap.json</em>.</p><h4 id="utilisation-1">Utilisation</h4><p>L&#x2019;utilisation de <em>npm-shrinkwrap</em> est tout &#xE0; fait similaire &#xE0; celle de <em>lockdown</em>, il suffit d&#x2019;utiliser la commande <em>npm install</em> pour installer vos d&#xE9;pendances, puis ex&#xE9;cuter la commande <em>npm shrinkwrap</em> pour g&#xE9;n&#xE9;rer le fichier de version.</p><h4 id="gestion-des-checksums">Gestion des checksums</h4><p>Contrairement &#xE0; <em>lockdown</em>, la commande <em>npm-shrinkwrap</em> ne g&#xE8;re pas de checksum. N&#xE9;anmoins, il existe des solutions de remplacement, telles que le package <a href="https://github.com/zaach/npm-seal?ref=helyx-org"><em>npm-seal</em></a> qui se propose de venir compl&#xE9;ter <em>la commande npm-shrinkwrap</em> en proposant la fonctionnalit&#xE9; manquante.</p><p>Contrairement &#xE0; l&#x2019;utilitaire <em>npm-shrinkwrap</em>, le package <em>npm-seal</em> est un utilitaire 3rd party qui doit &#xEA;tre install&#xE9; en compl&#xE9;ment avec la commande suivante:</p><pre><code class="language-bash">npm install seal -g</code></pre><h4 id="points-forts-1">Points forts</h4><p>Nous l&#x2019;avons d&#xE9;j&#xE0; vu, la commande est int&#xE9;gr&#xE9;e &#xE0; la distribution de l&#x2019;outil <em>npm</em>. Par ailleurs, l&#x2019;outil est &#xE9;galement maintenu par une entreprise gage de s&#xE9;rieux: <a href="https://www.uber.com/?ref=helyx-org">Uber</a>.</p><h3 id="bon-savoir">Bon &#xE0; savoir</h3><p>M&#xEA;me si les outils <em>lockdown</em> et <em>npm-shrinkwrap</em> vous proposent des solutions diff&#xE9;rentes, il est tout &#xE0; fait possible de combiner l&#x2019;usage de ces deux outils sans que cela pose de probl&#xE8;me.</p>]]></content:encoded></item><item><title><![CDATA[Activer le support JSONP avec Express]]></title><description><![CDATA[<p>Si vos services webs sont destin&#xE9;s &#xE0; &#xEA;tre appel&#xE9;s depuis d&#x2019;autres domaines dans un browser web, il sera n&#xE9;cessaire d&#x2019;activer le support du <a href="http://en.wikipedia.org/wiki/JSONP?ref=helyx-org"><em>JSON Padding</em></a> pour vos services REST JSON.</p><p>Le support du JSON Padding selon le serveur utilis&</p>]]></description><link>https://helyx.org/activer-le-support-jsonp-avec-express/</link><guid isPermaLink="false">62549ee1d35ce80569cadb34</guid><category><![CDATA[Express]]></category><category><![CDATA[Node.js]]></category><category><![CDATA[JSON]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Fri, 27 Jun 2014 08:00:20 GMT</pubDate><media:content url="https://helyx.org/content/images/2022/04/nodejs-3.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2022/04/nodejs-3.jpg" alt="Activer le support JSONP avec Express"><p>Si vos services webs sont destin&#xE9;s &#xE0; &#xEA;tre appel&#xE9;s depuis d&#x2019;autres domaines dans un browser web, il sera n&#xE9;cessaire d&#x2019;activer le support du <a href="http://en.wikipedia.org/wiki/JSONP?ref=helyx-org"><em>JSON Padding</em></a> pour vos services REST JSON.</p><p>Le support du JSON Padding selon le serveur utilis&#xE9; est tr&#xE8;s variable. C&#xF4;t&#xE9; node.JS avec Express, la fonctionnalit&#xE9; est tr&#xE8;s bien support&#xE9;e, mais n&#x2019;est pas activ&#xE9;e par d&#xE9;faut. Il faut donc l&#x2019;activer dans la configuration de votre serveur.</p><p>Malheureusement, il faut chercher un peu pour trouver comment faire. Voici donc pour vous faire gagner un peu de temps comment configurer votre serveur:</p><pre><code class="language-coffeescript">app.set &apos;jsonp callback name&apos;, &apos;callback&apos;</code></pre><p>La configuration de la cl&#xE9;: <em>&#x2018;jsonp callback name&#x2019;</em> permet de sp&#xE9;cifier le nom du param&#xE8;tre de <em>queryString</em> qui correspondra au callback encapsulant le JSON de retour. Dans notre cas, ici, la variable s&#x2019;appellera: <em>&#x2018;callback&#x2019;</em>.</p><p>Un appel sans callback donnera le r&#xE9;sultat suivant:</p><pre><code class="language-json">akinsella@~$ curl http://localhost:8000/api/v1/conferences
[
  {
    &quot;id&quot;: 12,
    &quot;backgroundUrl&quot;: &quot;http://blog.xebia.fr/images/devoxxuk-2014-background.png&quot;,
    &quot;logoUrl&quot;: &quot;http://blog.xebia.fr/images/devoxxuk-2014-logo.png&quot;,
    &quot;iconUrl&quot;: &quot;http://blog.xebia.fr/images/devoxxuk-2014-icon.png&quot;,
    &quot;from&quot;: &quot;2014-06-12&quot;,
    &quot;name&quot;: &quot;DevoxxUK 2014&quot;,
    &quot;description&quot;: &quot;The Devoxx UK annual event.&quot;,
    &quot;location&quot;: &quot;Business Design Center&quot;,
    &quot;enabled&quot;: true,
    &quot;to&quot;: &quot;2014-06-13&quot;
  }, ...]</code></pre><p>Les en-t&#xEA;tes sp&#xE9;cifieront un Content-Type de type <em>&#x2018;application/json&#x2019;</em>:</p><pre><code>akinsella@~$ curl -I http://localhost:8000/api/v1/conferences              
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Length: 4397
Date: Sat, 14 Jun 2014 14:11:36 GMT
Connection: keep-alive</code></pre><p>Alors qu&#x2019;en appellant des resources avec le param&#xE8;tre de <em>queryString</em>: <em>&#x2018;callback&#x2019;</em>, le serveur g&#xE9;n&#xE9;rera des r&#xE9;ponses avec un Content-Type de type <em>&#x2018;text/javascript&#x2019;</em>:</p><pre><code>akinsella@~$ curl -I http://localhost:8000/api/v1/conferences\?callback\=cb
HTTP/1.1 200 OK
Content-Type: text/javascript; charset=utf-8
Content-Length: 4430
Date: Sat, 14 Jun 2014 13:56:03 GMT
Connection: keep-alive</code></pre><p>Le contenu de la r&#xE9;ponse sera le suivant:</p><pre><code class="language-json">akinsella@~$ curl http://localhost:8000/api/v1/conferences\?callback\=cb
typeof cb === &apos;function&apos; &amp;&amp; cb([
  {
    &quot;id&quot;: 12,
    &quot;backgroundUrl&quot;: &quot;http://blog.xebia.fr/images/devoxxuk-2014-background.png&quot;,
    &quot;logoUrl&quot;: &quot;http://blog.xebia.fr/images/devoxxuk-2014-logo.png&quot;,
    &quot;iconUrl&quot;: &quot;http://blog.xebia.fr/images/devoxxuk-2014-icon.png&quot;,
    &quot;from&quot;: &quot;2014-06-12&quot;,
    &quot;name&quot;: &quot;DevoxxUK 2014&quot;,
    &quot;description&quot;: &quot;The Devoxx UK annual event.&quot;,
    &quot;location&quot;: &quot;Business Design Center&quot;,
    &quot;enabled&quot;: true,
    &quot;to&quot;: &quot;2014-06-13&quot;
  }, ...]);</code></pre><p>Cerise sur le g&#xE2;teau, la g&#xE9;n&#xE9;ration du r&#xE9;sultat est parfaitement g&#xE9;r&#xE9;e: la sortie obtenue int&#xE8;gre les best pratiques de codage permettant d&#x2019;&#xE9;viter d&#x2019;&#xEA;tre vuln&#xE9;rable &#xE0; certaines attaques XSS associ&#xE9;es &#xE0; l&#x2019;utilisation du JSON Padding.</p>]]></content:encoded></item><item><title><![CDATA[Désactiver l'en-tête de réponse 'x-powered-by' avec Express]]></title><description><![CDATA[<p>Il peut-&#xEA;tre jug&#xE9; emb&#xEA;tant niveau s&#xE9;curit&#xE9; de d&#xE9;voiler le type de serveur qui fait tourner vos services web. Il est donc pr&#xE9;f&#xE9;rable de ne pas envoyer cette information dans les en-t&#xEA;te de r&#xE9;</p>]]></description><link>https://helyx.org/desactiver-len-tete-de-reponse-x-powered-by-avec-express/</link><guid isPermaLink="false">62549ee1d35ce80569cadb33</guid><category><![CDATA[Express]]></category><category><![CDATA[Node.js]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Wed, 25 Jun 2014 09:00:21 GMT</pubDate><media:content url="https://helyx.org/content/images/2022/04/http-headers.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2022/04/http-headers.jpg" alt="D&#xE9;sactiver l&apos;en-t&#xEA;te de r&#xE9;ponse &apos;x-powered-by&apos; avec Express"><p>Il peut-&#xEA;tre jug&#xE9; emb&#xEA;tant niveau s&#xE9;curit&#xE9; de d&#xE9;voiler le type de serveur qui fait tourner vos services web. Il est donc pr&#xE9;f&#xE9;rable de ne pas envoyer cette information dans les en-t&#xEA;te de r&#xE9;ponses HTTP avec Express.</p><p>Exemple de r&#xE9;ponse HTTP avec l&#x2019;en-t&#xEA;te <em>&#x2018;X-Powered-By&#x2019;</em> activ&#xE9;:</p><pre><code class="language-bash">akinsella@~$ curl -I http://localhost:8000/api/v1/conferences
HTTP/1.1 200 OK
X-Powered-By: Express
Content-Type: application/json; charset=utf-8
Content-Length: 4397
Date: Sat, 14 Jun 2014 22:43:15 GMT
Connection: keep-alive</code></pre><p>Pour ce faire, il vous suffit de d&#xE9;clarer l&#x2019;option suivante dans le code de configuration de votre application:</p><pre><code class="language-coffeescript">app.disable &quot;x-powered-by&quot;</code></pre><p>Les clients HTTP connect&#xE9;s &#xE0; vos services ne recevrons ainsi plus cette information:</p><pre><code class="language-bash">akinsella@~$ curl -I http://localhost:8000/api/v1/conferences
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Length: 4397
Date: Sat, 14 Jun 2014 22:43:15 GMT
Connection: keep-alive</code></pre>]]></content:encoded></item><item><title><![CDATA[Créer un middleware de log des requêtes HTTP entrantes pour Express]]></title><description><![CDATA[<p>Vous souhaitez pouvoir logger les connections HTTP rentrantes avec Node.js et Express ? Rien de plus simple! Il suffit de d&#xE9;clarer un middleware de la fa&#xE7;on suivante:</p><pre><code class="language-coffeescript">util = require &apos;util&apos;

module.exports = (req, res, next) -&gt;
    console.log  &quot;&quot;&quot;---------------------------------------------------------
                    Http</code></pre>]]></description><link>https://helyx.org/creer-un-middleware-de-log-des-requetes-http-entrantes-pour-express/</link><guid isPermaLink="false">62549ee1d35ce80569cadb32</guid><category><![CDATA[CoffeeScript]]></category><category><![CDATA[Express]]></category><category><![CDATA[HTTP]]></category><category><![CDATA[Logging]]></category><category><![CDATA[Node.js]]></category><dc:creator><![CDATA[Alexis Kinsella]]></dc:creator><pubDate>Mon, 23 Jun 2014 08:00:50 GMT</pubDate><media:content url="https://helyx.org/content/images/2022/04/logger-8.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://helyx.org/content/images/2022/04/logger-8.jpg" alt="Cr&#xE9;er un middleware de log des requ&#xEA;tes HTTP entrantes pour Express"><p>Vous souhaitez pouvoir logger les connections HTTP rentrantes avec Node.js et Express ? Rien de plus simple! Il suffit de d&#xE9;clarer un middleware de la fa&#xE7;on suivante:</p><pre><code class="language-coffeescript">util = require &apos;util&apos;

module.exports = (req, res, next) -&gt;
    console.log  &quot;&quot;&quot;---------------------------------------------------------
                    Http Request - Pid process: [#{process.pid}]
                    Http Request - Url: #{req.url}
                    Http Request - Query: #{util.inspect(req.query)}
                    Http Request - Method: #{req.method}
                    Http Request - Headers: #{util.inspect(req.headers)}
                    Http Request - Body: #{util.inspect(req.body)}
                    ---------------------------------------------------------&quot;&quot;&quot;

    next()</code></pre><p>Comme vous pouvez le voir, aucun module externe n&#x2019;est n&#xE9;cessaire. Il suffit ensuite d&#x2019;int&#xE9;grer votre nouveau middleware dans le code de configuration de votre serveur <em>Express</em>, comme suit:</p><pre><code class="language-coffeescript">express = require &apos;express&apos;
requestLogger = require &apos;./lib/requestLogger&apos;

app = express()

app.configure -&gt;
    console.log &quot;Environment: #{app.get(&apos;env&apos;)}&quot;
    app.set &apos;port&apos;, 8000

    ...

    app.use requestLogger

    ...

    app.use app.router

app.listen app.get(&apos;port&apos;)</code></pre><p>Le log r&#xE9;sultant d&#x2019;une requ&#xEA;te HTTP prendra la forme suivante:</p><pre><code class="language-bash">---------------------------------------------------------
Http Request - Pid process: [26074]
Http Request - Url: /
Http Request - Query: {}
Http Request - Method: GET
Http Request - Headers: { host: &apos;localhost:9000&apos;,
  connection: &apos;keep-alive&apos;,
  accept: &apos;text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8&apos;,
  &apos;user-agent&apos;: &apos;Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36&apos;,
  &apos;accept-encoding&apos;: &apos;gzip,deflate,sdch&apos;,
  &apos;accept-language&apos;: &apos;fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4&apos;,
  cookie: &apos;...&apos; }
Http Request - Body: undefined
---------------------------------------------------------</code></pre>]]></content:encoded></item></channel></rss>