Dell Technologies #PowerStore and #CloudIQ are creating an innovative ecosystem that brings advanced autonomous storage and analytics to a companies fingertips. CloudIQ is a SaaS based offering that comes with Dell datacenter products with a current ProSupport or ProSupport Plus support contract at no additional cost.
CloudIQ is a powerful #AIOps tool – a term coined by Gartner a while back – that stands for Artificial Intelligence for IT Operations. CloudIQ combines proactive monitoring, machine learning and predictive analytics so you can take quick action and simplify operations of your on-premises infrastructure and data protection in the cloud.
Great! Customers with Dell’s data center portfolio can leverage this powerful, easy to use tool at no additional cost and can see, analyze, and monitor their environment. The following products are currently integrated into CloudIQ:
- Storage: PowerStore, PowerMax, PowerScale, PowerFlex, PowerVault, Unity/Unity XT, XtremIO, and SC Series
- PowerEdge Servers
- Data Protection: PowerProtect
- Converged and Hyper Converged: VxBlock, VxRail, PowerFlex
- Networking: PowerSwitch / Connectrix
- APEX Data Storage Services offerings
You can even test drive CloudIQ using an interactive demo located here: https://www.dell.com/en-us/dt/product-demos/cloudiq/index.htm
PowerStore + CloudIQ
If your PowerStore array is not currently configured to connect to CloudIQ, then let’s do that now. Simply click on the Settings icon at the top right of PowerStore Manager and then scroll to Support and Support Connectivity. Then click Connection Type and ensure that the “Connect to CloudIQ” box is checked.
Settings -> Support -> Support Connectivity -> Connection Type -> Connect to CloudIQ
Once your PowerStore is connected and sending data into CloudIQ you can start to harness the power of CloudIQ.
Direct launch from PowerStore Manager
First, let’s take note of the “Launch External Applications” icon at the top right of PowerStore Manager. This is a native capability of PowerStore Manager that allows instant connectivity and launching of applications like CloudIQ, VMware vSphere, and metro node technology.
Clicking the CloudIQ link will launch you directly into your securely authenticated CloudIQ environment within the context of the Dell support site.
The initial landing page is the CloudIQ dashboard where you will be presented with an overall view of your data center inventory of supported Dell products.
As you can see, this dashboard is extremely easy to navigate and you can immediately recognize systems with low health scores, cybersecurity vulnerabilities, capacity approaching full, and much more.
By clicking on the “Inventory” tab on the left side of the screen you can quickly jump to a detailed list of all available assets; storage, servers, networking, data protection.
If you have a lot of assets the “Filter” icon at the top left of the Inventory screen is your friend! You can then select which product you want to be displayed. In this example we are going to select PowerStore as that is our area of focus.
Once you select PowerStore in the filter list you are presented with only the PowerStore arrays in your inventory.
By selecting one of the listed PowerStore clusters in the Inventory page you will be dropped directly into the dashboard for that PowerStore cluster. This is where granularity comes into play.
You can specifically view the health, inventory, capacity, performance, and cybersecurity. The Health tab will give us an overall view of the appliance and gives us critical information like the health score.
In this example, you can see that the health score has been dropped to a 70, giving it a poor rating. The primary reason for this poor score listed is:
“Capacity is the top health check category impacting Manufacturing_Dev’s health score.”
Something is going on that has triggered a capacity anomaly. Based on the health score being dropped into the red, the administrator of the array would receive an email notification that the overall health of this array is degraded and needs immediate attention.
But what is causing the capacity anomaly? This could be for various reasons, but a couple of primary examples could be:
In the case of ransomware, one of the primary attack vectors is to encrypt the host data. Encrypting the host data would mean that the writes to the array are now non-reducible since you cannot deduplicate or compress encrypted data. This would mean that the growth profile of a particular set of data is suddenly trending higher than normal – hence an anomaly.
Runaway Application or Logs
Anyone administering storage has received the call from time to time about an application or host “running out of space.” Many times this is attributed not to natural growth of an application, as that is planned for, but rather to a sudden burst of capacity utilization. This often takes place as a result of application changes or log setting changes.
In either of these cases, ransomware or runaway apps / logging, CloudIQ will detect this sudden change in capacity behavior, flag it as an imminent issue (anomaly), and drop the health score of the array triggering a notification to the administrator.
Noting that our health score deduction (-30) was the result of a capacity issue, let’s dig a little deeper by clicking on the “capacity” tab.
You have several points of interest here.
- Capacity Forecast – projecting the time the array has until full – red indicating little time remaining
- Previous Forecast – how the array was trending before the anomaly
- Anomaly Forecast – represented by the red line and indicates the anomaly growth pattern
If you hover over the anomaly starting point you will see a pop-up indicating the exact moment the capacity growth pattern changed or as you can see from the image above became “imminent.” This imminent issue or anomaly began at 11:06 on September 2nd. This gives me precise information as to when the issue began. If hosts are found to be compromised via ransomware you can instantly recover from an immutable snapshot. Or perhaps you need to have a conversation with an application or host owner to check for changes that have resulted in the sudden capacity change.
Another critical area of importance is the performance of the array. CloudIQ is constantly measuring and monitoring the performance of an array and also trending the statistics.
The Performance tab has a wealth of information that can help you understand trends in workload performance whether in IOPs, bandwidth, or latency. One of the features I want to highlight is the anomaly detection capability for performance.
In the image above you can see that there has been an anomaly detected regarding overall latency of the system. The latency during this specific time period has exceeded the normal average or historical bounds measured over the past 24 hours. Basically, something has triggered a workload shift that has resulted in higher latency than is normal for the trend of the array. Clicking on the “Details” button will give you a breakout of the volumes that may be contributing to this anomaly. Having the specific volumes means that you can map to a specific host and work to quickly resolve the issue.
If you again view this feature through the lens of ransomware, you can see that this capability could potentially help you detect and isolate a threat. For example, a host traditionally runs at 1K IOPs reported at the volume on the array. If there was a compromise and an encryption engine started running on the host, this could potentially push the IO of the volume over its normal trend. Working to encrypt the data would be above and beyond the normal application workload and could throw a performance anomaly. Leveraging CloudIQ would allow you to detect the specific time something may have started and work towards a remediation; i.e., recovering from an immutable snapshot, etc.
But what if something on the array has induced the anomaly? Another cool feature within CloudIQ is the ability to detect configuration changes.
Clicking on the light blue bar on the time line will open a window that shows you exactly what configurations changes were made on the array at that specific time.
This capability not only allows you to see when changes were introduced on a timeline, but exactly WHAT changes were made to the system that could be causing an impact. This is a very granular capability that can track changes down to the specific volume or file system on the PowerStore appliance.
I have overviewed the Cybersecurity Assessment feature before here. In this article I want to give a few more specifics regarding this feature and capability.
On the Cybersecurity screen you can overview any known issues or exploits that are making your environment less secure including a rating or priority.
You can get more details about the cybersecurity issues by expanding the issue.
CloudIQ will describe the issue in detail and also provide you with a direct link to Dell support with instructions on how to remediate.
There is also the capability of directly viewing cybersecurity advisories.
Above you can see that CloudIQ is providing direct overviews of security advisories whether it be VMware on PowerStore or Dell PowerStore array specific. Clicking on the advisory ID will take you directly to the vendors site that deals with the specific security details of the advisory, including remediation details.
The industry is driving towards autonomous management, monitoring, and of course security integration. PowerStore’s autonomous operations capabilities (article here) coupled with the artificial intelligence of CloudIQ is an incredibly powerful tool – all at no additional cost. If you have not enabled CloudIQ on your PowerStore array check that box today and start reaping the benefits of a more secure and more autonomous environment.
Remember that CloudIQ encapsulates Dell Technologies datacenter portfolio at large. This means that your value goes well beyond just a single storage array. Stay tuned because there is much more to come regarding CloudIQ updates and digging into security aspects. I see another post coming sooner than later!
A guest post by Jodey Hogeland