Making predictions—for example, forecasting the weather or calculating the arrival time to a destination Identifying patterns—Recognizing objects in images and classifying them according to their contents Identifying anomalies—Detecting fraud or hacking attempts
ML/AI requires significant human input, and some applications require more human input than others. In fact, ML/AI is broken down into learning models based on the degree of human input required for them to operate—supervised and unsupervised.
Supervised Learning
Supervised learning applications include predictions and image or voice recognition. It requires creating or “training” an ML/AI model by feeding sample data into it. For example, if you want to create an application that identifies your voice in an audio file, you'd train the model using some quality samples of your voice. Subsequently, the model should be able to identify your voice accurately most of the time.
The accuracy of models depends on the size and quality of the training dataset. In terms of percentage, accuracy can be as high as the upper 90s, but rarely higher.
Unsupervised Learning
Unsupervised learning applications include anomaly detection and automatic classification. Unlike the supervised learning model, the unsupervised learning model requires no manual training. Instead, unsupervised learning automatically finds patterns in a dataset and creates groupings or clusters based on those patterns. For example, given a collection of potato pictures, an unsupervised ML/AI algorithm might classify them based on size, color, and the presence or absence of eyes.
It's important to understand the limitations of this approach. The algorithm doesn't know that it's looking at potatoes, so it might classify a bean as a potato if they look similar enough. All that the algorithm does is to label the pictures based on similarities. If you want your application to be able to take a collection of random images and identify all the potatoes, you'll have to use a supervised learning model.
Another interesting unsupervised learning application is fraud detection. If you have a credit or debit card, there's a good chance that your financial institution is using ML/AI for this purpose. It works by analyzing how you normally use your card—purchase amounts, the types of merchandise or services you purchase, and so on—to establish a baseline pattern. If your purchase activity deviates too much from this pattern, the algorithm registers it as an anomaly, probably generating a call from your financial institution.
If this sounds similar to the supervised learning model, it's because it is, but with one big difference—the model is always retraining itself with new data. To better understand this, imagine that you frequent a particular coffee shop and use your card to buy coffee. One day, the coffee shop closes its doors for good, and you end up going to another shop. This may initially register as a slight anomaly, but as you continue visiting the new coffee shop, that becomes your new baseline. The model is continually retraining itself on what your normal card usage looks like.
To put a finer point on it, in supervised learning the algorithm is looking for things it does recognize. In unsupervised learning, the algorithm is looking for things it doesn't recognize.
Creating and Validating a Cloud Deployment
In this section, I will go deeper into the technology architectures and processes that you'll find in the world of cloud computing. It is important to see how much of an effect virtualization has had in the creation of cloud computing and to understand the process of taking physical hardware resources, virtualizing them, and then assigning these resources to systems and services running in the virtualized data center.
The Cloud Shared Resource Pooling Model
Every cloud service depends on resource pooling. Resource pooling is when the cloud service provider virtualizes physical resources into a group, or pool, and makes these pooled resources available to customers. The underlying physical resources are then dynamically allocated and reallocated as the demand requires. Recall the NIST definition of cloud computing as “a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources …” Resource pooling is a defining feature of the cloud.
Resource pooling hides the physical hardware from the customer and allows many customers to share resources such as storage, compute power, and network bandwidth. This concept is also called multitenancy. We'll look at some examples of these pooled resources in the following sections.
Compute Pools
In the cloud provider's data center, there are numerous physical servers that each run a hypervisor. When you provision a VM in the cloud, the provider's cloud orchestration platform selects an available physical server to host your VM. The important point here is that it doesn't matter which particular host you get. Your VM will run the same regardless of the host. In fact, if you were to stop a running VM and then restart it, it would likely run on a completely different host, and you wouldn't be able to tell the difference. The physical servers that the provider offers up for your VMs to run on are part of a singular compute pool. By pooling physical servers in this way, cloud providers can flexibly dispense computing power to multiple customers. As more customers enter the cloud, the provider just has to add more servers to keep up with demand.
Drilling down a bit into the technical details, the hypervisor on each host will virtualize the physical server resources and make them available to the VM for consumption. Multiple VMs can run on a single physical host, so one of the hypervisor's jobs is to allow the VMs to share the host's resources while remaining isolated from one another so that one VM can't read the memory used by another VM, for instance. Figure 1.14 shows this relationship between the virtual machines and the hardware resources.
FIGURE 1.14 Shared resource pooling
The hypervisor's job is to virtualize the host's CPUs, memory, network interfaces, and—if applicable—storage (more on storage in a moment). Let's briefly go over how the hypervisor virtualizes CPU, memory, and network interfaces.
On any given host, there's a good chance that the number of VMs will greatly outnumber the host's physical CPU cores. Therefore, the hypervisor must coordinate or schedule the various threads run by each VM on the host. If the VMs all need to use the processor simultaneously, the hypervisor will figure out how to dole out the scarce CPU resources to the VMs contending for it. This process is automatic, and you'll probably never have to think about it. But you should know about another “affinity” term that's easy to confuse with hypervisor affinity: CPU affinity. CPU affinity is the ability to assign a processing thread to a core instead of having the hypervisor dynamically allocate it. A VM can have CPU affinity enabled, and when a processing thread is received by the hypervisor, it will be assigned to the CPU it originally ran on. You're not likely to see it come up on the exam, but just be aware that CPU affinity can and often does result in suboptimal performance, so it's generally best to disable it. A particular CPU assigned to a VM can have a very high utilization rate while another CPU might be sitting idle; affinity will override the hypervisor's CPU selection algorithms and be forced to use the saturated CPU instead of the underutilized one.
Now let's talk about random access memory (RAM). Just as there are a limited number of CPU cores, so there is a finite amount of RAM installed in the physical server. This RAM is virtualized by the hypervisor software into memory pools and allocated to virtual machines. When you provision a VM, you choose the amount of RAM to