Question 1 – IoT infrastructure
The fourth industrial revolution revolves around big data, data analytics, and machine learning. It’s focused on technology supported by the presence of the internet turning the world into the Internet of Things. Data analytics and machine learning are the fundamental aspects of the web of Things and driving the fourth industrial revolution. Three key drivers are pushing the fourth industrial revolution. They have divided broadly as; digital, physical, and biological.
Digital – This category consists of IoT, cloud computing, big data, artificial intelligence, and machine learning. All the sectors, as mentioned above, are driven by the digital power of today’s technology. It’s the broader category and well known originating from technology. Machine learning, artificial intelligence, and machine learning depend on big data and the internet of things. The ability of machines to learn from a considerable amount of data and gain experience, just like the human mind, is impressive and builds up machine learning. The availability of massive data on digital platforms also has made it easier to operate with big data and the internet of things.
Physical drivers – The physical drivers consist of Autonomous cars and 3D printing. All these are inventions based on the fourth industrial revolution depended on the use of the internet to operate. Driverless vehicles will utilize the internet and computers to complete the usual tasks of delivering people and luggage.
Biological drivers – it includes existing examples such as genetic engineering and Neurotechnology. Genetic engineering proves effective and is already in use, paving the way for the fourth industrial revolution. All these sectors are supported by the infrastructure of the internet of things.
Question 2 – steps in data preprocessing
There are four main stages involved in data preprocessing. They include data consolidation, data cleaning, data transformation, and data reduction. As mentioned above, the steps are necessary for converting misaligned, inaccurate, and complex raw data into defined data in the form of analytics algorithms. First, during the initial step – consolidation, only relevant data is assembled from different sources, including the selection of necessary variables and records. In this step, unnecessary information is eliminated. Three main tasks handled in this step are the collection of data, selection, and integration of data. The most popular method employed in this step is SQL queries. The SQL method is crucial for accessing data from a pool or a store or raw information, which is the function of the first step in data preprocessing.
The second step is data cleaning. In this step, missing values are handled appropriately, and noise is identified and reduced in the data. Unnecessary or worthless information is eliminated, inputting the missing values to make the data complete. The erroneous data is also recognized and eliminated at this stage. There is no specific method used in this stage, but the information is cleaned accordingly. Mean, median, and modes are used to fill in the missing values or replace them with a constant ML. On the other hand, averages and standard deviations are used to reduce the noise by identifying outliers. Odd amounts are then eliminated.
Thirdly is the data transformation. In this step, data is normalized, discretized, and new attributes constructed. Data ranges are determined to mitigate any form of bias. Some of the numerical values are converted into categorical values; into hierarchies or inform of low, medium, and high. The primary method used here is a statistical one in which range is determined to standard. The final step is data reduction. It’s crucial to reduce the amount of data for easy understanding, evaluation, and so on. Here, the number of records and attributes are reduced. Random sampling is a standard method used in this step used to minimize the files and attributes in the data.
Question 3 – Confusion Matrix
- Accuracy rate – accuracy is determined by adding the classified instances, positive and negative, then divided by the total cases.
Whereby; TP is actual positive to get the accuracy rate = (TP+TN)/total
TN is true negative = (170+105) / (170+37+36+105)
FP is false positive =275 / 348
FN is false negative = 0.7902
The most exceptional accuracy rate is 1.0, while the worst is 0.0; therefore, this model is average when it comes to predicting whether a patient has Alzheimer’s disease. It’s because the accuracy rate is close to 1.0.
- b) Sensitivity rate – it’s also called the correct favorable price, and given by the ratio of classified positives over the total positives. The following formula determines it;
Therefore, our sensitivity value will be; =170 / (170 + 36)
= 170 / 206
= 0.8252
This model is suitable for predicting whether a patient has Alzheimer’s disease because the true positive value is close to 0.8252. The best sensitivity rate is always 1.0, whereas the worst sensitivity rate is 0.0.
- c) Specificity – it’s also called the real negative. Which is the ratio of correctly classified images over the total negative counts? The formula gives it;
Therefore, in this case, our true value will be; 105 / (105 + 37)
= 105 / 142
= 0.7394
As stated earlier, the best specificity value is 1.0. Therefore, this model is good for predicting whether the patient has Alzheimer’s disease because 0.7394 is close to 1.0
- d) F1 score it is also known as the F score. The best measure is 1.0 while the worst F score is 0.0. The formula used to calculate the F score is given below;
F measure = (2 * precision *recall) / (precision + recall)
Precision = TP / (TP + FP)
= 170 / (170+37)
= 170 / 207
= 0.8213
Recall = TP / (TP +FN)
= 170 / (170 + 36)
= 170 / 206
=0.8252
Therefore, our F1 score in this case will be; (2*0.8213 * 0. 8252) / (0.8213+0.8252)
= (2*0.6777) / 1.6465
=1.3555/ 1.6465
F1 score = 0.8232
Question 4 Social Media Analytics
Social Media Analytics is the scientific and systematic means of consuming massive amounts of data created on web-based tools, social media outlets, and techniques. That helps organizations worldwide increase their competitiveness. Social media analytics now help many organizations globally to reach billions of customers and understand them better than never before. More so, it’s now acting as a tool for integrated communications and marketing. Many companies around the globe now prefer social media for marketing and advertising of their products and services, including attracting more customers because of the rich data.
Social media sites have influenced the emotions of most customers to various companies hence affecting purchase decisions. Companies consistently connect to an individual – potential customers around the world over social media networks such as Facebook, Twitter, Youtube, among others. The interaction, relationships, business engagement, and customers on these social media platforms then affect the emotions of customers on which the purchase decisions depend. Companies post attracting photos and pictures of the business products on social media platforms, including communicating the importance of the products hence attracting and convincing them to buy. Organizations also use analytics on social media platforms such as views, likes, comments, and shares to view their progress, competitiveness, and effectiveness of marketing and advertisement. As a result, social media conversation and engagement of customers results and customer satisfaction and increased profitability.
Social media analytics can be employed to measure customer sentiment or the impact of social media likes on the decisions of customers. There are three categories of tools present on social media platforms for measuring the impacts and organizations’ efforts. First, descriptive analytics measures activity characteristics using simple characteristics. It displays the number of followers, reviews, and most channels utilized by customers. The second tool is an advanced analytics, which involves text and predictive analytics, which critically examine online conversations to identify sentiments, themes, and connections. Lastly is the social network analysis, which identifies the significant sources of influence by analyzing links that exist between fans, friends, followers, etc.
Question 5 Stream Analytics
The term stream analytics also means the same as analytics of real-time data and data-in-motion. It’s the extraction of actionable information from data streaming or flowing continuously. Stream data is that which is active or flowing consistently, and hence analytics involved in the analysis of such data is stream analytics. The application of stream analytics is essential and widely spread in the word. One primary claim is that of the energy industry in the supply systems of power. It’s applied in the smart grid in power supply to ensure a steady enough amount of power to the consumers. The intelligent networks are integrated into the system in such a way that they process streaming data to ensure the distribution of power optimally to the users. Apart from that, it provides short term predictions accurately concerning unexpected demands of power and peaks of renewable energy. The application of stream analytics in the energy industry fulfills the needs of various energy companies in producing power according to the demand. It’s necessitated by analytics of streaming data that originates from production system sensors, smart meters, among others. However, the capabilities of this kind of technology in predicting future production and consumption, including anomalies, is vital in making supply decisions. It’s also necessary to regulate the use of power and favorable prices. The energy industry has reduced problems in the production and consumption of energy with the help of real-time analytics. It’s merely because the analytics provides the information in time, and hence future decisions are made accurately. The prediction capabilities are crucial in meeting the demand levels of consumers, therefore, satisfying the needs of consumers.
Question 6 Major aspects of IoT infrastructure
Applications – they turn data into information that can be easily understood. The applications are mostly run electronically on tablets, mobile phones, computers, company servers, or dashboards. They are made readily available to the necessary users.
Connectivity – it’s an active connection on the internet that allows devices to access data. Devices connect to the internet and hence used to share data or information over applications.
Software backend – this sector does data management. The data collected from various sources, devices, and networks and provides a powerful integration and mainly completed through cloud services.
Hardware – comprises of substantial parts of the electronic machines used. They usually process, store, and record data in computers, PCs, etc. however, hardware requires proper monitoring, control, and tracking.
IoT infrastructure applies in the healthcare sector and proves useful in managing and caring for patients admitted and treated in words. First, healthcare providers can effectively manage patients using IoT applications such as installing IoT sensors in oxygen pumps, nebulizers, wheelchairs, and other monitoring equipment. The sensors help to collect data from the medical equipment in the hospital words hence improve service delivery to patients. Besides, physicians also apply the IoT to collect data from different medical equipment or wearable devices that help in tracking and managing patients’ health. They achieve set goals of treatment by selecting the best treatment method and adhering to the basic plans according to the data collected. Hospitals apply various medical equipment to deliver services. The hardware includes; wearable, active, passive labels, among others.
Question 7
Prediction of exceptional events is still a challenge when it comes to predictive analytics. Predictive analytics make future predictions, depending on historical data, and improving its accuracy and reliability in predicting unexceptional events is now a necessity. It can be aided by sophisticated tools and models .to increase the accuracy and reliability of predictive analytics; therefore, we should consider the following;
Increase scope – let the internet of things cover everything that exists or takes place on earth. The ability of machine learning and data mining will help analyze all the data and increase the precision/accuracy of these analytics.
Installation of high power internet access – technology and the internet of things is entirely dependent on the internet, and hence stable internet access all over the globe would also help the machines, servers, etc. access data and information from all the corners of the world. The installation of the 5G network all over the globe will make it easy for machines to analyze data and precisely predict prediction.
Increase professionals – the field of information technology should incorporate and train more professionals. They will use their knowledge and skills to put in more effort and bring in more inventions to ensure high accuracy in predictive analytics. Governments should ensure most educational institutions provide students with IT knowledge to increase the chances of innovation.
All these measures, if put in place, I am convinced beyond doubt that predictive analytics would be perfect and make predictions in real-time to help individuals around that would to challenge various upcoming situations.