Covid 19 is an infectious disease caused by the SARS-CoV-2 virus. On 31st December 2019, WHO was informed of a cluster of cases of pneumonia of unknown cause detected in Wuhan, China. So far, there have been 429,229,223 cases and 5,931,537 deaths worldwide. Effective screening or vaccination enables quick and efficient diagnosis of the virus, mitigating the burden on the healthcare system. According to BBC, Magaret Keenan, who turned 91 on 8th December 2020, was the first person in the world to be given the Pfizer covid 19 jab as part of the mass vaccination program. She considered the vaccine a birthday gift to her. This analysis aims to ascertain how the world is coping with the virus through wearing of mask, isolation, vaccination, and improvement of general hygiene. I chose the data set from September to December 2020 (dataset1) and September-December 2021(dataset2) using Kmeans clustering. Firstly, the data was first consolidated, renamed accordingly, and then cleaned (null/ negative values were replaced or removed depending on the best approach for each situation). Data cleaning is the process of fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a data set. It is essential because it improves the data quality and, in so doing, increases overall productivity. For this case, it helped remove errors in the clustering and also negative clusters. To confirm the accuracy of longitude and latitude in the dataset, I used Geopy. Geopy makes it easy to locate the coordinates of addresses, cities, countries, and landmarks across the globe using third-party geocodes. Confirmed cases and Death are the only columns needed for the clustering analysis. That implies all other columns dropped, and I created a new data frame. The confirmed cases and death scale (2 features needed for k means clustering) are different because several confirmed instances will always be more than Death. The data were normalized to reduce the scale using a standard scalar method. Please note that the k-means algorithm is dependent on Euclidean distance, so having two features on different scales can be problematic to the k-means algorithm. To know the best value of K to use in the clustering, create a scree plot (a line plot that helps determine the number of clusters). The Elbow point defines the optimal value of k. the optimal values for k are k =3 and k =4. SSE is decreasing linearly after both of these points. For this analysis, 4 is the best value for K. Since deaths and confirmed cases are in different scales, feeding the k value and the scaled data frame (X_scaled) to the K-Means algorithm is the best approach. Then calculate the mean value of k to get the value for each cluster. We had 4 clusters; cluster 2- High risk, cluster 3- medium risk, cluster 4- low risk, and cluster 1- very Low risk. Data visualization was the final process assigning a color to each cluster, making it easier to read. From the map, I agree with the result. The spread of covid dropped significantly except in countries like England and Turkey, which had an outbreak of the variant Omicron. This can be further justified by the WHO and world meter data. Only two countries remained in high- risk countries from 2020 to 2021, and they are England and Turkey, and this was because of the omicron variant of Covid 19. A country like Argentina became a medium risk country from high risk, and France moved from a very high risk to a medium-risk. Globally there was a decline in the infection and death rate as nations implemented measures to manage the pandemic.
Refrences
Cabinet office (2020) Coronavirus: What has changed- 22 september. Availale at: https://www.gov.uk/government/news/coronavirus- covid-19-what-has-changed-22-september (https://www.gov.uk/government/news/coronavirus-covid-19-what-has-changed-22-september) (Accessed 1 March 2022).
World Health Organisation (2022) WHO Coronavirus (Covid19) Dashboard. Available at: https://covid19.who.int/ (https://covid19.who.int/) (Accessed 1 March 2022) Worldometer (2022) Covid19 Coronavirus Pandemic. Available at: https://www.worldometers.info/coronavirus (https://www.worldometers.info/coronavirus).
Department of Transport and Department of Health and Social care (2021): Red list of countries and Territories Available at: https://www.gov.uk/guidance/red-list-of-countries-and-territories (https://www.gov.uk/guidance/red-list-of-countries-and-territories) (Accessed 1 March 2022)
Juliana k., Laura F. and Morgan M. (2020) 'Ongoing list of how countries are reopening and which ones remain under lockdown' BusinessInsider, 23 September. Available at:https://www.google.com/amp/s/www.businessinsider.com/countries-on-lockdown- coronavirus-italy-2020-3%3famp (https://www.google.com/amp/s/www.businessinsider.com/countries-on-lockdown-coronavirus-italy- 2020-3%3famp) (Accessed 1 March 2022)