Level 7 Diploma in Data Science | Advanced Analytics & Machine Learnin

PGC Level 7 Diploma in Data Science

INFORMATION TECHNOLOGY PROGRAMS

Post-Graduate Certificate
Level 7 Diploma in Data Science

From Algorithms to Action — Build Your Expertise in Advanced Data Science

Course Overview

Level 7 Diploma in Data Science is a postgraduate-level qualification designed to empower learners with the skills and knowledge to explore, analyze, and interpret complex data sets and translate them into actionable insights. The diploma focuses on enabling professionals to formulate data-driven research hypotheses, uncover hidden patterns, and challenge conventional business thinking through the use of cutting-edge tools and methodologies.

This qualification prepares students to become data leaders who can support decision-making in modern organizations by applying advanced statistical, computational, and machine learning techniques. Learners will gain hands-on experience with industry-relevant tools such as Python, R, and SQL, enhancing both their technical abilities and strategic understanding.

Whether learners are aiming to enhance their professional expertise, transition into data-focused roles, or prepare for further academic pursuits such as a Master’s degree, the Level 7 Diploma in Data Science offers a powerful foundation for success.

Entry Requirements

To be eligible for this programme, applicants should meet the following criteria:

A relevant Level 6 qualification or equivalent professional experience in IT or a related field.

Be aged 21 years or older.

A strong grasp of English (IELTS 6.5 or equivalent) if the applicant’s first language is not English.

International qualifications are assessed for UK equivalency.

Applicants may be asked to provide academic or professional references and a statement of purpose.

Applicants with significant industry experience may also be admitted through Recognition of Prior Learning (RPL).

Qualification Structure

All units are mandatory.

Exploratory Data Analysis

 Learning UI about basics rule of programming in both R and Python

 Create and import external datasets in R and python

 Export R data frames into external flat files

 Data Management in R and Python (Sort, merge, aggregate and subset)

 Introduction to basic concepts of Statistics, such as measures of central tendency, variation, skewness, kurtosis

 Frequency tables crosstabs and bivariate correlation analysis

 Data visualization: what and why? Grammar of graphics, handling data for visualization

 Commonly used charts and graphs using ggplot2 package in R and matplotlib in python

 Advanced graphics in R and Python Data Management in R and Python (Sort, merge, aggregate and subset)

 Data Management in R and Python (Sort, merge, aggregate and subset)

Statistical Inference

 Concept of random variables and statistical distribution

 Discrete vs. Continuous Random Variables

 Standard discrete distributions-Bernoulli, Binomial and Poisson

 Using R to calculate probabilities

 Fitting of discrete distributions to observed data

 Standard continuous distributions-Normal, Log Normal, Exponential

 Introduction to sampling distributions

 Statistical Hypothesis Testing-concepts and terminology

 Parameter, test statistics, level of significance, power, critical region

 Parametric vs. non-Parametric Tests

 t tests (one sample, independent samples, paired sample)

 F test for equality of variance

 Z tests for proportions (single and independent samples)

 Non-parametric tests (Mann-Whitney U, Wilcoxon's signed rank)

 Tests for Normality, Q-Q plot

 What is analysis of variance?

 Definitions: Variable, factor, levels

 One Way Analysis of Variance

 Two Way Analysis of Variance (including interaction effects)

 Multi Way Analysis of Variance

 Analysis of Covariance

 Kruskal-Wallis Test

 Friedman Test

Fundamentals of Predictive Modelling

 Concept of random variables and statistical distribution

 Concept of a statistical model

 Estimation of model parameters using Least Square Method

 Interpreting regression coefficients

 Assessing the goodness of fit of a model

 Global hypothesis testing using F distribution

 Individual testing using t distributions

 Concept of Multicollinearity

 Calculating Variance Inflation Factors

 Resolving problem by dropping variables

 Ridge regression method

 Stepwise regression as a strategy

 Residual analysis

 Shapiro Wilk test, K-S test and Q-Q plot for residuals

 White’s test and Breusch-Pagan Test

 Partitioning data using the caret package

 Model development on training data

 Model validation on testing data using R squared and RMSE

 Concept of k-fold cross validation

 Performing k-fold cross validation using the caret package

 Identifying influential observations

Advanced Predictive Modelling

 Model definition and parameter estimation

 Estimation of model parameters using MLE

 Interpreting regression coefficients and odds ratio

 Assessing goodness of fit of the model

 Global hypothesis testing using LRT distribution

 Individual testing using Wald’s test

 Classification table

 ROC curve

 K-S Statistic

 Multinomial and Ordinal Logistic Regression - model building and parameter estimation

 Interpretation of regression coefficients

 Classification table and deviance test

 Concept of GLM and link function and .GLM

 Poisson Regression

 Negative Binomial Regression

 Survival Analysis Introduction

 Cox Regression

Time Series Analysis

 Components of time series

 Seasonal decomposition

 Trend analysis

 Auto-correlogram

 Partial auto-correlogram

 Dickey-Fuller test

 Converting non-stationary time series data into stationary time series data

 Concepts of AR, MA and ARIMA models

 Model identification using ACF and PACF

 Parameter estimation

 Residual analysis (testing for white noise process)

 Selection of optimal model

 What is Panel data?

 Need for different models for Panel data

 Panel data regression methods

 Dummy variable method

 Random effect model

Unsupervised Multivariate Methods

 Concept of Data reduction

 Definition of first, second, … ph principal component

 Deriving principal component using Eigenvectors

 Deciding optimum number of principal components

 Developing scoring models using PCA

 Principal component regression

 Orthogonal factor model

 Estimation of loading matrix

 Interpreting factor solution

 Deciding optimum number of factors

 Using factor scores for further analysis

 Factor rotation

 Concept of MDS

 Variable reduction using MDS

 Concept of cluster analysis

 Hierarchical cluster analysis methods (linkage methods)

 Using dendrogram to estimate optimum number of clusters

 k-means clustering methods

 Using k-means runs function in R and Python to find optimum number of k

Machine Learning

 Bayes theorem and its applications

 Constructing classifier using Naïve Bayes method

 Concept of Hyperlane

 Support vector machine algorithm

 Comparison with Binary Logistic Regression

 Basics of Decision Tree

 Concept of CART

 CHAID algorithm

 ctree function in R

 Bootstrapping and bagging

 Random forest algorithm

 Definitions of support, confidence and lift

 Aprioiri algorithm for market basket analysis

 Neural network problem for classification problem

Further Topics in Data Science

 What is text mining?

 Term Document Matrix

 Word cloud

 Establishing connection with Twitter using twitteR package and Tweepy in Python

 Introduction to SHINY

 Introduction to R Markdown

 Build dashboards

 Host standalone apps on a webpage or embed them in R Markdown documents or build

dashboards.

 What is Big Data?

 Features of Big Data (Volume, Velocity and Variety)

 Big Data in different industries (Healthcare, Telecom, etc.)

 HADOOP architecture

 Introduction to R HADOOP package

 What is AI and Theory behind AI

 What is Q learning

 The Monte Carlo theory

 SQL programming Basics

 Data Wrangling and analysis

 Text mining of Twitter data

Contemporary Themes in Business Strategy

 Fundamentals of Cloud Computing

 Compare and contrast cloud computing with traditional computing models

 Software as a Service

 Platform as a Services

 Infrastructure as a Service

 Business impact of Cloud Computing

 Historical development of Artificial Intelligence

 Vs of data - Volume, velocity, variety, veracity and value

 Christensen’s theory of disruptive innovation

 Ethical dilemmas and issues in Artificial Intelligence and Big Data

Key Outcomes

Graduates of this program will be able to:

Demonstrate core mathematical and statistical knowledge required for both basic and advanced data analysis.

Exhibit proficiency in R, Python, and SQL, applying them in practical, real-world data science tasks.

Understand the principles of data management, including cleaning, structuring, and evaluating datasets.

Use modern data visualization tools and techniques to communicate insights effectively.

Apply classical data analytics methods, such as statistical inference, predictive modeling, and dimensionality reduction.

Implement machine learning models to analyze business and organizational problems.

Understand contemporary business themes relevant to strategic planning and execution.

Evaluate and apply data science concepts within a strategic business context, helping organizations make evidence-based decisions.

Duration and Delivery

This diploma is designed to be completed in 9 to 12 months, depending on the study pace and delivery mode.

Modes of delivery include:

Blended learning (on-campus + online)

Fully online, tutor-supported learning

Interactive seminars, webinars, and case-study workshops

Individual and group project-based assignments

Assessment and Verification

The qualification is assessed through practical, work-related assignments designed to reflect real-world tasks. Each unit requires learners to demonstrate subject knowledge, critical thinking, problem-solving, and the ability to make informed recommendations. Assignments are aligned with specific learning outcomes and assessment criteria, incorporating relevant theories and concepts.

Learners are expected to apply their understanding to real organisational contexts, with mature or part-time learners encouraged to draw from personal work experience. Assessments are written to ensure academic rigour appropriate for Level 7 study. Sample assessments and marking schemes are available upon request..

Progression Opportunities

This qualification enables learners to:

Progress to a Master’s degree in IT, Data Science, Cybersecurity, Information Systems, or related disciplines.

Enter or advance within professional roles such as IT Project Manager, Data Analyst, Systems Architect, Network Consultant, or IT Director.

Book Now