Dataset is from the UCI Repository. We pre-processed the dataset and divided it into training data and testing data in 70 to 30 proportion. Build a training model, applied ML algorithms to it. There are two types of the learning phase, the first is single phase learning and the second is Multi-phase learning, in this we use classification and prediction algorithms for the prediction, and all that algorithms and the related data is saved inside the knowledgebase. if the user enters any input it will automatically show the results for that input. In this model, we are running a comparative analysis of the different algorithms based on their performance metrics, and from that results, we are going to choose the best-suited algorithm for the model.
Availability of data, the clinical blood reports data is not easily available. Data standardization: Before applying to the model we need to normalize the data. To eliminate null values, adding missing values, outbound values. Overfitting: Occurs when a statistical model or machine learning algorithm captures the noise of the data.
In this, we required a dimensional reduction algorithm. Underfitting: Get more training data. Increase the size or number of parameters in the model. Increase the complexity of the model. Increasing the training time, until the cost function is minimized. Underfitting occurs if the model or algorithm shows low variance but high bias. Underfitting is often a result of an excessively simple model. We required a large amount of data, therefore, there is an increase in training time.