Reading notes related to Naive Bayes classifiers and Markov Chain

reading notes- Further learning in statistics and data science

I read the book ‘Reinforcement Learning’ by S. Sutton and thus made reading notes about markov chain and also I learned knowledge about Bayes classifiers.


“An algorithm that implements classification, especially in a concrete implementation, is known as a classifier. The term “classifier” sometimes also refers to the mathematical function, implemented by a classification algorithm, that maps input data to a category.”

Bayes classifier

Classifiers based on Bayes’ Theorem.

Bayes’ Theorem




Naive Bayes classifier

Classifiers based on Bayes’ Theorem with an assumption of independence among predictors.

####What are the Pros and Cons of Naive Bayes?


  1. It is easy and fast to predict class of test data set. It also perform well in multi class prediction

  2. When assumption of independence holds, a Naive Bayes classifier performs better compare to other models like logistic regression and you need less training data.

  3. It perform well in case of categorical input variables compared to numerical variable(s). For numerical variable, normal distribution is assumed (bell curve, which is a strong assumption).


  1. If categorical variable has a category (in test data set), which was not observed in training data set, then model will assign a 0 (zero) probability and will be unable to make a prediction. This is often known as “Zero Frequency”. To solve this, we can use the smoothing technique. One of the simplest smoothing techniques is called Laplace estimation. (useful linkage to Naive Bayes classifiers)

Markov chain and laplace smoothing

It is a pretty tough section so i only made notes in Chinese to help me understand this section better.




One application

For example, consider a hypothetical market with Markov properties where historical data has given us the following patterns: After a week characterized by a bull market trend there is a 90% chance that another bullish week will follow. Additionally, there is a 7.5% chance that the bull week instead will be followed by a bearish one, or a 2.5% chance that it will be a stagnant one. After a bearish week there’s an 80% chance that the upcoming week also will be bearish, and so on. This data is compiled to form a matrix and then the results are drawn thereof.

拉普拉斯平滑(Laplace Smoothing)又被称为加 1 平滑,是比较常用的平滑方法。平滑方法的存在时为了解决零概率问题。





  • Copyrights © 2021-2025 Alan
  • Visitors: | Views:

