B4 is a good step for chess, not for me.

Reading notes related to text mining

reading notes– Further learning in statistics and data science

I read the book ‘Opinion Mining and Sentiment Analysis’ by Pang and Lee and thus made notes related to statistics and data science.

Text mining

Text mining (also known as text analysis), is the process of transforming unstructured text into structured data for easy analysis. Text mining uses natural language processing (NLP), allowing machines to understand the human language and process it automatically.

Normalisation

Text normalization is the process of transforming text into a single canonical form that it might not have had before.

Stemming and lemmatisation are two approaches for text normalisation.

1.Lemmatization is a text normalization technique used in Natural Language Processing (NLP), that switches any kind of a word to its base root mode. Lemmatization is responsible for grouping different inflected forms of words into the root form, having the same meaning.

Lemma: the basic form of a word, for example the singular form of a noun or the infinitive form of a verb, as it is shown at the beginning of a dictionary entry

2.In linguistic morphology and information retrieval, stemming is the process of reducing inflected (or sometimes derived) words to their word stem, base or root form—generally a written word form.

notes in Chinese

最常见的词汇规范化的实践有:

  1. 词干提取(Stemming):词干提取是一个初级的、基于规则的脱去后缀(如“ing”,“ly”,“es”,“s”等等)的过程

  2. 词元化(Lemmatization):另一方面,词元化,是一个组织好的、一步一步的获取词根的过程。

  3. 词汇规范化: 另外一种文本型的噪音与一个词语的多种表达形式有关。例如,“play”,“player”,“played”,“plays”和“playing”都是单词“play”的变种。尽管它们有不同的意思,但是根据上下文来看,它们是意思是相似的。这个步骤是将一个单词的所有不同形式转换为它的规范形式(也被称为词条(lemma))

Opinion mining

Opinion mining, or sentiment analysis, is a text analysis technique that uses computational linguistics and natural language processing to automatically identify and extract sentiment or opinion from within text (positive, negative, neutral, etc.)

In my words, opinion mining is just one part in text mining; text mining involves classical text mining and opinion mining;The former one analyses text which are expressed factually whereas the latter one analyses text which are expressed subjectively.

Sentiment analysis, also referred to as opinion mining, is an approach to natural language processing (NLP) that identifies the emotional tone behind a body of text. This is a popular way for organizations to determine and categorize opinions about a product, service, or idea. It involves the use of data mining, machine learning (ML) and artificial intelligence (AI) to mine text for sentiment and subjective information.

Emotional tone refers to positive, negative or neutral attitudes.

Subjectivity polarity score

The polarity score is a float within the range [-1.0, 1.0] . The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective.

In short, text polarity is a measure of how negative or how positive a piece of text is.
Most of the time, NLP models can predict simply positive or negative words and phrases quite well. For example, the words “amazing”, “superb”, and “wonderful” can easily be labelled as highly positive. The words “bad”, “sad”, and “mad” can easily be labelled as negative. However, we can’t just look at polarity from the frame of individual words, it’s important to take a larger context for evaluating total polarity. For example, the word “bad” may be negative but what about the phrase “not bad”? Is that neutral? Or is that the opposite of bad? At this point we’re getting into linguistics and semantics rather than natural language processing.

The two biggest open source libraries for NLP in Python are spaCy and NLTK, and both of these libraries measure polarity on a normalized scale of -1 to 1. The Text API measures, combines, and normalizes values on both the polarity of the overall text, individual sentences, and individual phrases.

Example https://www.ebaina.com/articles/140000005269
notes in Chinese

极性(polarity)指的是一陈述是肯定还是否定的性质,如果某个词只能出现在肯定或者否定的陈述中,那么这个词就是极性项(polarity item)。在英语中at all是一个否定极性项,它只能出现在否定句中。

Opinion holders and opinion targets

One of the key subtasks in sentiment analysis is opinion role extraction. It can be divided into the extraction of opinion holders (OH), i.e. entities ex-pressing an opinion, and the extraction of opinion targets (OT), i.e. entities or propositions at which sentiment is directed.

Reading notes related to Naive Bayes classifiers and Markov Chain

reading notes- Further learning in statistics and data science

I read the book ‘Reinforcement Learning’ by S. Sutton and thus made reading notes about markov chain and also I learned knowledge about Bayes classifiers.

Classifier

“An algorithm that implements classification, especially in a concrete implementation, is known as a classifier. The term “classifier” sometimes also refers to the mathematical function, implemented by a classification algorithm, that maps input data to a category.”

Bayes classifier

Classifiers based on Bayes’ Theorem.

Bayes’ Theorem

3.png

1.png

2.png

Naive Bayes classifier

Classifiers based on Bayes’ Theorem with an assumption of independence among predictors.

####What are the Pros and Cons of Naive Bayes?

Pros:

  1. It is easy and fast to predict class of test data set. It also perform well in multi class prediction

  2. When assumption of independence holds, a Naive Bayes classifier performs better compare to other models like logistic regression and you need less training data.

  3. It perform well in case of categorical input variables compared to numerical variable(s). For numerical variable, normal distribution is assumed (bell curve, which is a strong assumption).

Cons:

  1. If categorical variable has a category (in test data set), which was not observed in training data set, then model will assign a 0 (zero) probability and will be unable to make a prediction. This is often known as “Zero Frequency”. To solve this, we can use the smoothing technique. One of the simplest smoothing techniques is called Laplace estimation.

https://www.zhihu.com/question/21134457 (useful linkage to Naive Bayes classifiers)

Markov chain and laplace smoothing

It is a pretty tough section so i only made notes in Chinese to help me understand this section better.

1.jpg

这个矩阵就是转移概率矩阵P,并且它是保持不变的,就是说第一天到第二天的转移概率矩阵跟第二天到第三天的转移概率矩阵是一样的。

2.jpg

One application

For example, consider a hypothetical market with Markov properties where historical data has given us the following patterns: After a week characterized by a bull market trend there is a 90% chance that another bullish week will follow. Additionally, there is a 7.5% chance that the bull week instead will be followed by a bearish one, or a 2.5% chance that it will be a stagnant one. After a bearish week there’s an 80% chance that the upcoming week also will be bearish, and so on. This data is compiled to form a matrix and then the results are drawn thereof.

拉普拉斯平滑(Laplace Smoothing)又被称为加 1 平滑,是比较常用的平滑方法。平滑方法的存在时为了解决零概率问题。

背景:为什么要做平滑处理?

零概率问题,就是在计算实例的概率时,如果某个量x,在观察样本库(训练集)中没有出现过,会导致整个实例的概率结果是0。在文本分类的问题中,当一个词语没有在训练样本中出现,该词语调概率为0,使用连乘计算文本出现概率时也为0。这是不合理的,不能因为一个事件没有观察到就武断的认为该事件的概率是0。

拉普拉斯的理论支撑

为了解决零概率的问题,法国数学家拉普拉斯最早提出用加1的方法估计没有出现过的现象的概率,所以加法平滑也叫做拉普拉斯平滑。
假定训练样本很大时,每个分量x的计数加1造成的估计概率变化可以忽略不计,但可以方便有效的避免零概率问题。

Reading notes--Principles of Economics (chapter19)

Summary

Chapter 19 is Earnings and Discrimination. This chapter introduces different kinds of discrimination. (e.g Why doctors earn more wages than workers)

Labour market discrimination– education

  • A well-educated worker tends to have a greater wage because of human capital and signaling effect.

  • Human capital effect: The well-educated labours have a greater productivity so they have greater wage rates.

  • Signaling effect: The access of university does not increase the productivity of labours but it is a signal that labours have the greater ability to master skills so they have greater wage rates/

Labour market discrimination– job

  • e.g A superstar always earns a greater wage than a carpenter or a plumber.
  • reason 1: Every customer in the market wants to enjoy the good supplied by the best producer.
  • reason 2: The good is produced with a technology that makes it possible for the best producer to supply every customer at low cost.

Discrimination by employers

  • Wage differential will be eliminated by it in markets with free entry and exit.

  • Reason: If employers discriminate one sort of employees, the demand for it will decrease so the wage rate of it decreases in the labour market. Then some employers will recruit them since the cost of these labours decreases.

Definitions

1.Compensating differential– a difference in wages that arises to offset the nonmonetary characteristics of different jobs

2.Strike– the organized withdrawal of labor from a firm by a union

3.Discrimination– the offering of different opportunities to similar individuals who differ only by race, ethnic group, sex, age, or other personal characteristics

4.Human capital– the knowledge and skills that workers acquire through education and on-the-job training

5.Efficiency wage– above equilibrium wages paid by firms to increase worker productivity

6.Statistical discrimination– discrimination that arises because an irrelevant but observable personal characteristic is correlated with a relevant but unobservable attribute

7.Union– a worker association that bargains with employers over wages and working conditions

Review

There are some exercises in this book. I recorded questions I didn’t answer correctly here.

  1. Among full-time U.S. workers, white women earn about_____ percent less than white men, and black men earn about_____ percent less than white men. (Correct answer: C)

a. 5;20

b. 5;40

c. 20;20

d. 20;40

My wrong answer: A

  1. The forces of competition in markets with free entry and exit tend to eliminate wage differentials that arise from discrimination by_____ (Correct answer: A)

a. employers

b. customers

c. government

d. all of the above

My wrong answer: D

An academic subject inspires me

Introduction

One of the academic subjects that has boosted my further interest is Economics. This short essay will demonstrate my journey of learning it.

Background

I could only start studying Economics in Grade 10, because it was not available on the curriculum before then. (Economics is a subject which can only be studied by international students because it is not a compulsory subject for Chinese national education system.)

Experience of learning Economics

Despite my relatively poor English, I still understood the main points from my English Economics teacher quickly perhaps I worked hard in it. Learning about supply and demand in commerce caught my interest quickly in Economics and I found it fascinating and wanted to learn more in this field, for instance, how a market works. I became aware of the need to improve my English language skills when I did not understand some specific vocabulary such as vice versa, ceteris paribus and capita. From then on, I read English books to improve my lexical resources. For example, the book Principles of Economics which illustrates some fundamentals in Economics. From this book, I was surprised to find there is a strong relationship between Mathematics and Economics, for instance, you could use calculus(integration)to calculate total welfare in one market. This opened up a whole new world to me because I did well in Mathematics which helped me study this field in more depth. I finished all exercises in each chapter and made reading notes to summarize the main points.

Experience of taking part in FBLA

Because of my outstanding performance in the final exam of the semester, my teacher suggested that I take part in a business contest which was called The Future Business Leader of America (FBLA). The preparation was intense but I learnt a great deal of new information which the school syllabus had not provided. Then I cooperated with two partners to do a presentation focused on a business plan about attracting investment from investors and received excellent scores. I strengthened not only my economic knowledge but also speaking skills during the preparation of the presentation.

Conclusion

In conclusion, I have been capable and motivated to learn more about Economics and I hope to focus more on this interesting subject in the future and perhaps study it at university level.

My greatest skill

Introduction

My greatest skill is my ability to use study skills to enhance my understanding of school subjects. These involve the study skills of doing efficient research online, managing time wisely and cooperate with others to solve problems if necessary. This short essay will illustrate how I have been developing and strengthening these study skills.

Background

My home is far away from the school and therefore I live in the school dormitory during weekdays so my parents cannot accompany me as well as give me help. The majority of students in our class tend to have extra-curricular classes to improve their learning in order to get excellent exam scores. For me, I consider self-study to be more advantageous so I do it instead of having those extra curricular-classes.

My study experiences

In the textbooks I use, it is necessary to answer questions in order to check if I have mastered knowledge precisely. However, these questions are limited so I then use the internet to find appropriate exercises. Doing past papers is a good choice, but the majority of them only cover one unit. So, I collaborate with another student to classify questions that cover chapters we have already learned and organize them in an online file. In addition, I pay for an annual subscription of a website called Save My Exams which provides appropriate questions. This website is useful because it addresses individual topics in detail rather than more generally therefore it is more in-depth and improves my learning.

The skill i need to develop

I am keen to avoid procrastination which is the enemy to these study skills especially in terms of time management. I did not master Physics knowledge due to this lack of organizing my time well in revising physics, I did badly in the monthly exam which was disappointing due to the lack of review which occurred owing to my poor ability to execute my plan of reviewing science subjects.

Conclusion

Overall, because of my experience living in the school dormitory, I have been strengthening my self-study skills which will be beneficial for me when I attend university.

Reading notes--Principles of Economics (chapter18)

Summary

Chapter 18 is The Markets for the Factors of Production. This chapter introduces the market for labour, land and capital.

The versatility of supply and demand

The market for labour

  • The market for labour
  • The price of labours is the wage rate.
  • The market price and quantity of labours are determined by the demand and supply of labours.

The market for land

  • The market for land

The market for capital

  • The market for capital

The market for land and the market for capital are the same as the market for labour

Production function

acb46cbeb5a72834ca8c4e331aadecc.jpg

  • Production function is the relationship between the quantity of inputs used to make a good and the quantity of output of that good.

The value of the marginal product of labour

image.png

  • Value of the marginal product is the marginal product of an input times the price of the output
  • It decreases as the quantity of labour increases (because of the diminishing marginal product)
  • Diminishing marginal product is the property whereby the marginal product of an input declines as the quantity of the input increases
  • The market wage is equal to the value of marginal product

Definitions

1.Factors of production– the inputs used to produce goods and services

2.Production function– the relationship between the quantity of inputs used to make a good and the quantity of output of that good

3.Marginal product of labour– the increase in the amount of output from an additional unit of labor

4.Diminishing marginal product– the property whereby the marginal product of an input declines as the quantity of the input increases

5.Value of the marginal product– the marginal product of an input times the price of the output

6.Capital– the equipment and structures used to produce goods and services

Review

There are some exercises in this book. I recorded questions I didn’t answer correctly here.

  1. Approximately what percentage of U.S. national income is paid to workers rather than to owners of capital and land? (Correct answer: C)

a. 25 percent

b. 45 percent

c. 65 percent

d. 85 percent

My wrong answer: B

  1. Around 1973. the U.S. economy experienced a significant_____ in productivity growth, coupled with a_____ in the growth of real wages. (Correct answer: D)

a. pickup; pickup

b. pickup; slowdown

c. slowdown; pickup

d. slowdown; slowdown

My wrong answer: D

Reading notes--Principles of Economics (chapter17)

Summary

Chapter 17 is Oligopoly. This chapter introduces oligopoly, a kind of market structure.

  • Oligopoly is a market structure in which only a few sellers offer similar or identical products

Prisoner Dilemma

Prisoners’ dilemma is a particular “game” between two captured prisoners that illustrates why cooperation is difficult to maintain even when it is mutually beneficial.

image.png

  • Each player will choose the best strategy for them regardless of the strategies chosen by others.

  • In this picture, each player will choose to defect although cooperation is mutually beneficial.

Nash equilibrium

  • Each company will choose to produce more goods in order to get more market shares rather than producing fewer to keep the market price high.

  • As the supply increases, the market price decreases.

  • Each company stops producing more goods as their total revenue decreases, the quantity they produce is called Nash equilibrium.

  • The quantity supplied in oligopoly is greater than that in monopoly, but smaller than that in the competitive market.

Definitions

1.Oligopoly– a market structure in which only a few sellers offer similar or identical products

2.Cartel– a group of firms acting in unison

3.Prisoners’ dilemma– a particular “game” between two captured prisoners that illustrates why cooperation is difficult to maintain even when it is mutually beneficial

4.Game theory– the study of how people behave in strategic situations

5.Nash equilibrium– a situation in which economic actors interacting with one another each choose their best strategy given the strategies that all the other factors have chosen

6.Dominant strategy– a strategy that is best for a player in a game regardless of the strategies chosen by the other players

7.Collusion– an agreement among firms in a market about quantities to produce or prices to charge

Review

I did all questions correctly! Excellent!

Personal statement-- my stories of learning Mathematics

The place where I learnt something very important to my personal growth was at the A level centre where I studied my IGCSE examinations. This short essay will focus on my experience with Mathematics when I was in Grade 10.

My experience of a mock exam

According to the Cambridge Assessment International Education, Asians perform well in IGCSE Mathematic examinations. The majority of students in my class, including me, always advocate that IGCSE Mathematic examination is easier compared with the Chinese alternative because students need to in Chinese alternative. During the lesson, I used to feel extremely bored and spend the whole course session daydreaming, which caused me to omit some important formulae. After nearly one-month, we were given a mock examination. My exam result was average rather than excellent which was disappointing. Thus, I liaised with the high achieving students and found that they paid full attention to listening to the teacher during class.

From then on, I had previewed and reviewed each mathematic chapter gradually to make sure I could master mathematic formulae clearly. But I still had a problem which was I was too careless. Here is one of my stories.

My experience of Euclid Contest

Three months after the mock examination, I took part in a mathematic contest called Euclid at school which was a little bit difficult because no multiple choices in it. Due to my inattention, I even forgot that our teacher had said there were no multiple choices. In addition, I did not read the question correctly because I read so fast which meant I misunderstood some questions and answered them incorrectly. The results showed that I got 68 out of 100 and the international top 25% required 69 marks.

Summary

This contest experience taught me a number of lessons. First, it encouraged me to study harder in the mathematics field as questions in the contest were difficult. Second, I have developed a good learning habit which is reviewing and previewing. Finally, this experience made me try the best and I realized I should be more attentive and improve my concentration and motivation if I want to fulfil my potential.

Reading notes--Principles of Economics (chapter16)

Review

Chapter 16 is Monopolistic Competition. This chapter introduces monopolistic competition, a kind of market structure.

  • Monopolistic competition is a market structure in which many firms sell products that are similar but not identical.

Characteristics

  • Many sellers: There are many firms competing for the same group of customers.
  • Product differentiation: Each firm produces a product that is at least slightly different from those of other firms. Thus, rather being a price taker, each firm faces a downward- sloping demand curve.
  • Free entry and exit: Firms can enter or exit the market without restriction. Thus, the number of firms in the market adjusts until economic profits are driven to zero.

image.png

  • Firms cannot earn economic profits in the long run.
  • P > MC
  • Profit maximization: sell goods at the quantity where MR = MC

image.png

  • Efficient scale is the quantity where MC = ATC.
  • Excess capacity is the difference between efficient scale and quantity produced.
  • Markup is the difference between price and marginal cost.

Definitions

1.Oligopoly– a market structure in which only a few sellers offer similar or identical products

2.Monopolistic competition– a market structure in which many firms sell products that are similar but not identical

Review

There are some exercises in this book. I recorded questions I didn’t answer correctly here.

  1. New firms will enter a monopolistically competitive market of_____(Correct answer: D)

a. marginal revenue is greater than marginal cost

b. marginal revenue is greater than average total cost

c. price is greater than marginal cost

d. price is greater than average total cost

My wrong answer: C

  1. What is true of a monopolistically competitive market in long-run equilibrium
    (Correct answer: A)

a. Price is greater than marginal cost

b. Price is equal to marginal revenue

c. Firms make positive economic profits

d. Firms produce at the minimum of average total cost

My wrong answer: D

  1. Advertising can be a signal of quality_____ (Correct answer: B)

a. if advertising is freely available to all firms

b. if the benefit of attracting customers is greater for firms

c. only if consumers are irrationally attracted to products they see advertised

d. only if the content of the ads contains credible information about the products

My wrong answer: D

Reading notes--Principles of Economics (chapter15)

Review

The chapter 15 is Monopoly. This chapter introduces the situation of monopoly.

  • Monopoly is the situation when a firm that is the sole seller of a product without any close substitutes.

image.png

  • Natural monopoly is a type of monopoly that arises because a single firm can supply a good or service to an entire market at a lower cost than could two or more firms

image.png

  • A monopolist’s marginal revenue is less than the price of its good because of the downward-sloping demand curve.

Profit maximization

image.png

  • Find the Q at which MR = MC.
  • On the demand curve, find P at which consumers will buy Q.
  • If P > ATC, the monopoly earns a profit.

For a competitive firm: P=MR=MC

For a monopoly firm: P>MR=MC

The deadweight loss

image.png

  • A benevolent social planner maximizes total surplus in the market by choosing the level of output where the marginal cost curve and demand curve intersect.

image.png

  • Because a monopoly charges a price above the marginal cost, not all consumers who value the good at more than its cost buy it. It causes the deadweight loss.

Price discrimination

  • Price discrimination is the business practice of selling the same good at different prices to different customers.

image.png

image.png

  • Because different consumers have different willingnesses to buy the same good, the price discrimination increases the total surplus.

Differences between Competition and Monopoly

image.png

Definitions

1.Monopoly– a firm that is the sole seller of a product without any close substitutes

2.Natural monopoly– a type of monopoly that arises because a single firm can supply a good or service to an entire market at a lower cost than could two or more firms

3.Price discrimination– the business practice of selling the same good at different prices to different customers

Review

There are some exercises in this book. I recorded questions I didn’t answer correctly here.

  1. If a monopoly’s fixed costs increase, its price will_____ and its profit will_____. (Correct answer: D)

a. increase; decrease

b. decrease; increase

c. increase; stay the same

d. stay the same; decrease

My wrong answer: A

  1. The deadweight loss from monopoly arises because_____. (Correct answer: B)

a. the monopoly firm makes higher profits than a competitive firm would

b. some potential consumers who forgo buying the good value it more than its marginal cost

c. consumers who buy the good have to pay more than marginal cost, reducing their consumer surplus

d. the monopoly firm chooses a quantity that fails to equate price and average revenue

My wrong answer: D

  1. Antitrust regulators are likely to prohibit two firms from merging if_____. (Correct answer: C)

a. there are many other firms in the industry

b. there are sizable synergies to the combination

c. the combined firm will have a large share of the market

d. the combined firm will undercut competitors with lower prices

My wrong answer: A

  • Copyrights © 2021-2022 Alan
  • Visitors: | Views:

请我喝杯咖啡吧~

支付宝
微信