Reflections

Pretending to Be Asleep

Cross-posted on Quora under the question “What can non-Chinese do to convince the Chinese that they are being oppressed?”

I do not think non-Chinese people should do anything. There is a popular saying in China: you cannot wake up a person who is pretending to be asleep (你叫不醒一个装睡的人). I believe this is a proper description of what is going on today with many Chinese people.

The recent emergence of a flurry of overseas Chinese who have grown up in mainland China and who continue to show unwavering, sometimes unconditional support for the CCP-ruled “motherland” (祖国) after they have moved abroad, often accompanied by zero tolerance for any (imagined or real) “criticism/insult against China” (辱华), has provided a good opportunity for the world to see how some Chinese people are pretending to be asleep, even if they are no longer obliged to do so. I would refer to Chris Wang’s answer under this same thread for a plausibly representative perspective from the said group. But before I give my two cents on the issue, I would commend Chris Wang for pointing out that, whether a person grows up under the CCP rule or in the West, they are very much likely to be brainwashed either way.

However, the crux of the matter is, how many types of brainwashing can there be?

For those growing up under the CCP rule (like myself), there is basically only one type of brainwashing, i.e. listening to the Party. This includes explicit propagandas, censorship/eradication of unwanted voices offline and online, patriotic/nationalistic education that is mandatory for school children, etc. Even if a member from this group has physically relocated to another country, the knee-jerk responses from their younger selves’ experience still strongly resemble an integral part of their way of life (political or otherwise).

Now compare that to those growing up in an environment where there are numerous channels and countless ways of brainwashing, including, but not limited to, the press (say, the New York Times, which, by the way, is a profit-making business whose stock price has gone up almost 200% in the past three years), social media, Wikipedia, Q&A platforms such as Quora, this Chinese-politics-related Quora question, the many (oft-opposing) answers people have submitted under this question, the comments on the answers, the comments on the comments, and so on—all of which are freely available on the internet. And if you go into the streets, parks, and sidewalks during a protest, you might see more eye-widening or even repugnant brainwashing ideas. Sure, these brainwashing ideas are not equally distributed, but the possibilities are endless, and vastly different brainwashing ideas do have a fair chance to compete against each other, at least in theory.

So why do some overseas Chinese people like Chris Wang still feel oppressed, provided that more is better when it comes to brainwashing? I believe this lies in the fact that unlike what they were accustomed to back where they grew up, they have now realized that in the West it is not at all easy to brainwash another person using only one type of brainwashing technique—which they themselves acquired from being brainwashed as they grew up—no matter how well the people in the “motherland” have responded.

Hence the feeling of powerlessness.

But remember, even when they are showing such grievances against the “other,” those overseas Chinese are not forced to shut up by any authority. Indeed, when they freely express their opinions which their Western neighbors, police officers, mayors, and prime ministers might not like, they are completely aware that they can do this precisely because whatever they say there will be relatively minor if any real oppression—with the only exception being “bad-mouthing China,” possibly because “it is my motherland,” or more likely because “my current/future business is there.”

While few of them would openly acknowledge this “I can say whatever I want to say” privilege, they nevertheless take it for granted when they vent their anger on Facebook, Twitter, and Quora, as well as when they march through the streets carrying the Five-starred Red Flag, knowing from the bottom of their hearts that none of these is feasible today where they grew up. Moreover, here they are given a full opportunity to brainwash the “other” using their very own, albeit limited, argument (or name-calling, for better or for worse), which is impossible for the “other” to do in the same way in China without some serious repercussions.

So you think they need to wake up? They have been awake since the very beginning.

A question remains to be answered: for those overseas Chinese who claim that they truly feel oppressed by the non-Chinese actors in their diasporas (the press, the government, the “other” people, etc.) but not their own government and people back in China, if they are reasonable, consistent, and honest to themselves, then why don’t they simply refrain from publicly expressing their complaints—just like what they would self-consciously do towards their “less oppressive motherland”—and go back to China for an ostensibly better future?

I bet many of them know exactly why they are not doing it.

Standard
Mathematics, Reflections

On Kant and Mathematics

Cross-posted on Quora under the question “Did Kant make a mistake when he said that mathematics is synthetic and a priori?”

The influential German philosopher Immanuel Kant famously claimed in his Critique of Pure Reason that mathematics contains a body of knowledge that is synthetic a priori. I believe this claim is wrong, and here is why.

First, it seems obvious to me that there is some mathematics involved in our everyday interaction with the physical world. Yet, to say that such mathematics is a priori and true becomes questionable as soon as we look more closely into how human brains “think” mathematics.

From an evolutionary biology and cognitive science perspective, our brains can “think” everyday mathematics mainly because:

  1. Natural selection has preserved such hunting-and-gathering-related capacities as perceiving, abstracting, classifying, and grouping objects, which is coded in our genes and expressed as part of the biological development of our brains;
  2. Through subjective experience in the real world, especially during early childhood, our brains have acquired stereotyped “principles” and “rules” in relation to objects and their mathematical properties, which are operationalized by physical/chemical reactions among neurons under specific arrangements/relations/patterns in our brains.

Now, to evaluate the claim that mathematics is synthetic a priori, we may need to review the particular definition of the term that Kant used. According to Wikipedia (Analytic–synthetic distinction):

In the Introduction to the Critique of Pure Reason, Kant contrasts his distinction between analytic and synthetic propositions with another distinction, the distinction between a priori and a posteriori propositions. He defines these terms as follows:

a priori proposition: a proposition whose justification does not rely upon experience. Moreover, the proposition can be validated by experience, but is not grounded in experience. Therefore, it is logically necessary.

So basically an a priori mathematical proposition is justifiable independently of any experience whatsoever. For instance, and I follow Kant’s famous example, one can see that 5 + 7 = 12 is “true” by merely thinking about it, without referencing any experience such as 5 apples, 7 apples, 12 apples, and so on.

However, when we take the human brain’s inner workings into account, the fact that you know 5 + 7 = 12 is “true” by merely thinking about it actually involves the firing of neurons whose arrangements/relations/patterns have all been determined by the previous physical and experiential processes as I have described above. There is absolutely no guarantee that these processes will lead to truth. Indeed, it is a well-known fact that to the type of mathematics involved in the quantum world, the commutative law of multiplication (i.e., a × b = b × a) that we take for granted does not apply. There really is no a priori truth other than what you believe to be true mostly because of your heuristic experience with the world, which has been embedded in your thinking apparatus when you make the calculation.

To an outsider who can observe your neural activities, this whole “I know 5 + 7 = 12 by merely thinking about it” magic is no more different, in principle, than a mechanical Turing machine manipulating symbols on a strip of tape according to a predetermined table of rules, which will unmistakably give the result “5 + 7 = 12” (in its binary form) if requested. But there is nothing that prevents the rules from changing so that something like “5 + 7 = 21” becomes the result. And that is exactly why mathematics can be done analytically.

Standard
Predictive Analytics

Predicting Outcomes of New Entrants

In business consulting and competitor analysis, an important question that often arises is “what would happen if I were to open a new business here?” For one, it is a counterfactual question, and the consultant/analyst cannot directly find an answer from data analysis alone. This is because the relationships among variables would have been different had the new entrant existed, but by definition, the new entrant could not have been there when data was first collected, so the “naïve” patterns learned from historical data are subject to logical contradictions and poor generalizability.

In this article, I argue that with the help of a conceptual model, it may be possible to predict outcomes of a new entrant in a logically consistent way. I will use a simple example to show what this means.

Suppose I was a fitness franchisor and wanted to decide whether to open a new gym in the city. I gathered some market intelligence and found that there were L locations where customers might come from (e.g. the more affluent neighborhoods), and there were already G gyms in the city competing with each other. I also did some market research and got the total sales revenue of fitness for each of the L locations (e.g. from the Consumer Expenditure Surveys).

With all this information, I define a market penetration score (MPS): $$MPS_{lg}=frac{ISR_{lg}}{TSR_{l}},$$ where ISR is the individual sales revenue (of gym g at location l), and TSR is the total sales revenue (at l). While I cannot observe every g’s individual sales revenue, I do know the total sales revenue based on my market research, and I also know that $$TSR_{l}=sum_{g=1}^{G}ISR_{lg},$$ which is just a definition.

I then imagine opening a new gym somewhere in the city, and call it h. Instinctively, I would predict the ISR of the new entrant at l via a predicted MPS (by training a machine learning model, which I will describe later) together with my information on TSR: $$widehat{MPS}_{lh}times TSR_{l}.$$ However, because of the introduction of the new gym, the ISR of each current g at location l is expected to reduce by $$frac{widehat{MPS}_{lh}}{1+widehat{MPS}_{lh}}times ISR_{lg}.$$ As a result, the predicted ISR should be modified to $$frac{widehat{MPS}_{lh}}{1+widehat{MPS}_{lh}}times TSR_{l}.$$

This is what I mean by having a conceptual model: thinking clearly about the observed patterns being learned by machine learning models vs. the inner workings being applied to get logically consistent predictions.

Now comes the machine learning part. The predicted “raw” MPS can be acquired by fitting $$MPS_{lg}=MPSleft(D_{lg},A_{lg};Thetaright)$$ to past data in each of my other gym g at every location l in the same city or another area. Here, D is the distance-specific characteristics between g and l, and A represents other store-specific characteristics. There are a great many ways in machine learning to automatically find a function and parameter set for MPS, but oftentimes a traditional functional form specification such as the Huff model might work just as well (and is easy to interpret).

Finally, by changing site locations and store features, I can search for the best set of characteristics (subject to certain constraints) that maximize the sales revenue of my new gym. Sweet!

Standard
Data Visualization, Exploratory Data Analysis

Data Visualization Using Chart.js

Chart.js is a popular library that produces fast, elegant, and interactive charting and data visualization solutions using HTML and JavaScript. The API is very straightforward and easy to use. Chart.js has been widely deployed in admin dashboards and other user-friendly metrics-driven applications, working seamlessly with responsive web design libraries such as Bootstrap 4.

Chart.js also appears in statistics and data analytics oriented applications, though it is less known in the data science community compared to the more heavyweight D3.js. While no tool alone can solve all data visualization problems, I argue that Chart.js (together with its rich set of plugins) is able to handle most of the data visualization requests during the exploratory data analysis (EDA) phase of everyday data science practices. Chart.js is especially useful when such tasks are performed in a web application that is accessible to various stakeholders—who simply want to discover useful information from data to support their decision-making processes—regardless of their level of coding in a programming language.

For univariate data visualization, Chart.js is very handy for plotting summary statistics and distributions via the bar chart (histogram of a numerical variable as well as frequency counts of a categorical variable), the pie chart or the doughnut chart (relative frequencies of a categorical variable), and the line chart (values of a time series variable). The tooltip element of Chart.js makes it convenient for the user to instantly locate the exact number behind a bar or a slice of pie.

For multivariate data visualization, in many cases, the user can simply add a second, third, … variable to the one-variable chart by appending data to the value for the datasets key, if it is appropriate to compare these variables with each other. Chart.js creates a “dataset” label for each new variable and shows it in a different color. (See official examples here and here.) When the user clicks on such labels/legends, Chart.js will toggle the visibility of the clicked variables, which essentially performs a select/filter operation.

Chart.js is also good for presenting some important features of a two-way table with an intuitive interface. Here I give two examples (here and here) of two-variable data visualization for EDA in Chart.js. My first example uses the scatter chart to plot a y variable against an x variable, where y is categorical. My second example utilizes the bubble chart to plot a categorical y against a similarly categorical x. The flexibility of Chart.js makes it simple to customize configurations using callbacks, and data can be freely manipulated through the API itself. In my second example, the raw data numbers in the r dimension (radius of the bubbles) are correctly squared and rounded as they are displayed in the tooltips:

I am sure there are numerous other ways in Chart.js to play with the data and the charts, and hopefully, this article will help us get started.

Standard