Overview of asynchronous operations

Single thread model

The single-threaded model means that JavaScript runs on only one thread. In other words, JavaScript can only perform one task at a time, and other tasks must be queued up later.

Note that JavaScript only runs on one thread, which does not mean that the JavaScript engine has only one thread. In fact, the JavaScript engine has multiple threads, a single script can only run on one thread (called the main thread), and other threads cooperate in the background.

The reason why JavaScript uses single-threaded instead of multi-threaded has something to do with history. JavaScript has been single-threaded since its birth. The reason is that you don't want to make the browser too complicated, because multiple threads need to share resources and may modify each other's running results. This is too complicated for a web scripting language. If JavaScript has two threads at the same time, one thread adds content to the DOM node of the webpage, and the other thread deletes the node, which thread should the browser use at this time? Is it necessary to have a lock mechanism? Therefore, in order to avoid complexity, JavaScript was single-threaded from the beginning, which has become a core feature of the language and will not change in the future.

The advantage of this model is that it is relatively simple to implement, and the execution environment is relatively simple; the disadvantage is that as long as one task takes a long time, the subsequent tasks must be queued, which will delay the execution of the entire program. Common browsers are not responding (fake death), often because a certain piece of JavaScript code runs for a long time (such as an infinite loop), which causes the entire page to be stuck in this place and other tasks cannot be performed. The JavaScript language itself is not slow. The slow thing is to read and write external data, such as waiting for Ajax requests to return results. At this time, if the other party's server does not respond for a long time, or the network is not smooth, it will cause the script to stop for a long time.

If the queue is due to a large amount of calculations and the CPU is too busy, that’s fine, but most of the time the CPU is idle because IO operations (input and output) are very slow (for example, Ajax operations read data from the network) and have to wait After the results come out, proceed to the next step. The designers of the JavaScript language realized that at this time, the CPU can completely ignore the IO operation, suspend the waiting tasks, and run the tasks that are ranked later. Wait until the IO operation returns the result, then go back and continue the execution of the suspended task. This mechanism is the "Event Loop" mechanism used internally by JavaScript.

Although the single-threaded model poses a great limitation to JavaScript, it also gives it advantages that other languages ​​do not have. If you use it well, JavaScript programs will not be blocked, which is why Node can use very few resources to cope with high-traffic access.

In order to utilize the computing power of multi-core CPUs, HTML5 proposes the Web Worker standard, which allows JavaScript scripts to create multiple threads, but the child threads are completely controlled by the main thread and must not operate the DOM. Therefore, this new standard does not change the nature of JavaScript single-threaded.

Synchronous tasks and asynchronous tasks

All tasks in the program can be divided into two categories: synchronous tasks (synchronous) and asynchronous tasks (asynchronous).

Synchronous tasks are those tasks that are not suspended by the engine and are queued for execution on the main thread. Only when the previous task is completed can the next task be executed.

Asynchronous tasks are those tasks that are put aside by the engine, do not enter the main thread, but enter the task queue. Only when the engine thinks that an asynchronous task can be executed (for example, an Ajax operation gets the result from the server), the task (in the form of a callback function) will enter the main thread for execution. The code behind the asynchronous task will run immediately without waiting for the completion of the asynchronous task. In other words, the asynchronous task does not have a "blocking" effect.

For example, Ajax operations can be processed as synchronous tasks or as asynchronous tasks, at the discretion of the developer. If it is a synchronous task, the main thread waits for the Ajax operation to return the result, and then executes it; if it is an asynchronous task, the main thread executes the Ajax request directly, and when the Ajax operation has a result, the main thread executes it again Corresponding callback function.

Task queue and event loop

When JavaScript is running, in addition to a running main thread, the engine also provides a task queue, which contains various asynchronous tasks that need to be processed by the current program. (Actually, there are multiple task queues depending on the type of asynchronous tasks. For ease of understanding, it is assumed that there is only one queue.)

First, the main thread will perform all synchronization tasks. When all the synchronous tasks are executed, they will look at the asynchronous tasks in the task queue. If the conditions are met, the asynchronous task will re-enter the main thread to start execution, and then it will become a synchronous task. After the execution is complete, the next asynchronous task enters the main thread to start execution. Once the task queue is emptied, the program ends execution.

Asynchronous tasks are usually written as callback functions. Once the asynchronous task re-enters the main thread, the corresponding callback function will be executed. If an asynchronous task does not have a callback function, it will not enter the task queue, that is, it will not re-enter the main thread, because the callback function is not used to specify the next operation.

How does the JavaScript engine know whether an asynchronous task has a result, and whether it can enter the main thread? The answer is that the engine is constantly checking, over and over again, as long as the synchronization tasks are executed, the engine will check whether the suspended asynchronous tasks can enter the main thread. This kind of loop checking mechanism is called event loop (Event Loop). Wikipedia is defined as: "The event loop is a program structure used to wait and send messages and events (a programming construct that waits for and dispatches events or messages in a program)".

Asynchronous operation mode

The following summarizes several modes of asynchronous operation.

Callback

Callback function is the most basic method of asynchronous operation.

The following are two functions f1 and f2. The programming intention is that f2 must wait until the execution of f1 is completed before it can be executed.

function f1() {
  // ...
}

function f2() {
  // ...
}

f1();
f2();

The problem with the above code is that if f1 is an asynchronous operation, f2 will be executed immediately instead of waiting until the end of f1.

At this time, you can consider rewriting f1 and write f2 as the callback function of f1.

function f1(callback) {
  // ...
  callback();
}

function f2() {
  // ...
}

f1(f2);

The advantage of the callback function is that it is simple, easy to understand and implement. The disadvantage is that it is not conducive to the reading and maintenance of the code. There is a high degree of [coupling] between various parts (http://en.wikipedia.org/wiki/Coupling_(computer_programming)) (coupling ), which makes the program structure chaotic, the process difficult to trace (especially the situation where multiple callback functions are nested), and each task can only specify one callback function.

Event monitoring

Another way of thinking is to use an event-driven model. The execution of asynchronous tasks does not depend on the order of the code, but on whether an event occurs.

Let's take f1 and f2 as examples. First, bind an event to f1 (the jQuery writing used here).

f1.on("done", f2);

The above line of code means that when a done event occurs in f1, f2 is executed. Then, rewrite f1:

function f1() {
  setTimeout(function () {
    // ...
    f1.trigger("done");
  }, 1000);
}

In the above code, f1.trigger('done') means that after the execution is completed, the done event is triggered immediately, thus starting to execute f2.

The advantage of this method is that it is easier to understand, multiple events can be bound, multiple callback functions can be specified for each event, and it can be "decoupling" (Decoupling), which is conducive to the realization of modularity. The disadvantage is that the entire program has to become event-driven, and the running process will become very unclear. When reading the code, it is difficult to see the main flow.

Publish/Subscribe

An event can be completely understood as a "signal". If there is a "signal center" and a certain task is completed, a signal will be "published" to the signal center, and other tasks can be "subscribed" to the signal center. , So you know when you can start execution. This is called "publish/subscribe pattern" (publish-subscribe pattern), also known as "[observer pattern](http://en. wikipedia.org/wiki/Observer_pattern)” (observer pattern).

There are multiple [implementations] (http://msdn.microsoft.com/en-us/magazine/hh201955.aspx) for this model, the following is Ben Alman's [Tiny Pub/Sub](https://gist. github.com/661855), this is a plug-in for jQuery.

First, f2 subscribes the done signal to the signal center jQuery.

jQuery.subscribe("done", f2);

Then, f1 is rewritten as follows.

function f1() {
  setTimeout(function () {
    // ...
    jQuery.publish("done");
  }, 1000);
}

In the above code, jQuery.publish('done') means that after the execution of f1, the done signal is issued to the signal center jQuery, which triggers the execution of f2.

After f2 has finished executing, you can unsubscribe (unsubscribe).

jQuery.unsubscribe("done", f2);

The nature of this method is similar to "event monitoring", but it is significantly better than the latter. Because you can check the "message center" to understand how many signals exist and how many subscribers each signal has, so as to monitor the operation of the program.

Process control of asynchronous operation

If there are multiple asynchronous operations, there is a process control problem: how to determine the order in which asynchronous operations are executed, and how to ensure that this order is adhered to.

function async(arg, callback) {
  console.log(
    "The parameter is" + arg + ", the result will be returned in 1 second"
  );
  setTimeout(function () {
    callback(arg * 2);
  }, 1000);
}

The async function of the above code is an asynchronous task, which is very time-consuming. Each execution takes 1 second to complete, and then the callback function is called.

If there are six such asynchronous tasks, they need to be completed before the final final function can be executed. How should I arrange the operation process?

function final(value) {
  console.log("Completed:", value);
}

async(1, function (value) {
  async(2, function (value) {
    async(3, function (value) {
      async(4, function (value) {
        async(5, function (value) {
          async(6, final);
        });
      });
    });
  });
});
// The parameter is 1, and the result will be returned after 1 second
// The parameter is 2, and the result will be returned after 1 second
// The parameter is 3, the result will be returned after 1 second
// The parameter is 4, and the result will be returned after 1 second
// The parameter is 5, and the result will be returned after 1 second
// The parameter is 6, and the result will be returned after 1 second
// Complete: 12

In the above code, the nesting of the six callback functions is not only troublesome to write, error-prone, and difficult to maintain.

Serial execution

We can write a flow control function, let it control asynchronous tasks, after one task is completed, then execute another. This is called serial execution.

var items = [1, 2, 3, 4, 5, 6];
var results = [];

function async(arg, callback) {
  console.log(
    "The parameter is" + arg + ", the result will be returned in 1 second"
  );
  setTimeout(function () {
    callback(arg * 2);
  }, 1000);
}

function final(value) {
  console.log("Completed:", value);
}

function series(item) {
  if (item) {
    async(item, function (result) {
      results.push(result);
      return series(items.shift());
    });
  } else {
    return final(results[results.length - 1]);
  }
}

series(items.shift());

In the above code, the function series is a serial function, it will execute asynchronous tasks in sequence, and the final function will be executed after all tasks are completed. The items array stores the parameters of each asynchronous task, and the results array stores the running results of each asynchronous task.

Note that the above writing takes six seconds to complete the entire script.

Parallel execution

The flow control function can also be executed in parallel, that is, all asynchronous tasks are executed at the same time, and the final function is executed after all the tasks are completed.

var items = [1, 2, 3, 4, 5, 6];
var results = [];

function async(arg, callback) {
  console.log(
    "The parameter is" + arg + ", the result will be returned in 1 second"
  );
  setTimeout(function () {
    callback(arg * 2);
  }, 1000);
}

function final(value) {
  console.log("Completed:", value);
}

items.forEach(function (item) {
  async(item, function (result) {
    results.push(result);
    if (results.length === items.length) {
      final(results[results.length - 1]);
    }
  });
});

In the above code, the forEach method will initiate six asynchronous tasks at the same time, and the final function will not be executed until all of them are completed.

In comparison, the above writing takes only one second to complete the entire script. This means that parallel execution is more efficient and saves time compared to serial execution which can only execute one task at a time. But the problem is that if there are many parallel tasks, it is easy to exhaust system resources and slow down the running speed. So there is a third method of process control.

Combination of Parallel and Serial

The so-called combination of parallel and serial is to set a threshold, at most n asynchronous tasks can be executed in parallel each time, so as to avoid excessive occupation of system resources.

var items = [1, 2, 3, 4, 5, 6];
var results = [];
var running = 0;
var limit = 2;

function async(arg, callback) {
  console.log(
    "The parameter is" + arg + ", the result will be returned in 1 second"
  );
  setTimeout(function () {
    callback(arg * 2);
  }, 1000);
}

function final(value) {
  console.log("Completed:", value);
}

function launcher() {
  while (running < limit && items.length > 0) {
    var item = items.shift();
    async(item, function (result) {
      results.push(result);
      running--;
      if (items.length > 0) {
        launcher();
      } else if (running == 0) {
        final(results);
      }
    });
    running++;
  }
}

launcher();

In the above code, at most two asynchronous tasks can be run at the same time. The variable running records the number of currently running tasks. As long as it is below the threshold, a new task will be started. If it is equal to 0, it means that all tasks have been executed. At this time, the final function is executed.

This code takes three seconds to complete the entire script, between serial execution and parallel execution. By adjusting the limit variable, the best balance between efficiency and resources is achieved.