title: Quick Review - ByteDance's Speech on Serverless 2021 GMTC
- quick review
create_time: 2023/10/24 23:17:33
update_time: 2023/10/25 00:51:44
publish_time: 2023/10/25 00:51:34
Let's start with the text version and then the audio version.
GMTC 2021 Speech "ByteDance's Upgrade of Frontend Development Mode Based on Serverless"
by Wang Lei from ByteDance Web Infra on 2021-07-07
Here are my notes, part of it is the original text and part of it is my thoughts. I will indicate the original text part.
Serverless is not a new concept, especially for large companies. Large companies definitely have their own BaaS system, after all, FaaS is relatively easy to do, but the underlying infrastructure still needs to be customized, keeping this in mind.
Original text: Next, I will introduce today's content from the following 6 aspects:
- First, summarize the responsibilities and challenges of frontend in the era of large frontend.
- Then, introduce the common business forms of ByteDance.
- ByteDance's traditional development mode and challenges.
- Then, introduce how we upgraded the frontend development mode based on Serverless.
- In order to ensure stability, we have also done a lot of work in monitoring and operation.
- Finally, make a simple summary and outlook.
1. Multiple Job Responsibilities
The first part is not surprising, the job responsibilities have expanded, from cutting images to frontend development, SSR/BFF/micro frontends/cross-platform, integrated, and Serverless.
I didn't quite understand the progressive development from BFF to micro frontends.
In terms of knowledge system, traditional framework knowledge is still needed, and backend tools need to be supplemented, such as Redis/mq/object storage, monitoring and alerting, etc. The main thing is to supplement backend and operation knowledge.
2. Multiple Business Forms
The second part is also not surprising, toC + toB, cannot avoid CSR+SSR+BFF, micro frontends are highly regarded. As a global company, global deployment is inevitable, and global deployment requires distributed data synchronization.
3. Needs and Challenges
The third part gives examples of web development problems. Let's take a closer look.
- CSR is the basic deployment of frontend, and the expansion is to integrate CDN + k8s cluster deployment.
- Business highlights: handling object storage, login authentication, AB testing, cluster operation and maintenance.
- BFF deployment cannot avoid k8s cluster.
- Business highlights: permissions, operation and maintenance, traffic control, domain name integration (probably using DNS).
- Deployment system
- Project management, release system, travel management, AB management.
There is not much new here, but it is a lot for ordinary front-end developers, and it is difficult. The problem is that it is too large, and Serverless is emerging.
Alright, let's bring out the stuff quickly.
4. Solutions based on Serverless
The fourth part is about the development mode based on Serverless.
Concept Alignment and Expansion
Serverless industry practices:
- There are already FaaS, such as scheduling, cold start, etc.
- Combination with BaaS
- Cloud functions, Node frameworks, Runtimes, etc.
So, let's create a one-stop frontend solution based on Serverless. The so-called solution needs to start with a diagram: architecture diagram + lifecycle roadmap.
Let's continue, the one-stop platform needs to provide basic platform capabilities, common capabilities out of the box, and developer experience (DX) friendly (just like Nuxt.js).
So, what are the platform capabilities?
No need to read the text explanation. Let's bring out the platformization, DDDD. Let's talk about architecture first, and then implementation.
This part is similar to modern.js. If you want to enable SSR, it can be done with one click. But it's actually difficult, similar to nuxt.js, providing options for CSR/SSR/SSG modes, automatic mapping of the API directory. The deployment artifact uses online configuration, routing and domain name allocation.
Here comes the architecture diagram:
There is a lot of text here, the left side introduces the architecture diagram, and the right side introduces the lifecycle and data flow.
After talking about the architecture, let's talk about CICD. Based on the capabilities provided by Coding, it seems to be: code submission - compilation - linting - security check - manual approval by test personnel - lighthouse performance check - manual approval for deployment.
Service orchestration pipeline, configuration file and process conversion.
Then let's move on to implementation.
Implementation of CSR
First, for regular CSR, the artifacts are automatically uploaded to the CDN, with one copy in ES5 and one copy in ES6. Why separate them? For ESM? In the deployment process, there is a platform control diagram, it seems to be:
- Allocate domain names
- Select the folder for object storage
- Choose to publish
A diagram is provided:
So that's why, ES6 is for dynamic polyfilling later on, dynamically returning based on the user agent (similar to https://polyfill.io/), dynamically filling in the gaps, as expected from a large company, using everything they can think of.
Implementation of SSR
When allocating routes, you can choose SSR and micro frontend modules. Corresponding to Nuxt's configuration mode changes.
For users accessing SSR, caching is still necessary. First, check Redis, otherwise use the corresponding service through service discovery (distributed architecture). In exceptional cases, CSR is used as a fallback.
Implementation of BFF
Oh, I see. It is related to existing knowledge, Nuxt's Nitro API directory, and the SSR mentioned earlier is SSR. After compilation, the artifacts have BFF files, indicating the corresponding services, and deploying the services will have BFF services.
It's a bit more complicated, and this is as far as I can understand.
SSR service discovery uses RPC calls, CSR uses HTTP calls, as usual.
Implementation of Micro Frontends
Micro frontends are about embedding one set of system pages into another set of system pages within a single page, I won't explain it here. Internally, it is integrated into the Garfish Micro Frontend system, and the technical output can be compared to micro-app/wujie.
The goal of a large company is definitely to be smooth, and they are preparing from the project infrastructure.
For target selection, I need to develop CSP/SSR/BFF/micro frontend applications, so that development and delivery are the same. Micro frontends have parent-child module relationships, micro frontend menus, and other functions.
I speculate that the parent-child module relationship requires defining the root container and corresponding URL routes. It has a page, and it needs to be associated to be unified. I didn't quite understand the purpose.
I speculate that for the latter micro frontends, it means that the menu can be configured, after all, the menu is the entry point, and the pages hit by the menu are empty and serve as containers. After all, dynamic authentication is required, and it reminds me of my past nightmares, hahaha.
That's as far as I can understand, the traffic comes in, goes through the gateway-container page-loads the micro frontend.
5. Monitoring and Operation
After playing with so many things in frontend, we still need to look at load and operation. With k8s, it becomes much simpler. I speculate that there can be panel monitoring, monitoring and alerting solutions, automatic restart and rollback, and maybe even flame graphs.
First, define metrics and set thresholds through rule engines.
- 404 alert
- 5xx alert
- QPS exceeding alert
- SSR failure alert
With alerts, there needs to be a popup for handling, what to do when there is a problem:
- Know it, don't handle it
- Block for half an hour, block for 6 hours
- Cancel blocking
- This is a correct alert, error alert, needs to be handled. Report with tracking.
- Copy content for help
With metrics, you can create a big screen, I'm too alert. Let's try to see the effect of the load.
Logs still need to be integrated into the log system, separate management and filtering of log systems. That is, information exploration of massive data. Strive to achieve traceability.
When there is a problem, how to protect the scene and debug? This part is a bit of a blind spot, it should still be remote sourcemap, flame graph analysis, and he also admits that there are similar custom things internally, node runtime debugging.
Snapshot/cpu profile analysis is a bit of black magic, I don't understand, respect.
6. Summary and Outlook
Future development of serverless:
- Runtime needs to better handle cold starts, dynamic scaling
- Better BaaS platform construction
- Serverless+ is a one-stop service
Their infra team:
- APM platform
- Test infra
- Low code
- Cross-platform solution
- Web development engine
- Engineering platform
- Mobile center
- Node.js architecture
- Web IDE
- Micro frontend solution
- Design ops
Final Thoughts on Reading
As expected from a large company, very honest and real, delving deep.
I'm getting better at bragging about myself, I can guess what they are talking about. Happy!