#serverless
Yang Kai, China Publishing Group of the Ministry of Industry and Information Technology
Outline
- Overview Disruptive impact, understanding the relationship between sls and current technology
- FaaS is the practical application technology of the core idea of Serverless
- BaaS is another major scoring system of sls
Overview
#cloudnative The most mentioned concept of cloud-native is serverless
The concept of sls, the development concept of sls, low operation and maintenance costs, significantly reduced server costs
For example, GitHub Pages, developers deploy files as static sites, allocate domain names, and access services. It shields many details.
For example, CDN, no need to pay attention to details, nodes, global users access quickly.
It can be considered as a way to provide products as a service for computing resources, which conforms to the serverless concept and shields technical details to provide services to the outside world.
CNCF released the serverless white paper in 2018
sls means that there is no need to maintain and manage servers in both application construction and operation
The sls computing platform should include one or two capabilities: faas baas
Next, introduce usage scenarios, advantages and disadvantages
Advantages of faas: high development efficiency, low deployment cost, low operation and maintenance cost, low learning cost, low server cost, flexible deployment plan, high system security
Disadvantages: high platform learning cost, high debugging cost (runtime in the cloud, difficult to debug and log query), cold start, vendor lock-in
Usage scenarios:
- The function itself is stateless and integrates with persistent servers
- Functions work independently, and the cost of computing will increase when a large number of functions are linked to each other
- Automatic scaling
- Event-driven, such as http, database modification, scheduled events, etc.
- Scenarios with low requirements for cold start
Faas itself does not support the http protocol, so it needs to provide web APIs to the outside world through an API gateway
Sls and server-side technology
Monolithic application - layered architecture, UI layer, Business Logic Layer, Data Access Layer
Based on the three-tier architecture, such as enterprise application architecture patterns, domain-driven design, further differentiation
- Application layer
- Domain layer
- Persistence layer
- Infrastructure layer
Still thinking in terms of layers, dividing the code to achieve the goal of Separation of Concerns (SOC).
Can I abstract the logic of login, permission judgment, etc., so as to reuse them between different businesses?
Microservices architecture style, evolved from distributed architecture. Multiple independent small services for lightweight and controllable software development and management. Services communicate with each other through RPC, so each service can use different technology stacks to complete functions.
The architecture of the system is constrained by the communication structure of the organization. Conway's Law. The architecture is designed according to the organizational structure. For example, in a small company, a monolithic application is used. As the scale expands, new personnel need to familiarize themselves with the overall content again, which is a mesh structure, and each other needs to synchronize information, and the cost of communication increases exponentially.
Splitting the monolith into microservices only focuses on its own business, shields unnecessary information, and reduces communication costs.
Containerization, cloud computing, noops-devops next stage, reducing the investment of R&D personnel in operation and maintenance, is a concept.
Sls and frontend technology
bff, backend serves the frontend. It acts as a glue layer between the frontend and the backend, aggregating and trimming information. One interface completes the operations of multiple interfaces. Remove unnecessary fields, etc.
This makes the interface more flexible. Low communication cost. Perform performance optimization by reducing requests, and improve security by reducing data exposure.
GraphQL is not easy to apply in a multi-language microservices architecture, and the cost of transformation is too high. Currently, there are not many cross-language implementations. It is only possible to combine with serverless.
FaaS Technology
Event-driven. It is a design model.
faas functions are also event-driven, with different triggers as event sources. For example, http triggers
The function is stateless and simple enough.
Creating a function. There are many technical details inside, and it has been revised several times now.
- Runtime selection
- Default
- Custom runtime, which has the highest compatibility
- Custom container, such as GPU container
- Basic information, name, region, trigger event and http
- Code upload
- Demo
- Zip package compression
- OSS upload
- Folder upload
- Specification design
- CPU 0.05~16g
- Memory 128M ~ 32 G
- Temporary hard disk 512M, 10G
- If you choose container deployment, you can also choose GPU T4/A10 specifications
- Instance concurrency. How many requests are processed at the same time
- Timeout 60s
- Time zone
- Authentication
- Signature
- JWT
In 2015, the open source community introduced the serverless framework, aiming to become the framework and ecosystem of sls, and solve the problem of vendor lock-in.
Ignore the details of writing.
7 Function Lifecycle
module.exports.handler = function (request, response, context) {
// code
(async () => {
response.setStatusCode(200);
response.setHeader('content-type', contentType);
response.send(fs.readFileSync(path))
})().catch(err => {
response.setStatusCode(500);
response.setHeader('content-type', 'text/plain');
response.send(err.message);
});
}
8 Understanding Function Runtime
23 design patterns
Message queue can solve the problem of high concurrency and decouple the system. Smooth out peaks and fill in valleys. Ensure that information is received in order.
Suddenly thought of information-triggered solutions, such as internal messages, text messages, WeChat push, etc.
9 Building a Simple FaaS
To ensure the security of each function and avoid interference with each other, the key technology for isolation is the sandbox Sandbox
- Use Docker for isolation
- Use process isolation, such as Docker
Use the master to listen for function calls and enable child processes to execute functions. Communicate the returned results to the main process.
Here, child_process.fork()
is used to start a child process for isolation.
If the execution code can be compared using eval/Function, the best solution is the new vm.Script(code)
solution of Node.js
But vm also has risks, for example, this.constructor.construcotr
Design of prototype chain, so it's better to have no prototype chain
const sandbox = Object.create(null)
vm.createContext(sandbox)
vm.runInContext(code,sanbox)
But it is not perfect. In the community, vm2
is used, which is wrapped with the proxy feature to make it safer.
Therefore, the runtime uses vm2 and child_process.fork in combination to achieve isolation and execution of user code.
To support the http protocol, you can use Koa. Download and execute functions dynamically...
The simplified version is completed, but throughput performance, security and stability, and development efficiency still need to be considered.
Each execution will create a child process to execute. The creation and destruction management of this process is a performance overhead. Too many processes during high concurrency may also cause crashes, so consider process pools.
Cluster can be used to further abstract and encapsulate child_process
, making it easier to use.
Using cluster and vm2 in conjunction with Koa can implement a solution.
Asked gpt,
Consider generic-pool
: This module provides a generic resource pool that can be used to manage various resources, including processes. You can use it to create a process pool to avoid frequent startup and shutdown of child processes. It provides flexible configuration options that allow you to customize the size of the pool, resource allocation strategies, etc.
If the user code uses an infinite loop, the task cannot be ended. Set a timeout. Both vm/vm2 have a timeout, such as setting it to 5000ms
As for stability, resource limitations. High-intensity use of CPU, memory, and disk. To limit resources, Linux has CGroup to limit resources, and Docker also uses this capability.
Introduced a bunch of concepts about CGroup, skipped the details, and you can limit a process to use up to 20% of CPU resources with just a command. It's a bit complicated, and gpt answered it well.
Improve efficiency, built-in commonly used frontend services.
- Simple kv storage, using vm2 to implement get/set
Of course, Redis is still the best
BaaS Technology
backend as a service.
![Pasted image 20230923233047.png]
Subsequently, Firebase was mentioned as an example, which provides many out-of-the-box basic services, such as cloud databases, cloud functions, auth, hosting, storage, etc.
There are also crash reports, performance monitoring, test lab, application distribution, and other functions.
There are also extended operational functions
- In-app messaging
- Analysis
- A/B testing
- Cloud messaging, push IM, etc.
- Remote config
Domestic platforms like LeanCloud/Bmob still have gaps in overall integrity.
Database design principles.
For example, when querying two tables, posts and comments, there will be a select 1+n problem, which means that you need to first query n records to find the information, and then query n times for the corresponding data. In SQL, you can use join.
For example, the content of the five most recent comments and the title of the corresponding article.
Using SQL is relatively simple. If it is designed as an embedded solution in advance, it is easy. If it has become a reference solution, more redundant content may need to be retained. Set up a new collection, the collection of the latest comments, with redundant fields. If there are read and write problems, it can be changed to a cache, updated once an hour. The problem caused by this is that when deleting, multiple copies need to be synchronized for deletion.
If you use embedding, you only need one query, and the query performance is good. It is not good for complex nested data. Aggregate the data first and then sort it out. The implementation is also placed in memory, and it may not be good for a large amount of data.
Quickly introduced CDN, object storage, and other functions
User authentication introduced OneID, which is not popular, and introduced OAuth 2.0
Finally included in the IETF organization and standardized.
OAuth2 solves authorization, not user authentication, and introduces OIDC - OpenID Connect
Further divided into id token and access token
jwt is rfc 7519
IDaaS such as Auth0 and Authing emerged to integrate with third parties.
- Identity authentication and account authorization
- SSO single sign-in, one application login, multiple applications share
- Provide OAuth2 to the outside world
- Unified sending of emails, SMS password reset
- Enterprise identity login
- Two-step verification
You can take a look, there are some user/rule/rules/hooks, etc.
That's it.
Summary and Outlook
The second part is more exciting, the others are average.