#serverless
Yang Kai, China Industry and Information Publishing Group
Outline#
- Overview Disruptive impact, understanding the relationship between sls and current technology
- FaaS is the practical application technology of the core idea of Serverless
- BaaS is another major scoring system of sls
Overview#
#cloudnative The concept mentioned most in cloud-native is serverless
The concept of sls, the development concept of sls, low operation and maintenance costs, significantly reduced server costs
For example, GitHub Pages, developers deploy files as static sites, allocate domain names, and provide access services. It shields many details.
For example, CDN, no need to pay attention to details, nodes, global users access quickly.
It can be considered as a way to provide products as a service for computing resources, which conforms to the serverless concept and shields technical details to provide services to the outside world.
CNCF released the serverless white paper in 2018
sls means that there is no need to maintain and manage servers in both application construction and operation
The sls computing platform should include one or two capabilities: faas and baas
Next, introduce use cases, advantages, and disadvantages
Advantages of faas: high development efficiency, low deployment costs, low operation and maintenance costs, low learning costs, low server costs, flexible deployment solutions, high system security
Disadvantages: high platform learning costs, high debugging costs (runtime in the cloud, difficult to debug and log queries), cold start, vendor lock-in
Use cases:
- The function itself is stateless and integrates with persistent servers
- Functions work independently, and the cost of computing will increase when a large number of functions are linked together
- Automatic scaling
- Event-driven, such as http, database modifications, scheduled events, etc.
- Scenarios with low requirements for cold start
faas itself does not support the http protocol, so it needs to provide web APIs to the outside world through an API gateway
sls and server-side technology#
Monolithic application - layered architecture, UI layer, Business Logic Layer, Data Access Layer
Based on the three-tier architecture, such as enterprise application architecture patterns, domain-driven design, further differentiation
- Application layer
- Domain layer
- Persistence layer
- Infrastructure layer
Still thinking in terms of layers, dividing the code to achieve the goal of Separation of Concerns (SOC).
Can I extract login, permission judgment logic, and reuse them between different businesses?
Microservices architecture style, evolved from distributed architecture. Multiple independent small services for lightweight and controllable software development and management. Services communicate with each other through RPC, so each service can complete functions with different technology stacks.
The architecture of the system is constrained by the communication structure of the organization. Conway's Law. The design of the architecture is determined by the organizational structure. For example, in a small company, a monolithic application is used. As the scale expands, new personnel need to familiarize themselves with the overall content again, which is a mesh structure, and each other needs to synchronize information, and the cost of communication increases exponentially.
Splitting the monolith into microservices only focuses on its own business, shields unnecessary information, and reduces communication costs.
Containerization, cloud computing, noops-devops is the next stage, reducing the investment of R&D personnel in operations and maintenance, is a concept.
sls and front-end technology#
bff, backend serves the front end. It acts as a glue layer between the front end and the back end, aggregating and trimming information. One interface completes operations of multiple interfaces. Remove unnecessary fields, etc.
This makes the interface more flexible. Low communication cost. Perform performance optimization by reducing requests, and improve security by reducing data exposure.
GraphQL is not easy to apply in a multi-language microservices architecture, and the cost of transformation is too high. Currently, there are not many cross-language implementations. The ability to implement unified aggregation queries across databases. It is only possible to combine with serverless.
FaaS Technology#
Event-driven. It is a design model.
faas functions are also event-driven, with different triggers as event sources. For example, http triggers
The function is stateless and simple enough.
Create a function. There are many technical details inside, and it has been revised several times now.
- Runtime selection
- Default
- Custom runtime, which has the highest compatibility
- Custom container, such as GPU container
- Basic information, name, region, trigger event and http
- Code upload
- Demo
- Zip package compression
- oss upload
- Folder upload
- Specification design
- CPU 0.05~16g
- Memory 128M ~ 32 G
- Temporary disk 512M, 10G
- If you choose container deployment, you can also choose GPU T4/A10 specifications
- Instance concurrency. How many requests are processed at the same time
- Timeout 60s
- Time zone
- Authentication
- Signature
- jwt
In 2015, the open source community introduced the serverless framework, aiming to become the framework and ecosystem of sls, and solve the problem of vendor lock-in.
Ignore the details of the writing.
7 Function Lifecycle#
module.exports.handler = function (request, response, context) {
// code
(async () => {
response.setStatusCode(200);
response.setHeader('content-type', contentType);
response.send(fs.readFileSync(path))
})().catch(err => {
response.setStatusCode(500);
response.setHeader('content-type', 'text/plain');
response.send(err.message);
});
}
8 Understanding Function Runtime#
23 design patterns
Message queue can solve the problem of high concurrency and achieve system decoupling. Smooth out peak and valley. Ensure that messages are received in order.
Suddenly thought of a solution triggered by information, such as in-site messages, SMS, WeChat push, etc.
9 Building a Simple FaaS#
To ensure the security of each function and avoid interference with each other, the key technology for isolation is the sandbox Sandbox
- Use docker for isolation
- Use process isolation, such as docker
Use the master to listen to function calls and start child processes to execute functions. Communicate the returned results to the main process.
Here, child_process.fork()
is used to start child processes to achieve isolation.
If the execution code can be compared using eval/Function, the best solution is still node's new vm.Script(code)
solution
But vm also has risks, for example, this.constructor.construcotr
The design of the prototype chain, so it's good without the prototype chain
const sandbox = Object.create(null)
vm.createContext(sandbox)
vm.runInContext(code,sanbox)
But it is not perfect. In the community, vm2
is used, which is wrapped using the proxy feature to make it more secure.
Therefore, the runtime uses vm2 and child_process.fork to achieve isolation and execution of user code.
To support the http protocol, you can use koa to implement it. Download and execute functions dynamically...
The simplified version is complete, but you still need to consider throughput performance, security and stability, and development efficiency.
Each execution will create a child process to execute, and the creation and destruction management of this process is a performance overhead. Too many processes during high concurrency may also cause crashes, so consider the process pool.
You can use cluster to further abstract and encapsulate child_process
, which is easier to use.
Using cluster and vm2 in conjunction with koa can implement a solution.
Asked gpt,
You can consider generic-pool
: This module provides a general resource pool that can be used to manage various resources, including processes. You can use it to create a process pool to avoid frequent startup and shutdown of child processes. It provides flexible configuration options that allow you to customize the size of the pool, resource allocation strategies, etc.
If the user code uses an infinite loop, the task cannot be terminated. Set a timeout. Both vm/vm2 have a timeout, such as setting it to 5000ms
Asynchronous call problem, the default timeout does not work, you can consider implementing the timer yourself
Ensure stability, resource limitations. High-intensity use of CPU, memory, and disk. To limit resources, Linux has CGroup to limit resources, which is also used in docker.
Introduced a bunch of concepts about CGroup, skipped the details, and you can limit a process to use a maximum of 20% of CPU resources with just a command. It's a bit complicated, and gpt answered it well.
Improve efficiency, built-in commonly used front-end services.
- Simple kv storage, using vm2 to implement get/set
Of course, Redis is better
BaaS Technology#
backend as a service.
![Pasted image 20230923233047.png]
Subsequently, Firebase was mentioned, which provides many out-of-the-box basic services, such as cloud databases, cloud functions, auth, hosting, storage, etc.
There are also crash reports, performance monitoring, test lab, application distribution, and other functions.
There are also extended operational functions
- In-app messaging
- Analysis
- A/B testing
- Cloud messaging, push IM, etc.
- Remote config
Domestic platforms like LeanCloud/Bmob still have gaps in overall integrity.
Database design principles section.
For example, querying two tables, posts and comments, will have the problem of select 1+n, that is, you need to query n records to find information, and then query n times for the corresponding data. You can use join in SQL
For example, the content of the five most recent comments and the title of the corresponding article.
Using SQL is relatively simple. If it is designed in advance as an embedded solution, it will be easier. If it has become a reference solution, it may be necessary to retain more redundant content. Set up a new collection, the latest comment collection, with redundant fields. If there are read and write issues, it can be changed to a cache that is updated once an hour. The problem this brings is that when deleting, multiple copies need to be deleted synchronously.
If you use embedding, you only need one query, and the query performance is good. It is not good for complex nested data. Aggregate the data first and then sort it, and the implementation is also placed in memory, which may not be good for a large amount of data.
Quickly introduced CDN, object storage, and other functions
User authentication introduced OneID, which is not popular, and introduced OAuth 2.0
Finally included in the IETF organization and standardized.
OAuth solves authorization, not user authentication, and introduces OIDC - openID Connect
Further split id token and access token
jwt is rfc 7519
Docking with third parties is more difficult, so IDaaS such as auth0 website and authing website emerged
- Identity authentication and account authorization
- sso single sign in, one application login, multiple applications share
- Provide oauth2 to the outside world
- Unified sending of emails, SMS password reset
- Enterprise identity login
- Two-step verification
Take a look, there are some user/rule/rules/hooks, etc.
That's it.
Summary and Outlook#
The second part is more exciting, the others are average.