pseudoyu

pseudoyu

Blockchain | Programming | Photography | Boyi
github
twitter
telegram
mastodon
bilibili
jike

Blockchain Fundamentals and Key Technologies

Introduction#

Recently, I have been taking the "" course at HKU. It has given me a more systematic understanding of the basic concepts of blockchain. Combined with the previous online course by Professor Xiaozhen from Peking University, "Blockchain Technology and Applications," I have realized the vastness of the blockchain knowledge system. Therefore, I plan to update a series of articles to systematically organize the knowledge of blockchain, Bitcoin, Ethereum, and other related topics. If there are any mistakes or omissions, please feel free to provide feedback.

Cryptographic Principles in Blockchain#

Blockchain is closely related to cryptography. For example, Bitcoin adopts core public-private key encryption technology, digital signatures, hashes, and many consensus algorithms are based on complex cryptographic concepts. Therefore, before learning blockchain, it is necessary to understand several core cryptographic concepts in order to have a deeper understanding of their applications in the blockchain system.

Hash Functions#

A hash function is a method that transforms arbitrary-length source data into a fixed-length output value through a series of algorithms. The concept is simple, but it has several characteristics that make it widely used in various fields.

You can experience the working principle of a hash function (using SHA256 as an example) in this Demo!

The first characteristic is one-way irreversibility. It is easy to perform a hash operation on an input x to obtain the value H(x). However, it is almost impossible to reverse-engineer the value of x given a value H(x). This characteristic effectively protects the source data.

The second characteristic is collision resistance. Given two values x and y, if x is not equal to y, it is almost impossible for H(x) to be equal to H(y), although not completely impossible, the probability is very low. Therefore, the hash value of data is almost unique, which is useful for scenarios such as identity verification.

The third characteristic is the unpredictability of hash calculation. It is difficult to derive a hash value based on existing conditions, but it is easy to verify whether it is correct. This mechanism is mainly used in the Proof-of-Work (PoW) mining mechanism.

Encryption/Decryption#

Encryption mechanisms are mainly divided into symmetric encryption and asymmetric encryption.

Symmetric encryption mechanism uses the same key for encryption and decryption between two parties. It is convenient and efficient, but there is a high risk in distributing the key. If the key is distributed through networks or other means, it is easy to leak the key and compromise the information.

Asymmetric encryption mechanism mainly refers to the public-private key encryption mechanism. Each person generates a pair of keys, called a public key and a private key, through an algorithm. If A wants to send a message to B, A can encrypt the file with B's public key and send the encrypted message to B. In this process, even if the message is intercepted or leaked, the source file will not be exposed. Therefore, it can be disseminated in any way. When B receives the encrypted file, B decrypts it with its private key to obtain the file content. B's private key is not distributed through any channel and is only known to B, so it has high security.

In practical applications, encrypting large files using asymmetric encryption is inefficient. Therefore, a combination mechanism is generally used: assuming A wants to send a large file D to B, A first encrypts the file D with a symmetric key K, and then encrypts the key K with B's public key. A sends the encrypted key K and the file D to B. Even if it is intercepted or leaked during the transmission, because B does not have B's private key, it is impossible to obtain the key K and access the file D. After B receives the encrypted file and key, B first decrypts the key K with its own private key, and then decrypts the file D with the key K to obtain the file content.

Digital Signatures#

Digital signatures are another application of asymmetric encryption mechanisms. As mentioned earlier, each person has a pair of generated public and private keys. In encryption/decryption applications, the public key is used for encryption and the private key is used for decryption. However, the digital signature mechanism is the opposite. For example, if a file holder encrypts a file with their private key, others can decrypt it with their public key. If they obtain the result, it can prove the ownership of the file.

The most typical application of the digital signature mechanism is in the Bitcoin blockchain network, where private keys are used to prove ownership of Bitcoin, sign transactions, and others can use public keys to verify the legality of transactions. The entire process does not require exposing the private key, ensuring the security of assets.

Basic Concepts of Blockchain#

With the development of history, people's accounting methods have evolved from single-entry bookkeeping to double-entry bookkeeping, digital bookkeeping, and finally to distributed bookkeeping. Traditional centralized digital bookkeeping often relies on the trustworthiness of one or more organizations, which poses some trust risks. Blockchain technology is essentially a distributed ledger technology, where a group of people collectively maintain a decentralized database and record transactions through a consensus mechanism. Blockchain can easily trace historical records, and due to the existence of decentralized trust mechanisms, it is almost impossible to tamper with the records (or the cost of tampering is much higher than the benefits).

Compared to traditional databases, blockchain only has two types of operations: adding and querying. All operation history is accurately stored in the ledger and is immutable, providing high transparency and security. Of course, the cost is that all nodes must reach consensus through certain mechanisms (therefore, the efficiency is lower and not suitable for real-time operations), and because each node needs to permanently store historical records, it occupies a large amount of storage space.

Application Scenarios#

How can we determine whether a company/business is suitable for adopting blockchain as a solution?

  1. Do you need a database?
  2. Do you need shared writing?
  3. Do you need multiple parties to establish trust?
  4. Can it operate without relying on third-party institutions?
  5. Can it operate without relying on permission mechanisms?

As a distributed database, blockchain mainly focuses on information storage. Through various mechanisms, it allows entities with common needs but without mutual trust to reach consensus at a relatively low cost without the involvement of third-party institutions. In addition, the system has characteristics such as encryption authentication and high transparency, which can meet some business needs. However, if the data involved cannot be made public/the data volume is very large/external services are needed to store the data, or the business rules change frequently, then blockchain is not suitable as a solution.

Therefore, based on the above criteria, the following requirements are well-suited for blockchain as a solution:

  1. Need to establish a shared database with multiple participants.
  2. The parties involved in the business do not trust each other.
  3. The existing business trusts one or more trusted institutions.
  4. The existing business has a demand for encrypted authentication.
  5. Data needs to be integrated into different databases, and there is an urgent need for business digitization and consistency.
  6. There are unified rules for system participants.
  7. Multi-party decision-making is transparent.
  8. Objective and immutable records are required.
  9. Non-real-time processing of business.

However, in many application scenarios, companies need to strike a balance between decentralization and efficiency. Sometimes, complex businesses have different requirements for transparency and rules. Therefore, based on complex commercial needs, there are also solutions such as "consortium chains" that can better integrate with existing systems to meet business needs.

Types of Blockchains#

There are different types of blockchains, mainly private chains, public chains, and consortium chains.

Private chains are mainly used in specific domains or run within a single enterprise. They are mainly used to solve trust issues in scenarios such as cross-department collaboration and generally do not require external access to data.

Public chains are open transactions and are often used in businesses that require transaction/data transparency, such as authentication, traceability, finance, etc., such as Bitcoin, Ethereum, and EOS.

The most significant feature of consortium chains is that nodes need to verify permissions to participate in the blockchain network, and the verification is generally associated with their real-world roles. Therefore, consortium chains also have centralized attributes, but they greatly improve efficiency, scalability, and transaction privacy, meeting the needs of enterprise-level applications. The most widely used consortium chain is "Hyperledger Fabric." It is worth mentioning that consortium chains often do not require tokens as incentives. Instead, each participating node serves as a bookkeeping node, and the economic benefits brought by business collaboration between departments through the blockchain mechanism serve as internal incentives. This is a healthier and more suitable way for enterprise applications.

In the long run, public chains and consortium chains will gradually converge technically. Even for the same business, trusted data can be placed on a public chain, while industry data and private data can be placed on a consortium chain to ensure transaction privacy through permission management.

Basic Framework of Blockchain#

What are the components of a blockchain?

  1. Blocks
  2. Blockchain
  3. P2P Network
  4. Consensus Mechanism
  5. ...

Blocks#

A blockchain is an ecosystem composed of blocks. Each block contains several parts, including the hash value of the previous block, a timestamp, Merkle Root, Nonce, and block data. The block size of Bitcoin is 1 MB. You can experience the process of generating a block in this Demo.

Because each block contains the hash value of the previous block, according to the hash properties mentioned earlier, even a tiny change in the hash value will result in a completely different value. Therefore, it is easy to detect whether a block has been tampered with. The Nonce value is mainly used to adjust the mining difficulty and keep the time around 10 minutes to ensure security.

Blockchain#

All blocks are connected to form a blockchain, which is a ledger that stores all transaction history in the network. Because each block contains the hash information of the previous block (for example, the Bitcoin system takes the hash of the previous block header twice), if there are changes in transactions, it will cause a break in the blockchain. This Demo demonstrates this process well, and you can experience it!

P2P Network#

A P2P network is a distributed network used to share information and resources among different users. It is a distributed network where everyone can have a backup of the information and has access rights. In contrast, a centralized network connects everyone to a single (or a group of) central network(s), while a decentralized network has multiple central networks, but no single network has all the information. The following diagram explains the difference between them well:

blockchain_network

Consensus Mechanism#

A blockchain network is composed of multiple network nodes, each of which stores a copy of the information. How do they reach consensus on transactions? In other words, as independent nodes, they need a mechanism to establish mutual trust, which is the consensus mechanism.

Common consensus mechanisms include Proof of Work (PoW), Proof of Stake (PoS), Delegated Proof of Stake (DPoS), Delegated Byzantine Fault Tolerance (DBFT), etc.

Bitcoin/Ethereum mainly use the Proof of Work mechanism, which increases the cost of malicious nodes through computing power competition. By dynamically adjusting the mining difficulty, the time for a transaction to be confirmed is controlled at around 10 minutes (6 confirmations), but as Bitcoin mining becomes more popular and consumes more resources, it causes environmental damage. Some mining pools have a large amount of resources, which also poses some centralization risks.

The Proof of Stake mechanism achieves consensus through voting by stakeholders (usually tokens). This mechanism does not require extensive computing power competition like Proof of Work, but it also has some risks, known as the "Nothing at Stake" problem, where many stakeholders can bet on all chains and profit from them. To solve this problem, the system sets some rules, such as imposing penalties on users who create blocks on multiple chains simultaneously or on the wrong chain. Currently, Ethereum is transitioning to this consensus mechanism.

EOS adopts the Delegated Proof of Stake mechanism, which selects representative nodes to vote. This approach aims to optimize the efficiency and results of community voting but brings some centralization risks.

The DBFT consensus mechanism achieves consensus by assigning different roles to nodes, which can greatly reduce costs and avoid forks. However, it also carries the risk of core role misconduct.

Blockchain Security and Privacy#

Security#

As a relatively new technology, blockchain also has many security risks, such as attacks on cryptocurrency exchanges, smart contract vulnerabilities, attacks on consensus protocols, attacks on network traffic (Internet ISPs), and uploading malicious data. Some well-known cases include the Mt.Gox incident and the Ethereum DAO incident. Therefore, the security risks of blockchain are an important research direction.

Risk analysis can be carried out from the perspectives of protocols, encryption schemes, applications, program development, and systems to improve the security of blockchain applications. For example, in the Ethereum blockchain, analysis can be performed on the Solidity programming language, EVM, and the blockchain itself.

For example, there is a low-cost attack method in smart contracts, which identifies operations in the Ethereum network with lower Gas fees and repeats them to disrupt the entire network.

For security issues, building a universal code detector to check for malicious code will be a more general solution.

Privacy#

When discussing blockchain concepts, it was mentioned that privacy is an important feature. This means that everyone can see the transaction details and historical records on the chain. This feature is mainly applied in supply chain links such as food and drugs. However, for some financial scenarios, such as personal account balances and transaction information, it can easily pose privacy risks.

What technologies can be applied to protect privacy in these high-value and sensitive information scenarios?

At the hardware level, trusted execution environments can be used, such as secure hardware like Intel SGX, which greatly guarantees privacy. The network can use multi-path forwarding to avoid inferring real identities from node IP addresses.

At the technical level, mixing techniques can mix many transactions to make it difficult to identify the corresponding transaction senders and recipients. Blind signature technology can prevent third-party institutions from linking the parties involved in a transaction. Ring signatures are used to ensure the anonymity of transaction signatures. Zero-knowledge proofs can be used to prove to one party (prover) that a statement is correct without revealing the person and information other than the statement being correct. Homomorphic encryption can protect original data. Given E(x) and E(y), it is easy to calculate some encrypted function values related to x and y (homomorphic operations). Attribute-based encryption (ABE) adds attributes/roles to each node to achieve access control and protect privacy.

It is worth noting that even if a transaction generates multiple inputs and outputs, the addresses of these inputs and outputs may still be associated. In addition, the address account and the real-world identity may also be associated.

Conclusion#

The above is a summary of some fundamental knowledge of blockchain, mainly from the conceptual and theoretical aspects. In the future, I will update the analysis and thoughts on typical applications such as Bitcoin, Ethereum, Hyperledger Fabric, and explore popular technologies such as IPFS, cross-chain, NFT, etc. Stay tuned!

References#

  1. COMP7408 Distributed Ledger and Blockchain Technology, Professor S.M. Yiu, HKU
  2. Udacity Blockchain Developer Nanodegree, Udacity
  3. Blockchain Technology and Applications, Xiaozhen, Peking University
  4. Advanced Blockchain Technology and Practice, Cai Liang, Li Qilei, Liang Xiubo, Zhejiang University | Qu Chain Technology
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.