The following document introduces a convention for appending information (stored in various files) to the compiled application’s bytes.
The goal of this convention is to standardize the process of verifying and adding this information.
The encoded information byte string is arc23 followed by the IPFS CID v1 of a folder containing the files with the information.
The minimum required file is contract.json representing the contract metadata (as described in ARC-4), and as extended by future potential ARCs).
Specification
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC-2119.
Comments like this are non-normative.
Files containing Application Information
Application information are represented by various files in a folder that:
MUST contain a file contract.json representing the contract metadata (as described in ARC-4), and as extended by future potential ARCs).
MAY contain a file with the basename application followed by the extension of the high-level language the application is written in (e.g., application.py for PyTeal).
To allow the verification of your contract, be sure to write the version used to compile the file after the import eg: from pyteal import * #pyteal==0.20.1
MAY contain the files approval.teal and clear.teal, that are the compiled versions of approval and clear program in TEAL.
Note that approval.teal will not be able to contain the application information as this would create circularity. If approval.teal is provided, it is assumed that the actualapproval.teal that is deployed corresponds to approval.teal with the proper bytecblock (defined below) appended at the end.
MAY contain other files as defined by other ARCs.
CID, Pinning, and CAR of the Application Information
The CID allows to access the corresponding application information files using IPFS.
The CID MUST:
Represent a folder of files, even if only contract.json is present.
You may need to use the option --wrap-with-directory of ipfs add
Be a version V1 CID
E.g., use the option --cid-version=1 of ipfs add
Use SHA-256 hash algorithm
E.g., use the option --hash=sha2-256 of ipfs add
Since the exact CID depends on the options provided when creating it and of the IPFS software version (if default options are used), for any production application, the folder of files SHOULD be published and pinned on IPFS.
All examples in this ARC assume the use of Kubo IPFS version 0.17.0 with default options apart those explicitly stated.
If the IPFS is not pinned, any production application SHOULD provide a Content Address Archiver (CAR)( file of the folder, obtained using ipfs dag export.
For public networks (e.g., MainNet, TestNet, BetaNet), block explorers and wallets (that support this ARC) SHOULD try to recover application information files from IPFS, and if not possible, SHOULD allow developers to upload a CAR file.
If a CAR file is used, these tools MUST validate the CAR file matches the CID.
For development purposes, on private networks, the application information files MAY be instead provided as a .zip or .tar.gz containing at the root all the required files.
Block explorers and wallets for private networks MAY allow uploading the application information as a .zip or .tar.gz.
They still SHOULD validate the files.
The validation of .zip or .tar.gz files will work if the same version of the IPFS software is used with the same option. Since for development purposes, the same machine is normally used to code the dApp and run the block explorer/wallet, this is most likely not an issue.
However, for production purposes, we cannot assume the same IPFS software is used and a CAR file is the best solution to ensure that the application information files will always be available and possible to validate.
Example: For the example stored in /asset/arc-0023/application_information, the CID is bafybeiavazvdva6uyxqudfsh57jbithx7r7juzvxhrylnhg22aeqau6wte, which can be obtained with the command:
Inclusion of the Encoded Information Byte String in Programs
The encoded information byte string is included in the approval program of the application via a bytecblock with a unique byte string equal to the encoding information byte string.
The size of the compiled application plus the bytecblock MUST be, at most, the maximum size of a compiled application according to the latest consensus parameters supported by the compiler.
At least with TEAL v8 and before, appending the bytecblock to the end of the program should add exactly 44 bytes (1 byte for opcode bytecblock, 1 byte for 0x01 -the number of byte strings-, 1 byte for 0x29 the length of the encoded information byte string, 41 byte for the encodedin information byte string)
The bytecblockMAY be placed anywhere in the TEAL source code as long as it does not modify the semantic of the TEAL source code.
However, if approval.teal is provided as an application information file, the bytecblockSHOULD be the last opcode of the deployed TEAL program.
Developers MUST check that, when adding the bytecblock to their program, semantic is not changed.
At least with TEAL v8 and before, adding a bytecblock opcode at the end of the approval program does not change the semantics of the program, as long as opcodes are correctly aligned, there is no jump after the last position (that would make the program fail without bytecblock), and there is enough space left to add the opcode, at least with TEAL v8 and before.
However, though very unlikely, future versions of TEAL may not satisfy this property.
The bytecblockMUST NOT contain any additional byte string beyond the encoded information byte string.
Retrieval the Encoded Information Byte String and CID from Compiled TEAL Programs
For programs until TEAL v8, a way to find the encoded information byte string is to search for the prefix:
0x2601296172633233
which is then followed by the 36 bytes of the binary CID.
Indeed, this prefix is composed of:
0x26, the bytecblock opcode
0x01, the number of byte strings provided in the bytecblock
0x29, the length of the encoded information byte string
0x6172633233, the hexadecimal of arc23
Software retrieving the encoded information byte string SHOULD check the TEAL version and only perform retrieval for supported TEAL version.
They also SHOULD gracefully handle false positives, that is when the above prefix is found multiple times.
One solution is to allow multiple possible CID for a given compiled program.
Note that opcode encoding may change with the TEAL version (though this did not happen up to TEAL v8 at least).
If the bytecblock opcode encoding changes, software that extract the encoded information byte string from compiled TEAL programs MUST be updated.
Rationale
By appending the IPFS CID of the folder containing information about the Application, any user with access to the blockchain could easily verify the Application and the ABI of the Application and interact with it.
Using IPFS has several advantages:
Allows automatic retrievel of the application information when pinned.
Allows easy archival using CAR.
Allows support of multiple files.
Reference Implementation
The following codes are not audited and are only here for information purposes.
Here is an example of a python script that can generate the hash and append it to the compiled application, according this ARC:
main.py.
If they are not accessible be sure to removed [—only-hash | -n] from your command or check you ipfs node.
Security Considerations
CIDs are unique; however, related files MUST be checked to ensure that the application conforms.
An arc-23 CID added at the end of an application is here to share information, not proof of anything.
In particular, nothing ensures that a provided approval.teal matches the actual program on chain.